rounded

Visualizing Chinese media 2 months, 16 days ago. by nickster

For data geeks interested in the developing world, few places are more compelling to gather numbers about than China. This owes much to its legendary economic growth, the staggering size of its population and global footprint, and hybrid political system. But there is another, often overlooked characteristic of the country at work here: its relentless pursuit of what it calls “scientific development”, which emphasizes the use of scientific research as a means to achieve social harmony and balanced economic growth, has led to an explosion in data-fueled, science-based policy. As a result, China is now one of the largest and most sophisticated data-gathering entities in the world.

There’s a good reason for this. Unlike China’s early post-revolution cadres, the ranks of China’s top leadership today are brimming with scientists and engineers, including President Hu Jintao, who has a degree in hydraulic engineering. When government “works”, these technocrats steer Chinese policy down a painfully cautious course based on five, ten, and even twenty year plans crafted to satisfy discrete social, economic, and technological benchmarks. At any given moment, the country is teeming with pilot projects spanning areas like subsidized housing, health care, industrial development, and family planning, which will ultimately be scrutinized by the country’s National Reform and Development Commission for use at the national level.

This science-based approach is exactly why China has recently come forward with ambitious carbon emissions targets–global warming has a direct, significant impact on its population, and therefore social stability. None of these projects could be completed without good data, and China knows it.

While we’ve been emphasizing social media data with recent posts, we hope to shine more light on the state of Chinese data and bring more of it into the repository in the near future. To this end, and as a special holiday treat, we’re releasing a visualization of major Chinese websites we scraped this past October during the country’s meticulously executed 60th anniversary of its founding. We find the bright colors and flashing lights to be particularly seasonally appropriate.

Click here for the visualization

Twittersong 1 year, 1 month ago. by mrflip

Took the 50M twitter messages we saw between mid-November and mid-January and used Wordle to make a word cloud:  http://bit.ly/tweetcloud Fun!

(If you’re not familiar with a word cloud: the larger a word, the more often it was used. The colors & positions don’t mean anything, they’re just for fun. We stripped out the little words (a, the, with, …), leaving everything that appeared more than 10,000 times in the 50 million+ tweets we examined.)

Then I looked again at the filtered list and noticed something… just awesome.

Here are the forty most-commonly used words, in their exact order of decreasing frequency:

It’s time, Twitter. Love/Christmas blog:

Home! Thanks, people…

Night post:

Getting happy
watching morning
that’s tonight.
Tomorrow: looking news, trying nice? Check.

2009: Hope.
Week: 2008.

Little video:

snow.

Live free. Life. Awesome days!

Doing:

Feel house ready.
Look cool.
Sleep.
Yeah world!

I like your poem, Twitter.
A lot.

(more…)

Geography of Newspaper Endorsements in the 2008 US Presidential Election 1 year, 4 months ago. by mrflip

A visualization experiments on interconnecting datasets:

Bubble chart and histogram showing endorsements by size of newspaper's circulationGeography of Newspaper Endorsements in the 2008 US Presidential Election

Apart from the unsurprising evidence that (choose one: [[Obama is the overwhelming choice]] -OR- [[there is overwhelming liberal media bias]]), I’m struck by the mismatch between papers’ endorsements and their “Red State” vs “Blue State” alignment.

  • I think the amount of red in the blue states is a market effect. If you’re the Boston Herald, there’s no percentage in agreeing with the Boston Globe; similarly Daily News vs New York Post, SF Examiner vs SF Chronicle. That’s why the Tribune endorsement, even accounting for hometown bias, is so striking. I don’t mean that they’re cynically pandering; rather that in a market with multiple papers readers, and journalists are efficiently sorted into two separate camps. (And the axis doesn’t have to be political: though the Chronic and the Statesman are politically distinct I see their main difference being lifestyle vs. traditional news).
  • The amount of blue in the red states highlights how foolishly incomplete the “Red State/Blue State” model is for anything but electoral college returns. The largest part of the Red/Blue split is Rural/Urban — look at the electoral cartogram for the last election and almost every city is blue, even in the south and mountain; and almost all of our rural areal is red. The exceptions, chiefly Dallas, Houston and Boise, stand noticeably alone as having red unpaired with blue. (Though in this election even the Houston Chronicle is endorsing Obama.)I’m going to try to make a map colored by county, but there are no good off-the-shelf tools for doing this (that I’ve found).

This seems to speak of why so many on the right feel there’s a MSM bias — 50% of the country is urban, 50% rural, but newspapers are located exclusively in urban areas [see below]. So, surprisingly, the major right-leaning papers are all located in parts of the country we consider highly leftish. The urban areas that are the largest are thus both the most liberal and the most likely to have a sizeable conservative target audience.

(more…)