That said, from polls we know that the overall vote is going to be very close to 50/50, so the lopsided totals are a sign of skew in Twitter users, not the country.
No, because those phrases do not appear to be linked to who voted or how they voted in any statistical meaning way. All you can derive from this is that on election day people are talking about the election in high volume. Hardly a revolutionary insight.
What we used for counting is slightly different from what you see in the Twitter widgets (yeah, those tweets are from Twitter directly).
In our backend, we have a pretty conservative filter that matches a bag of phrases, such as "voted for barrack obama", "voted for pres obama", etc. The accuracy is over 95%. Of course, political tweets are full of sarcasm and humor, and Twitter is full of demographic bias. This is just a fun project for us.
We don't remove RT tweets, but instead, we only count each user once. If a user retweeted Michelle, s/he probably will vote for Obama. But if a user have a few tweets in favor of Obama, it's counted once only.
How can you draw this conclusion at this point in the process? I'm genuinely curious to your filtering scheme to be able to extract information out of such a noisy data stream.
This is not a scientific research, so I didn't compute std, t-stats, etc. But I did pull a few hundred tweets from our database and counted how many wrong ones we had. That's where the number comes from. The filtering scheme is very simple: classify only if we're confident. There are many tweets containing "voted", but we only took ones we have a strong confidence and throw away the rest. For a complete set of keywords used for filtering, please feel free to email.
Very interesting. Of course, the map is biased to those who use the internet (which is why it shows Obama winning every state), but I wonder if this could still be used for predictions by normalizing each state to a known percentage for the "safe" states.
Fortunately, we'll have the data to compare against actual state or county results, since most of these folks leave their location settings enabled, too.
While it won't predict this location, correlational data could be applied for future elections assuming the primary political leanings of the electorate don't change too much in 4 years.
Well they would be mostly actual votes, but it causes Obama voters on Twitter to be more likely to mention having voted than Romney voters. You'd probably get more accurate results if you eliminated all these.
I personally find it scary when people think that "Gods will" has anything to do with electing the leader of the most powerful country in the world for the next four years.
It starts to sound a lot like fundamentalism, and makes me wonder how different this is than parts of the world in constant violence due to "gods will".
I'm much more worried by the fact that voting is so widely considered "our civic duty" than by the fact that some people believe there is a God whose will affects the world.
- hate Americans who are like 'I always vote republican so of course I voted for romney'
- I'm not afraid to say who I voted for. I voted for Obama cause lets face it, Romney's just a dickhead.
- Sweet satire. "Why I voted for Mitt Romney" http://www.salon.com/2012/11/06/why_i_voted_for_mitt_romney/....
That said, from polls we know that the overall vote is going to be very close to 50/50, so the lopsided totals are a sign of skew in Twitter users, not the country.