The live site trendingtopics.org is using MySQL for all 3 million articles and i...

CodeChutney · on Aug 4, 2009

Thanks for the quick reply. How many machines are running MySQL for you?

I was reading this website - http://www.metabrew.com/article/anti-rdbms-a-list-of-distrib...

I have not tried HBase and HyperTable myself yet, but the blog post says that they still have latency issues. What are your views?

pskomoroch · on Aug 4, 2009

We're just using a single c1.medium instance for the database right now. Trendingtopics.org is a relatively low traffic, read-only site and most of the reads are for a handful of urls on the front page which can be cached.

Also, after processing the raw log data with Hadoop, we only need to store/lookup 3M records in the MySQL presentation layer, which is well within the capabilities of a tuned RDBMS. Many Rails sites are backed by MySQL, so I thought linking Hadoop/Hive to a common data workflow would make for a good example.

I've been hearing that recent improvements to HBase 0.20 could make it a contender: http://stackoverflow.com/questions/1022150/is-hbase-stable-a... and some high volume sites like Mahalo are already using it. That said, there are other alternative data stores (Cassandra, Voldemort, Tokyo Tyrant) that might be worth exploring if a database isn't cutting it for you.