Hacker News new | past | comments | ask | show | jobs | submit login
Building a complete Tweet index (blog.twitter.com)
117 points by ChrisArchitect on Nov 18, 2014 | hide | past | favorite | 13 comments



Will this new index be accessible via the Twitter API? It seems like this only impacts manual searches via the web/client.


Not that I know of. This historical index actually has been in place for many many months, but Twitter just decided to publish this article recently. The API sometimes has less than a week of data, as little as 3-4 days I've noticed. And some days (like once every 3 months) it returns 50% less data, but so does the web app.


Yeah, this really isn't interesting to anyone except maybe their investors. If they provided this functionality in their API it would've been big news.


Interesting to think that half a trillion tweets is about 70TB of data, and that can be stored for about $10k/year in Dropbox.


Tweets contain a lot of metadata too though. In practice a tweet is really about 3K, not 140 bytes.


Yes, taking the example in the API docs [1] it gives me 3K of JSON, which can be gzip’d down to 1K.

[1]: https://dev.twitter.com/rest/reference/get/statuses/show/%3A...


And DropBox is a storage middleman. Twitter could store it for less (and anyway, they are already storing it)


They're probably highly compressible as well.


Without sounding facetious: Am I amble to search (or at least paginate) through all of my own DMs yet?


Didn't backtweet do this? It was the most useful twitter search ever before Twitter bought them and shut it down...


That's just a news for "investors" I don't see it coming to the api so I wonder why this is an hacker news...


It's a technical post on their engineering blog about how they implemented it. Why wouldn't that belong on Hacker News?


We'll if it is utilized for the web client, one could probably use the same urls calls twitter does in the background to get possibly get better real time tweets compared to the api, but I haven't tested it out yet to see if that is true, but probably one may have more flexibility doing that anyways with being able to make calls from multiple ip address and not having to worry about rate limiting by api key.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: