Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is 1% of all tweets which is about 50 tweets per second. Not the full firehose but still very useful.


It is up to 1% per stream. With 2 streams you can stream up to 2% (minus overlaps). This is theoretical, practice isn't quite that good, but not far off. In reality you can collect about 4-5 million tweets per day with a stream. If you look for common terms like http,and,for,the, etc you can collect 10's of millions per day easily with 20 or so streams. If you take the time to develop some tech and pull hundreds of streams designed to have minimal overlap you can pull 100+ million per day.

And how do you pull more than 1 stream, you ask? The limit of streams isn't per app, it is per authorization. If you develop an app that has sign in with Twitter and you collect auth keys for 100 people you can have 100 streams because the auth limits are per user, not per app.


Interested dummy here: You can get a 1% stream per user authorisation (sounds weird to me)? How do you minimize overlap?

I once had access to a 1% stream but thought it was a fire hose test version (and everyone would get the same 1% to avoid combining like you describe).


There is a sample stream which is supposed to be 1%, but then the keyword based filter streams are also theoretically up to 1% of tweets. You can setup streams for different words and broaden your coverage.

https://developer.twitter.com/en/docs/tweets/filter-realtime...


I could probably figure out the optimal set of keywords from all the twitter sample hose data I have...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: