"This is pre-alpha quality software, the result of a three
day project. It will crash your browser, leak your
passwords and destroy your home. This is actually my first
Chrome extension and I am no expert javascript programmer."
It's written in Ruby but is not really dependent on Rails or anything to run. We use it to compare a list of URLs between the version we are currently testing in CircleCI and our production site. Any differences are highlighted in a nice gallery and we get a message in our Hipchat room.
I run mine as a NAS, and the performance is pretty terrible. The portability is great though! I'm somewhat nomadic at the moment so it's great to have a 1TB NAS that I can pack in my laptop bag (1TB external 2.5" drive + rpi: minimal weight too)
ubuntu@c1-10-1-18-157:~$ sysbench --test=cpu --cpu-max-prime=2000 --num-threads=4 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 4
Doing CPU performance benchmark
Threads started!
Done.
Maximum prime number checked in CPU test: 2000
Test execution summary:
total time: 6.7674s
total number of events: 10000
total time taken by event execution: 27.0485
per-request statistics:
min: 2.69ms
avg: 2.70ms
max: 7.00ms
approx. 95 percentile: 2.70ms
Threads fairness:
events (avg/stddev): 2500.0000/17.36
execution time (avg/stddev): 6.7621/0.00
Because everything written in Ruby is junk? This project can be used by any system that uses a cron-like syntax for scheduling jobs, regardless of the languages and frameworks that system is built with.
I agree that this software appears versatile and is probably well-written, well thought-out and the developer(s) worked hard on it and my hat is off to them; however, I just don't like Ruby for a syntactically and orthogonal lack of common sense. This, again, is my personal opinion. I'm not saying everything in Ruby is junk at all. There is some pretty awesome software (e.g. Chef) out there that uses Ruby.
I'm simply saying, in my own personal opinion in synchronization with my preference, Ruby (the language itself) is junk.
We certainly don't use Heroku for the price. I've got a fair amount of experience running distributed apps on AWS and I have found that Heroku do provide a lot of value-added services on top (from deployment to database provisioning). For us, the best reason to use Heroku is that when AWS has a problem, Heroku can actually do something about it. Try calling AWS support when a whole availability zone just went down!
Of course, we are a Rails shop who have been on Heroku since we launched, so Heroku is really tailor made for us. If you're happy on raw AWS, good for you. Every hour I'm not thinking about AWS is an hour I can be writing a feature for our 10-month-old startup. The extra cost of Heroku is negligible when viewed in that light.
We find that neither New Relic nor Analytics gives the full picture: some of our pages are heavily cached, others (e.g checkout processing) are computationally expensive, database heavy and communicate with other systems (e.g payment processors) that can be a big bottleneck. Both New Relic and GA tend to just average those together (although with GA you can create new views that focus on specific pages). You are right that 'number of visitors on site' does not reflect our site performance in every respect.
We first conceived of Dynosaur as a plugin-based autoscaler (with GA and New Relic plugins to start with), but we've found the times we really need to scale fast are the times we have a lot of traffic generated from press stories etc (like this from today, if you will excuse the shameless plug: http://dealbook.nytimes.com/2014/01/21/a-start-up-run-by-fri...) and using the analytics live API allows us to react a little quicker than if we waited for New Relic to tell us our response times are getting slow. So far, we're happy enough with just a Google Analytics plugin.
One possible improvement would be to scale differently based on different traffic / performance metrics across the site. I think New Relic or other performance instrumentation would be very useful for that.
Nice work, looks pretty solid. If you want to offload the responsibility of determining which GA metric events signify a potential spike, you could abstract it out and instead make the plugin use GA intelligence event alerts setup by your analytics team. This would help keep the respective subject-matter experts in their realms of expertise ideally allowing for a more on-going tailored approach to what, where, and how can trigger scaling fluctuations and the dev team isn't responsible for on-going management of the scaling trigger rules (well, to a certain extent).
Just a thought. I know you abstracted it out the way you did so as not to tie it to just GA, but if GA is your analytics platform of record, it could be worth pursuing. Cheers!
That really depends on your app and your caching strategy. If you are doing something complex or memory intensive (or your app is poorly optimized), or if your users are highly engaged, it could be in the low tens of users! If you are doing a lot of caching you should be able to serve static pages on the order of several hundred or more per dyno.
Dude! This is so boss. I've been looking for exactly this for Techendo - as you can guess our traffic varies by time of day. Being able to ramp services up as I need them would help tremendously.
We haven't benchmarked against other autoscalers or done any direct comparisons.
Our main motivation was to write something that scaled as we scale manually (i.e. when we get a lot of traffic due to press hits etc, we scale up based on GA realtime data as well as New Relic response times). When we were given access to the GA live API it just seemed like a natural fit.
Just speaking off the top of my head here, but services like HireFire ping the Heroku API to poll for information on when to scale. The Heroku API is a little rate limited. You'll probably get a more responsive result using GA.
Woah! So sorry to be treading on your namespace! I did a google search for the 'Dynosaur' and didn't get any hits if I recall. We will put our heads together and come up for a new name for our project later today.
Same might apply to an overloaded api / anything not serving html pages to browsers.
Heroku provides log-runtime-metrics, which includes current cpu load / pending cpu tasks. The librato addon also shows request throughput in its dashboard UI. I'm not sure where it gets this data from. If not logging then perhaps new relic?