more tysont's comments

tysont · on June 19, 2012

I remember when this debate was raging a couple years ago.

tysont · on June 19, 2012

My Apple fanboi buddies are telling me that the Surface is ugly. I don't really get that... seems like every tablet is just a border + glass at this point, and mostly indistinguishable from the others. If the delta is just the kickstand + keyboard, I'm a fan. Given that I own an iPhone, Kindle Fire, and X-Box I claim brand neutrality in the religious war. :)

tysont · on June 18, 2012

On the plus side, the level of transparency that AWS displays and the detail that they provide seems above and beyond the call of duty. I find it refreshing and I hope that other companies follow suit so that customers can understand the details of operational issues, calibrate expectations appropriately, and make informed decisions.

rdl · on June 18, 2012

They're less transparent and responsive than most datacenter or network providers -- it's just that most of those providers hide their outage information behind an NDA, so only customer contacts get it, vs. making it public.

smackfu · on June 18, 2012

Yeah, a good datacenter will have SLAs around the root-cause analysis document for any failures. Like a preliminary report within a day and a final report within 7 days.

acdha · on June 18, 2012

I've also had a few cases where providers either outright lied or only gave details if you persisted in requesting them. Having to play that game gets old…

tysont · on May 31, 2012

I still think that the future of Facebook is data (not specifically picture data, that's just one piece). Advances in hardware and cloud computing have kick started an era where data mining and machine learning are pervasive in every field from pharmaceutical research to banking to sports and everywhere in between. Facebook is the single authoritative source for a whole bunch of personal data. People have started thinking about the basic applications for that data when it comes to things like online shopping (you're a 31 year old white male with an 8 month old daughter who likes rock and roll, so you probably want to buy diapers, zinfandel, and a radiohead album), but I think that lots of new interesting applications will emerge over the next several years that don't violate user privacy and will monetize well.

tysont · on May 27, 2012

Agree with the other comments. Choice of programming language is in some ways tangential to concerns over vendor lock in since other people can always come along with a different implementation of the language, unless of course programming languages end up being found to be patentable (http://tech.slashdot.org/story/12/04/13/1646215/oracle-and-g...).

tysont · on April 1, 2012

Amazon.com (Seattle) -

My team is hiring 3-4 SDE's and a Manager for a very exciting new Platform as a Service (PaaS) offering that will become a key piece of infrastructure for Amazon.com Retail Website (RCX) as well as other products like Amazon Web Services (AWS) & Kindle. If you're interested in building super large scale distributed systems, applying machine learning to scale hosting up or down to meet constantly changing traffic flows, and having a massive impact on the world's biggest online retailer... come join us!

http://careers.stackoverflow.com/jobs/18233/senior-software-...

tysont · on Feb 9, 2012

I still don't understand the value add of MapReduce (in it's various implementations, including Hadoop) versus Clustered SQL. I see articles like this one that seem to imply that there is a niche for MapReduce for tasks like simple string search on super large data sets: http://gigaom.com/2009/04/14/mapreduce-vs-sql-its-not-one-or.... It just seems odd that people are so quick to throw away 40+ years of research on how to structure relational data and how to optimize queries against large data sets.

ebiester · on Feb 9, 2012

Hadoop is free for 400 (or n) computers. Let's say you need 100 computers (full time) to process the equivalent clustered query in Oracle, given the time needed to get the formatted data. How much do the licenses for that cost? (And if it's a free system, you're still going to have to do an ETL job, so why not ETL it to a free solution?)

You need to process a couple terabytes of data, but you only need to do it once a month. However, even the highly tuned SQL will bring the system to its knees. So, 90% of the time really expensive servers are idling because of these peak processing needs. Once a month. (And Oracle isn't going to be available on a cloud service.)

So, yeah, if someone had a free/cheap SQL solution for these kinds of ETL datasets that was able to handle these embarrassingly parallel problems, people might be interested. Perhaps you are aware of solutions I am not.

tysont · on Feb 9, 2012

If you're like me and you haven't played with S3 and are wondering what an object is, here's a relevant excerpt from the Wikipedia entry on S3 to add some context:

"S3 stores arbitrary objects (computer files) up to 5 terabytes in size, each accompanied by up to 2 kilobytes of metadata. Objects are organized into buckets (each owned by an Amazon Web Services or AWS account), and identified within each bucket by a unique, user-assigned key. Amazon Machine Images (AMIs) which are modified in the Elastic Compute Cloud (EC2) can be exported to S3 as bundles."

http://en.wikipedia.org/wiki/Amazon_S3