The New M4 Instance Type

listic · on June 12, 2015

Just how custom is the E5-2676 v3 processor?

I see this paticular model is not in the Intel's list of models [1] nor in the Wikipedia [2].

I suspect that, as the model number suggests, it is virtually identical to E5-2670 v3 [3] and 100 MHz higher base clock rate and 100 MHz lower Turbo Boost frequency is all customization there is (but what about the exact turbo profile?), but the fact that Intel does it at all, even for a customer as large as Amazon (how much are they ordering, actually?), is an interesting development.

[1] Intel® Xeon® Processor E5 v3 Family http://ark.intel.com/products/family/78583/Intel-Xeon-Proces...

[2] Wikipedia: List of Intel Xeon microprocessors http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocess...

[3] Intel® Xeon® Processor E5-2670 v3 (30M Cache, 2.30 GHz) http://ark.intel.com/products/81709/Intel-Xeon-Processor-E5-...

dangrossman · on June 12, 2015

All those different models you see in the E5v3 product line aren't separate products, they're bins of probably just 2-3 different CPUs. Many of the chips made will have defects, and the number of working cores, their maximum stable frequency, and their thermal properties determine which product number they get sold as. Amazon probably just gets one of these bins: the chips that come out of testing meeting their specs are sold to them as a 2676. Intel doesn't actually have to custom design any hardware.

http://en.wikipedia.org/wiki/Product_binning

tcas · on June 12, 2015

It's probably easy for Intel to reclassify the processors since Amazon probably provides the exact thermal conditions they will be in. If you say that the CPU temperature won't exceed X degrees due to the cooling systems, you can push the core clock speed higher.

idunno246 · on June 12, 2015

There's a Hamilton talk at the last reinvent where he kinda implies what's going on. Because Amazon is running in a datacenter and can guarantee the temperature, they are able to make assumptions and clock the chip less conservatively.

http://www.slideshare.net/mobile/AmazonWebServices/spot301-a...

wmf · on June 12, 2015

Oracle also has custom SKUs. It's just different fuses and since Intel already makes a dozen SKUs from each die it's easy for them to make one more. Personally, I'm thinking of ordering an 11-core SKU.

Ciantic · on June 12, 2015

I don't have a great need for VPS, but I still like running a few.

I have never tried Amazon VPS because their pricing calculation is really confusing, especially when I just want smallest VPS to play with. This announcement is just as unclear on actual price than any other pricing page in Amazon's site.

I like the way Digital Ocean makes it simple to understand how much it costs to try their smallest VPS. Why can't the Amazon do something like that? Simple chart for monthly payment.

derefr · on June 12, 2015

I use both VPS providers (for my own little experiments), and deploy to AWS at scale. There are two major reasons I can think of that, from the perspective of "I just want a monthly VPS", that AWS's pricing feels so opaque. At the end, I'll also answer your actual question of what the dang thing costs.

1. Short Instance Lifetimes

An idiomatic system running on AWS (with autoscaling, load-balancing, and CloudFormation managing AMI changes to enable immutable infrastructure) has average instance lifetime (much) lower than one month.

In my current operation, most instances live a few days at most, many far fewer (some are literally just "background processes" instantiated as whole machines, and terminate themselves when their job is done. When instances are charged by the hour, this costs far less than it would seem!)

AWS isn't a VPS provider, really. You don't use it to run your snowflakes (http://martinfowler.com/bliki/SnowflakeServer.html). It's for building systems, living bodies made out of cells that are created and die without affecting the body as a whole.

A pretty valid comparison is that AWS encourages you to use VMs in about the same way that Erlang encourages you to use processes.

You might also consider that hosts like Heroku consume AWS and then provide something with monthly fees. When AWS calls itself Infrastructure-as-a-Service, it means it; it's not for end-user developers, because end-user developers don't need control over infrastructure. AWS is for ops-teams who need to declaratively instantiate an entire virtual data center.

If you want to play at being an ops-team (and I'm not knocking that desire; it's fun and educational) you can use AWS for your tiny web-app. But it's the wrong fit, precisely because you won't need most of what AWS is trying to give you, or have answers to the compromises it wants to know your position on, because those compromises only matter at scale.

AWS also has ...

2. A Heterogeneous Substrate

AWS has both old hardware and new hardware laying around, and both older and newer data centers, and their regions (collections of interlinked data centers) are either over- or under-capacity. The prices are played with to incentivize you to use whatever currently costs AWS the least to give you.

The us-east-1 region, for example, is more expensive to launch instances in, mostly because it's pretty hard to find more places to put new DCs in the area it's in, so they want to prevent everyone from crowding into it (especially given that it's the oldest region and the default.)

AWS spot instances are another example; if you're willing to take whatever people are using the least of on AWS whenever it becomes available, you can boot up an instance for even cheaper than its nominal price, because you're running it using machines that would otherwise be (pretty much guaranteed to be) idle.

One interesting thing AWS does, is make newer instance types only compatible with newer parts of their computational substrate. AWS never truly deprecates anything: non-VPC instances, 32-bit instances, PV instances, and instances with ephemeral public IPs, are still all there for systems that were designed way back then with those in mind and are expected to stay working. (This is also why us-east-1 is still the default, by the way. Old orchestration scripts assume they'll continue to get us-east-1 if they leave the region off. You don't want your new instances suddenly sneaking off half-way across the country from your old ones...)

But AWS isn't gonna bring up any new hardware to run the legacy stuff if they can help it. There are only so many e.g. 32-bit boxes in the AWS DCs, and AWS isn't going to allocate more, since one legacy customer moves away from 32-bit instances at about the same rate that some other legacy customer scales up to need more 32-bit instances. So 32-bit instance types (which are all legacy instance types, now) all stay the same price.

The new instance-types, meanwhile—like t2.micro or this m4 series—can afford to be cheaper, since they're restricted to only launch on the modern parts of the substrate. This means both that they will get cheaper over time (because AWS will continue to grow the modern part of the substrate), and also that they start out cheaper, because the modern designs are more efficient for AWS to run in terms of computational/network overhead, power consumption, cooling, cost of replacement parts, etc.

---

Anyway, to answer your question, I didn't bother with the docs; I used the AWS calculator (http://calculator.s3.amazonaws.com/index.html). (Side note: it's "AWS calculator", not "EC2 calculator", for a reason. It's very likely some cost other than compute will swamp your compute costs. Bandwidth, for example, or S3 storage, or RDS Postgres instances with provisioned IOPS.)

Anyway, in the us-west-2 region (a new-ish region with modern DCs), a t2.micro instance (the newest tiny instance type) run on-demand (no prepaid reservation) costs $9.52/mo. The cheapest you can get is a t2.micro run with a 3-year upfront reservation: it's $4.20/mo, or $151.00 total to get a VM that will sit around for three years.

But if you just want to try AWS, all of this is irrelevant: you get a year of hobbyist-grade resources (including e.g. one year of t2.micro instance-hours) for free (http://aws.amazon.com/free/) when you sign up.

helper · on June 12, 2015

I wonder why the m4 instances are less expensive than the m3s: m4.xlarge is $0.252 vs m3.xlarge $0.266. The c4 instances are more expense than the c3 instances.

listic · on June 12, 2015

I look at this the other way around: older instances are more expensive, compared to the newer ones, when those arrive. And that is a common thing in the hosting business, either EC2 or not.

wnevets · on June 12, 2015

I think the m3s were cheaper than the m2s when they we launched.

kijin · on June 12, 2015

Yeah, I think Amazon does this on purpose to encourage people to move to the new lineup, so that the old, underperforming, energy-inefficient lineup can be decommissioned sooner.

amluto · on June 12, 2015

I'm curious how the C state control works. What does /proc/cpuinfo look like? Are they letting mwait through for real? Are these Xen or real virtualization? I wish Amazon would document their virtual hardware platform more clearly.

Sanddancer · on June 12, 2015

Is that core count right for the big instance? The same cpu is used on the data-heavy nodes [1] and it says it maxes out at 36 cores. Or does amazon save some cores for their own uses on those boxes?

[1] https://aws.amazon.com/blogs/aws/next-generation-of-dense-st...

kijin · on June 12, 2015

There doesn't seem to be any official specifications for the Xeon E5-2676 v3, probably because it's a custom CPU made just for Amazon.

Random sources found via Google seem to suggest that they have somewhere between 10 and 12 cores, which would translate to between 40 and 48 hyperthreaded vCPU's with a dual-socket setup.

So perhaps Amazon has found a way to set aside fewer (or no) cores for their own uses on the new m4 lineup. Or maybe it's the other way around and the data-heavy nodes require more cores to be set aside than usual.

Sanddancer · on June 12, 2015

Yeah, it is possible that amazon does the latter there. Save some virtual cores for their own purposes to dedicate towards parity, etc, to get IO higher.

mindprince · on June 12, 2015

A bit surprising that these instances are EBS only and don't have any ephemeral disks at all.

bgentry · on June 12, 2015

That's been the norm for all new instances launched by AWS in the last couple years. They often follow that up after 6-12 months with ephemeral versions of these instance types.

Hopefully they continue to release instances with ephemeral storage, because most distributed systems design assumes uncorrated failure.

nierman · on June 12, 2015

MB/s is probably incorrect for the EBS throughput instance limits; those values make more sense if they use Mb/s.

jeffbarr · on June 12, 2015

Blog post author here - you are correct, I was wrong, and the fix is rolling out now!

listic · on June 12, 2015

Glad to see you following the discussion, Jeff. Maybe you probably could address my and others' curiosity re: the exact nature of the custom Intel Xeon E5-2676 v3 CPU? I don't think Intel could justify making any serious customization even for a customer as large as Amazon, but I would be very interested to know exactly how custom it is.

ghshephard · on June 12, 2015

Universal request to all when doing technical documentation - "MBytes/sec" and "Mbits/sec" are much clearer, and leave only the doubt as to whether you mean 2^20 or 10^6 (and, in 95% of use cases, 10^6 is the correct choice, both syntactically, as well as in what you are actually measuring - data rates are almost always in SI units)

xeno42 · on June 12, 2015

The page linked in the post confirms your assertion: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptim...