Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Terraform 0.12 (hashicorp.com)
281 points by burntcaramel on May 22, 2019 | hide | past | favorite | 161 comments


I honestly love Terraform as a product. It was one of probably three tools I've used in my entire career that made me feel immediately more productive. After using it for a very short period of time I was shocked developers continued to struggle through CF templates and the fragility the whole process entailed.


Viewing Terraform solely through the lens of cloud automation and in comparison with CloudFormation is a shortsighted mistake. Terraform has providers for plenty of other services that don't qualify as "cloud things" and lack proper configuration of their own. In a very general sense, Terraform is a terrific resource management tool with state versioning & locking built in. For example, there's a terraform-kafka-provider[1], which can be used to manage topics in a Kafka cluster. Could shell scripts be written to accomplish the same goal? Of course. Would those scripts inevitably develop in a haphazard, organic way to form a buggy and incomplete implementation of something like Terraform? You bet!

"Cloud" configuration may have been Terraform's proverbial toe in water, but the truly untapped potential lies in the other providers. Anything that can be packaged as a Terraform provider exposing resource abstractions can be easily managed using convenient HCL syntax. This, IMHO, is the unfortunately buried lede.

[1]: https://github.com/Mongey/terraform-provider-kafka


Someday folks will rediscover the potential of Puppet for these use cases. Until then, I'm content to watch countless alternative implementations come and go.


Before using Terraform I tried to use Puppet to manage AWS infrastructure. It was a fun but short adventure. The idea of having all your configuration in one tool is nice but Puppet just isn't that tool. The one thing I found lacking the most, which made Terraform the tool for the job, is awareness of state. Having the triangle of what configuration you want (HCL), what configuration you expect to have (state) and what configuration you actually got (real world), with the tools to observe the differences between these and the ability to make careful changes is just what you need for important infrastructure changes. With Puppet every change felt like Fire-and-pray. Sure you can run a 'noop', but if anything changes in between that and the actual run that could trigger another resource you would be catch by surprise. On OS/App level the impact can often be contained, but on infra, not so much as you would have reduce the powerful features of Puppet to much as to it not being a benefit anymore.


You are implying that people who use Terraform are not aware of Puppet. Puppet is a terribly complicated thing compare to Terraform. We use both on a daily basis and everybody agrees that we need to move away from Puppet. Terraform + Ansible is the way for us to go.


It is nice that puppet defines a graph of dependencies compared to doing this in ansible. What I find tough is that you have another indirection when using puppet to configure your tools. So you first have to figure out the puppet module’s configuration and how it maps onto the actual tools configuration


Wonder why you got downvoted, as someone who has been using ansible for years to accomplish what was said and what you accomplish with puppet I wonder what I am missing out of terraform


There's a lot of overlap but, in general, Terraform is focused on orchestrating things while Puppet is more about configuration management.

Plus, the different philosophies of mutable/immutable infrastructure that the different capabilities/limitations in each tool encourage.


The nice thing about Terraform (and Ansible) imo is that they don't require a daemon but just run locally (or on CI), with some shared state in a object store.


Here's how Lyft used "Masterless SaltStack at Scale" https://youtu.be/7ffHKH9H5_Q and the getting started documentation https://docs.saltstack.com/en/latest/topics/tutorials/quicks...


Neither does Puppet. Many people run it in a masterless "one-shot" configuration.


What would be your comparison between the solutions? As in, why would I pick Puppet over Ansible or Terraform for a masterless use-case?


Doesn't puppet only support success/changed/error status for resources? How does it work with AWS infrastructure where you may need to possibly remove dependencies before the change, update the resource in place, or update other resources to point at what you just created? You need at least 5 states, I believe? (That depend on properties, not resources, so notification is not enough)


I wish we could add providers by having terraform load python modules (or tcl, lua, js) and call it a day. Instead they chose to offer a statically-linked binary that forks gRPC plugin servers, and supporting that looks like a hell of a lot more work than writing and running a script.


this is short-sighted. Terraform and CloudFormation are not even in the same league. One of them works and actually can be used for Infrastructure as Code, the other one does not roll back in the face of failure - it effectively craps for reasons ranging from network failure, process crash, even normal operation. One is heavy kool-aid with bugs that go unresolved for years, the other behaves as advertised.

Sorry, but I have yet to meet a developer that used Terraform and willingly wants to keep using it after seeing it fail. When your tool cannot keep track of the resources it created or when it gets into situations where it doesn’t allow you to do certain things (like deleting all resources created) and you have to relearn how the underlying cloud works, it’s time to move on.

Do yourselves a favor and use Cloudformation (or your favorite cloud’s equivalent and just move on with your life)


I mean, this is harsh, but there's a running joke that the big feature terraform is missing is a -twice flag so that it'll re-run itself on failure, since that's what you end up having to do anyways.

Also, the terraform language, HCL? It's, I guess there's no better way to put this: not good.

Am I misunderstanding the complexity of what Terraform is trying to do? To me, it looks like a bunch of tiny API clients tied together with a topological sort --- in other words, it's just another species of "make". It feels like it carries a whole lot more complexity than that concept warrants, but just enough simplicity to make it not a serious programming language. It's, to me, one of those frustrating uncanny valley systems.

I am prepared to be totally wrong about this.


Yeah, HCL's limitations have been an enormous thorn in my side for a long time. HCL2 (Terraform 0.12) is a big step forward, but even still I pine for a proper programming language, even if it has foot-guns.

I do think you're selling Terraform short, though. Sure, the core is the toposort-create-things. But it also stores the state of its created things and (crucially) has the ability to diff the actual state of resources against what it thinks they ought to be.

Being able to inspect existing resources and diff state is also what lets it import existing resources, so that they can be Terraform-managed going forward.

Terraform is also capable of determining if its planned changes can be performed in-place or if they require resources to be destroyed and re-created. That boils down to a boolean flag on a field, ultimately, but it's still something a dead-simple make clone probably wouldn't do well.

I've only really used Terraform seriously for AWS, so I'm not sure about the other providers, but the Terraform AWS Provider has an enormous amount of work behind it. Basically every resource API has schema validation written in the AWS provider, and depending on the resource there are often eventual-consistency issues handled by the provider. See for example [0].

In contrast, AWS CloudFormation: can't import existing resources; isn't always sure whether an update will require replacement or not; and, as of ~6 months ago, can detect configuration drift, but not correct it (!). Of course, CloudFormation wins in other areas...

[0]: https://github.com/terraform-providers/terraform-provider-aw...


Long-time user of Terraform here, never used CloudFormation. Where does CloudFormation beat Terraform?


CloudFormation has an “easy button” if you are part of an organization with a business support plan from AWS. If you can’t figure out something you can submit a ticket and start a chat.

Also CF is the “native” language of AWS. There are plenty of getting started examples from AWS where they give you the template. Also Elastic Beanstalk extensibility is built on top of CF.

Not to mention Codestar that will set up environments for you for common use cases and exports templates and the lambda environment lets you configure everything from the console, test it and then you can export the CF definition.


So, I think it does win in a few areas...

Like one sibling comment mentioned, getting support from AWS is nice. You can buy Terraform support, too, but knowing HashiCorp it'd probably cost more than most peoples' AWS bills in their entirety. (This is me being a little cheeky and unfair—HashiCorp handles community support via GitHub Issues for Terraform really well.)

CloudFormation's built straight into AWS, so there's no need to set up state file storage or locking or worry about a state file at all, really. This has its own set of drawbacks, but it's nice for getting started, and in theory it makes CloudFormation more robust out-of-the-box.

CloudFormation StackSets are really nice if you have identical resources that need to be placed in many different regions and accounts. (Example: we use it to place GuardDuty and Config, both regional services, in every region.) With Terraform, this means copy-and-paste, as far as I know, though maybe Terraform 0.12 sets up the groundwork to make this better?

The Service Catalog is basically a way to let technical end-users manage products via CloudFormation templates. Something could be built to do the same for Terraform, but I don't think there is anything like that right now.

CloudFormation does (attempt to) roll back to a known-good state if an update fails. Terraform just stops in the middle of what it was doing. I have mixed feelings about that.

I find YAML to have better tooling/editor support than HCL, though I actually prefer HCL.

There are existing CloudFormation wrappers (e.g. Troposphere) that can give you a full programming language on top of CloudFormation. To my knowledge, there isn't anything similar for Terraform.

Some things that have changed with 0.12:

CloudFormation has had `AWS::NoValue` for as long as I can remember. Terraform <=0.11 had special values (like empty string, number zero, etc.) that were special-cased as no value. Terraform 0.12 now has `null` properly.

Terraform <=0.11's ternary operators were maddening because both sides were evaluated, which led to errors, unlike CloudFormation's !If. In Terraform 0.12, the ternary only evaluates one side.


Cloudformation will leave your infrastructure in a consistent state. Period. It will consistently drive things from point A to point B or rollback to point A if it cannot get to B. It will also correctly remember all the resources it created and allow you to properly identify them and/or delete them. These sound like tablestakes but Terraform cannot do this. Add the frustrations of HCL on top and it’s a big no go for me.


I've had CloudFormation fail to roll back plenty of times, though never in a situation where it actually mattered.

I've never had Terraform "forget" resources or disallow me from deleting them (unless I requested that). Maybe those are bugs that've since been fixed?

It sounds like we've had very different experiences with these tools.


My experience with CloudFormation has been consistent with what you just said. CF will fail to delete resources and then lock them in that state, requiring AWS tickets and reps and all that. I have not had anything like that in TF.


> can detect configuration drift, but not correct it

if by "correct it" you mean update the current template to match what's there, that'd be good, as AFAIK there's no easy way to do this at the moment. if by "correct it" you mean revert or change resources, no thanks. that sounds like a production accident waiting to happen, and you can un-drift (?) resources manually already.


I do mean revert or change resources. Making manual changes outside the context of an IaC tool is largely madness, in my opinion, although sometimes the situation warrants it.

Auditability and controls are one of the many facets of IaC. We require code changes to be approved by another developer, and similarly we require infrastructure changes to be approved by another developer. Regularly working outside the IaC tool would be in violation of that policy.

It's in this approval step that the change-set (CFN) or plan (Terraform) should be carefully reviewed by a human. If someone's made manual changes, reversion of them should appear here, and those should be unusual and eyebrow-raising. At that point, it's either fix the IaC definition of the infrastructure, manually un-drift it as you say, or do some workaround to ignore specific changes.

(To reiterate, IMO no one should ever run CFN/Terraform unattended on prod infrastructure, and there should always be a step to review the change-set/plan.)

I'll also say that the sword cuts both ways when it comes to prod outages and manual changes. Not so long ago, I ran into a prod-impacting issue when turning on multi-AZ for an RDS instance. In the other regions that had multi-AZ enabled, someone had manually added an extra parameter to the RDS parameter group, one that was required for certain app functionality to work. No one ever added it back to CloudFormation and that knowledge was eventually lost. When we enabled multi-AZ in a different region, we expected no problems at all, but instead we ended up with a whole section of app functionality breaking.

(This would've been before drift detection was a thing in CloudFormation, but actually I don't think RDS parameter groups are supported in CloudFormation's drift detection right now anyway. [0])

[0] https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...


AWS Config proves additional guarantees/monitoring/auditing, but many people don't use it cause it costs money.


Speaking of, they recently changed the pricing model for (part of) Config. It should be much cheaper now. https://aws.amazon.com/blogs/aws/new-updated-pay-per-use-pri...


detecting resources that are already there may be useful in a limited set of scenarios (prototypes / retrofitting) but definitely should not be the way you build things. thinking through what your cloud infra is doing and expressing it is the way to go.

there is also a way to do this for cloudformation. look up cloudformer.


I agree, importing resources is a limited use-case. However... it's one I find myself in more often than I would like. Reality is often disappointing. At least, with the companies I've work(ed) with.

That said, I don't think CloudFormer does the same thing. I haven't used it before, so please correct me if I'm wrong, but to me it looks like it takes existing resources and generates a CloudFormation template out of them. But then you're still expected to upload that template and create a new stack, with all brand-new CloudFormation managed resources—is that right?

So for example, CloudFormer for an RDS instance would probably be a no-go.

In the Terraform case, after you've imported resources, there's no need to re-create them. There's the question of if what's in the template will match what the resources actually are, but there are also tools to generate Terraform templates straight off resources.


Well said. Its basically ANT for the cloud, in that its an annoyingly limited DSL with just enough power to work, but in a verbose and frustrating way.

Speaking of real code (not YAML or HCL) as infrastructure, anyone have experience with Pulumi?


I've been using Pulumi for a few weeks now and it works pretty much exactly as advertised (for AWS at least). I am yet to run into any major hurdles and for any minor issues I have had the Pulumi team have been very responsive (they have a public Slack channel available).

It actually uses terraform's API's under the hood which is comforting in a way because you know it's building on a solid foundation.

I would not go back to writing HCL after experiencing Pulumi if I can avoid it, using a "real" programming language just feels a whole lot more natural and allows for much more powerful abstractions.


It basically depends on the formation of your team.

A bunch of oldschools sysadmins who "don't code"? Terraform is ridgid and on-rails enough that it probably helps sort of keep things sensible compared to just using boto. Almost like it was a framework, specifically defined to do that sort of thing.

It does sort of suck though, but what sucks less?

Edit: My solution is to stick as much as possible into k8s, but obviously that comes with its own warts, and to be fair to terraform, a lot of terraforms warts are just the underlying API warts leaking through.


The most sensible way I’ve used TF was to create an api for others to consume, containing models of terraform modules.

Users can pick and chose what modules, how many and set some params which are then validated.

The components are then rendered as json, fully declarative, no counts, no fancy TF hacks.

Data sources and a couple of home grown providers take care of whatever needs to be dynamic.

This way a lot of foot-guns are removed for consumers, and it simplifies the whole thing. At the expense of writing some code... worth it though, in a more enterprisey setting.


No, it's pretty straightforward, terraform is just intended to parse and execute random functions in a graph generated by a template language. It is absolutely a form of 'make': overcomplicated and generic to serve the needs of supporting random companies' infrastructure as versioned code, without having to write the implementation bits.

It's a good idea and not a bad design, but the user experience is pretty bad and nearly all the operational aspect is an afterthought. It was not built to be sold as a product, hence it kind of sucks as a product, but it's fine as a free tool. The fact that it's the best free tool we have for this task speaks volumes about how most companies are deathly afraid to work as a community to build better solutions.


Well, judging from the release notes, with for-loops you are now one step closer to a real programming language. Looks like the beginning of the end of the "declarative" paradigma which keeps people running their heads against the wall like this https://blog.gruntwork.io/terraform-tips-tricks-loops-if-sta...


Those are list / map comprehensions. So it's still declarative—you can't iterate and do anything while iterating. Only transform items in a list.

https://www.hashicorp.com/blog/hashicorp-terraform-0-12-prev...


the complexity of Terraform does not come from the happy path. It comes from when things go wrong and now you have to understand what went wrong and how to correct it. Now you’re in a world of pain since TF just throws its hands in the air and you’re left holding the bag. i have hear several people referring to terraform as Terrafail. I have also observed good dev/devops people struggling to make it work when shit hits the fan.


Well, if you don't follow the obvious advice of storing your state in a S3 bucket... If the idea of storing your state somewhere is so abhorrent for you, you can also tag your resources and import them with Ansible/Bash as per the obligatory CloudFormation workflow. Then your state becomes local-only. That's a pretty useless thing to do for purely ideological reasons, though: storing your state in a cloud-agnostic way is what allows being able to manage a Gitlab repo, a Cloudflare domain and AWS or GCP instances in the same tool. That is what the "multi-cloud" means.

If you are saying a nonsensical argument like "not IaC because it can't rollback" I can say CloudFormation is not IaC because it can't even import a local file to grab some values. Or run external commands


Yes it can run external commands via custom resources.


Terraform allows you to call an arbitrary shell command in you machine, not just trigger some webhook.


What is the practical difference between “calling an arbitrary shell command” and “calling an arbitrary lambda”?


Honestly I don't even know how to respond to such a comment. Let's just leave it at that.


Yes because in scenario 1 your running code that you write outside of the template and in scenario two you’re calling out custom code....


CF doesn't rollback either, the rollback is not done by the client but by CF backend, very different story.


I was going to respond, but instead I will say: I hope that you enjoy using CF, and I will continue enjoying Terraform.


When working solely on AWS, I prefer CloudFormation, since I feel it's easier to organize my deployments into CloudFormation Stacks.


Agreed about CF. AWS puts a huge emphasis on CloudFormation when they know it is sub-par. Ansible is pretty cool too because you can easily convert between YAML/JSON for CF, and add a lot of flexibility based on variables and other things such as error handling that you can't do with Terraform.


I don't agree it is subpar. I have a fairly long list of reasons for preferring CF over Terraform. I agree on combining CF with Ansible though. My recipe is to wrap a CF template in an Ansible role. It makes for a versioned chunk of automation that composes well with my other roles, including ones that are not CF based. Plus I can do non-CF setup tasks using any Ansible module, and even run tests, to get pretty much full end-to-end automation for a given stack. I haven't found a way to get the same degree of power and flexibility from Terraform without resorting to writing custom plugins.


Definitely not subpar. Cloudformation is an incredible service. I’ve used it to deploy thousands of stacks. It just works.


does it rollback in the face of failure? /s



i know, hence the /s

it was a jab at Terraform’s behavior


My experience of TF is very much the opposite. It looks fantastic on paper (well, in the browser) and I was excited to start using it. But the shine quickly wore off. It claims to be multi-cloud but it's not as if you can take some TF from AWS and run it on Azure - it's a rewrite. And it claims to be able to import existing, running cloud configurations so migrating to it should be easy, but it doesn't work, so that's a rewrite. And it is terrible at modifying existing things, the developers of TF seem to be believe that destroying everything and rebuilding from scratch is the way to, say, add another node. Of course by the time you figure all this out you have invested so much political capital in persuading the organisation and your peers to adopt it that you're stuck with it. But the lesson is really, use whatever is native to each cloud, such as CF on AWS, beyond any trivial deployment you're locked in anyway, might as well enjoy it.


Well, I feel the same about terraform. I much prefer cloudformation.


I'm an ex Googler and 2nd time founder of a company in the computer vision / robotics space and started hacking when I was 14. I'd consider myself pretty knowledgeable in my area -- yet, I must admit that I have absolutely no clue what Hashicorp and Terraform do and why everybody likes it.

I am very curious though. Would you mind explaining this from a high level (to someone who knows cloud technology about as well as your dog)?


Terraform serves to translate a descriptive statement about a desired infrastructure state into the sequence of API calls that will bring about that state.

Given something like "I want three auto-scaling groups, each containing a minimum of three instances of type m4.xlarge, in the AWS us-west-2 region. And I want an S3 bucket that has permissions set up so that only the code running on those instances can read and write to the bucket. And I want a load balancer between all of them. And I want the instances to run Ubuntu 18.04 and to install these 6 dependencies on startup. And I want a large pepperoni pizza[1]." Terraform will read your credentials and make it happen.

[1] https://github.com/ndmckinley/terraform-provider-dominos


Just curious: what exactly about the description on the website is unclear?

> Provision and Manage any Infrastructure Use infrastructure as code to consistently provision any cloud, infrastructure, and service.


The terms "provision", "manage", "infrastructure" and "service" alone can each have 10s of different meanings. Hence if you multiply it out that sentence could have 10000s of different meanings.

Hence it's meaningless to me.


Both CloudFormation and Terraform are tools that allow a dev to define Cloud Resources (databases, networks, permissions, instances, all that) as code/markup, and then run those markup files through their respective engines (CloudFormation or Terraform) and have those resources get created/modified/whatever.

They both have their drawbacks, Terraform obviously suffers from not being a first-class service to AWS (since their service CloudFormation is a direct competitor). It is also possible to accidentally discard your Cloud "state" file, that keeps track of every instance that already exists and its current state (so TF can do a "diff" on what is there vs. what you're trying to apply), which definitely causes some headaches. In my experience the benefits of the design decisions from the tool far far outweigh any of the cons.

On the contrary, I have never had a good experience with CloudFormation. The workflow is long/slow, some of the AWS Best Practices are hilariously bad (looking at you "Paste this entire Python file as a text string into a yaml file), and more than a few times have I gotten into a state where CF-generated instances cannot be deleted and require the intervention of an AWS Rep.


Hey, thanks for the clarification!


what are the other two tools?


My biggest hope is that this means that some effort can be directed back to fixing bugs. I love what terraform can do, but I hate what I sometimes need to write to make it work.

Also, I'll go out on a limb and say that I dislike the flexibility of iteration allowed in HCL 2. I know that people overwhelmingly asked for it, but my opinion is that it demonstrates a fundamental misunderstanding of how the v. 11 and earlier system was designed, and just how powerful completely declarative code can be.


IMO they should have dumped HCL once they caught a whiff of what Pulumi is up to, much rather go with actual languages that when you learn them, you pick up some valued knowledge. You can also do a whole lot more with Typescript than you can with HCL 2.


+1 on Pulumi.

It's a bit disconcerting when one of the BIG aspects of the Terraform 0.12 release is some support for preliminary types.

Pulumi just uses Typescript.

I'm not entirely sure why Terraform is going in the direction of reinventing the wheel when it can leverage stuff out there.


Agreed, the data model of Terraform simply doesn’t match the problem domain. You need an algorithm to codify the pattern and then data to fill in the params. Terraform doesn’t allow you to create the patterns you need in a way that’s debugable and doesn’t allow for code reuse. For simple setups it’s not apparent there’s a problem but when they get more complex it’s nearly impossible to use.

Additionally you have to rewrite everything for each cloud provider, so it just expands the work required, all with no real IDE integration. I’d just write against the cloud provider APIs directly or look at Pulumi when they get interactive debugging.


I used terraform with my AWS deployments because there were lots of examples and I pretty much was able to find a solution that matched my problem and copy paste.

Then, I needed to launch infra in GCP and I messed around with terraform unsuccessfully for a few days before writing about 10 lines of gcloud CLI commands into a makefile.

Now I just check a makefile into my project and just break things up into little shell scripts.

Solved so many problems and headaches.


I'm so tempted to do abandon CloudFormation for a Makefile with AWS cli commands.

If there wasn't a chip on my shoulder telling me I had to use what the next person would expect else as a contractor I risk being seen as unprofessional I'd do it in a heartbeat.

I've never used Terraform but CloudFormation just seems to suck, the documentation is poor and relatively few people are sharing their stack files. I've lost count of the number of times I've hit an error only to find Google hasn't heard of it.


IMO both of you guys should take a look at Pulumi. You could still use a makefile but using their TS clients to access the APIs is much more intuitive and easier to diagnose than shell scripts.


I can't see paying an ongoing monthly fee unless I was dealing with really big and ever changing architecture. Most of the time, I am writing everything into a log because I will want the ability to rebuild everything in the event something goes down.


I wouldn’t say it becomes impossible to use at scale, what happens at scale with terraform, but rather the opinionated nature of HCL makes itself present in ways you probably won’t ever have to deal with if you’re shepherding very small fleets or single serving resources.

Personally I’m excited that loop operations now exist in 0.12

But thanks for the reference to Pulumi, had not heard of this and it looks very interesting.


The Terraform language extension in VSCode is very good.


It's not good, it doesn't understand the TF types and recommends values that are invalid for the context, there's no inline help (you have to go look it up on the website), refactoring a variable name doesn't fully work, and you can't set breakpoints or trace through the execution to determine what's going on... I mean it's practically in the mid-80s, although most of those things worked even back then.

The lesson here is that if you plan on making a language, even a DSL, you want to be sure you're really up for it since it's a lot of work.


The biggest issue we have with Terraform (and other Hashicorp tools in general, really) is their different configuration formats, but mostly it is HCL limitations.

I haven't found a solution for it, but I have lots of resources that are almost identical, except for a few arguments, but right now it seems like my options are to either write my own Ruby script to output .hcl files, or live with two resources that are almost identical, and live with the mistakes that could entail by future modifications.

I haven't had the time to look into HCL2 yet, but maybe it is solved.


+1 for the question about a "standard" or recommended way to parse and generate .tf files from a script.

I have two use cases in mind: Use case 1: need to have a reproducible way to generate terraform folders for multiple almost-identical deployments (this can be solved via modules) Use case 2: need way to "promote" a deployment from staging to production (load a .tf file, change a few basic params then save again to a different folder --- doesn't really work with templates since I want to load an existing .tf file)

One thing that might work is to use the combination of two tools: - https://github.com/virtuald/pyhcl = reads .tf, can export .tf.json - https://github.com/kvz/json2hcl = convert .tf.json to .tf but feels hacky...

Any other recommendations for .tf parsing and generation? (preferably in python or scriptable via python)


Your favorite tool that produces json should be able to work. I'd probably use jsonnet, which has a multi-file mode so that one template file can render into many json files.


You can actually use JSON files in place of HCL: https://www.terraform.io/docs/configuration/syntax-json.html

main.tf.json


You can't write a local module for the resource that takes the differences as inputs?


Maybe I can, but I am failing to grasp how templates or the modules work, we have these resources where only the ami is different (and sometimes the count):

    resource "aws_instance" "nomadclients_tick" {
      ami = "${data.aws_ami.nomadclient_tick.image_id}"
      instance_type = "t2.micro"

      count = 3

      iam_instance_profile   = "${aws_iam_instance_profile.consul-join.name}"
      subnet_id              = "${element(aws_subnet.consul.*.id, count.index)}"

      vpc_security_group_ids = [
        "${aws_security_group.rule01.id}",
        "${aws_security_group.rule02.id}",
        "${aws_security_group.rule03.id}",
        "${aws_security_group.rule04.id}",
      ]
    }

    resource "aws_instance" "nomadclients_tock" {
      ami = "${data.aws_ami.nomadclient_tock.image_id}"
      instance_type = "t2.micro"

      count = 3

      iam_instance_profile   = "${aws_iam_instance_profile.consul-join.name}"
      subnet_id              = "${element(aws_subnet.consul.*.id, count.index)}"

      vpc_security_group_ids = [
        "${aws_security_group.rule01.id}",
        "${aws_security_group.rule02.id}",
        "${aws_security_group.rule03.id}",
        "${aws_security_group.rule04.id}",
      ]
    }
The examples I have found feels very difficult.


I personally would solve this issue with a local module (very easy to set up, just a subfolder with a main.tf that defines the resources[0] then it's called as here[1]) with two input variables[2]: a required ami variable, and an optional count variable which has a default of 3. I hope that helps.

[0] https://www.terraform.io/docs/modules/index.html

[1] https://www.terraform.io/docs/modules/sources.html#local-pat...

[2] https://www.terraform.io/docs/configuration/variables.html


One way I can think of that will solve this problem is to do something like:

    resource "aws_instance" "nomadclients" {
      ami = "${count.index > 2 ? data.aws_ami.nomadclient_tick.image_id : data.aws_ami.nomadclient_tock.image_id}"
      instance_type = "t2.micro"

      count = 6

      iam_instance_profile   = "${aws_iam_instance_profile.consul-join.name}"
      subnet_id              = "${element(aws_subnet.consul.*.id, count.index)}"

      vpc_security_group_ids = [
        "${aws_security_group.rule01.id}",
        "${aws_security_group.rule02.id}",
        "${aws_security_group.rule03.id}",
        "${aws_security_group.rule04.id}",
      ]
    }
The downside to this approach (in TF < 0.12), however, will become apparent when you want to modify the number of instances in one or both pools. This arrises from the way hcl v1 manages the counter index with each resource in the state; That is each resource is linked to a specific index and when that index is altered in someway, terraform will attempt to adjust the resources at each respective index accordingly.

Because of these headaches, we decided to abandon hcl for these use cases and ended up writing our own preprocessor that takes a JSON configuration and generates individual JSON-based HCL (ie. .json.tf) that terraform can then use. We can leverage proper templating to generate many variants of a single resource in a higher level language, while still using terraform to manage the infrastructure and it only adds one additional step to the plan + apply process.


Alternatively you could put the AMIs in an array, and just iterate over that array.

  locals {
    nomad_ami = [
      "${data.aws_ami.nomadclient_tick.image_id}",
      "${data.aws_ami.nomadclient_tock.image_id}",
    ]
  }

  resource "aws_instance" "nomadclients" {
    count = 6
    ami   = "${element(local.nomad_ami, count.index)}"
  
    ...
  }


How does a var, template, or even a generic resource that runs a local script not solve this problem?


We are very heavy users of terraform. We have thousands of lines of HCL and our own providers etc...

Terraform 0.x before 0.12 was heavily limited due to the syntax, you just worked around so many limitations. We are actively ignoring those limitations and we started working without our own templates using jinja that just generate vanilla terraform. No counts and no if/else needed.

Maybe with .12 we can move some of these back to plain terraform.

Upgrade so far is not smooth though, there are a lot of pains with the type system vs the plain old “I’ll figure it out for you”.

ALL of this being said, I really appreciate Hashicorp’s work on this, we could not imagine our life without terraform.


Terraform is an amazing piece of technology but the biggest thing holding it back is HCL. I guess the intention was for it to serve as a happy middle ground between a full blown programming language and a configuration language - enabling some abstractions, but yet rigid enough to stop developers getting too carried away. I think unfortunately that trying to have it both ways doesn't really work and ends up leaving those writing the code frustrated - it's almost as if it's teasing you - you can feel there's all this power under the hood and you are given things like modules which are nice, but you always feel like you're on a of short leash.

I switched from Terraform to Pulumi for a personal project recently and haven't looked back (no affiliation). Writing it with Typescript means you get excellent IDE support (using VSCode here) and access to the enormous JS ecosystem. I've also found myself creating a number of useful abstractions - that I would never have bothered to with HCL - like IAM helper functions, eg:

    const limitedReadAccessPolicy = createPolicy(
      "product-table-read-access",
      allow(
        ["dynamodb:Get*", "dynamodb:Query"],
        [table.arn, interpolate`${table.arn}/index/*`]
      )
    );

vs

    resource "aws_iam_policy" "product_read_access" {
      name        = "madu_${var.env_name}_product_read"

      policy = <<EOF
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Action": [
            "dynamodb:Get*",
            "dynamodb:Query"
          ],
          "Effect": "Allow",
          "Resource": [
            "${aws_dynamodb_table.products.arn}",
            "${aws_dynamodb_table.products.arn}/index/*"
          ]
        }
      ]
    }
    EOF
    }


Terraform and Packer are the two most common tools I use with my DevOps consulting company. Most all of my clients previously were creating cloud resources manually within the U/I control panel. Moving to Terraform is a game changer in terms of transparency, auditability, and automation. With the release of Terraform Cloud[1] which provides centralized state storage, locking, history, and integration with GitHub the HashiCorp stack is a no-brainer.

[1] https://app.terraform.io/signup?utm_source=banner&utm_campai...


Congratulations to the team for a monumental release! The significance of the changes involved to ship 0.12 seems understated. This release paves the path to implement many highly requested features and fixes. Most noteworthy of which, in my opinion, is module counts.


I’ve been very excited for this release. I’m a huge fan of Terraform—When I first started using it I quickly couldn’t imagine working without it, despite its many rough edges. But it continues to improve (I can think of very few projects that have had such large and consistent improvement) and this release is another huge step forward. Props to Hashicorp for the great work.


If you love terraform, please also look at Pulumi (I have no affiliation with them).

https://pulumi.io/reference/vs/terraform.html

https://github.com/pulumi/tf2pulumi


Honestly I like the idea of using declarative code for infra stuff because this way it becomes an inventory-like configuration language. There is a reason why YAML dominates today.


Terraform does not use YAML. And in the last updates they try to make their HCL more "programming language like". Pulumi is already there (they are using an actual programming language)


Of course Terraform does not use YAML. I mean that YAML or other approaches like HCL allow for a declarative style while maintaining some constructions like for-loops or conditionals.

Having your infrastructure in an inventory-like codebase makes it clearer to reason about. It can't go too crazy like most programs end up like


Pulumi is declarative. You can't tell Pulumi what to do, you can only tell it what you want. (Disclosure: I work on Pulumi.)


I see there's a way to not use the hosted service for storing state files, but how do you address things like locking in Pulumi?


I haven't used that scenario, but you are responsible for locking. I guess if you use S3 or something similar to store/retrieve state you can easily make it work.


I work heavily with the Deployment Manager from GCP, so I am basically used to describe the infrastructure using python. What seriously bugged me about TF is how difficult it was to do simple stuff like having a nested loop within a resource. I know that 0.12 is supposed to help there and I am looking forward to the moment where TF will evolve into a proper programming language, supporting saner formats (already happening with json integration).


As someone using Terraform from it's first release on I still think that Terraform has one fundamental flaw:

It always looks at things using provider specific resource, while IMHO it should just expose a bunch of predefined resource types (see rOCCI specs e.g.) and then allow you to attach a specific provider to it.

IMHO the biggest win as a user would be having not to have an implementation for every provider over and over. Do we really need to have a consul module for Azure, AWS, Tudeluuu and god knows who? No.

That being said, Terraform in the long run still is the most reliable tool in that space.

The whole situation about state management is... lacking. Experience says the one thing no client ever wants in the cloud but always on prem is state.


> The whole situation about state management is... lacking. Experience says the one thing no client ever wants in the cloud but always on prem is state.

Hm, can you elaborate? S3 state seems perfectly serviceable, and I don't immediately see why I would want to operate on-prem resources just to maintain state.


S3 is Amazon. In some cases you want to exert control over your state without being dependent on external parties.

Currently Terraform Enterprise is the only approach to solving that and its a good one but... only if you're one of the big guns and even those sometimes think twice because the infrastructure state requires much more security than that offers.


There is nothing stopping you from backing up state in s3 and git. Just have CICD to sync it up.

State really shouldn't have anything secret in it in the first place, so 3rd party having access to it shouldn't matter.


S3 compatible on-prem storage like minio might work for you


Sec policies.


This, and also total lack of control.


Many different clients will be happy with many different situations. Current and previous 3 clients all happy with state not on-prem.


Question that's been bugging me and I haven't quite wrapped my head around --

For light usage, I find managing Terraform's state to be a significant hurdle. You basically have no choice but to set up secure remote storage unless you want to check passwords into source control. In contrast `kubectl apply` is so easy to use since it's stateless. It just creates or updates any resources provided, and it even supports --prune if you want the set of configuration to be treated as comprehensive.

It seems like the main things that Terraform adds:

1. The ability to work with providers that require you to store their generated IDs to reference later. With kubernetes, the kind and name of the resource is enough to identify it; it does get assigned a UID, but you don't have to include that in the configuration since keys that are excluded are left as-is.

2. The ability to work with multiple different providers. I'm not sure how often you do have a single terraform project(is that the term?) with more than one provider, but I guess using the same set of tools, even if the configuration is provider-dependent, is nice.

Is that accurate? Does Terraform offer any other advantages?

If you were building a configuration mechanism for your system from scratch to allow your users to configure it as code, would you make .a Terraform provider over a command line tool that can apply [--prune] that same configuration?


Terraform shine when you combine multiple providers, kubectl manages Kube but Terraform helps you integrate K8s with the rest of your infrastructure (monitoring, Sentry,GitHub and GitLab, etc.)

> I find managing Terraform's state to be a significant hurdle Have a look at https://app.terraform.io/signup/account, it's the free version of Terraform Enterprise and it makes managing your state very easy.

>If you were building a configuration mechanism for your system from scratch to allow your users to configure it as code, would you make .a Terraform provider over a command line tool that can apply [--prune] that same configuration?

I would make a Terraform provider, you get plans, modules, the possibility to integrate with other providers easily and creating a new Terraform provider is actually very easy!


I've been chasing infrastructure-as-code for a while, but I keep getting blocked by Terraform missing provider APIs (GCP Cloud Function Runtimes outside Nodejs, most recently), Kubernetes complexity, and generally being too lazy to sit down and bang out something in code that I can quickly iterate on with a console or quick CLI commands. I've had some wins with Serverless Framework, but outside that I've yet to see the payoff from the time I have put towards it. Am I defective?


Just not working at a large enough scale.

Infrastructure as code becomes progressively more important as you add resources until you're at the point where terraform does the job of a whole team of administrators who would be spending all day clicking through UIs.


The UI is often a good way to quick get things done. Terraform requires a bit more investment, but has a huge payoff in the long run.

Personally, I think staging/production is the greatest advantage. I can experiment and break things on staging, then when I apply the plan to production I can be confident I’m getting the same infrastructure.

Infrastructure-as-code brings many programming amenities. You can write comments. You can easily see what a resource depends on and you can grep the repository to see what depends on it. You can version control it, which gives you a record of who changed what and possibly why. You can do code reviews.

It does take more time then clicking through a UI, but I can’t imagine operating all but the smallest infrastructure without Terraform or similar.


No not defective; sounds like you could use some ansible in your life.

It's easy to misuse ansible though. People try to consume it like puppet and use it for whole-system-declarative-state, which is usually not worth the squeeze in ansible.

Treat it as glorified bash-scripting with smarter data handling and yaml-driven config files, and you'll have success.


to be fair for kubernetes deployment you don't need a lot of knowledge. either use k3s or kubespray and just edit some variables. (baremetal)


The upgrade path for this has been an absolute nightmare, made worse by the fact that we sat on the `google` provider 1.16 (lots of GKE stuff like node taints was moved out to `google-beta` in `google` 2.0 and TF 0.12 requires `google` 2.5 at least). There isn't even a straightforward transformation for many things. I get why it's this way, but brace yourself.

I also made the mistake of `terraform plan`ning and updating my code as I went along. Just use `terraform validate`. Otherwise you're going to inadvertently promote the statefile before you're done dealing with all the issues (and you don't want that because it prevents you from aborting your upgrade and switching back to 0.11 till you're all ready). Not a real problem because the statefile is versioned but an annoyance nonetheless.

The type issues were mostly easy to deal with, except for where things that were assignments now being blocks. For instance, look at this:

      master_authorized_networks_config {
        cidr_blocks {
          cidr_block = "10.0.0.0/8"
          display_name = "Example /8"
        }
        cidr_blocks {
          cidr_block = "10.1.2.3/32"
          display_name = "Example /32"
        }
      }

That looked like:

    master_authorized_networks_config = {
        cidr_blocks = [
          {
            cidr_block   = "10.0.0.0/8"
            display_name = "Example /8"
          },
          {
            cidr_block   = "10.1.2.3/32"
            display_name = "Example /32"
          },
        ]
      }
before and `terraform 0.12upgrade` isn't about to help you navigate this. Especially if you previously assigned from a variable. In that case, it's going to make this monstrosity of a `for_each` over that thing. Jesus Christ.

Still, I'm thrilled for the new stuff with the more type-safety. Not going to complain. If this is the price, then I'll pay. I just wish they'd done more to help the upgrade, but it's an 0.x release so fine.


Good stuff. It looks like they've addressed a few of the ergonomic issues in their language with this release. The old way of doing iteration and having to use string interpolations just to reference variables were annoying.


Hashicorp got it right with Vagrant and its Ruby-based DSL. At the end of the day you could always drop down to a proper programming language and express what you wanted if the DSL was lacking. Why they had to invent this HCL monstrosity is beyond me.


I’m a little unclear on terraform in practice.

As you supposed to download this, use it’s language and syntax which is all it’s own thing, to define you services and then export that to a YAML setup that AWS CloudFormation (for example) is expecting?

I assume there are reasons I wouldn’t just define it myself in YAML directly?


As others have said, no exporting is involved. You write roughly-json-esque code which you then apply to a cloud provider.

The state of the infrastructure is stored, ideally, in the cloud. You apply your code during which terraform identifies which changes need to be made by comparing the current state with your local changes and executes those changes as you watch.

The end result is well-defined, testable, repeatable cloud infrastructure provisioning.

Infrastructure as code. Store it in git. Profit.


Where do you store your state files? The only project I've worked on that used Terraform stored state in a git repo, which seemed like a nightmare with multiple people doing deploys to the same environment. Others have suggested it would be better stored on a network accessible share.


I use a versioned S3 bucket to store state in my project, terraform also supports "locks" to avoid race conditions/parallel execution. I've set up buildkite to run `terraform plan` and then if the output looks I can approve the next step to run `terraform apply`.

Atlantis (https://github.com/runatlantis/atlantis) is also a really cool project for managing teams working on terraform projects.


Personally, my team uses S3 to store state paired with DynamoDB to manage state locks. Translation, when someone applies a layer a remote lock is created for the duration of their interaction. It’s a fantastic system.

I imagine using git to store state would be mightily inconvenient.


Usually your favorite S3 like service. Do most of your development in modules then have “roots” that cover a well defined subsets of your infrastructure (say db, app servers, load balancer and firewall rules for a single app) then use workspaces to manage different environments. The docs for terraform are actually quiet.


Store them in a bucket as soon as possible.


>Infrastructure as code. Store it in git. Profit.

Ok. But so is CloudFormation YAML files, mostly.

So is the advantage of Teraform that it works beyond AWS?


For me yes. I can do things like: Spinup a kubernetes cluster, add a deployment with the latest image from my docker registry, add a loadbalancer with an external IP and create a cloudflare A record for my domain with that IP - all within the same tool.


Ok, but if you are (as in my case) staying entirely in AWS dynamo, lambda, E3, and IoT Core... it seems less important.

I never leave the AWS ecosystem at all.

I’m not challenging what you do, just trying to figure out what makes sense for me.


huge difference between storing the infrastructure definition vs storing the infrastructure state. cloudformation manages the state for you and will not fuck it up / allow you to fuck it up


My opinion? Because CloudFormation is full of shit regarding "always being able to rollback". After dealing with a 100+ failed CF rollbacks, I stopped caring about that phantom feature. That's around the same time I started using Terraform. At this point I only use CF for things I can not put in TF directly - namely ASGs.

Regarding IaC "rollback" capabilities - I don't think it really exists in a way that makes it reasonable for people to depend on. The path forward is to stand up new infrastructure, canary test, etc - then steer via load-balancers the traffic to your new nodes, and then finally destroy the old ones. I loathe trying to maintain a fleet of nodes that can drift into various states of dis-repair. I love the idea of blowing everything away and having a 100% clean and predictable environment again. It makes me happy.

</end rant>

Hope that helps - I really have fallen in love with Terraform after having to cobble down N number of CLI tools for various cloud providers, hypervisor providers, etc. Terraform at least gives an easy to use language and abstracts me above recursive API call logic that I would otherwise need to write for basic things.


That does help. I’ve never heard about CF not working as expected.

Hmm, I’ll have to look into it some more. I really like the idea of layering my application into dynamo set up, lambdas, polices, etc. and being able to update a specific layer, pull it down and put the new one up.

Last thing I want is more headache.


Terraform creates and manages changes for you. You should not be using Cloudformation or writing CF templates at all while using it.


No. It invokes infra changes directly, Cloudformation doesn't enter into this at all


there's no exporting. You write everything in there format to define your infrastructure, and then terraform turns it into AWS API calls


That makes sense.

Now, other than the portability of AWS to Azure to GCP, why do I want this?

Because with CloudFormation I can pull down and put up my “stacks” in layers and this isn’t necessarily the same as API calls (although on the backend maybe it is).


Terraform can manage much more than AWS, Azure, and GCP.[1]

[1] https://www.terraform.io/docs/providers/


Basically Terraform is cloud agnostic, so you can (theoretically) define something in Terraform and then use it on AWS, or GCP, or Azure.


That's not at all how it works.

Terraform supports each of those providers, but the resources are specific to each provider.

You cannot use the same Terraform configuration on AWS and Azure.

Nor would you want to, per se. As far as I know, this is by design.


You can’t share provider-specific resources across providers of course, but you can absolutely share other configuration data across providers. I find it extremely useful - for example it’s fairly trivial to bounce between scaleway, AWS, DO, Vultr, etc.


You can also use the same basic cicd pipeline to manage infrastructure on different providers instead of a custom pipeline for each provider tailored to their tool.


Fair criticism, I've never used it, just what I've picked up from their marketing material.


[flagged]


Personal attacks will get you banned here, regardless of how ignorant or annoying some other comment is. Please review the guidelines and follow them when posting to HN: https://news.ycombinator.com/newsguidelines.html


People often complain about HCL being very limiting, that they are sick of templating HCL, etc.

Has anyone tried the approach of writing some go code that imports terraform, rather than using the terraform CLI? This would give you the full power of golang to set up your resources.


First of all, I don't want to dismiss the fact that terraform is a great tool and mostly on the right path.

But HCL is giving me Puppet DSL PTSD.

Folks at Hashicorp should just embed JS engine and let users write definitions using a real language (JS).

It's totally doable using library such-as Otto.


For a second I confused this with TerraGen (https://planetside.co.uk/) and got all excited about reliving a nostalgic youth.


From a language and semantics perspective, any comparisons to Nix?


Language improvement is nowhere match to what Pululmi brings to the table without sacrificing any features.


Terraform still isn't stable?

On the other hand, I'm usong React-Native, lol


Finally


> Terraform 0.12 is a major update

Looks like a minor update to me...


Before stable version x.y.z, if x is 0, then y means major


I am astounded that after all this time it is still 0.y.z

I mean, it has been around for like 5+ years...

I am even more astounded that people are happy to use a product that by its own definition is not stable.

Same happen in ruby a lot. You find a Gem that claims to follow semantic versioning and it is still in 0.y.z after years of being used in production which flies in the face of https://semver.org/#how-do-i-know-when-to-release-100


Well we can argue about the hundreds of Apache projects that are not even stable after decades.

OpenSource projects have another pace compared to commercial solutions


Where does Terraform claim to follow semver?


I never said it did. I said rubygems did and often fail at following that.

However I know of no common versioning scheme where 0.y.z is considered production ready.


Where doesn't it claim it? Using a versioning scheme like x.y.z _by default_ means semver


No it doesn't. Why would you think it does?


I moved away from Terraform a long time ago. Ansible was way more powerful, handled errors and issues with state changes. Terraform was super picky about how you had to operate, slow and HCL was a terrible markup language. I wasn't really a huge fan of Ansible either though and more recently have been doing things in regular shell/Bash scripts. Now with Kubernetes and service brokers, there is no need for Terraform.


Ansible can very easily, and often does, end up in situations where two runs of the same playbook both have drastically different results. roles/playbooks slowly become Bash scripts written in a YAML layer parsed as a Jinja2 template.. and the project turns into a mess of many layers of indirection.

it attempts to and encourages declarative configuration, but is very hard to keep that way. it is difficult and requires determination to make Terraform do something in a non-declarative fashion.

the end result is that when I look at a Terraform configuration I can very easily tell what is going on, because the end result is exactly what I read. where with Ansible, it very much depends on understanding the current state of the server you are about to run this playbook on, and you just have to cross your fingers and hope for the best.


I have many Ansible roles that I trust to do the right thing over and over again. Though, I never use bash for scripting, and when I do shell out in Ansible I keep it extremely simple and always use a `when` or `creates` condition to keep it idempotent.


Maybe you never worked in a company with good ansible practices.


following best practices with C will get you code without memory leaks or overflow errors, yet here we are in 2019 still routinely dealing with those in mission critical software written by people who understand what they are doing.

Terraform is a tool designed to overcome the problems that "best practices" will supposedly prevent you from introducing in your Ansible playbook.

don't get me wrong. Ansible is a wonderful tool for certain applications (configuration management). I would just never use it to spin up infrastructure again.


While there is certainly overlap, these products are trying to solve different problems.

Ansible seems to focus on managing infrastructure, Terraform shines when creating infrastructure.


What about setting up and tearing down EKS clusters? I haven't used terraform but I learned about it through looking at this:

https://learn.hashicorp.com/terraform/aws/eks-intro

So, I'd say Kubernetes itself does more heavy lifting but not all of it?


I use Terraform to create EKS, works pretty well. I have it setup to use Spot instances with a standard compute node as well as GPU nodes when needed.

https://github.com/JasonCarter80/contoso_aws_k8s/tree/master...


Spinning up an EKS cluster is something you usually do once, and if you need multiple EKS clusters then you can bootstrap others from an initial cluster that you deploy with eksctl or AWS CLI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: