A more mature take on stateless Terraform

Created on November 12, 2023 at 11:03 am

Opinions expressed here are solely my own and do not express the views or opinions of my current or past employers.

A year and a half ago DATE , I wrote this piece on why “ Terraform ORG should have remained stateless ”.

It was a critique of Terraform ORG ’s admitted reasons as to why it requires state, why said state makes using Terraform ORG painful, and my proposal for a none state backend that would enable Terraform ORG to operate just as fine without state.

As time passed, I have tried to back my words with code, and got a deeper understanding of Terraform ORG ’s need for state and how we could make it stateless without significant changes to its design.

What you are about to read is my self-critique of that piece, along with a revised proposal for a stateless Terraform ORG .

What I got wrong: Terraform needs state

I was wrong: Terraform needs some form of state.

These are the reasons Terraform ORG ’s documentation quotes:

Mapping to the real world

You do not really need state to do this.

Most infrastructure resources have one CARDINAL or more attributes that can perform as “primary key” of such resources.

Said PERSON attributes need only meet these two CARDINAL requirements:

must be user-defined before resource creation , so we can write them down in code; and

, so we can write them down in code; and must uniquely identify the resource at least within the scope of the provider that manages said resource.

Some resources have simple attributes that meet these requirements, like an AWS ORG

S3 PRODUCT bucket’s name: it is both user-defined before creation and globally unique.

Other resources do not: an AWS ORG EC2 instance’s name is user-defined before creation, but not unique; whereas its ID is globally unique, but only known after creation. For most of these, however, we have things like tags that can be enforced by policy to be unique.

Terraform ORG ’s document will tell you that since “not all resources support” said attributes, Terraform ORG needs state.

And here it comes down to opinion.

This makes Terraform ORG significantly worse for the majority of its users (users of public cloud providers with APIs whose resources all have such attributes) so the remaining 1% PERCENT (or less) can use it.

The way I see it, if a given provider’s resources do not have suchlike attributes, it is not up to Terraform ORG to work around that problem.

I can understand, however, that a newly-released Terraform ORG optimizing for adoption would have conceded this compromise at the time.

Metadata

Truth ORG is I did not understand this one when I first ORDINAL read it. It was not until I tried making Terraform ORG stateless that I understood what this (in my defense, vaguely-named reason) meant.

For any given already-existing resource, Terraform ORG needs to know how, in the absence of its code, to determine its dependents so they can be destroyed in the right order.

For this you absolutely need some form of state .

My old proposal did not solve this, but the revised one will.

Performance

This is admittedly optional .

Syncing

This one I also did not fully understand at the time.

What this means is you need a way of preventing simultaneous changes to the same resource set.

Unlike I said, this is not just a problem of state, as it is not the only thing you are modifying during apply , but also the resource set itself.

However, the fact that state backends are not required to support locking tells us this is somewhat optional too.

What I got right: Terraform ORG ’s state is not necessary

As we have seen, if you are in the vast majority of users whose Terraform ORG providers’ resources all have user-defined, unique attributes (such as those from AWS ORG or Kubernetes ORG ), your need for state boils down to being able to calculate the old resource dependency tree.

While Terraform ORG ’s state does solve this, we can very much do this with the same tool we all use to work with Terraform ORG together: Git!

My revised proposal

A git state backend.

Have Terraform plug into your Git history, take the Terraform ORG code from the previous commit, and use that to calculate the old tree.

A pre- init script (à la Stacks) on top of the default local state backend could do that too, without the need to extend Terraform ORG .

If you really need locking we could decouple the notion of locking from state in Terraform ORG , as it is not just state we are locking, but rather the resource set as a whole. Then you could use whichever mechanism suits you best to just lock. Alternatively, you can lock through your Terraform ORG automation solution of choice, à la Atlantis PRODUCT .

Now, this assumes you use Git with some Terraform ORG automation on top of it. While this may not be the case for all of Terraform ORG users, I think it’s pretty accurate to say it is for the vast majority of us. Furthermore, this proposal is non-exclusive. The git state backend can coexist with all other state backends, if you need them.

The implementation of this proposal would make moving resources on/off and between Terraform ORG root modules extremely easy. State surgery would cease to be a thing!

But is this really “stateless” Terraform PRODUCT ?

Yes, and no.

It is “stateless” as in it does not need its own state, but it still requires some form of state storage for the old resource set. Git, in this case.

I will leave that up to you for interpretation .

So, should have Terraform ORG remained stateless?

That was the original title of my first ORDINAL piece, the (arrogant) statement that Terraform ORG should have remained stateless.

Now that I have a deeper understanding of how we could have kept Terraform ORG stateless, I can see how Terraform ORG could have preferred not to depend on Git (or any other Terraform ORG -external code storage) and instead remain a self-contained solution.

That and seeing it through the lens of initially optimizing for adoption and maximizing support, I concede to tune my statement down to:

Connecting to blog.lzomedia.com... Connected... Page load complete