In this blog post, we'll explore Terraform state management in depth. We'll start by explaining what Terraform state is and why it's so important. Then, we'll explore the best practices for storing your state files, including choosing the proper backend and implementing locking to prevent conflicts. 

We'll also discuss scenarios to consider when deciding whether to split or keep a single Terraform state file.

Finally, we'll discuss the benefits of using Terraform for remote state storage. This allows you to share state files across projects and teams, which can improve collaboration in infrastructure management.

Common mistakes made around the state

Understanding these points can help you avoid potential issues and maintain the integrity of your Terraform-managed infrastructure.

1. Storing State Locally

One of the most common mistakes is storing the Terraform state file locally on a developer's local machine. Still, it's recommended that the state file be stored on a remote backend, such as Amazon S3, Google Cloud Storage, or Terraform Cloud.

2. Failing to Implement Locking

Another common mistake is not implementing locking mechanisms for the Terraform state file. Locking is essential to prevent concurrent modifications, primarily when multiple users or processes work on the same infrastructure.

When using a remote backend, configure the appropriate locking mechanism, such as DynamoDB for Amazon S3.

3. Manually Editing the State File

Directly editing the Terraform state file is not recommended, as it can show unexpected behavior and state corruption. Terraform's state file is a private API, and manual edits can easily break the state and cause issues with future Terraform runs.

Instead, use the provided Terraform commands, such as terraform state pull, terraform state push, and terraform state mv, to both manage terraform state and manipulate the state as needed.

4. Lack of Versioning and Backup

With version control, tracking changes, collaborating with your team, and rolling back to the previous version or versions if needed, becomes easier.

5. Inconsistent State Management Across Environments

Maintaining consistent state management practices across different environments (e.g., development, staging, production) is important.

Inconsistencies in state file location can lead to confusion, errors, and potential security risks.

Ensure your Terraform state management strategy is consistently applied across all environments to maintain a predictable infrastructure.

By being aware of these common mistakes and following best practices for Terraform state management, you can ensure your infrastructure's long-term stability, security, and maintainability as code.

Introduction to Terraform State Management 

Terraform is the tool for managing your IAC. But did you know that the state of your infrastructure is just as important as the terraform code itself? This state information is stored in a file called terraform.tfstate, and Terraform must understand what's happening in your infrastructure.

At the end of this blog, we will discuss Firefly and how it helps manage the state.

The state file keeps track of all the resources that Terraform has created, their attributes, and how they're connected. With this state information, Terraform would know what to do when you rerun it. It may develop resources that already exist, or it may miss significant changes.

That's why managing your Terraform state is so important. You must ensure the state file is stored safely and everyone on your team can access it. Otherwise, you might run into problems down the line.

Storing Terraform StateFiles 

One of the first decisions you'll need to make when working with Terraform is where to store your state file. By default, Terraform stores the state file locally on the machine running the Terraform commands. You should only use this approach if you are trying out Terraform for the first time.

Selecting the Right Backend

The recommended best practice is to store your state file in a remote backend, such as Amazon S3, Azure Blob Storage, or Google Cloud Storage. These cloud-based storage options offer several advantages:

Remote state files can be shared among team members, making it easier to collaborate on infrastructure changes.

Remote backends often provide encryption and access control features to secure your state file.

To configure a remote backend, you must add an entire backend configuration block to your Terraform configuration, specifying the appropriate backend type and the necessary configuration options, such as the bucket name, region, and access credentials.

Remember to have the state backend configured before using it to manage your infrastructure.

Here's an example of how you might configure an AWS S3 backend:

1. Create an S3 Bucket

  • Navigate to the S3 service.
  • Click on "Create bucket".
  • Enter a unique name for your bucket, such as "terraform-state-mana".
  • Choose the appropriate AWS Region for your bucket.
  • Configure other bucket settings as needed, then click "Create bucket."

Importance of DynamoDB for Terraform State Management

The state file is typically stored in a backend, as an S3 bucket, to ensure it is versioned and secure.

However, without additional measures, the state file can become a point of conflict when multiple users or processes try to access and modify it simultaneously. This can lead to conflicts and data loss in your infrastructure. This is where DynamoDB comes into play!

DynamoDB is a fully managed NoSQL database service provided by AWS that offers fast and predictable performance with seamless scalability. When used with an S3 backend for Terraform state, DynamoDB delivers a locking mechanism to ensure that only one user or process can modify the state file at a time.

2. Create a DynamoDB Table

  • Navigate to the DynamoDB service in the AWS Management Console.
  • Click on "Create table".
  • Enter a name for your table, such as "lock-table".
  • Set the Primary key to "LockID" with the key type as "String".
  • Configure any other table settings as needed, then click "Create".

3. Configure Terraform Backend

In your Terraform configuration, add the following block to the Terraform section:

4. Initialize Terraform with the S3 Backend

This will create the necessary state file in your S3 bucket and set up the DynamoDB table for locking.

After completing these steps, your Terraform state will be stored in the specified S3 bucket, and the state file will be locked using the DynamoDB table to prevent concurrent modifications.

Dynamo DB Overview : 

Here, the lock-table table locks the state file whenever someone uses terraform init and terraform plan, and at that particular time, only that user can make the changes as per the lockID. The Digest changes each time we do terraform init.

Locking State Files

When working within a team, it's essential to implement state locking to prevent multiple users from making changes to the state file simultaneously. This can help avoid conflicts and unexpected behavior in your infrastructure.

Most remote state backends, including the AWS S3 backend we configured earlier, support state locking. In the example above, we used a DynamoDB table called “lock-table” to handle the locking.

When you run Terraform commands that modify the state, Terraform automatically acquires a lock on the state file. This lock is released when the command completes successfully. If another user tries to make changes while the lock is held, Terraform prevents them, ensuring that your state file remains consistent.

By storing your state files in a remote backend and enabling locking, you can ensure that your Terraform infrastructure remains consistent and that your team can collaborate effectively on infrastructure changes.

Isolating Terraform State Files

As your Terraform managed infrastructure grows, isolating and organizing your state files becomes increasingly essential. This can help you maintain a clear separation of concerns, make it easier to manage different environments and improve collaboration within your team.

Let’s look at the diagram below on how we can isolate our state files for different scenarios.

Isolation through Workspaces

One technique for isolating state files is to use Terraform workspaces. Workspaces allow you to create multiple state files within a single Terraform configuration, each representing a different environment or use case.

For example, you might have separate workspaces for your development, staging, and production environments. This ensures that changes in one environment don't accidentally affect the others.

You can use the terraform workspace command to create and switch between workspaces. Here's how it works:

  • Create a new workspace: terraform workspace new dev
  • Switch to an existing workspace: terraform workspace select prod.
  • List all available workspaces: terraform workspace list.

When you switch between workspaces, Terraform will automatically load the corresponding state file, ensuring you work with the correct infrastructure resources.

Workspace-specific Configurations

In addition to isolating state files, workspaces allow you to define workspace-specific configurations and variables. This can be particularly useful when you need to customize the behavior of your infrastructure based on the environment.

For example, you could use a different instance type or a different VPC in your production environment compared to your development environment. You can achieve this by defining workspace-specific variables and using them in your Terraform configurations.

Scenarios to split or keep multiple state files

Managing Terraform state effectively is important for maintaining the integrity of your infrastructure, whether you are just beginning your Terraform journey or joining a team where Terraform is already implemented. Understanding the concepts around organizing and splitting your Terraform state files is important for making informed decisions that suit your project's needs.

You might start with a single configuration file containing all your infrastructure as code when you're new to Terraform. As you learn more and start collaborating with others, you'll likely need to scale your Terraform configuration to accommodate team growth and evolving infrastructure requirements. In this scenario, it's essential to understand how to structure your Terraform configuration files for improved testing, reusability, and scalability.

Scenario 1: State for Smaller Projects

One Environment (Dev, Prod, Staging all in one)

Keeping everything in a single Terraform state file is often convenient for smaller projects with real infrastructure and just a few resources. This approach simplifies state management and makes it easier to work on your infrastructure.

This scenario is typically suitable when you have a relatively small infrastructure, such as 1 or 2 instances, a few network resources, and a handful of other components and you know that the infrastructure will not grow significantly over the time. In this case, maintaining a single state file can be more straightforward, as there are fewer moving parts to manage.

Additionally, when working with a single state file, the initial terraform init command is generally slower. As for multiple state files Terraform only needs to initialize the specific state file(s) relevant to the managed resources rather than the entire remote state file.

To implement this scenario, you'll create a single state file, e.g., single_env.tfstate, and configure your Terraform backend to use this file:

This keeps all your state information in one place, making managing and collaborating on your infrastructure easy.

Scenario 2: State per Environment

Separate State Files for Each Environment

As your project grows and you have multiple environments (e.g., development, staging, production), it should be preferred to maintain separate state files for each environment. This helps prevent unintended changes in one environment from affecting the others.

To implement this scenario, you'll create a directory for each environment (e.g., env, prod, staging) and a Terraform state file within each directory (e.g., env.tfstate, prod.tfstate, staging.tfstate). Then, you'll configure your Terraform backend to use the appropriate state file for each environment:

This approach helps maintain clear boundaries and dependencies between your different environments.

Scenario 3: Split State by Resource-wise

Separate State Files for Resource Types

When managing a large and complex infrastructure with Terraform, it can be beneficial to split your state files based on the different types of resources. This approach can make working on and managing specific parts of your infrastructure easier without dealing with the entire state file.

Splitting state files by resource type is important to improve the manageability of your Terraform configuration. Having a single monolithic state file can become problematic as your infrastructure grows. By separating the state into logical groupings based on resource types, you can reduce various issues, improve performance, enable parallel workflows, enhance visibility and collaboration, and facilitate selective backups and restores.

When working with multiple state files, the initial terraform init command is generally faster and more efficient than a single, large monolithic state file. Terraform only needs to initialize the specific state file(s) relevant to the managed resources rather than the entire state.

For example, you might have separate state files for resources like compute, networking, storage, databases, and so on. This allows you to manage and maintain these infrastructure components more effectively, without the overhead of a single, large state file.

By splitting up your state this way, you can improve your Terraform-managed infrastructure's overall manageability, scalability, and reliability, especially as it grows in complexity over time.

In this scenario, you'll create a directory for each environment (e.g., env, prod, staging) and separate state files for different resource types within different directory for each environment (e.g., iamrole.tfstate, s3.tfstate). Then, you'll configure your Terraform backend to use the appropriate state file for each resource type:

Finally, you can use the terraform import command if you have existing resources to import into your split state files.

terraform import -state=iam.tfstate <resource address> <name of the resource>

Terraform state management using Firefly

Firefly is a cloud asset management solution that enables DevOps, SREs, and platform engineering teams to control their entire cloud footprint and manage it more efficiently and consistently using Infrastructure-as-Code (IaC) 

Firefly's non-intrusive platform provides visibility into your cloud resources, allowing you to monitor and track changes, optimize cloud asset management, and replace manual effort with codified processes.

Firefly scans your entire cloud footprint, including your Terraform states, to determine which parts of your existing infrastructure are codified and which are unmanaged. It then automatically generates Terraform code to turn unmanaged resources and configuration drifts into managed ones, ensuring your cloud remains in its desired state.

Firefly's continuous comparison of your IaC to your actual cloud configuration helps identify drift, misconfiguration, and policy violations, allowing you to quickly remediate issues.

Firefly provides a seamless way to navigate from your codified cloud resources directly to the underlying Terraform state files that manage them. In the Firefly dashboard, you can view all the resources in your cloud environment currently managed by Terraform, known as your "Codified" resources. 

For each of these codified resources, Firefly provides details on the Terraform configuration that defines it. 

From this Compiled resource view, you can directly navigate to the specific Terraform state file (.tfstate) managing that particular resource. For example here is the view for the IAM role “staging1111.role”

This allows you to easily inspect the contents of the Terraform state file under the IaC Stacks section, including all the resources in that state, their current state, and any relevant metadata or dependencies. 

Now, If you go to the backends section under IaC Explorer, you can quickly locate the resources associated with the listed state file

Frequently Asked Questions 

Q. How do you manage state file in Terraform?

Terraform uses a state file to keep track of the resources it has created and their current state. To manage the state file, you can use Terraform commands like terraform state pull, terraform state push, and terraform state rm to inspect, update, and remove resources from the state. It's important not to manually edit the in storing terraform state file, as this can lead to inconsistencies between the state and your infrastructure.

Q. What is the difference between Terraform backend and state?

The Terraform backend is responsible for storing and retrieving the state data. The state file itself contains the serialized representation of the infrastructure that Terraform manages. The backend determines where the data source is and how this state data is stored, whether that's a local file, a remote storage service, or a custom backend implementation

Q. What if Terraform state file is deleted?

If the Terraform state file is deleted, Terraform will no longer be able to track the resources it has created. This can lead to inconsistencies between the state and the actual infrastructure. To recover from a deleted state file, you would need to re-import(terraform import) the resources into the existing state by using Terraform commands like terraform import.

Q. How do I manage multiple Terraform state files?

To manage multiple Terraform state files, you can use Terraform workspaces or separate state files for different environments or components of your infrastructure. This allows you to keep your state organized and avoid conflicts between different parts of your infrastructure.

Q. What are Terraform state commands?

Some key Terraform state commands include:

  • terraform state pull: Pulls the current state and outputs it to stdout
  • terraform state push: Uploads a local state file to the backend
  • terraform state rm: Removes one or more items from the Terraform state
  • terraform state mv: Moves an item in the state to a new address
  • terraform state pull: Pulls the current state and outputs it to stdout
  • terraform state show: Displays attributes of a single resource in the state

Conclusion

This blog focuses on the importance and best practices for managing Terraform state files. The main points are:

Terraform state is the core of Infrastructure as Code (IaC). It tracks the current state of your cloud resources. Properly managing the state is critical to ensuring your deployments are consistent, reliable, and secure.

Terraform's state locking is essential to prevent conflicts when multiple people or processes update the state simultaneously. This ensures consistency and prevents unintended changes.

Tools like Firefly can enhance Terraform state management by providing visibility into your whole infrastructure setup, automatically detecting and fixing state drift, and acting as a safety net for your deployments.