What Is Drift in Terraform?

Drift occurs when the Terraform configuration of your Azure resources no longer matches with your Terraform state file. The state file acts as the central source of truth for your infrastructure. It keeps a detailed record of how your resources were configured when Terraform last applied changes, serving as a reference for their intended state. If changes are made outside Terraform, such as through the Azure portal or AZ CLI or ARM templates,  this creates a mismatch referred to as drift.

Let’s say you use Terraform to create a storage account in Azure with public access disabled. Later, someone enables public access through the Azure portal without updating Terraform. In this case, the resource's actual configuration no longer matches the Terraform state file. Terraform will still assume that public access is disabled until next time applied , which could lead to security vulnerabilities or misconfigured applications relying on restricted access.

How Does Drift Occur in Azure?

Now that we know what drift is, let’s take a look at how it occurs. Drift in Azure typically occurs when modifications are made to resources outside of Terraform. These changes can be intentional, such as adjusting configurations through the Azure portal or Azure CLI. Here are the most common scenarios that may lead to drift within Terraform:

  • Direct Changes via Cloud Console: Teams often make quick changes directly in the Azure portal, such as turning on public access for a storage account or changing a virtual machine's size. While these changes might solve immediate needs, they don’t get recorded in Terraform’s state file, causing drift.
  • Changes Through CLI Tools or Scripts: Another way drift happens is when someone uses Azure CLI or scripts to update resources. For example, resizing a VM or changing a network configuration outside Terraform means the state file is no longer accurate.
  • Conflicts from Other IaC Tools: If you use tools like Bicep or ARM templates to update an Azure resource, such as changing the SKU of an App Service, Terraform’s state file won’t record those updates. For example, if you upgrade an App Service Plan from Basic to Standard using Bicep, Terraform will still think the resource is set to Basic. This mismatch creates drift, making Terraform unaware of the actual configuration within your infrastructure.

Understanding these reasons for drift helps you take steps to detect and fix it. 

How Can Drift Affect Your Infrastructure?

Drift may initially seem harmless, but it can lead to some significant issues if not addressed correctly. Here are some pointers on how it can affect your infrastructure:

  • Security Issues: When changes are made outside of your code, they can bypass the security measures that are defined within your Terraform configuration. For example, if public access to a storage account is enabled through the Azure portal, data such as customer records, financial documents, or API keys could be exposed to the internet. Terraform would remain unaware of this change, leaving your infrastructure vulnerable to unauthorized access.
  • Compliance Issues: Many organizations rely on Terraform configurations to enforce compliance with standards like GDPR or HIPAA. Drift can result in configurations that no longer meet these requirements. For example, if someone disables encryption on a storage account through the Azure portal, it could lead to a compliance violation, putting your organization at risk of regulatory penalties.
  • Deployment Failures Within Your CI/CD: Drift can also disrupt your CI/CD pipelines. For example, if the size of a virtual machine is changed through the Azure portal or Azure CLI, Terraform might try to revert it during the next CI/CD run. This can lead to conflicts, deployment errors, and even service interruptions.
  • Increased Cloud Costs: Drift can easily increase your organization’s cloud bills. For example, Azure might scale a virtual machine to a larger size during high traffic, such as increased users accessing an application, but if it doesn’t scale back down, you’ll pay for unused capacity. Similarly, turning on backups or lifecycle policies directly through the Azure portal without updating Terraform can lead to extra charges you didn’t plan for.

These pointers show us why addressing drift is important for maintaining a reliable and efficient configuration.

How to Detect Drift in Azure?

Now that we understand what drift is and how it can impact your infrastructure conceptually, let’s see how you can identify drift in your Azure infrastructure. In this section, we’ll deploy resources using Terraform, change their configuration through the Azure CLI, and then use terraform plan to identify the drift. This process will help us see how Terraform detects the differences between the desired state and the actual configuration.

We’ll start by setting up a resource group, a storage account, and a storage container. In this configuration, HTTPS traffic is not enforced for the storage account. Below is the Terraform code to implement this configuration:

provider "azurerm" { features {} } variable "env" { description = "Environment name" type = string default = "dev" } variable "location" { description = "Azure region for resources" type = string default = "eastus" } variable "name_prefix" { description = "Prefix for all resource names" type = string default = "infrasity" } resource "azurerm_resource_group" "rg" { name = "${var.name_prefix}-rg-${var.env}-${var.location}" location = var.location } resource "azurerm_storage_account" "storage" { name = "${var.name_prefix}st${var.env}${replace(var.location, "-", "")}" resource_group_name = azurerm_resource_group.rg.name location = azurerm_resource_group.rg.location account_tier = "Standard" account_replication_type = "LRS" tags = { Environment = var.env Project = var.name_prefix } } resource "azurerm_storage_container" "container" { name = "container-${var.env}" storage_account_name = azurerm_storage_account.storage.name container_access_type = "private" } output "storage_account_name" { value = azurerm_storage_account.storage.name } output "resource_group_name" { value = azurerm_resource_group.rg.name } output "storage_container_name" { value = azurerm_storage_container.container.name }

Now, run the terraform init command to initialize your configuration. This will download the required provider plugins and set up the working directory. 

Before applying the configuration, run the terraform plan command to preview the resources that Terraform will create. This will show you a detailed list of the resources that Terraform will add, including their configurations. It gives you a chance to review and confirm the planned changes.

Next, run the terraform apply command to deploy these resources defined in the Terraform configuration.

Terraform will display the same plan as earlier but now allows you to confirm the deployment. Review the plan and type yes to apply the changes. Once complete, you’ll see the outputs for the storage account name, resource group name, and storage container name.

Now, use the Azure CLI to modify the configuration directly, which will not be reflected in Terraform’s state file. For this example, we will disable public network access for the storage account with the following command:

az storage account update \ --name <storage_account_name> \ --resource-group <resource_group_name> \ --public-network-access Disabled

Remember that you must replace <storage_account_name> and <resource_group_name> with the actual values from the Terraform outputs.

Now, this step works as an external change made to the resource that Terraform is unaware of, which now creates a drift. Run the terraform plan command again to check for drift within the resources.

Terraform will compare the current state of the resources in Azure with its state file and display the differences. In this case, it detects a change in the public_network_access_enabled configuration for the storage account. Terraform highlights the drift and plans an in-place update to bring the resource back to its intended configuration.

Here, Terraform displays that the public_network_access_enabled attribute was modified outside Terraform (from false to true), which caused the drift. Terraform plans to revert the configuration during the next terraform apply operation.

By running a terraform plan, you can easily detect and analyze drift within your Azure infrastructure, making sure that the resources stay aligned with your desired Terraform configuration. Now, you know how to detect drift. If you want to learn how to fix this drift, you can refer to this blog.

What Are Some of the Best Practices That Help You Prevent Drift?

Now that we’ve explored how drift occurs and how to detect it, let’s look at some practical ways to minimize it within your Azure infrastructure. These practices will help you keep your configurations consistent with your Terraform state.

  • Always Make Changes Through IaC Tools: Avoid making changes directly through the Azure portal or CLI. By handling updates through Terraform or any other IaC tool, the state file remains aligned with the actual resource configuration. For example, scaling a virtual machine or updating storage configurations should always be done in Terraform and re-applied using terraform apply only.
  • Run terraform plan Regularly: Running terraform plan frequently helps catch differences early. For example, conducting weekly checks within your team can identify if someone accidentally enabled public access on a storage account. This makes sure that any differences are resolved before they escalate into more significant issues.
  • Add Drift Checks to CI/CD Pipelines: Integrate drift detection into your CI/CD pipelines by adding a terraform plan step before deploying changes. This makes sure that any unexpected differences are identified and addressed before they disrupt the deployments.
  • Use Consistent Tagging: Apply consistent tags like managed_by=terraform to your resources. This makes it easier to identify Terraform-managed resources and avoid unintentional changes through CLI or portal.

Following these practices will help you reduce drift and maintain a secure, consistent, and compliant Azure infrastructure.

Even with best practices, using terraform plan has its limits. It only detects drift in resources that are managed by Terraform. If someone from your team creates a resource directly in the Azure portal or using another tool, Terraform won’t track or identify changes to these resources. This leaves gaps that can lead to unnoticed misconfigurations, compliance problems, or unexpected costs.

Firefly: A Tool to Simplify Managing IaC Drift

Firefly helps close this gap by providing a complete view of your infrastructure. It doesn’t just check Terraform-managed resources; it scans all your resources, including those created outside of Terraform. This makes sure that nothing is missed within your infrastructure.

Firefly provides a centralized dashboard that offers a detailed overview of your cloud infrastructure. This dashboard categorizes resources into different states, such as managed, unmanaged, drifted, and ghost resources. Managed resources are those governed by IaC tools like Terraform, while unmanaged resources are created outside of Terraform’s scope, such as directly through the Azure portal. Drifted resources are those where the actual state has deviated from the desired state.

One of Firefly’s standout features is its ability to generate codified versions of unmanaged resources. For example, if a virtual machine or storage account was created outside Terraform, Firefly can generate the necessary Terraform code to bring that resource under IaC management. This allows you to standardize your infrastructure without starting from scratch, saving time and reducing any effort from your end.

Firefly also simplifies the process of identifying and addressing drift. Firefly displays drifted resources with clear comparisons between the desired configuration and the actual state. For example, if the desired configuration specifies public access disabled for a storage account, but the current state has it enabled, Firefly highlights this difference. It provides actionable insights, so you can quickly restore the resource to its intended configuration.

By offering a complete view of your cloud environment, Firefly makes sure that no resource is ignored. Whether it’s identifying unmanaged resources, detecting drift, or converting resources into Terraform-managed configurations, Firefly bridges the gap between partial visibility and complete control over your infrastructure. This makes it an essential tool for maintaining consistency, security, and compliance across your Azure environment.

Frequently Asked Questions

1. What is drift in Azure?

Drift occurs when the actual configuration of Azure resources no longer matches the desired state defined in your Terraform configuration. This happens when changes are made outside Terraform.

2. How to fix drift in Terraform?

To fix drift, run terraform plan to identify mismatches, then apply the changes using terraform apply. This brings your resources back to the desired state defined in the Terraform configuration.

3. What is the risk of drift?

Drift can lead to security vulnerabilities, compliance violations, deployment failures, and increased cloud costs due to unmanaged or misconfigured resources.

4. How can I detect drift in Terraform?

You can detect drift by running the terraform plan command. It shows the differences between the current state of your resources and the desired state in your Terraform configuration.

5. Does drift affect cost management?

Yes, drift can increase costs. For example, scaled-up resources may not scale back down, or additional features like backups enabled outside Terraform can lead to unexpected charges.