State Data in Terraform

February 27, 2026 | 4 months ago

IaC Cloud Platforms

Terraform’s state is the persistent data structure that allows declarative configuration to work in practice. Without state, Terraform would have no reliable way to understand what already exists, what it manages, and what needs to change.

This chapter explains:

What Terraform state data consists of
Where it is stored
How Terraform uses it internally
How resources, data sources, outputs, and metadata are represented

Why State Exists

Terraform is declarative. You describe the desired end state of your infrastructure. To determine what actions are necessary, Terraform must compare:

Configuration (what you declare now)
Real infrastructure (what exists in the provider)
Previous known state (what Terraform believes exists)

The state file bridges configuration and real-world infrastructure. It allows Terraform to compute a safe and minimal execution plan.

What Comprises Terraform State?

The state file (typically terraform.tfstate) is a JSON document. You should never edit it manually, but understanding its structure is important.

At a high level, state contains:

Resource mappings
Data source mappings
Outputs
Metadata
Dependency graph information

Let’s examine each.

1. Resource Mapping

The most critical part of state is the mapping between:

A resource in your configuration
A real object in a provider (e.g., cloud resource)

Example configuration:

resource "aws_s3_bucket" "assets" {
  bucket = "my-app-assets"
}

Terraform stores in state:

The resource address (aws_s3_bucket.assets)
The provider-specific ID (e.g., bucket name or ARN)
All known attributes returned by the provider
Dependency information

This enables Terraform to:

Update the correct real-world object
Detect drift
Destroy the correct infrastructure when requested

Without state, Terraform would not know which S3 bucket belongs to which configuration block.

2. Data Source Mapping

Data sources are read-only queries to providers.

Example:

data "aws_vpc" "default" {
  default = true
}

Unlike resources, data sources do not create infrastructure. However, Terraform still records their evaluated results in state.

Why?

To cache resolved values
To allow dependency resolution
To enable consistent planning within a run

Important distinction:

Resources → managed objects (Terraform owns lifecycle)
Data sources → fetched objects (Terraform reads but does not manage)

In the state file, data sources are stored similarly to resources but marked as data instances and without lifecycle ownership.

3. Outputs

Outputs are values exported from a module:

output "bucket_name" {
  value = aws_s3_bucket.assets.id
}

State stores:

Output name
Output value
Whether it is sensitive

This enables:

Cross-module communication
terraform output
Remote state data usage
Integration with automation systems

If another project uses:

data "terraform_remote_state" "infra" { ... }

It reads outputs directly from stored state.

Outputs therefore act as a public interface of your infrastructure module.

4. Metadata

State also contains metadata such as:

Terraform version used
Serial number (incremented each write)
Lineage (unique ID for the state)
Backend configuration
Provider configuration references

The serial number prevents concurrent writes and helps backends detect conflicts.

The lineage ensures that Terraform does not accidentally merge unrelated states.

Metadata ensures safety, consistency, and concurrency protection.

5. Dependency Graph Information

Terraform builds a dependency graph during planning. Some of that structure is stored implicitly in state through references and resource relationships.

This allows Terraform to:

Apply changes in correct order
Destroy in reverse dependency order
Identify implicit dependencies via interpolation

Although the full graph is rebuilt each run, state contains enough attribute information to reconstruct relationships.

Where Is State Stored?

Local State (Default)

By default:

terraform.tfstate
terraform.tfstate.backup

Stored locally in your working directory.

This is suitable only for:

Personal projects
Experiments
Non-collaborative workflows

It is not safe for teams due to lack of locking.

Remote Backends

For production use, state should be stored remotely.

Common backends:

S3-compatible storage (e.g. Amazon S3)
Google Cloud Storage
Azure Blob Storage
Terraform Cloud / Terraform Enterprise
HTTP backends

Remote backends provide:

State locking
Versioning
Encryption
Access control
Team collaboration

State locking prevents two engineers from running apply simultaneously and corrupting infrastructure.

How Terraform Uses State

Terraform operates in a sequence:

1. Refresh Phase

Terraform queries providers and compares real infrastructure to state.

If drift is detected:

State is updated
Differences appear in the plan

2. Plan Phase

Terraform compares:

Desired configuration
Current state

It computes actions:

Create
Update
Replace
Destroy

3. Apply Phase

After successful execution:

State is updated to reflect new infrastructure reality
Serial number increments

State is therefore both:

A record of managed infrastructure
A mechanism for computing future changes

Resource vs Data Source Mapping in State

Understanding this distinction is important architecturally.

Aspect	Resource	Data Source
Creates infrastructure	Yes	No
Lifecycle managed	Yes	No
Stored in state	Yes	Yes
Can be destroyed	Yes	No
Used for dependency resolution	Yes	Yes

Data sources behave like cached lookups, whereas resources represent owned infrastructure objects.

State and Drift Detection

Drift occurs when infrastructure changes outside Terraform (e.g., manual cloud console modification).

Because state contains previously known attributes, Terraform can:

Detect attribute mismatches
Propose corrective updates
Reconcile infrastructure

This is one of the most important reasons state must be accurate and protected.

Sensitive Data in State

State may contain:

Passwords
API keys
Private IP addresses
Connection strings

Even if marked sensitive in outputs, the raw values still exist in state.

Therefore:

Remote backend encryption is critical
Access to state must be tightly controlled
State files must never be committed to version control

Internal Structure (Conceptual Overview)

A simplified state structure looks like:

{
  "version": 4,
  "terraform_version": "1.x.x",
  "serial": 12,
  "lineage": "uuid",
  "resources": [
    {
      "type": "aws_s3_bucket",
      "name": "assets",
      "instances": [
        {
          "attributes": {
            "id": "my-app-assets",
            "arn": "...",
            "region": "eu-central-1"
          }
        }
      ]
    }
  ],
  "outputs": {
    "bucket_name": {
      "value": "my-app-assets",
      "sensitive": false
    }
  }
}

Actual state files are more complex, but this illustrates the conceptual components.

Architectural Implications

In my opinion, state management is the most operationally critical aspect of Terraform.

Good practices include:

Always use a remote backend in team environments
Enable versioning on state storage
Enable locking
Restrict write access
Treat state as sensitive data
Avoid splitting infrastructure into too many tiny states without reason
Separate unrelated domains into different states

State boundaries are architectural boundaries.

Summary

Terraform state:

Maps configuration to real infrastructure
Stores resource and data source information
Exposes outputs
Maintains metadata for safety and locking
Enables drift detection
Powers plan and apply operations

Without state, Terraform would be a static templating engine.
With state, it becomes a reliable infrastructure management system.

If you want, the next chapter could cover advanced topics such as:

State migration
Importing existing infrastructure
State refactoring (terraform state mv)
Backend architecture patterns
Monorepo vs multi-state strategies

About Author

Mathias Bothe To my job profile

I am Mathias from Heidelberg, Germany. I am a passionate IT freelancer with 15+ years experience in programming, especially in developing web based applications for companies that range from small startups to the big players out there. I create Bosycom and initiated several software projects.