Terraform for Network Engineers: Part Four

Wrapping up our Terraform series, I hope it’s empowered you to automate and manage network infrastructure efficiently!

Sudarshan V

20 Jul 2024 — 6 min read

Terraform for Network Engineer - Part Four

Welcome back to the final part of my Terraform for Network Engineers series! In this installment, we're diving into the often tricky territory of managing the state file in Terraform. I'll also walk you through the concept of remote state management and show you how it can streamline your infrastructure management. Let's get started!

Managing the State file

The state file is the backbone of Terraform. It keeps a detailed record of the resources you've created and their current state. Terraform relies on this file to figure out what changes need to be made whenever you run terraform apply. By default, Terraform saves this file locally as terraform.tfstate.

While this works fine if you're the only one handling the infrastructure, it can quickly become a headache for a team. Multiple engineers working on the same setup can run into conflicts and inconsistencies. The solution? Store the state file remotely in a shared location that everyone on the team can access. This way, everyone stays on the same page, and your infrastructure management becomes much smoother.

Remote State Management

Remote state management is all about storing the state file in a place that everyone on the team can access. This setup lets multiple engineers work on the same infrastructure without stepping on each other's toes. Here are some of the key benefits:

Consistency: With the state file in a shared location, everyone on the team has access to the same information, keeping your infrastructure consistent.
Collaboration: When everyone works with the same state file, you avoid conflicts and make teamwork smoother.
Security: A secure, remote location for your state file helps protect sensitive information from unauthorized access.
Backup: Many remote state management solutions offer backup and versioning features, so you can recover from mistakes or roll back changes easily.
Locking: Some remote state management tools include locking mechanisms to prevent multiple engineers from updating the state file at the same time, avoiding potential issues.

Terraform offers various remote state management solutions. You can find a complete list of supported backends in the Terraform documentation. In this section, we'll zero in on using Amazon S3 as our remote state backend.

Using Amazon S3 as a Remote State Backend

Amazon S3 is a popular choice for storing Terraform state files due to its scalability, durability, and robust security features. To use Amazon S3 as your remote state backend, follow these steps:

Create an S3 Bucket: First, log in to the AWS Management Console and navigate to the S3 service. Click the "Create bucket" button and follow the on-screen instructions to set up a new bucket. Make sure to enable versioning on the bucket; this will allow for state file versioning and recovery if needed.
Configure Terraform: After creating the S3 bucket, we need to configure Terraform to use it as the remote state backend. We'll do this by creating a new Terraform configuration file named backend.tf and adding the following configuration:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.58.0"
    }
  }
  backend "s3" {
    bucket = "myblogremotestate"
    key    = "state/terraform.tfstate"
    region = "ap-southeast-2"
  }
}

Contents of backend.tf

In addition to the backend configuration, we also need to configure the AWS provider to authenticate with AWS. We will do this by adding the following environment variables to the shell session:

export AWS_ACCESS_KEY_ID="KeyIDhere"
export AWS_SECRET_ACCESS_KEY="KeySecretHere"

Updated .env file

I have also made minor changes to the palo.tf file to source the module from a git repository. The updated palo.tf file is as follows:

module "paloalto_security_policy" {
  source           = "github.com/SudarshanVK/palo-module.git"
  device_group     = "DEMO"
  source_zone      = ["Inside"]
  destination_zone = ["DMZ"]
  rules = [
    {
      name        = "rule-1"
      source      = ["any"]
      destination = ["10.10.10.1/32", "10.10.10.2/32"]
      services    = ["tcp-80", "tcp-443", "tcp-8080"]
      application = ["any"]
    },
    {
      name        = "rule-2"
      source      = ["any"]
      destination = ["10.10.10.10/32"]
      services    = ["any"]
      application = ["any"]
    }
  ]
}

Updated palo.tf file

Once we've set up the configuration, the next step is to initialize the backend. Run terraform init, and Terraform will prompt you to copy the existing state file to the remote backend. After the state file has been successfully copied, you can proceed with terraform apply as usual. From now on, Terraform will use the remote state file stored in your S3 bucket.

With the state file safely tucked away in Amazon S3, multiple engineers can work on the same infrastructure without running into conflicts. This setup allows us to store the configuration files in version control, making it easier to collaborate on infrastructure changes.

And that's it! With these steps, we've made our Terraform state management more robust and team-friendly.

Reviewing the PAN-OS Provider: Challenges and Considerations

Many readers have raised questions about the PAN-OS provider in Terraform, so I thought it would be helpful to address some of the common issues and challenges associated with it. In this section, we'll review the PAN-OS provider, discussing its limitations, and some tips for overcoming common obstacles. Let's dive in!

Importing Existing Configuration: Not all resources in the PAN-OS provider support import, and those that do may not work well with large numbers of resources. This is a problem when trying to import existing configuration into a terraform managed resource. A practical workaround is to manually replicate existing policies in Terraform, place them above the manually created ones, and then delete the manual policies once everything is verified. Although this approach isn't ideal, it is a way forward.
Committing and Pushing Changes: The PAN-OS provider lacks support for committing and pushing changes directly as part of the terraform apply process. You would need to write your own script to perform this step. This means you need to run an additional step after applying changes, which disrupts the typical Terraform workflow and adds complexity to the process.
Provider Timeouts: When managing a large number of resources in a single state file, you may encounter timeout issues with the PAN-OS provider. To mitigate this, you can either just rerun the plan and apply stages when it occurs or adjust Terraform’s parallelism settings by limiting the number of concurrent operations. While this approach may slow down the execution process, it helps avoid timeout errors and manage the provider’s load more effectively.
Provider Documentation: The documentation for the PAN-OS provider often lacks detail on acceptable values and configurations for resources. This makes it difficult to understand how to properly set up your resources and can lead to a lot of trial and error. Engaging with community forums or consulting with others who have experience with the PAN-OS provider can provide valuable insights and help overcome documentation gaps.

Terraform Review: A Network Engineer's Perspective

While Terraform is widely celebrated in the tech world, it’s important to recognize that it’s not without its flaws. Despite all the praise and hype, Terraform has its own set of challenges and limitations. Here’s a review of some key issues I've encountered:

Complex Expressions: Terraform’s rich set of functions and operators can create highly flexible configurations. However, this flexibility can also lead to very complex expressions that make configuration files hard to read and understand, especially for newcomers.
Limited Error Handling: When something goes wrong, Terraform’s error messages are often vague, making troubleshooting a bit of a challenge. To dig deeper into issues, you may need to run Terraform in debug mode to uncover more detailed error information.
Handling IP Addresses: Terraform lacks built-in support for managing IP addresses, which can complicate network configurations. You often have to resort to using external data sources or custom functions to handle IP addresses effectively.

By understanding these limitations, you can better navigate the challenges of using Terraform and find ways to work around them in your network engineering tasks.

Wrapping Up: Terraform for Network Engineers Series

As we wrap up the Terraform for Network Engineers series, I hope you’ve found it helpful for diving into Terraform and automating your network infrastructure. Thank you for joining me on this journey!

To recap, let’s revisit the quote from HashiCorp that we started with:

HashiCorp Terraform is an infrastructure as code tool that lets you define both cloud and on-prem resources in human-readable configuration files that you can version, reuse, and share. You can then use a consistent workflow to provision and manage all of your infrastructure throughout its lifecycle.

Throughout this series, we’ve explored each aspect of this statement. We discussed how to:

Define custom modules to keep configuration files readable.
Store configurations in version control systems for reuse and sharing.
Manage the state file and utilize remote state management to provision and handle infrastructure throughout its lifecycle.

Thank you for joining me on this journey, and I hope these insights will help you make the most of Terraform in your network engineering endeavors.