Git Repository Management for Large Infrastructure Projects

Infrastructure at scale can be challenging due to complexities across environments, components, and teams. Git is frequently chosen as the control system for infrastructure as code (IaC) because of its version control history, auditability features, and overall compatibility with the automation workflows common in product development. However, as infrastructure-related projects become larger in scope, the management of Git repositories can also become a not inconsequential engineering concern. Monorepos can create contention, polyrepos can create broken visibility, and branching strategies can create inconsistent deployments. In this article, we describe how to organize, structure, and govern Git repositories for infrastructure projects at scale, with attention to the in-practice objectives of deployment safety, audit accuracy, and operational predictability.

Repository Models: Monorepo vs Polyrepo

The choice made between using one monolithic repository (monorepo) or several isolated repositories (polyrepo) can establish the base-level structure of the version control approach to infrastructure.

The monorepo approach includes all of the infrastructure code – for environments, components, and/or teams – into one Git repository. This makes it easy to do a global refactor and ensure atomic updates across dependencies, but adds the overhead of degrading performance in Git operations and can create random coupling of unrelated modules. For organizations with large teams doing modifications in shared directories or committing changes frequently, Git performance bottlenecks and merge conflicts are common.

The polyrepo approach separates the infrastructure into multiple repositories, often aligned against environments (e.g., infra-prod, infra-staging), components (e.g., networking, kubernetes, monitoring), or teams. Polyrepos segregate responsibility and reduce the potential for merge conflicts, but create duplication as tooling, branching, and policy enforcement do not apply universally across repos. There are hybrid models where foundational infrastructure (e.g., VPCs, IAMs, DNS) is in a separate repository from application-specific infrastructure (e.g., Helm charts/service definitions) that is co-located with the application code, and when using these models, you are strictly responsible for maintaining dependency boundaries and automated promotion workflows to avoid cross-repo drift.

Directory and Module Structure

When structuring within a repository, it should be deterministic, intentional, and constructed for instrumentation. Regardless of the format for your infrastructure modules (Terraform, Pulumi, Helm, etc.), all will be broken down to reusable constructs and organized into directory structures with managed interfaces.

A common structure in a monorepo with Terraform could be:

/modules/
- vpc/
- eks/
- redis/

/environments/
- dev/
- staging/
- prod/

All environments depend on and import modules and only provide overlays of configuration. This aids in the reuse of modules and provides lower amounts of duplicate code. All modules must be versioned independently, via Git tags or commit hashes. Each environment can be wired into CI pipelines to deploy when changes are made in a merge, and can have plan outputs validated in GitHub pull requests.

It is paramount to avoid coupling environments tightly. Sharing backends, variable files, or states will limit your ability to deploy environments independently when you want to perform phased rollouts, testing, and rollback of environments.

Branching for Infrastructure Projects

The branching strategy related to infrastructure repositories needs to consider production safety, promotion workflow, and approval gates. Infrastructure changes typically take effect immediately, and it is more challenging to reflect and control them, at least fully, until applied on real instances to indicate the results or changes before the action occurs, when compared to application code, which does not have to do so, and completion of the change is not experienced until you deploy the code.

Some common branching models are:

Trunk-based development: all features are merged into a mainline branch (i.e., a main branch) and deployed, as part of the CI/CD processes, continuously with automated pipelines, which keeps your merge debt as low as possible, and you need to do detailed testing before merging and need to ensure environments do not consider each other.
Environment-specific branches: dev, stage, prod, etc. Changes are either cherry-picked or promoted through the use of pull requests. This allows for gatekeeping but also causes drift and adds complexity to synchronizing changes.
Promotion Branches: Changes are made against the main and then cherry-picked or merged against related branches. Changes against the branches are tracked to identify what has progressed overall to release, but this usually takes coordination.

Branch protection rules must be in place to prevent merging of changes that have not gone through review. If the environments are to be productionized, you should at a minimum enforce merge approvals, automated checks, and human sign-off approval. The Git histories should be immutable post-merge to allow for audits.

Tagging and Versioning of Infrastructure Code

Tags in Git provide immutable references to the various versions of your infrastructure. Unlike application artifacts, there may not always be binaries or containers produced by infrastructure. It is fine to use the Git commit hash or tag as the reference for the deployment.

Module repositories should use semantic versioning (v1.2.3) to track changes for their modules. If the changes are breaking changes, then the version change is 1 major version increment and is isolated/changed from the previous version. Tags should be enforced using automated release pipelines for tags, along with a version compatibility matrix for modules.

Promotion of tags should be tied to CI/CD workflows. For example, if v1.3.0 is tagged in the repository, you could create an automated plan and run it against the staging environment and then manually promote/tag it into the prod or production environment. Using tags as your input for deployment can enable repeatability and rollback.

CI/CD System Integration for GitOps

Repository-based infrastructure should initiate polling to CI/CD systems that have the ability to run infrastructure plans, apply to make changes, and enforce policy checks. Infrastructure repositories using systems like Atlantis, Spacelift, GitHub Actions, custom runners, etc, will invoke commands such as terraform plan, kubectl diff, or pulumi preview to visualize changes in pull requests. An infrastructure repository must have pipelines that:

Identify when changes have been made to infrastructure code
Produce a “plan”, or a “diff”, comparing the old and new state of a resource
Export artifacts for review (for example, a plan file, graph, etc.)
Make the changes when the plan/diff has been approved
Retain the commit hash that has been applied for audit trail purposes.

These workflows must be idempotent and stateless across runs. State management (Terraform backends, Kubernetes state, etc.) must happen outside the pipeline runner, ensuring that all comparisons are made with a consistent state.

For changes to shared infrastructure components (including IAM policies, VPC configuration, etc), we want to introduce a validation stage to avoid unintentionally impacting other resources. These validations can take the form of policy-as-code checks (e.g., OPA, Conftest), linting (TFLint, kubeval), and simulation of the resource.

Access Control and Audit Trail

Git repositories for infrastructure must implement access controls that align with access to infrastructure. If a developer does not have permission to deploy to production, they should not have write access to production infrastructure repositories. Role-based access control (RBAC) modelled in Git platforms should match infrastructure roles.

Audit trails should include:
Who was the author of each change
The commit hash of the deployment
A record of production merges approvals
Any CI logs associated with the plan and apply step.

Such auditability will support root cause analysis and post-incident review. Infrastructure Git repositories should be included in security audits and change management processes.

Don’t Get Drowned in History, and Repository Performance

Infrastructure repos can have performance issues with git due to large binary files, large diffs (or diffs with a lot of history), or a large number of commits. In order to embrace Git performance:

Don’t commit state files or plan artifacts to Git
Use .gitignore to avoid tool-generated files
Use Git LFS if you must
Squash merge where it makes sense to reduce noise in history
Archive “legacy” directories that are no longer used.

Git maintenance operations (git gc, git repack, etc) can be run on a periodic basis. If a repository cannot be further optimized, consider splitting it up based on isolation boundaries.

Conclusion

By not managing Git repositories properly with your infrastructure projects, you create operational and security risks that impact the stability of your deployments and the integrity of your systems. Uncommitted or untracked changes can lead to state differences across environments, introducing a greater risk of configuration drift. In the absence of version tagging or a reliable way to reference configuration, rollback in the event of outages becomes unreliable, as you may not be able to restore your infrastructure to a known good state.

The lack of branch protection rules or audit trails creates a potential for either unauthorized or accidental changes to reach production, while poor organization of directories or repository sprawl adds to the risk of experiencing merge conflicts, duplicate code, and inconsistent propagation of fixes. This lack of sufficient CI/CD integration can also result in broken deployments, bypassed validations, and fully unverified plans. Similarly, wide write access or not performing commit scanning opens up security considerations where infrastructure components can be manipulated due to human error or credentials being compromised.

In the end, all of these failures lead to outages of service, compliance violations, and slow incident resolution in production systems. Finally, without intentionally managed Git repositories, large infrastructure projects will lose valuable transparency, security against unknown changes, and the operational control we have built or should facilitate.

Managing Git Repositories for Large-Scale Infrastructure Projects

Repository Models: Monorepo vs Polyrepo

Directory and Module Structure

Branching for Infrastructure Projects

Tagging and Versioning of Infrastructure Code

CI/CD System Integration for GitOps

Access Control and Audit Trail

Don’t Get Drowned in History, and Repository Performance

Conclusion

Improwised Editorial Team

Unlocking Developer Potential: How Platform Engineering Transforms Developer Experience

Struggling with CI/CD Security? Here’s How Advanced DevSecOps Fixes It Fast

How to Build a High-Impact Internal Developer Platform: Step-by-Step Blueprint, Tools, and Best Practices

Struggling with slow deployments and rising cloud costs?