Back to Blog

Compliance Operations Without DevOps: How Karpenter's Config-Driven Model Eliminates Manual Infrastructure Drift

The compliance officer walks into your weekly sync with a simple question: "Who changed the node configuration last month, and why?" The room goes silent. Your lead developer pulls up the AWS console. Your other engineer digs through Slack messages. Someone mentions there might be a Terraform repository somewhere, but nobody is sure if the state file reflects what is actually running. Twenty minutes later, you still have no answer.

This scenario plays out constantly at growing companies. The infrastructure that runs your production workloads evolved organically. Someone applied a hotfix directly in the console during an outage. Another engineer tweaked a node group size through the CLI without updating the committed configuration. Your Terraform state drifted from reality months ago, and nobody noticed until the auditor asked.

For healthcare startups navigating HIPAA compliance, this lack of infrastructure change documentation is not just embarrassing. It is a compliance violation waiting to happen.

The Terraform Sprawl Problem

Terraform promised infrastructure as code compliance. The reality for most teams without dedicated DevOps engineers is different. What starts as a clean module structure becomes a tangled web of custom configurations, copy-pasted blocks, and "temporary" workarounds that become permanent fixtures.

Consider what a typical small team's Terraform setup looks like after two years of organic growth:

infrastructure/
├── terraform/
│   ├── modules/
│   │   ├── eks-cluster/
│   │   ├── eks-cluster-v2/          # "new" version, partially migrated
│   │   ├── eks-cluster-prod-only/   # production-specific overrides
│   │   ├── node-groups/
│   │   ├── node-groups-gpu/         # added for ML workloads
│   │   └── karpenter/               # half-implemented
│   ├── environments/
│   │   ├── staging/
│   │   │   ├── main.tf
│   │   │   ├── main.tf.backup       # from last failed apply
│   │   │   └── terraform.tfstate    # local state, never migrated
│   │   └── production/
│   │       ├── main.tf
│   │       └── overrides.tf         # "temporary" hotfixes
│   └── README.md                    # last updated 18 months ago

The state files tell their own story. The staging environment uses local state because "we'll migrate to S3 later." Production uses remote state, but three engineers have admin access and someone ran terraform apply from their laptop during an incident without committing the changes. The drift between your committed configuration and your running infrastructure is unknown.

When your compliance officer asks for a change log of infrastructure modifications over the past quarter, you face a choice: spend days reconstructing history from CloudTrail logs, commit messages, and Slack threads, or admit that you do not have a clear audit trail.

Why Small Teams Cannot Maintain Custom IaC

The problem is not Terraform itself. The problem is that infrastructure as code requires infrastructure as code expertise. Your team has five backend engineers, a frontend developer, and a designer. None of them went to school for HCL syntax, state file management, or Kubernetes resource planning.

Yet someone needs to maintain this infrastructure. So you do what every growing startup does: assign it to your most senior engineer as a side responsibility. They learn enough Terraform to be dangerous. They copy configurations from blog posts and Stack Overflow. They make changes that work, even if they do not understand exactly why.

This creates several compliance-relevant problems:

  • No peer review: Only one person understands the infrastructure code, so there is no meaningful code review process for changes.
  • Inconsistent application: Changes are applied from different machines, at different times, sometimes through CI/CD and sometimes directly.
  • Drift accumulation: Small manual changes accumulate over time, each one widening the gap between documented and actual state.
  • Knowledge concentration: When that senior engineer leaves, so does all institutional knowledge about how the infrastructure actually works.

HIPAA requires documented, repeatable infrastructure controls. Your auditor wants to see evidence that infrastructure changes follow a defined process with clear accountability. They want to trace any configuration to when it was changed, by whom, and for what reason. Custom Terraform sprawl makes this nearly impossible without significant forensic effort.

Declarative Infrastructure Through Rack Parameters

Convox takes a different approach. Instead of requiring teams to maintain custom Terraform modules, Convox exposes infrastructure configuration through declarative rack parameters. These parameters are managed through a single interface, tracked through the Convox API, and auditable through the Console.

Enabling Karpenter on a Convox Rack looks like this:

$ convox rack params set karpenter_auth_mode=true karpenter_enabled=true -r production
Setting parameters... OK

Configuring node lifecycle policies:

$ convox rack params set karpenter_node_expiry=720h karpenter_consolidation_enabled=true -r production
Setting parameters... OK

Setting resource limits to prevent runaway scaling:

$ convox rack params set karpenter_cpu_limit=200 karpenter_memory_limit_gb=800 -r production
Setting parameters... OK

Every parameter change is a single command. Every change is logged. Every change creates a traceable event in the Convox API. When your compliance officer asks about node configuration changes, you can point to a clear history of exactly what changed, when, and which user initiated the change.

The complete list of Karpenter parameters is documented and versioned. There is no drift between documentation and implementation because the parameters are the implementation.

The Audit Trail Difference

Consider the compliance documentation effort for the same infrastructure change under both approaches.

Scenario: Updating Node Expiry Policy

Your security team recommends rotating nodes every 30 days instead of the default to ensure nodes pick up the latest AMI patches. Here is what that looks like with custom Terraform versus Convox rack parameters:

Concern Custom Terraform Convox
Change Request Create ticket, find correct module, understand HCL syntax for NodePool expireAfter Create ticket, reference karpenter_node_expiry parameter
Implementation Modify HCL, open PR, review, merge, run terraform plan, run terraform apply convox rack params set karpenter_node_expiry=720h
Audit Evidence Git commit, PR approval, CI/CD logs, state file diff Single API event with user, timestamp, parameter, value
Drift Risk Manual apply could skip PR, state could drift from committed config No drift possible; parameters are the source of truth
Expertise Required Terraform, HCL, Karpenter CRDs, state management Read parameter documentation

The Convox approach collapses the entire change management process into a single, auditable action. There is no gap between "what was approved" and "what was applied" because they are the same thing.

Exporting Configuration for Audit Documentation

When preparing for a HIPAA audit, you need to document your current infrastructure configuration. With Convox, this is straightforward:

$ convox rack params -r production
karpenter_enabled                    true
karpenter_auth_mode                  true
karpenter_node_expiry                720h
karpenter_consolidation_enabled      true
karpenter_consolidate_after          30s
karpenter_cpu_limit                  200
karpenter_memory_limit_gb            800
karpenter_capacity_types             on-demand
karpenter_instance_families          c5,m6i,r5
high_availability                    true
private                              true
node_type                            t3.medium
...

This output represents your complete infrastructure configuration in a human-readable format. You can include it directly in audit documentation. You can diff it against previous exports to show exactly what changed between audit periods. You can version it in your documentation repository without worrying about state file conflicts or HCL syntax.

For more detailed infrastructure state, you can also access the rack information:

$ convox rack -r production
Name      production
Provider  aws
Router    router.production.0a1b2c3d4e5f.convox.cloud
Status    running
Version   3.24.0

The combination of rack parameters and rack status gives auditors a complete picture of your infrastructure configuration without requiring them to parse Terraform state files or understand HCL syntax.

Karpenter Configuration as Compliance Documentation

Karpenter's config-driven model through Convox parameters maps directly to HIPAA technical safeguards. Consider how each parameter category addresses specific compliance requirements:

Access Controls and Resource Limits

HIPAA requires limiting access to only what is necessary. Karpenter resource limits enforce this at the infrastructure level:

$ convox rack params set \
  karpenter_cpu_limit=200 \
  karpenter_memory_limit_gb=800 \
  -r production
Setting parameters... OK

These parameters create documented, enforceable limits on infrastructure scaling. Your compliance documentation can state: "Production infrastructure is limited to a maximum of 200 vCPUs and 800GB memory through platform-enforced controls." The karpenter_cpu_limit and karpenter_memory_limit_gb parameters provide the evidence.

System Integrity and Patch Management

HIPAA requires maintaining system integrity through regular updates. Karpenter node expiry ensures nodes are regularly replaced with fresh AMIs:

$ convox rack params set karpenter_node_expiry=720h -r production
Setting parameters... OK

This single parameter ensures all nodes are replaced every 30 days, automatically picking up the latest security patches. Your compliance documentation can reference the karpenter_node_expiry parameter as evidence of your patch management process.

Cost Controls and Capacity Management

While not strictly a HIPAA requirement, demonstrating controlled infrastructure growth is often part of broader compliance programs:

$ convox rack params set \
  karpenter_capacity_types=on-demand \
  karpenter_instance_families=c5,m6i,r5 \
  -r production
Setting parameters... OK

These parameters document your infrastructure purchasing model and approved instance types. The karpenter_capacity_types and karpenter_instance_families parameters become part of your change control documentation.

The BYOC Compliance Advantage

Convox's Bring Your Own Cloud model is particularly relevant for HIPAA compliance. Your infrastructure runs in your own AWS account, not on shared infrastructure managed by a third party. This means:

  • PHI never leaves your account: Patient data stays within your AWS security boundary, satisfying data residency requirements.
  • You control the BAA relationship: Your Business Associate Agreement is with AWS directly, not with a platform vendor that might change their compliance posture.
  • Full audit trail access: CloudTrail, VPC Flow Logs, and all AWS-native logging tools remain under your control.
  • No vendor concentration risk: Your compliance posture does not depend on a single platform vendor's security practices.

This architecture means Convox rack parameters provide an additional layer of infrastructure audit documentation on top of your existing AWS compliance controls, not as a replacement for them.

Practical Implementation for Healthcare Startups

Here is a complete example of setting up a HIPAA-ready Karpenter configuration through Convox:

# Enable Karpenter with security-focused defaults
$ convox rack params set \
  karpenter_auth_mode=true \
  karpenter_enabled=true \
  karpenter_capacity_types=on-demand \
  karpenter_node_expiry=720h \
  karpenter_consolidation_enabled=true \
  karpenter_consolidate_after=5m \
  karpenter_cpu_limit=100 \
  karpenter_memory_limit_gb=400 \
  -r production
Setting parameters... OK

This configuration establishes several compliance-relevant controls:

  • On-demand instances only (no spot interruptions for production workloads)
  • 30-day maximum node lifetime for automated patching
  • Active consolidation to minimize attack surface (fewer idle nodes)
  • Conservative resource limits as a runaway scaling safeguard

After applying these parameters, export the configuration for your compliance documentation:

$ convox rack params -r production > infrastructure-config-$(date +%Y%m%d).txt

Store these exports in your compliance documentation repository. When your auditor asks about infrastructure controls, you have timestamped, human-readable evidence of your configuration at any point in time.

Moving Beyond Terraform Drift

The fundamental difference between managing infrastructure through custom Terraform and through Convox rack parameters is the elimination of drift as a possibility. With custom Terraform:

  • State can drift from committed configuration if someone runs apply without committing
  • Running infrastructure can drift from state if someone makes console changes
  • Multiple state files can exist for the same resources
  • Detecting drift requires running terraform plan and understanding the output

With Convox rack parameters:

  • Parameters are the source of truth; there is no separate state to drift
  • Every parameter change is an API event with a clear audit trail
  • Current configuration is always available through convox rack params
  • No Terraform expertise required to understand or modify configuration

This matters for infrastructure change documentation because your audit evidence is always accurate. You never have to explain why your documented configuration differs from your running infrastructure, because they cannot differ.

The Team Expertise Reality

Most healthcare startups are not infrastructure companies. You are building patient engagement platforms, telehealth solutions, clinical workflow tools, or health data analytics. Your competitive advantage comes from your domain expertise and your product, not from your ability to manage Kubernetes clusters.

HIPAA compliance is a requirement, not a differentiator. The question is not whether you need infrastructure controls and audit documentation. The question is how much of your team's limited engineering bandwidth you want to spend maintaining those controls.

Convox's approach assumes you do not have a dedicated DevOps engineer. The Karpenter integration exposes the functionality you need for compliance through a simple parameter interface, without requiring your backend engineers to become infrastructure specialists.

When your compliance officer asks "Who changed the node configuration last month and why?", you can answer immediately. The Convox Console shows exactly which user set which parameters at which timestamp. No forensic investigation required. No Terraform state archaeology. Just clear, auditable infrastructure change documentation that satisfies HIPAA's requirement for documented, repeatable controls.

Get Started

Convox deploys into your own AWS account, keeping PHI within your infrastructure boundary while providing the declarative infrastructure controls you need for HIPAA compliance. The Getting Started Guide walks through installation and your first deployment.

Create a free account and deploy your first Rack in minutes. For healthcare organizations with specific compliance requirements, reach out to our team to discuss HIPAA-ready infrastructure configurations.

Let your team focus on what matters.