Back to Blog
Cloud & Serverless Architecture

FinOps for Architects: Controlling Cloud Spend Before It Controls Your Business

Sritharan K
March 28, 2026
8 min read

FinOps for Architects: Controlling Cloud Spend Before It Controls Your Business

Cloud cost surprises happen at a predictable point in a company's growth: somewhere between the early stage where the bill is small enough to ignore and the growth stage where it is suddenly the third-largest line item on the P&L. By the time finance flags it, the architectural decisions that caused the problem are 18 months old and expensive to reverse.

Architects own this problem whether the job description says so or not. The decisions made during system design determine whether a platform costs $8,000 or $80,000 per month at 10x scale. This post covers the practical patterns and tooling that keep cloud spend rational from day one.

The Tagging Strategy That Makes Everything Else Possible

You cannot optimize what you cannot attribute. The first FinOps problem in most organizations is that nobody knows which team, which product, or which customer is responsible for a given line on the cloud bill. The fix is a mandatory tagging strategy applied at the infrastructure provisioning layer, not enforced by hope.

Enforce tags via policy at the cloud provider level. In AWS, use Service Control Policies (SCPs) to deny resource creation that lacks required tags. In Azure, use Azure Policy with a deny effect. In GCP, use Organization Policies.

Once tags are consistent, you can build cost allocation reports that show exactly what each team or product is spending, down to the service level. This is the data that makes engineering conversations with finance productive rather than defensive.

Right-Sizing: The Most Common Waste

Overprovisioned compute is responsible for 30-40% of wasted cloud spend in most organizations that have not actively addressed it. Services get sized for peak capacity during initial provisioning and never revisited. Three months later you have a production cluster of r5.4xlarge instances running at 8% CPU.

The fix is not to under-provision. It is to build right-sizing reviews into your architecture process and to use auto-scaling rather than static provisioning for variable workloads.

The Vertical Pod Autoscaler in recommendation mode gives you data-driven right-sizing suggestions without automatically restarting pods. Run it for two to four weeks, review the recommendations, then apply them manually. After a few cycles you will have a reliable baseline.

Reserved Capacity vs Spot vs On-Demand: The Decision Framework

The pricing difference between these options is significant: On-Demand is the baseline, Reserved Instances are 30-60% cheaper for the same instance type, and Spot Instances are 70-90% cheaper but can be interrupted with two minutes notice. The right mix depends on workload characteristics.

  • Reserved Instances or Savings Plans: use for baseline compute that runs continuously. API servers, databases, and background workers with predictable load. Commit to 1-year (30-40% savings) or 3-year (50-60% savings) terms only for workloads you are confident will continue.
  • Spot Instances: use for batch jobs, ML training, CI/CD workers, and any workload that can handle interruption. Run Spot instances in a separate node group and never place stateful workloads or primary API servers on Spot.
  • On-Demand: use for stateful workloads where interruption is unacceptable (primary databases, session-bearing services) and for unpredictable burst capacity above your reserved baseline.

Kubernetes Cost Optimization: The Details That Matter

Kubernetes clusters have specific cost patterns that are worth addressing explicitly. The most impactful ones are namespace-level resource quotas, cluster autoscaler configuration, and idle workload cleanup.

OpenCost (formerly Kubecost open-source) gives you per-namespace, per-deployment cost allocation using the same tag-based attribution model from your cloud provider. It integrates with Prometheus and surfaces costs in Grafana, so engineers see cost data in the same dashboard where they see latency and error rates.

Cost-Aware Architecture Decisions

Some of the most expensive architectural decisions look cheap at design time. These are the ones worth flagging explicitly during architecture reviews:

  • Data transfer costs: egress between availability zones, between regions, and between cloud and internet is not free. A microservices architecture that makes cross-AZ calls on every request can have data transfer costs that exceed compute costs at scale. Co-locate services that communicate heavily.
  • NAT Gateway costs: in AWS, traffic from private subnets to the internet routes through NAT Gateways at $0.045 per GB. A service pulling large external datasets through NAT will generate significant costs. Use VPC endpoints for AWS services and Gateway Load Balancers for external traffic where possible.
  • Log volume: shipping 100GB per day to a managed logging service costs roughly $1,500/month before retention. Design log sampling for high-volume low-signal events (health checks, static asset requests) from day one.
  • Database connection counts: serverless functions that each open a new database connection at invocation bypass connection pooling entirely. A PgBouncer or RDS Proxy layer between Lambda and PostgreSQL is not optional at scale.
  • S3 request pricing: the difference between s3:GetObject (per-request billing) and using CloudFront as a caching layer in front of S3 is significant for high-traffic media or asset serving.

Tooling That Gives Real Visibility

Manual cost review every month does not work. You need automated anomaly detection that alerts when a service's spend increases by more than 20% week-over-week. These tools cover the main use cases:

  • AWS Cost Explorer with anomaly detection: built-in, set up in 10 minutes, alerts on unexpected spend spikes per service or per tag
  • Infracost: integrates into CI/CD pipelines and adds a cost estimate comment to every pull request that changes infrastructure. Engineers see the cost impact of their changes before merge.
  • OpenCost: open-source Kubernetes cost allocation. Runs inside the cluster, integrates with Prometheus and Grafana, gives per-workload cost breakdowns.
  • cloud-nuke / aws-nuke: automated cleanup of unused resources (old AMIs, unused EBS volumes, stopped EC2 instances, orphaned load balancers). Run on a schedule in non-production environments.

The Organizational Side

FinOps is not purely a technical problem. The tools only work if teams see cost as their responsibility. Three practices that make this cultural shift happen:

  • Show teams their own costs: put a cost dashboard in the engineering team's daily standup rotation. Engineers who can see what their services cost make better decisions without being told to.
  • Include cost in pull request reviews: with Infracost in the CI pipeline, cost becomes a first-class code review concern alongside correctness and performance.
  • Set team-level budgets with alerts: give each team a monthly cloud budget and configure alerts at 80% of budget. Finance stops being a surprise and starts being a planning input.

The Architecture Review Checklist

For any new service or significant infrastructure change, these questions should be answered before the design is finalized:

  • What is the estimated monthly cost at current scale, and what does it look like at 5x and 10x?
  • Which pricing model applies: compute, storage, requests, data transfer, or some combination?
  • Where are the cross-AZ or cross-region data flows, and have they been minimized?
  • Is the compute right-sized for the expected load, or has it been over-provisioned for safety?
  • Can any batch or background workloads run on Spot or ARM instances?
  • Does the service produce log or metric volume that is proportional to its business value?
  • Are there cost alerts configured so the team knows within 24 hours if spend doubles unexpectedly?

Cloud cost governance is easier to build into the architecture process than to retrofit after the fact. The window to make the high-leverage decisions is at design time. An hour spent on cost modeling during architecture review is worth ten hours of optimization six months later.

Planning a complex Python or FastAPI migration? I specialize in auditing and executing large-scale backend transformations.

Book a Strategy Call
FinOps for Architects: Controlling Cloud Spend Before It Controls Your Business | Sritharan K. | SKengineer