A backend developer at a fast-growing fintech company needs to deploy a new microservice. They open a ticket with the DevOps team requesting a Kubernetes namespace, an RDS database, a CI/CD pipeline, and access to monitoring. The ticket sits in a queue for two weeks. When it finally gets processed, the DevOps engineer copy-pastes configuration from another service, forgets to enable encryption on the database, and grants overly permissive IAM roles because figuring out the correct permissions would take another day.
This scenario plays out thousands of times daily across the industry. Developers wait while operations teams become bottlenecks. Security gets cut corners because proper configuration is hard. Shadow IT emerges as frustrated developers spin up resources outside approved channels. Everyone loses.
Platform engineering offers a different path: build self-service infrastructure where doing the right thing is the easy thing. Instead of tickets and wait times, developers get a catalog of pre-approved, secure components they can provision themselves. Instead of copy-pasted configs that drift from security standards, they get golden paths—opinionated templates where security is built in from the start.
The Platform Engineering Revolution
Platform engineering emerged from a simple observation: the DevOps promise of "you build it, you run it" created cognitive overload. Developers are excellent at writing business logic. Expecting them to also be experts in Kubernetes, Terraform, observability, security, and compliance is unrealistic. The result was either teams that moved slowly while learning infrastructure, or teams that moved fast while cutting corners.
An Internal Developer Platform (IDP) solves this by providing a product-like experience for infrastructure. Think of it as the "Amazon shopping experience" for development resources: a curated catalog of products (infrastructure components), one-click provisioning, and standardized delivery (deployment pipelines)—all with guardrails that prevent unsafe choices.
The key insight is that abstraction is not about hiding complexity—it's about encoding expertise. When your platform team configures a database template, they embed decisions about encryption, backup retention, network isolation, and access patterns. Developers consuming that template get all those decisions for free, without needing to understand them.
Golden Paths: The Secure Way is the Easy Way
Golden paths are the core concept in platform engineering. They're not restrictions—they're paved roads. Just as a highway gets you to your destination faster than bushwhacking through the wilderness, golden paths get developers to production faster than ad-hoc configuration.
Consider the difference between these two experiences:
Without golden paths: A developer wants to create a new service. They look at an existing service's repository, copy files they think are relevant, modify them based on half-remembered conversations, open PRs to infrastructure repos they don't fully understand, wait for reviews from overloaded platform engineers, fix issues found in review, deploy to staging, discover they forgot to configure logging, fix that, redeploy, discover the service can't connect to the database because security groups are wrong, file a ticket, wait...
With golden paths: A developer opens the developer portal, clicks "Create New Service," fills in a form (service name, team owner, data classification), and clicks submit. Three minutes later, they have a repository with a working CI/CD pipeline, a Kubernetes namespace with appropriate resource limits, network policies configured, security scanning enabled, logging integrated, and documentation automatically generated. They write their business logic and deploy.
The magic is that the second approach isn't just faster—it's also more secure. Every security control is built into the template. Developers can't forget to enable encryption because the template doesn't allow unencrypted options.
Building a Secure Service Template with Backstage
Backstage, originally developed at Spotify, has become the de facto standard for developer portals. Its templating system lets you define golden paths that generate entire project scaffolds with a few form inputs.
Here's a template that creates a production-ready microservice with security built in from the start:
# Backstage template: Secure Microservice Generator
# This template creates a complete, production-ready service
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: secure-microservice
title: Secure Microservice
description: |
Creates a production-ready microservice with:
- Hardened container configuration
- Security scanning in CI/CD
- Branch protection and code review requirements
- Automatic registration in service catalog
spec:
owner: platform-team
type: service
# Gather information from the developer
parameters:
- title: Service Details
required: [name, owner]
properties:
name:
title: Service Name
type: string
pattern: '^[a-z][a-z0-9-]*$'
description: Lowercase letters, numbers, and hyphens only
owner:
title: Owning Team
type: string
ui:field: OwnerPicker
description: The team responsible for this service
dataClassification:
title: Data Classification
type: string
enum: [public, internal, confidential, restricted]
default: internal
description: |
Determines security controls applied:
- Public: No sensitive data
- Internal: Employee-only data
- Confidential: Customer data, requires encryption
- Restricted: Financial/health data, requires audit logging
# Execute these steps to create the service
steps:
# Generate code from secure skeleton template
- id: fetch
name: Generate Secure Code Skeleton
action: fetch:template
input:
url: ./skeleton
values:
name: ${{ parameters.name }}
owner: ${{ parameters.owner }}
dataClassification: ${{ parameters.dataClassification }}
# Create repository with security settings enabled
- id: publish
name: Create Repository
action: publish:github
input:
repoUrl: github.com?owner=myorg&repo=${{ parameters.name }}
defaultBranch: main
# Security: Require reviews before merging to main
branchProtectionEnabled: true
requireCodeOwnerReviews: true
dismissStaleReviews: true
# Security: Block merges until checks pass
requiredStatusChecks:
- security-scan # Vulnerability scanning
- secret-scan # Check for leaked secrets
- tests # Unit/integration tests
- lint # Code quality
# Add to service catalog for visibility and governance
- id: register
name: Register in Service Catalog
action: catalog:register
input:
repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
catalogInfoPath: /catalog-info.yaml
# Trigger security setup workflow
- id: security
name: Configure Security Scanning
action: github:actions:dispatch
input:
repoUrl: github.com?owner=myorg&repo=${{ parameters.name }}
workflowId: setup-security.yml
workflowInputs:
dataClassification: ${{ parameters.dataClassification }}
Notice how security is woven throughout: branch protection is enabled by default, code reviews are required, security scans must pass before merging, and data classification drives what controls get applied. A developer creating a service handling "restricted" data automatically gets stricter encryption requirements and audit logging—without needing to know those requirements exist.
The Secure Container: Dockerfile as a Golden Path
Container security is notoriously easy to get wrong. A typical internet tutorial produces a Dockerfile running as root, based on a full OS image with hundreds of unnecessary packages, and no health checks. Each of these is a security issue.
Golden path Dockerfiles encode security best practices:
# Golden Path Dockerfile - Security by Default
# This is what developers get when they use our template
# Build stage: Install dependencies and compile
FROM cgr.dev/chainguard/node:latest AS builder
# Why Chainguard? These images are rebuilt daily with latest patches,
# contain minimal packages (smaller attack surface), and include SBOMs
WORKDIR /app
# Install dependencies first (better layer caching)
COPY package*.json ./
RUN npm ci --only=production --ignore-scripts
# --ignore-scripts: Don't run arbitrary postinstall scripts
# This prevents supply chain attacks via malicious npm scripts
COPY . .
RUN npm run build
# Production stage: Minimal runtime image
FROM cgr.dev/chainguard/node:latest
# Using same base ensures compatibility and consistency
WORKDIR /app
# Copy only what's needed for production
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
# Security: Run as non-root user
# Chainguard images use 'nonroot' by default, but be explicit
USER nonroot:nonroot
# Security: Drop all capabilities and prevent privilege escalation
# This is enforced at runtime via Kubernetes, but document the intent
# Observability: Health check for Kubernetes probes
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
CMD node healthcheck.js || exit 1
# Document the port (doesn't actually expose it)
EXPOSE 8080
# Use array syntax to prevent shell interpretation issues
CMD ["node", "dist/index.js"]
When developers use the golden path, they get all these security decisions for free. They don't need to know why Chainguard images are better than the official Node.js images, or why running as non-root matters. The platform team made those decisions once, and every service benefits.
Self-Service with Guardrails: Freedom Within Boundaries
The traditional approach to infrastructure security was gatekeeping: developers request resources, operations approve or deny. This doesn't scale. Platform engineering replaces gatekeeping with guardrails: developers can provision resources themselves, but the platform constrains what's possible.
Consider database provisioning. The old model: developer files a ticket, DBA reviews the request, provisions the database manually, hopefully remembers to enable encryption, sets up backups, configures access. Lots of human decisions, lots of opportunities for mistakes.
The platform engineering model: developer selects "PostgreSQL Database" from the catalog, specifies size and name, clicks create. The platform provisions a database that's automatically encrypted, backed up, isolated to the correct network, and accessible only from their services. No human decisions, no mistakes possible.
Crossplane: Kubernetes-Native Infrastructure Provisioning
Crossplane extends Kubernetes to provision infrastructure across any cloud provider. Combined with compositions, you can create self-service APIs for cloud resources with security built in:
# Crossplane Composition: Secure PostgreSQL Database
# This defines what developers get when they request a database
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: secure-postgres-aws
labels:
provider: aws
database: postgres
spec:
compositeTypeRef:
apiVersion: database.company.io/v1alpha1
kind: PostgresInstance
resources:
# The actual RDS instance
- name: rds-instance
base:
apiVersion: rds.aws.upbound.io/v1beta1
kind: Instance
spec:
forProvider:
engine: postgres
engineVersion: "15"
# These security settings are ENFORCED - developers can't change them
storageEncrypted: true # Data at rest encryption
deletionProtection: true # Prevent accidental deletion
publiclyAccessible: false # Never expose to internet
autoMinorVersionUpgrade: true # Automatic security patches
# Backup policy - also enforced
backupRetentionPeriod: 7 # Keep 7 days of backups
backupWindow: "03:00-04:00" # Backups during off-hours
copyTagsToSnapshot: true # Maintain tagging in backups
# Monitoring - developers don't need to configure this
enabledCloudwatchLogsExports:
- postgresql
- upgrade
performanceInsightsEnabled: true
performanceInsightsRetentionPeriod: 7
# These fields can be customized by developers
patches:
- fromFieldPath: "spec.size"
toFieldPath: "spec.forProvider.instanceClass"
transforms:
- type: map
map:
small: db.t3.micro
medium: db.t3.small
large: db.t3.medium
- fromFieldPath: "spec.storageGB"
toFieldPath: "spec.forProvider.allocatedStorage"
# Security group - tightly controlled network access
- name: security-group
base:
apiVersion: ec2.aws.upbound.io/v1beta1
kind: SecurityGroup
spec:
forProvider:
description: "Database SG - managed by platform"
# Only allow connections from application VPC
ingress:
- fromPort: 5432
toPort: 5432
protocol: tcp
# CIDR is patched based on environment
# No egress - database doesn't need to initiate connections
egress: []
Now developers can create databases with a simple Kubernetes manifest:
# What developers actually write - the platform handles the rest
apiVersion: database.company.io/v1alpha1
kind: PostgresInstance
metadata:
name: orders-db
namespace: orders-team
spec:
size: medium
storageGB: 50
That's it. From those four lines, they get an encrypted, backed-up, monitored, properly-networked PostgreSQL database. They can't accidentally make it public, skip encryption, or disable backups—those options simply don't exist in the API they're using.
Policy as Code: Automated Guardrails
Golden paths work well when developers use them. But what about resources created outside the golden path? Policy as code creates a safety net that catches misconfigurations regardless of how resources were created.
The key insight is that policies should be preventative, not detective. It's better to block a misconfigured resource from deploying than to detect it running in production. Kubernetes admission controllers make this possible—they intercept every resource creation request and can reject those that violate policy.
Gatekeeper: The Kubernetes Policy Enforcer
OPA Gatekeeper integrates Open Policy Agent with Kubernetes admission control. Here are policies that prevent common security mistakes:
# Policy: Containers Must Run as Non-Root
# Why: Running as root means a container escape gives attackers root on the host
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredSecurityContext
metadata:
name: require-non-root
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
excludedNamespaces: ["kube-system", "gatekeeper-system"]
parameters:
runAsNonRoot: true
allowPrivilegeEscalation: false
requiredDropCapabilities: ["ALL"]
readOnlyRootFilesystem: true
# If someone tries to deploy this:
# securityContext:
# runAsUser: 0 # root!
#
# Gatekeeper rejects it with:
# "Container must not run as root user"
---
# Policy: Containers Must Use Approved Base Images
# Why: Prevent developers from using unscanned, vulnerable images
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
name: approved-images-only
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
repos:
# Only allow images from trusted, scanned registries
- "gcr.io/distroless/" # Google's minimal images
- "cgr.dev/chainguard/" # Chainguard hardened images
- "ghcr.io/myorg/" # Our organization's registry
# If someone tries to deploy this:
# image: some-random-dockerhub/image:latest
#
# Gatekeeper rejects it with:
# "Image from unauthorized repository"
---
# Policy: All Containers Must Have Resource Limits
# Why: Without limits, one container can starve others (DoS) or cause cost explosion
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sContainerLimits
metadata:
name: require-resource-limits
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
# Maximum allowed limits (prevents runaway costs)
cpu: "2"
memory: "4Gi"
These policies are evaluated on every pod creation—whether the pod comes from a golden path template, a manually-written manifest, or a third-party Helm chart. Nothing escapes the policy check.
Terraform Sentinel: Catching Infrastructure Misconfigurations
For infrastructure outside Kubernetes (S3 buckets, IAM roles, networking), Terraform Sentinel policies provide similar guardrails:
# Sentinel Policy: S3 Buckets Must Be Secure
import "tfplan/v2" as tfplan
# Find all S3 buckets being created or modified
s3_buckets = filter tfplan.resource_changes as _, rc {
rc.type is "aws_s3_bucket" and
rc.mode is "managed" and
(rc.change.actions contains "create" or
rc.change.actions contains "update")
}
# Rule: All buckets must have encryption enabled
encryption_enabled = rule {
all s3_buckets as _, bucket {
bucket.change.after.server_side_encryption_configuration is not null
}
}
# Rule: All buckets must block public access
public_access_blocked = rule {
all s3_buckets as _, bucket {
bucket.change.after.block_public_acls is true and
bucket.change.after.block_public_policy is true and
bucket.change.after.ignore_public_acls is true and
bucket.change.after.restrict_public_buckets is true
}
}
# Rule: All buckets must have versioning (for recovery from ransomware/deletion)
versioning_enabled = rule {
all s3_buckets as _, bucket {
bucket.change.after.versioning[0].enabled is true
}
}
# Main policy: All rules must pass
main = rule {
encryption_enabled and
public_access_blocked and
versioning_enabled
}
# If someone tries to create a public bucket, Terraform plan fails:
# "Policy check failed: public_access_blocked rule returned false"
This policy runs during `terraform plan`, before any resources are created. A developer can't even see what a misconfigured bucket would look like—the plan itself is rejected.
Integrated Security Scanning: Shift Left, Shift Everywhere
Security scanning shouldn't be something developers think about—it should happen automatically, constantly, invisibly. Platform engineering integrates scanning into every stage of the software lifecycle.
The Unified Security Pipeline
Rather than asking each team to configure their own security tools, the platform provides a reusable workflow that teams can call with one line:
# Platform-provided reusable workflow: Comprehensive Security Scanning
# Teams include this in their CI with: uses: ./.github/workflows/security.yml
name: Platform Security Pipeline
on:
workflow_call:
inputs:
image:
required: true
type: string
description: Container image to scan
jobs:
security-scan:
runs-on: ubuntu-latest
permissions:
security-events: write # For uploading to GitHub Security tab
contents: read
steps:
- uses: actions/checkout@v4
# Secret Detection: Find leaked credentials before they reach production
- name: Scan for Secrets
uses: trufflesecurity/trufflehog@main
with:
path: ./
extra_args: --only-verified # Reduce false positives
# Static Analysis: Find security bugs in code
- name: Static Application Security Testing (SAST)
uses: returntocorp/semgrep-action@v1
with:
# Use curated rulesets for security issues
config: >
p/security-audit
p/owasp-top-ten
p/nodejs
p/typescript
# Dependency Scanning: Find vulnerable packages
- name: Scan Dependencies (SCA)
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
severity: 'CRITICAL,HIGH'
exit-code: '1' # Fail build on critical/high
# Container Scanning: Find issues in the built image
- name: Scan Container Image
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ inputs.image }}
severity: 'CRITICAL,HIGH'
exit-code: '1'
# Infrastructure as Code Scanning: Find misconfigurations
- name: Scan IaC (Terraform/Kubernetes)
uses: bridgecrewio/checkov-action@master
with:
directory: ./infrastructure
framework: terraform,kubernetes,dockerfile
soft_fail: false # Fail on any issue
# Generate SBOM for compliance and incident response
- name: Generate Software Bill of Materials
uses: anchore/sbom-action@v0
with:
image: ${{ inputs.image }}
artifact-name: sbom-${{ github.run_id }}
Individual teams don't need to understand each scanning tool, configure rules, or handle updates. They call the platform workflow, and security happens:
# In a team's CI workflow - one line enables comprehensive security
jobs:
security:
uses: platform-team/workflows/.github/workflows/security.yml@v1
with:
image: ghcr.io/myorg/${{ github.repository }}:${{ github.sha }}
The Developer Experience: Making Security Invisible
The ultimate goal of platform engineering is to make security invisible—not absent, but so seamlessly integrated that developers don't have to think about it. Security becomes a property of the platform, not a task for developers.
This requires more than just automation; it requires excellent developer experience:
- Security scan results appear in pull requests, not in separate dashboards developers never check
- Remediation guidance is specific and actionable: "Upgrade lodash from 4.17.15 to 4.17.21 to fix CVE-2020-8203" instead of "Critical vulnerability found"
- The service catalog shows security posture at a glance: Which services have open vulnerabilities? When was the last security scan?
- Documentation is generated, not written: Golden paths produce consistent architecture diagrams, runbooks, and security documentation
- Compliance evidence is collected automatically: SOC 2 auditors get reports, not interview requests
Security Visibility in the Developer Portal
Backstage plugins integrate security information directly into the developer experience:
# Service catalog entry with security annotations
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-service
annotations:
# Integrations pull security data into the portal
snyk.io/org-name: myorg
snyk.io/project-ids: payment-service
trivy.dev/image: ghcr.io/myorg/payment-service:latest
sonarqube.org/project-key: payment-service
# Compliance tracking
compliance.company.io/frameworks: "pci-dss,soc2"
compliance.company.io/data-classification: "restricted"
compliance.company.io/last-audit: "2025-01-15"
compliance.company.io/audit-status: "passed"
spec:
type: service
lifecycle: production
owner: payments-team
dependsOn:
- resource:payments-database
- component:auth-service
When developers view the payment-service in Backstage, they see a unified dashboard showing vulnerability counts from Snyk, code quality from SonarQube, container scan results from Trivy, compliance status, dependencies, and team ownership—all in one place.
Continuous Compliance: Security as a Feature
Compliance is often treated as periodic audit preparation—scramble to collect evidence, document controls, and pray you pass. Platform engineering transforms compliance from an event into a continuous process.
When security controls are built into golden paths, compliance evidence generates automatically:
- Encryption at rest: Proven by Crossplane configurations that don't allow unencrypted databases
- Access control: Proven by Gatekeeper policies that enforce RBAC
- Vulnerability management: Proven by CI/CD scan results showing all deployments pass security checks
- Change management: Proven by GitHub branch protection requiring code reviews
- Audit logging: Proven by platform configuration that enables logging for all services
When auditors ask "how do you ensure databases are encrypted?", the answer isn't "we train developers to enable encryption." It's "our database provisioning API doesn't have an unencrypted option. Here's the Crossplane composition proving it."
Building Your Platform: Where to Start
Platform engineering is a journey, not a destination. You don't need to build everything at once. Start with the highest-impact, lowest-effort improvements:
Phase 1: Golden Paths for Common Patterns
Identify your organization's most common deployment patterns. What does 80% of your software look like? Build golden paths for those first. A template for "Node.js microservice with PostgreSQL" might cover half your organization's services.
Phase 2: Policy Guardrails
Deploy policy enforcement for the highest-risk misconfigurations. Running containers as root, using public S3 buckets, skipping encryption—these are the mistakes that cause breaches. Block them with policy before adding more sophisticated controls.
Phase 3: Self-Service Infrastructure
Replace tickets with APIs. If teams frequently request databases, message queues, or cache clusters, build self-service provisioning. Each ticket eliminated is developer time saved and standardization improved.
Phase 4: Developer Portal
Unify the experience with a developer portal. Backstage is the standard choice, but the key is having one place where developers can discover services, create new ones, view security status, and find documentation.
The Platform Mindset
The most important change in platform engineering isn't tooling—it's mindset. Platform teams are product teams. Their customers are internal developers. Their success metric is developer productivity, not infrastructure metrics.
When security is built into the platform, it stops being friction and becomes a feature. Developers don't complain about security requirements; they appreciate that the platform handles compliance for them. Security teams don't fight with developers; they collaborate on building better golden paths.
The goal isn't to control developers—it's to free them. Free them from wrestling with infrastructure. Free them from deciphering compliance requirements. Free them from configuring security tools. Free them to write the business logic that actually matters. That's what platform engineering is about.