AWS Infrastructure
Overview of the AWS services powering TerraGuard in production, including the migration from serverless Lambda to the Go event processor.
Overview
TerraGuard runs on AWS with a mix of managed services and self-hosted components on EC2. The architecture balances cost efficiency with operational simplicity, using App Runner for the backend API and EC2 for services that require persistent connections or specialized runtimes.
AWS Services
Compute
| Service | Purpose | Details |
|---|---|---|
| App Runner | Backend API hosting | Auto-scaling, deploys from ECR images |
| EC2 (Server B) | Support services | c7g.xlarge ARM64, runs crawler + geopop (the standalone search service was removed) |
Storage
| Service | Purpose | Details |
|---|---|---|
| RDS PostgreSQL | Primary database | PostGIS + pgVector extensions enabled |
| S3 | Static assets & reports | Generated PDF reports, uploaded documents |
| ECR | Container registry | Docker images under terra-guard/ namespace |
Networking & Delivery
| Service | Purpose | Details |
|---|---|---|
| CloudFront | CDN | Caches static assets, terminates TLS |
| Route 53 | DNS | Domain management for all services |
Application Integrations (Runtime)
Beyond hosting, the Backend API calls these AWS services directly at runtime (via boto3,
authenticated with AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION):
| Service | Purpose | Details |
|---|---|---|
| S3 | Document storage | Client uploads go straight to S3 via presigned URLs; the extraction pipeline reads them back. Bucket from AWS_S3_BUCKET_NAME, keyed under events/documents/. |
| SES | Transactional email | Auth emails, notifications, and reports. In local/dev, USE_MAILPIT=true swaps SES for Mailpit so nothing leaves the machine. |
| Textract | Document OCR | Optional, off by default (AWS_TEXTRACT_ENABLED=false). When enabled, extracts text from scanned PDFs/images during document ingestion. |
SQS is no longer in the ingestion path. The Go event processor polls sources and writes
to PostgreSQL directly; AWS_SQS_URL and the SQS handler remain only for replaying captured
message dumps in tests. See Legacy Services.
Legacy Services (Deprecated)
| Service | Former Purpose | Status |
|---|---|---|
| SQS | Event ingestion queue | Replaced by Go event processor direct writes |
| DynamoDB | Event deduplication state | Replaced by PostgreSQL-based dedup |
| Lambda | Event processing functions | Replaced by Go event processor |
AWS CLI Configuration
All AWS CLI commands use the tg profile:
# Configure the profile
aws configure --profile tg
# Example commands
aws ecr get-login-password --profile tg --region us-east-1
aws s3 ls s3://terraguard-assets --profile tg
aws ssm send-command --profile tg --instance-ids i-05033852181296c97 ...Migration: Lambda + SQS to Go Event Processor
The original event ingestion pipeline used a serverless architecture:
This was replaced with the Go event processor for several reasons:
- Cold start latency -- Lambda cold starts added 3-5 seconds to event processing, delaying time-sensitive disaster alerts
- SQS complexity -- Dead letter queues, retry policies, and visibility timeouts added operational overhead with little benefit for the throughput level
- DynamoDB cost -- Per-request pricing for deduplication lookups became expensive as the polling frequency increased
- Debugging difficulty -- Distributed traces across Lambda, SQS, and DynamoDB were hard to follow compared to a single-process log stream
The new architecture is a single Go binary that handles polling, deduplication, normalization, and database writes:
The event processor runs as a Docker container on Server B alongside the other Go services. It polls data sources on a configurable schedule, deduplicates against PostgreSQL, and sends a webhook to the backend API to trigger the enrichment pipeline via Inngest.
Cost Optimization
Key cost decisions in the current architecture:
- App Runner over ECS/EKS -- Simpler scaling model for a single backend service, no cluster management overhead
- Single EC2 instance (Server B) -- Co-locating three lightweight services on one c7g.xlarge is cheaper than running three separate App Runner services
- ARM64 (Graviton) -- Server B uses Graviton processors for better price/performance on Go and Rust workloads
- Inngest Cloud over self-hosted -- Managed job queue eliminates the need for a dedicated Redis instance in production (Redis is only used locally)
- Vercel for frontend -- Free tier covers the Next.js deployment, avoiding App Runner costs for static/SSR content
Local Development
Set up the TerraGuard platform for local development, including prerequisites, environment configuration, port assignments, and testing workflows.
Server B Deployment
Deployment architecture for Server B, the EC2 instance hosting TerraGuard's crawler and geocoding services with Caddy reverse proxy and automated CI/CD.