Server B Deployment
Deployment architecture for Server B, the EC2 instance hosting TerraGuard's crawler and geocoding services with Caddy reverse proxy and automated CI/CD.
Overview
Server B is a single EC2 instance that hosts support services behind a Caddy reverse proxy. It provides web content crawling and reverse geocoding to the backend API. (Web/news search is no longer hosted here — it now runs inside the Backend API using Serper.dev and Brave Search; see Search Layer.)
Instance Specification
| Property | Value |
|---|---|
| Instance Type | c7g.xlarge (ARM64 Graviton3) |
| vCPUs | 4 |
| Memory | 8 GB |
| Instance ID | i-05033852181296c97 |
| Elastic IP | 34.200.207.223 |
| OS | Amazon Linux 2023 (ARM64) |
| Region | us-east-1 |
Services
TerraGuard support services run as Docker containers managed by Docker Compose:
| Service | Domain | Internal Port | Description |
|---|---|---|---|
| tg-web-crawler | crawler.terraguard.ai | 8091 | Async web crawling with strategy fallback |
| tg-geo-pop | geopop.terraguard.ai | 8077 | Reverse geocoding & population analysis |
Supporting containers that are not publicly exposed:
| Service | Internal Port | Description |
|---|---|---|
| Crawler Worker | 8092 | Python crawl4ai browser-based extraction |
Caddy Reverse Proxy
Caddy serves as the entry point for all external traffic. It handles TLS termination, automatic HTTPS certificate provisioning via Let's Encrypt, and API key authentication.
Automatic HTTPS
Caddy automatically obtains and renews TLS certificates from Let's Encrypt for each domain it serves. No manual certificate management is required.
API Key Authentication
All service endpoints are protected at the Caddy layer with API key validation. Requests must include the correct key in the X-API-Key header:
curl -H "X-API-Key: your-api-key" \
https://crawler.terraguard.ai/v1/healthThis means the individual services do not need to implement their own authentication -- Caddy rejects unauthenticated requests before they reach the backend containers.
Caddyfile Structure
The Caddy configuration routes each domain to its corresponding container:
crawler.terraguard.ai {
@authenticated header X-API-Key {env.API_KEY}
handle @authenticated {
reverse_proxy tg-web-crawler-api:8091
}
respond 401
}
geopop.terraguard.ai {
@authenticated header X-API-Key {env.API_KEY}
handle @authenticated {
reverse_proxy geopop-api:8077
}
respond 401
}Docker Compose
All services are orchestrated with a single docker-compose.yml file:
# Start all services
docker compose up -d
# View logs
docker compose logs -f
# Restart a specific service
docker compose restart tg-web-crawler-api
# Pull latest images and recreate
docker compose pull && docker compose up -dImages are pulled from ECR. Each service image is built and pushed by its GitHub Actions CI/CD pipeline.
Deployment Process
Deployments are fully automated via GitHub Actions. The flow is:
Manual Deployment
If needed, you can deploy manually via SSM:
aws ssm send-command \
--profile tg \
--instance-ids i-05033852181296c97 \
--document-name "AWS-RunShellScript" \
--parameters 'commands=["cd /opt/terraguard && docker compose pull && docker compose up -d"]'Or SSH directly (for debugging only):
ssh -i terraguard-search-vps.pem ec2-user@34.200.207.223Health Checks
Verify all services are running:
# Crawler API
curl -H "X-API-Key: $API_KEY" https://crawler.terraguard.ai/v1/health
# GeoPop API
curl -H "X-API-Key: $API_KEY" https://geopop.terraguard.ai/api/v1/healthAll endpoints should return HTTP 200 with a JSON health status.