System Architecture
Complete technical overview of the TerraGuard platform architecture, including all six services, database design, inter-service communication, and infrastructure components.
Architecture Overview
TerraGuard is composed of five purpose-built services, each optimized for its workload, plus a search layer embedded in the Backend API. The platform follows a microservices architecture where services communicate over HTTP, share a PostgreSQL database, and coordinate through an event-driven job queue.
Service Inventory
| Service | Language | Port | Purpose |
|---|---|---|---|
| tg-frontend | TypeScript (Next.js 15) | 4000 | Interactive dashboard, maps, event detail views |
| tg-backend-api | Python (FastAPI) | 5601 | Core API, AI agents, job orchestration, notifications, embedded search layer |
| tg-web-crawler-api | Go + Python | 4003/4004 | Async web content extraction with strategy fallback |
| geopop-api | Rust | 5605 | Reverse geocoding, population grids, exposure analysis |
| tg-event-processor | Go | 5606 | Source polling, normalization, dedup, correlation |
Web and news search are no longer a standalone service. They run inside tg-backend-api as the app/common/search_providers.py module, which calls Serper.dev (primary) and Brave Search (fallback on error) directly. See Search Layer.
Supporting Infrastructure
| Component | Port | Role |
|---|---|---|
| Inngest Server | 4007 | Event-driven job queue orchestrator |
| Mailpit | 4011/4012 | Local email testing (SMTP/UI) |
| Redis | 4013 | Inngest state store, caching |
External APIs called by the Backend API: Serper.dev (primary search) and Brave Search (search fallback). Neither runs locally — no SearXNG or Tor containers are part of the topology.
Database Schema
The PostgreSQL database uses PostGIS for geospatial queries and pgVector for embedding similarity search. The core tables form a three-tier event hierarchy.
Three-Tier Event Hierarchy
-
raw_event_records -- The original payload from each data source, stored verbatim for audit and replay. An MD5 hash of the payload enables exact-duplicate rejection.
-
normalized_event_records -- Source-specific data transformed into a common schema. Each record maps to exactly one raw record and belongs to one disaster event. A staleness timestamp prevents older data from overwriting newer updates.
-
disaster_events -- The canonical event entity that users interact with. Created through spatial-temporal correlation of normalized records. Carries the computed severity score, priority level, and lifecycle status.
Inter-Service Communication
All service-to-service communication uses synchronous HTTP/JSON except for the Inngest job queue, which is asynchronous and event-driven.
Communication Patterns
| Pattern | Used Between | Mechanism |
|---|---|---|
| Webhook | Event Processor to Backend API | Fire-and-forget HTTP POST |
| Request/Response | Backend API to Crawler/GeoPop and external search APIs (Serper/Brave) | Synchronous HTTP with timeouts |
| Event-driven | Backend API to Inngest | Async job dispatch and execution |
| Database | All services to PostgreSQL | Direct connection (asyncpg / pgx) |
| MCP Protocol | Report Agent to MCP Servers | JSON-RPC over HTTP |
Technology Stack Rationale
Each service uses the language best suited to its workload:
Go for Event Processor and Crawler API -- These services handle high-throughput I/O: polling multiple sources on tight schedules and managing concurrent crawl jobs. Go's goroutine model and low memory footprint make it ideal for this kind of concurrent network I/O.
Python for Backend API -- The backend orchestrates AI agents, manages complex business logic, and integrates with the pydantic-ai and LangGraph frameworks. Python's ecosystem for machine learning, LLM tooling, and rapid API development (FastAPI) makes it the natural choice.
Rust for GeoPop API -- Reverse geocoding against population grid data (GeoTIFF rasters) is CPU-intensive and latency-sensitive. Rust provides the performance needed to scan geographic datasets and return results in single-digit milliseconds.
TypeScript/Next.js for Frontend -- Server-side rendering, React Server Components, and the App Router provide the best developer experience for building a data-heavy dashboard with real-time map interactions.
Docker Compose Topology
In local development, all services and infrastructure run via Docker Compose. The backend API's local-services directory contains the compose file for supporting infrastructure (Redis, Inngest, Mailpit). Search is not containerized — the Backend API calls Serper.dev and Brave Search directly over HTTPS.
Services run directly on the host for hot-reload during development, while stateful infrastructure (Redis, mail server) runs in containers. Search providers (Serper.dev, Brave Search) are external SaaS APIs reached over HTTPS in every environment. In production on AWS, all services are containerized and deployed behind an Application Load Balancer.
Port Allocation Reference
4000 tg-frontend (Next.js)
5601 tg-backend-api (FastAPI)
4003 tg-web-crawler-api (Go API)
4004 tg-web-crawler-api (Python worker)
5605 geopop-api (Rust)
5606 tg-event-processor (Go)
4007 Inngest Server
4011 Mailpit SMTP
4012 Mailpit Web UI
4013 Redis
4021 MCP: Brave Search
4022 MCP: Vector Search
4023 MCP: GeoPop
4024 MCP: Database Search
4025 MCP: Document LinksNext Steps
- Event Processing Pipeline -- How events flow from source to database
- Scoring & Priority -- How events are scored and classified
- Notification Engine -- How alerts reach response teams
Labels, Contacts & Filtering
Use labels to categorize and organize disaster events, and manage contact organizations and emergency contacts extracted from knowledge base content.
Event Processing Pipeline
The deterministic event processing pipeline at the core of TerraGuard -- how raw disaster data from GDACS, USGS, and NHC is normalized, deduplicated, correlated, and persisted.