TerraGuard

System Architecture

Complete technical overview of the TerraGuard platform architecture, including all six services, database design, inter-service communication, and infrastructure components.

Architecture Overview

TerraGuard is composed of five purpose-built services, each optimized for its workload, plus a search layer embedded in the Backend API. The platform follows a microservices architecture where services communicate over HTTP, share a PostgreSQL database, and coordinate through an event-driven job queue.

Loading diagram...

Service Inventory

ServiceLanguagePortPurpose
tg-frontendTypeScript (Next.js 15)4000Interactive dashboard, maps, event detail views
tg-backend-apiPython (FastAPI)5601Core API, AI agents, job orchestration, notifications, embedded search layer
tg-web-crawler-apiGo + Python4003/4004Async web content extraction with strategy fallback
geopop-apiRust5605Reverse geocoding, population grids, exposure analysis
tg-event-processorGo5606Source polling, normalization, dedup, correlation

Web and news search are no longer a standalone service. They run inside tg-backend-api as the app/common/search_providers.py module, which calls Serper.dev (primary) and Brave Search (fallback on error) directly. See Search Layer.

Supporting Infrastructure

ComponentPortRole
Inngest Server4007Event-driven job queue orchestrator
Mailpit4011/4012Local email testing (SMTP/UI)
Redis4013Inngest state store, caching

External APIs called by the Backend API: Serper.dev (primary search) and Brave Search (search fallback). Neither runs locally — no SearXNG or Tor containers are part of the topology.

Database Schema

The PostgreSQL database uses PostGIS for geospatial queries and pgVector for embedding similarity search. The core tables form a three-tier event hierarchy.

Loading diagram...

Three-Tier Event Hierarchy

  1. raw_event_records -- The original payload from each data source, stored verbatim for audit and replay. An MD5 hash of the payload enables exact-duplicate rejection.

  2. normalized_event_records -- Source-specific data transformed into a common schema. Each record maps to exactly one raw record and belongs to one disaster event. A staleness timestamp prevents older data from overwriting newer updates.

  3. disaster_events -- The canonical event entity that users interact with. Created through spatial-temporal correlation of normalized records. Carries the computed severity score, priority level, and lifecycle status.

Inter-Service Communication

All service-to-service communication uses synchronous HTTP/JSON except for the Inngest job queue, which is asynchronous and event-driven.

Loading diagram...

Communication Patterns

PatternUsed BetweenMechanism
WebhookEvent Processor to Backend APIFire-and-forget HTTP POST
Request/ResponseBackend API to Crawler/GeoPop and external search APIs (Serper/Brave)Synchronous HTTP with timeouts
Event-drivenBackend API to InngestAsync job dispatch and execution
DatabaseAll services to PostgreSQLDirect connection (asyncpg / pgx)
MCP ProtocolReport Agent to MCP ServersJSON-RPC over HTTP

Technology Stack Rationale

Each service uses the language best suited to its workload:

Go for Event Processor and Crawler API -- These services handle high-throughput I/O: polling multiple sources on tight schedules and managing concurrent crawl jobs. Go's goroutine model and low memory footprint make it ideal for this kind of concurrent network I/O.

Python for Backend API -- The backend orchestrates AI agents, manages complex business logic, and integrates with the pydantic-ai and LangGraph frameworks. Python's ecosystem for machine learning, LLM tooling, and rapid API development (FastAPI) makes it the natural choice.

Rust for GeoPop API -- Reverse geocoding against population grid data (GeoTIFF rasters) is CPU-intensive and latency-sensitive. Rust provides the performance needed to scan geographic datasets and return results in single-digit milliseconds.

TypeScript/Next.js for Frontend -- Server-side rendering, React Server Components, and the App Router provide the best developer experience for building a data-heavy dashboard with real-time map interactions.

Docker Compose Topology

In local development, all services and infrastructure run via Docker Compose. The backend API's local-services directory contains the compose file for supporting infrastructure (Redis, Inngest, Mailpit). Search is not containerized — the Backend API calls Serper.dev and Brave Search directly over HTTPS.

Loading diagram...

Services run directly on the host for hot-reload during development, while stateful infrastructure (Redis, mail server) runs in containers. Search providers (Serper.dev, Brave Search) are external SaaS APIs reached over HTTPS in every environment. In production on AWS, all services are containerized and deployed behind an Application Load Balancer.

Port Allocation Reference

4000  tg-frontend (Next.js)
5601  tg-backend-api (FastAPI)
4003  tg-web-crawler-api (Go API)
4004  tg-web-crawler-api (Python worker)
5605  geopop-api (Rust)
5606  tg-event-processor (Go)
4007  Inngest Server
4011  Mailpit SMTP
4012  Mailpit Web UI
4013  Redis
4021  MCP: Brave Search
4022  MCP: Vector Search
4023  MCP: GeoPop
4024  MCP: Database Search
4025  MCP: Document Links

Next Steps

On this page