AI Agents & MCP Servers
The six AI agents powering TerraGuard's analysis pipeline, plus the five MCP tool servers that enable the report generation agent to produce comprehensive disaster reports.
Overview
TerraGuard uses six specialized AI agents, each designed for a specific task in the disaster analysis pipeline. The agents use different LLM providers based on the task requirements -- GPT-4o-mini for structured classification and extraction, Gemini 2.0 Flash for grounded summarization, and GPT-4.1 for complex multi-step reasoning.
All agents are implemented using the pydantic-ai framework, except for the Report Generation Agent which uses LangGraph for its ReAct reasoning loop. Each agent has a well-defined input/output contract, making them testable in isolation.
Agent Inventory
| Agent | Model | Framework | Purpose |
|---|---|---|---|
| GDACS Response | GPT-4o-mini | pydantic-ai | Severity assessment, deployment recommendations |
| GDACS RAG | GPT-4o-mini | pydantic-ai | Knowledge retrieval from vector DB |
| News Filter | GPT-4o-mini | pydantic-ai | Classify URLs as genuine news articles |
| News Briefing | Gemini 2.0 Flash | pydantic-ai | 24-hour event summaries with search grounding |
| Notify | GPT-4o-mini | pydantic-ai | Draft professional disaster alert emails |
| Report Generation | GPT-4.1 | LangGraph ReAct | Multi-section reports using 5 MCP tools |
GDACS Response Agent
Model: GPT-4o-mini | Trigger: New GDACS event ingested
The GDACS Response Agent processes raw GDACS alert data and produces a structured assessment. It receives the full GDACS payload (alert level, event description, affected areas, population estimates) and returns:
- Severity assessment -- qualitative analysis beyond the numeric alert level
- Impact summary -- expected humanitarian impact in plain language
- Deployment recommendation -- whether the event warrants organizational response
- Key concerns -- specific risks (aftershocks, storm surge, dam integrity, etc.)
The agent uses a structured output schema (pydantic model) to ensure consistent, parseable responses that can be stored directly in the database.
GDACS RAG Agent
Model: GPT-4o-mini | Trigger: User asks a question about an event
The RAG (Retrieval-Augmented Generation) agent enables conversational Q&A about any disaster event. When a user asks a question:
- The question is embedded using the same model used to create the knowledge base embeddings
- A pgVector similarity search retrieves the top-k relevant chunks from
disaster_event_embeddings - The retrieved context is passed to the agent along with the user's question
- The agent generates an answer grounded exclusively in the retrieved content
The agent is instructed to decline answering if the retrieved context does not contain sufficient information, rather than hallucinating. Source citations link back to the specific external_knowledge_base entries that informed the answer.
News Filter Agent
Model: GPT-4o-mini | Trigger: Knowledge discovery pipeline finds candidate URLs
Not every URL returned by a web search is a genuine news article. Search results frequently include social media posts, forum threads, product pages, weather widgets, and other non-news content. The News Filter Agent classifies each URL as either a valid news article or not.
The agent receives:
- URL
- Page title (from search result)
- Snippet (from search result)
- Domain name
It returns a binary classification with a confidence score and a brief reason. Only URLs classified as genuine news articles proceed to the crawling stage.
The agent processes URLs in batches of up to 20 to minimize API calls. A single prompt evaluates all URLs in the batch, returning a classification for each.
News Briefing Agent
Model: Gemini 2.0 Flash | Trigger: Scheduled (24-hour interval) or on-demand
The News Briefing Agent produces comprehensive 24-hour summaries of disaster event developments, using Gemini for its large context window and built-in search grounding capability.
Google Gemini is used well beyond this agent. Several Backend API services run on
gemini-2.5-flash (via pydantic-ai's GeminiModel and the google.genai client),
including AI event summaries, the knowledge-base chat agent, dynamic report generation, and
message-template previews. The News Briefing Agent runs on gemini-2.0-flash. All Gemini
calls authenticate with GEMINI_API_KEY.
The agent receives:
- Event metadata (type, location, timeline)
- All indexed knowledge base content from the past 24 hours
- Latest search results from the search layer (Serper + Brave)
It produces a structured briefing with:
- Situation overview -- current status in 2-3 paragraphs
- Key developments -- bulleted list of changes since last briefing
- Casualty and damage figures -- latest confirmed numbers with source attribution
- Response status -- humanitarian and government response actions
- Outlook -- expected evolution over the next 24-48 hours
Notify Agent
Model: GPT-4o-mini | Trigger: Notification engine passes all five gates
The Notify Agent drafts professional disaster alert emails. It receives event data, population exposure numbers, and recent search results, then produces email-ready content.
Key requirements for the agent:
- Tone must be professional and factual, suitable for humanitarian response teams
- No speculation -- only confirmed information from provided sources
- Include specific numbers (magnitude, population, coordinates)
- Subject line must include priority level and event type
See the Notification Engine documentation for full context on when and how this agent is invoked.
Report Generation Agent
Model: GPT-4.1 | Framework: LangGraph ReAct | Trigger: User requests a report
The Report Generation Agent is the most sophisticated agent in TerraGuard. It uses a ReAct (Reasoning + Acting) loop to iteratively gather information from five MCP tool servers and compose a comprehensive 10-section disaster report.
ReAct Loop
Unlike the other agents which make a single LLM call, the Report Agent operates in a multi-step loop:
The agent typically executes 8-15 tool calls before it has sufficient information to write the report. A minimum source threshold is enforced -- the agent must gather information from at least 8 distinct sources before it can begin writing.
10 Report Sections
- Executive Summary -- One-page overview for decision-makers
- Event Overview -- Type, location, timeline, measurements
- Impact Assessment -- Casualties, displacement, infrastructure damage
- Population Exposure -- Detailed demographic analysis from GeoPop
- Humanitarian Needs -- Shelter, water, food, medical priorities
- Response Overview -- Government and NGO actions taken
- Situation Forecast -- Expected evolution (aftershocks, storm path, etc.)
- Media & Public Information -- Key news coverage and official statements
- Historical Context -- Previous events in the same region
- Recommendations -- Actionable next steps for the organization
Source Tier Priority
The agent prioritizes information sources in a defined order:
| Tier | Source Type | Examples |
|---|---|---|
| 1 | Official agencies | UN OCHA, GDACS, USGS, national disaster agencies |
| 2 | International organizations | IFRC, WHO, UNICEF situation reports |
| 3 | Quality news sources | Reuters, AP, BBC, Al Jazeera |
| 4 | Internal knowledge base | Previously indexed and validated content |
| 5 | General web sources | Other search results |
MCP Tool Servers
The Report Generation Agent accesses external data through five MCP (Model Context Protocol) servers. Each server exposes a set of tools that the agent can call during its ReAct loop.
Brave Search MCP (:4021)
Provides real-time web and news search capabilities. The agent uses this to find the latest information not yet in the knowledge base. Returns structured results with titles, URLs, snippets, and publication dates.
Vector Search MCP (:4022)
Queries the pgVector embeddings stored in disaster_event_embeddings. This surfaces previously crawled and indexed content that is semantically relevant to the agent's query. Returns text chunks with similarity scores and source attribution.
GeoPop MCP (:4023)
Proxies requests to the GeoPop API for population and geographic analysis. Returns affected population counts at multiple radii, country and administrative region names, and land/sea classification.
Database Search MCP (:4024)
Provides read-only access to the PostgreSQL database. The agent can query event details, measurement histories, timeline data, and related events. All queries are parameterized and read-only -- the agent cannot modify data.
Document Links MCP (:4025)
Extracts content from URLs and documents. When the agent finds a relevant URL via Brave Search, it can use this tool to extract the full text. Also handles PDF extraction for situation reports and official documents.
Agent Error Handling
All agents follow the same error handling pattern:
- Timeouts: 30-second default, 120 seconds for the Report Agent
- Retries: Maximum 2 retries with exponential backoff
- Fallbacks: Each agent has a template-based fallback that produces basic output from structured data alone
- Cost controls: Token limits are set per agent to prevent runaway costs
Notification Engine
How TerraGuard's multi-gate notification system determines who gets alerted, when, and why -- including organization matrices, country configurations, and the full decision audit trail.
Knowledge Discovery Pipeline
How TerraGuard automatically discovers, crawls, validates, and indexes external knowledge sources to build a comprehensive information base for each disaster event.