v1.4.1  ·  Apache 2.0  ·  Open Source  ·  Security Hardened

CivicRecords AI

Open-source AI for municipal open records

Every city in America processes public records requests — and most staff still search file shares manually, review hundreds of pages by hand, and track deadlines in spreadsheets. CivicRecords AI changes that. It runs entirely inside your city's network, ingesting your documents and making them searchable with natural language AI queries, flagging potential exemptions, and managing the full request lifecycle from intake to response. No cloud subscription. No vendor lock-in. Your data never leaves the building.

Step 1

Install Docker Desktop (required — not included in setup scripts)

docker.com/get-started
Step 2

Clone & run the setup script

bash install.sh
Step 3

Open your browser

http://localhost:8080

ⓘ  A Windows double-click installer (unsigned, ships with every tagged release) handles the Docker-stack setup end-to-end. macOS and Linux continue to use the setup scripts below — Docker Desktop (macOS) or Docker Engine (Linux) must be installed first.

🐈 GitHub Repository ⬇ Download Installer (Linux / macOS) ⬇ Download Installer (Windows) 🗎 User Manual 📄 README

Architecture

CivicRecords AI deployment stack: browser to nginx frontend to FastAPI API plus Celery worker, backed by Postgres+pgvector, Redis, and Ollama, all running inside Docker Compose on the city's own network with no cloud dependency.
Deployment stack — entire system runs inside Docker Compose on the city's network. No cloud, no outbound by default.
LLM call flow: records-ai application code calls a thin app.llm.client shim which routes through civiccore.llm — context assembly with sanitization, prompt template resolution with three-step override, model registry, and provider factory — into a local Ollama provider, with optional OpenAI and Anthropic providers behind opt-in extras.
LLM call flow — records-ai routes every model call through civiccore.llm. Local-first; cloud providers are optional extras.
Sovereignty boundary: all runtime components — FastAPI plus Celery, Postgres with pgvector, Ollama, and local volumes — live inside the city's on-prem network. Connector credentials are encrypted at rest with Fernet. No outbound by default; cloud is opt-in only.
Sovereignty / data boundary — every byte of data and every model call stays on city infrastructure unless an operator explicitly opts in to a cloud provider.
CLIENT LAYER APPLICATION LAYER DATA & AI LAYER ASYNC WORKERS Municipal Staff Browser React 18 + shadcn/ui + Tailwind CSS HTTP :8080 nginx (frontend) FastAPI Application Server :8000 Auth Search API Workflow Exemptions Departments Compliance LLM / Fed Audit Logger PostgreSQL 17 + pgvector :5432 Redis 7.2 task queue :6379 Ollama LLM runtime :11434 SQL queries task dispatch LLM inference Celery Worker ingestion + embedding Celery Beat scheduler consume tasks store embeddings embed / infer 7 Docker services on a single machine. All LLM inference local via Ollama. Zero outbound data.
17
Core Features
7
Docker Services
51
Jurisdictions (50 States + DC)
1.4.1
Stable Release

What it does

Seventeen integrated capabilities — all running locally, all under your control.

🔍

AI-Powered Search

Natural language hybrid search combining pgvector semantic similarity and PostgreSQL full-text search via Reciprocal Rank Fusion. Results include source attribution and confidence scores. Optional LLM-synthesized answer summaries, clearly labeled as AI drafts.

📄

Document Ingestion

Two-track pipeline handles PDF, DOCX, XLSX, CSV, email (.eml), HTML, and plain text. Scanned documents use Gemma 4 multimodal AI with Tesseract OCR fallback. Sentence-aware chunking with configurable overlap ensures context is never lost at chunk boundaries.

🔒

Exemption Detection

Rules-primary engine with built-in PII patterns — SSN, phone, email, credit card, date of birth — plus statutory keyword matching across all 50 states and DC (180 rules). Auditability dashboard tracks acceptance/rejection rates by category with CSV/JSON export. All flags require explicit human confirmation.

📋

Request Workflow

Full lifecycle tracking across 10 statuses: received, clarification needed, assigned, searching, in review, ready for release, drafted, approved, fulfilled, and closed. Deadline alerts surface approaching and overdue requests. Timeline, messaging, fee tracking, and AI-drafted response letters.

Compliance by Design

Hash-chained audit logs with SHA-256 tamper evidence and automatic archival. Human-in-the-loop enforced at the API layer — no auto-redaction, no auto-approval. Ships with 5 compliance templates: AI Use Disclosure, CAIA Impact Assessment, AI Governance Policy, Response Letter Disclosure, and Data Residency Attestation. Verification script confirms zero outbound data transmission.

🔗

Federation-Ready

Service account API keys (hashed before storage) enable authenticated REST access between CivicRecords AI instances. The foundation for cross-jurisdiction record discovery — a single staff member can query cooperating cities without leaving their own system.

🚀

Guided Onboarding

Step-by-step setup wizard walks administrators through initial configuration — LLM model selection, department setup, user creation, and first document ingestion — so the system is production-ready in under an hour.

📁

Systems Catalog

Structured registry of every record system your jurisdiction operates — file shares, databases, email archives, cloud drives. Each system tracks location, retention schedule, and responsible department, forming the backbone of responsive search.

🔌

Connector Framework

Pluggable adapter architecture lets administrators connect CivicRecords AI directly to live record sources. Ships with four implemented connector types: file_system (local/mounted directories), manual_drop (watched drop folders), rest_api (generic REST API — API key / Bearer / OAuth2 / Basic; JSON/XML/CSV; page/offset/cursor pagination), and odbc (SQL databases via pyodbc, row-as-document with SQL-injection guards). Per-source cron scheduling (croniter, UTC) with configurable presets. Idempotent sync via content-hash dedup for binary sources and source-path dedup for structured sources. Roadmap: IMAP email, SMB/NFS, SharePoint.

📈

Analytics Dashboard

Real-time metrics on request volume, average response time, exemption rates by category, and staff workload. Exportable reports help demonstrate statutory compliance and identify process bottlenecks before they become deadline violations.

Response Letter Generation

AI-drafted response letters pre-populated with request details, responsive documents, and statutory exemption citations. Staff review and approve every letter before it leaves the system — the AI drafts, humans decide.

🔔

Notification Service

Configurable email and in-app alerts for deadline approach, assignment changes, and requester follow-ups. Notification rules are set per department so each team gets the alerts that matter to them — no noise, no missed deadlines.

🌟

Design System

Built on shadcn/ui with a consistent component library. Responsive shell: fixed 240px sidebar on desktop, hamburger-driven slide-in drawer below 768px with focus trap, ESC close, and auto-close on route change. Keyboard-navigable throughout. Admin forms use programmatic label↔input associations, role="alert" validation errors with actionable copy, role="radiogroup" for segmented choices, and 44px touch targets — targeting WCAG 2.2 AA. Every screen follows the same visual language so staff spend zero time learning a new UI paradigm.

🕑

Request Timeline & Messaging

Every request carries a full chronological timeline — intake, assignment, search queries, review actions, approvals, and response. Built-in secure messaging lets staff communicate with requesters and internal reviewers from within the request record.

🏢

Department Access Controls

Staff are scoped to their assigned department — they see only their department's requests, documents, and exemption flags. Admins retain full org-wide visibility. Department CRUD with audit logging ensures every access boundary change is tracked.

📜

Compliance Templates

Five ready-to-use compliance documents ship with the product: AI Use Disclosure, Response Letter Disclosure Language, CAIA Impact Assessment, AI Governance Policy, and Data Residency Attestation. Templates render with your city profile data and are customizable by admins.

🔴

Sync Failure Tracking & Circuit Breaker

Per-record failure tracking with two-layer retry — task-level exponential backoff absorbs transient network errors; record-level retry handles persistent connector failures (up to 5 attempts over 7 days). Automatic circuit breaker suspends a source after 5 consecutive full-run failures and sends admin notifications. Health status (Healthy / Degraded / Circuit Open) displayed live on every source card. Admin UI surfaces failed records with bulk retry and dismiss actions, per-record history, and a one-click unpause path once the underlying issue is resolved.


Tech stack

Standard, well-supported open-source components. No proprietary runtimes. No licensing surprises.

Backend
Python 3.12 + FastAPI
SQLAlchemy 2.0, Alembic, Celery, fastapi-users
Frontend
React 18
Tailwind CSS, shadcn/ui — 14 admin pages
Database
PostgreSQL 17 + pgvector
Hybrid semantic + full-text search in one store
LLM Runtime
Ollama
Gemma 4 recommended; nomic-embed-text for embeddings
Queue / Cache
Redis 7.2 + Celery
Async ingestion and scheduled tasks
Deployment
Docker Compose
7 services, single-command install, no internet required
Licenses
All permissive
MIT, Apache 2.0, BSD, LGPL, MPL — no AGPL or SSPL
Hardware
8+ cores / 32 GB RAM
50 GB disk · AMD GPU auto-detection (ROCm/DirectML) · runs on existing city hardware

Supported platforms

Identical Docker containers on every OS. Install scripts included for each platform.

Windows 10 / 11  (Docker Desktop)
macOS 13+  (Docker Desktop)
Ubuntu 22.04+  (Docker Engine)
Debian 12+  (Docker Engine)

All platforms run identical Linux containers. Windows users get install.ps1; macOS and Linux users get install.sh.