Secure Database Gateway for AI in Go
Table of Contents
Heimdall: Building a Secure Database Gateway for AI in Go
As large language models (LLMs) like Claude, Gemini, GPT become increasingly integrated into our workflows, the need to safely connect them to live data has never been more critical. How do you let an AI query a database without risking a hallucinated DROP TABLE or something? How do you prevent an LLM from reasoning about which dialect of SQL to use?
This was the question that led to the creation of Heimdall, a universal database gateway designed to be the “all-seeing watchman” between AI and data. Written in Go, Heimdall acts as a Model Context Protocol (MCP) server, providing a secure, governed, and observable layer for database interactions. Currently it works with relational databases like Postgres, Oracle, Mysql, Mssql and Sqlite.
In this post, we’ll dive deep into Heimdall’s architecture, exploring the design patterns, security principles that make it a robust solution for this modern challenge, focussing on the technical decisions and trade-offs.
The “Why”: The Problem with Direct Database Access
Giving an LLM direct access to a database connection string is a non-starter for several reasons:
- Unrestricted Access: The AI could issue any query, including
DROP TABLE,DELETEwithout aWHEREclause, or access sensitive tables. - Data Exfiltration: It could
SELECT *on a massive table, exfiltrating millions of records and overwhelming its own context window. - Sensitive Data Exposure: Results could include PII, financial data, or other sensitive information that the AI should never see.
- Resource Exhaustion: A poorly written or malicious query could lock tables or consume all available database connections.
Heimdall solves these problems by never exposing credentials directly. Instead, it provides the AI with a set of dynamically generated tools and resources, and every query sent through those tools is rigorously vetted by a security pipeline.
Architectural Overview
Heimdall is designed with a clean, modular architecture that emphasizes separation of concerns, testability, and extensibility.
The core components are assembled at startup in cmd/heimdall/main.go:
- Configuration: A single YAML file defines all databases, their types, DSNs, and security policies.
- Database Manager: A connection manager abstracts away the specifics of each database dialect (PostgreSQL, MySQL, etc.) using a
Dialectinterface. This is a classic example of the Strategy Pattern. It handles connection pooling and dialect-specific SQL syntax. - MCP Server: This component, built on
mark3labs/mcp-go, handles all communication with the AI overstdio. - Execution Engine: This is the orchestrator that ties everything together. When a tool call comes in, the engine passes the query through the SafeGuard pipeline before execution.
- SafeGuard Pipeline: The heart of Heimdall’s security model.
- Observability: Key events, such as tool calls, SafeGuard violations, and query execution times, are logged in a structured format (e.g., JSON) to standard output, allowing for easy integration with modern logging and monitoring platforms.
At its core, the request flow is simple but powerful. An AI model, such as Claude, makes a request through a client application. The client, via the Model Context Protocol (MCP), invokes a specific tool registered by Heimdall (e.g., read_finance_db). Heimdall receives this request, and the Execution Engine sends the SQL query through the full SafeGuard pipeline. If the query is deemed safe, it’s executed against the target database. The results are then processed by the Masker before being returned through the MCP server to the AI. This entire process is opaque to the model; it only sees a tool that safely returns data.
Dynamic Tool Registration: A Two-Layer Approach
A core feature of Heimdall is its ability to dynamically expose database capabilities as tools. This is achieved through a secure, two-layer system that separates the safe, high-level tools shown to the LLM from the powerful, low-level tools that execute the work.
1. High-Level Predefined Tools (The LLM’s Interface)
The primary way an LLM interacts with a database is through high-level tools defined directly in the heimdall.yaml configuration. These tools represent specific, semantic actions, like get_todays_sales_summary. Each definition maps the tool name to a hardcoded SQL query.
For example, a resource like finance_db://daily_summary (as seen in examples/heimdall.yaml) has its SQL query defined and managed entirely within Heimdall’s configuration. The LLM can invoke this tool, but it cannot change the underlying query. This is the foundation of Heimdall’s security: the SQL is trusted because it comes from the configuration, not the model.
2. Low-Level Generic Tools (The Internal Engine)
Under the hood, Heimdall uses a set of generic, low-level tools for the actual database interaction. These correspond to the Go files in internal/tools/:
inspect_<db_id>: For schema discovery.read_<db_id>: A generic executor for runningSELECTqueries.task_<db_id>: A generic executor forINSERT,UPDATE, etc.
The read tool, for instance, is implemented to accept a SQL string from its caller.
How They Work Together
This is why the two-layer setup is secure, when an LLM calls a high-level tool like daily_summary:
- Heimdall’s handler for that predefined tool is activated.
- The handler reads the safe, corresponding SQL query from the configuration file.
- It then internally invokes the low-level
readtool, programmatically injecting the predefined SQL into the request.
This means the low-level read tool gets its SQL from a trusted, internal source (the high-level tool’s handler), not from the LLM. This two-step process allows Heimdall to use secure, parameterized queries defined in its configuration, completely preventing SQL injection while providing a powerful and flexible architecture.
The SafeGuard Pipeline: A Multi-Layered Security Model
Even though the SQL for predefined tools originates from a trusted configuration file, but we don’t even want to trust our own configuration files, the SafeGuard pipeline provides a crucial defense-in-depth layer. It protects against human error (e.g., an administrator accidentally putting a DELETE statement in a read-only tool’s configuration) and automates best practices like row limiting. This ensures that even if one layer fails or is misconfigured, others are in place to prevent a breach.
1. AST-Based Analysis
First, the query is parsed into an Abstract Syntax Tree (AST) using vitess/sqlparser. The Analyzer walks the AST to determine the statement type (SELECT, INSERT, etc.) and extract table names. Relying on an AST is far more reliable than simple keyword matching.
2. Operation Enforcement
The Enforcer checks the detected statement type against the allowed_operations list for the database. If a database is configured to be read-only (["SELECT"]), any attempt to run an INSERT is rejected immediately. It also checks queries against a table allow_list.
3. Row Limit Injection
To prevent the AI from requesting enormous datasets, the Limiter automatically injects a row limit into every SELECT query. This requires dialect-specific logic (e.g., LIMIT n, TOP n, FETCH FIRST n ROWS ONLY). The Limiter is smart enough to find and replace existing limits if they are more permissive than the configured max_rows.
4. Query Timeouts
Every query is executed within a context.WithTimeout. This prevents long-running queries from tying up database resources.
5. Data Masking
After a query executes but before the results are sent back to the AI, the Masker steps in. For any columns whose names match a masking rule in the configuration, it can either:
- Redact: Replace a column’s value with
[REDACTED]. - Hash: Replace a value with a deterministic, salted SHA-256 hash. This is useful for preserving the ability to group by or join on an ID without revealing the ID itself.
For example, a database configuration in heimdall.yaml might include:
mask:
- column_name: "email_address"
type: "redact"
- column_name: "user_ssn"
type: "hash"
This ensures that even if a query accidentally selects these columns, their sensitive values will never be exposed to the LLM.
Golang Design Patterns in Action
Here’s how we used standard Go patterns to build this secure and extensible tool.
The Dialect Interface for Extensibility
The internal/database/dialect.go interface abstracts all database-specific logic, making it straightforward to add support for new databases without changing core application logic.
type Dialect interface {
DriverName() string
NormalizeDSN(dsn string) (string, error)
ReadOnlySessionSQL() string
LimitSQL(maxRows int) LimitPlacement
}
Dependency Injection for Testability
Heimdall relies on dependency injection to create loosely coupled, testable components. Instead of components creating their own dependencies, they receive them as inputs during construction. The main function acts as the assembler, constructing services and “injecting” them into the components that need them.
A clear example is the creation of the execution.Engine:
func main() {
// ... load config ...
// Construct dependencies
manager, _ := database.NewManager(cfg)
analyzer := safeguard.NewAnalyzer()
enforcer := safeguard.NewEnforcer(analyzer)
limiter := safeguard.NewLimiter(analyzer)
masker := masking.NewMasker(salt)
// Inject dependencies into the engine
engine := execution.NewEngine(manager, enforcer, limiter, masker)
// ... start server with the engine
}
This pattern is the key to effective unit testing. For example, to test the execution.Engine in isolation, we don’t need a real database. We can simply pass mock implementations of its dependencies (like the database.Manager or safeguard.Enforcer) and assert that the engine behaves correctly based on the inputs it receives from those mocks.
Pure Go Drivers for Portability
Using pure Go database drivers avoids CGO, resulting in a statically-linked binary with no external dependencies that can be deployed anywhere.
Getting Hands-On: Running Heimdall
Want to try it yourself? It only takes a few minutes to spin up locally and connect it to an LLM.
Building and Running the Binary
-
Build the binary:
# From the root of the repository make buildThis creates a single executable file named
heimdall. -
Run Heimdall:
# Run heimdall, pointing it to your config file ./heimdall -config examples/heimdall.yamlThe server will start and listen for JSON-RPC messages on
stdio.
Connecting to an LLM (Claude Desktop Example)
Heimdall acts as an MCP server. To connect it to a client like the Claude Desktop app, edit your claude_desktop_config.json to include the command to run Heimdall.
{
"mcpServers": {
"heimdall": {
"command": "/path/to/your/heimdall/binary",
"args": ["-config", "/path/to/your/heimdall.yaml"]
}
}
}
When Claude starts, it will launch the Heimdall process, and the dynamically registered tools will become available to the model.
A Robust Testing Strategy
Heimdall has a multi-layered testing strategy, including unit tests and a full integration test suite that uses Docker to spin up real databases.
Local Databases with Docker
The docker-compose.yml file defines services for PostgreSQL, MySQL, and MSSQL.
- To start all databases:
docker-compose up -d - Database Seeding: On startup, each database is seeded with test data from the
data/init/directory.
Running the Tests
The Makefile provides convenient targets for running tests.
- Unit Tests:
make test(do not require Docker) - Integration Tests:
make test-integration(require Docker)
Design Trade-offs and Security Considerations
Here are Heimdall’s design trade-offs we accepted
Security Focus: Is Heimdall Immune to SQL Injection?
This is the most critical question for any security gateway, and Heimdall was designed from the ground up to eliminate this threat. The answer is yes, Heimdall is immune to SQL injection because of a fundamental design choice: the LLM never provides raw SQL.
An LLM interacting with Heimdall does not generate a query like SELECT * FROM employees WHERE id = 123. Instead, it might invoke a tool with parameters that are defined in Heimdall’s configuration, for example: read_employees(id=123).
Heimdall’s execution.Engine maps this tool call to a predefined SQL query string from its configuration, such as SELECT * FROM employees WHERE id = ?;. It then passes the LLM-provided value (123) as a parameter to the database driver. This allows the driver to use secure parameterized queries (also known as prepared statements).
This approach ensures that values from the LLM are treated strictly as data, not as executable code. It is impossible for a malicious value (like 'x' OR 1=1) to break out of the data context and alter the structure of the SQL query.
While the SafeGuard pipeline provides crucial defense-in-depth (enforcing which tools can be run, against which tables, and limiting results), the primary protection against SQL injection is the tool-based abstraction itself. By never allowing the LLM to write raw SQL, Heimdall’s design makes this entire category of attack impossible.
Conclusion
Heimdall is more than just a tool; it’s a blueprint for how to think about securely integrating powerful, non-deterministic AI with legacy, deterministic systems. By applying classic software engineering principles—layered security, modular design, and solid abstractions—we can build a bridge between these two worlds that is both powerful and safe.
The project demonstrates that with careful architecture, Go is an excellent choice for building high-performance, secure infrastructure for the new generation of AI-powered applications.