Chat with a PostgreSQL DB using SQLChatAgent¶
The SQLChatAgent is
designed to facilitate interactions with an SQL database using natural language.
A ready-to-use script based on the SQLChatAgent is available in the langroid-examples
repo at examples/data-qa/sql-chat/sql_chat.py
(and also in a similar location in the main langroid repo).
This tutorial walks you through how you might use the SQLChatAgent if you were
to write your own script from scratch. We also show some of the internal workings of this Agent.
The agent uses the schema context to generate SQL queries based on a user's input. Here is a tutorial on how to set up an agent with your PostgreSQL database. The steps for other databases are similar. Since the agent implementation relies on SqlAlchemy, it should work with any SQL DB that supports SqlAlchemy. It offers enhanced functionality for MySQL and PostgreSQL by automatically extracting schemas from the database.
Before you begin¶
Security: SQLChatAgent executes LLM-generated SQL
SQLChatAgent runs SQL produced by a language model against your
database. Because the LLM is influenceable by prompt injection — both
through the user's question and through any data the LLM reads back from
the database — an attacker who can shape either input can attempt to
coerce the agent into running dangerous SQL.
The agent defaults to SELECT-only execution (CVE-2026-25879) and
rejects queries that match known dangerous primitives
(e.g. PostgreSQL COPY ... FROM PROGRAM, MySQL INTO OUTFILE, SQLite
load_extension, MSSQL xp_cmdshell). See
Security configuration below.
Independently of the agent's defenses, you should always:
- Connect using a least-privilege database role that does not have
pg_execute_server_program,FILE,xp_cmdshell, or other privileges enabling code execution or filesystem access. - Treat the agent's prompt input as untrusted if it can come from end users.
- Avoid pointing the agent at databases that contain other tenants' data.
Data Privacy Considerations
Since the SQLChatAgent uses the OpenAI GPT-4 as the underlying language model, users should be aware that database information processed by the agent may be sent to OpenAI's API and should therefore be comfortable with this.
-
Install PostgreSQL dev libraries for your platform, e.g.
sudo apt-get install libpq-devon Ubuntu,brew install postgresqlon Mac, etc.
-
Follow the general setup guide to get started with Langroid (mainly, install
langroidinto your virtual env, and set up suitable values in the.envfile). Note that to use the SQLChatAgent with a PostgreSQL database, you need to install thelangroid[postgres]extra, e.g.:pip install "langroid[postgres]"orpoetry add "langroid[postgres]"oruv add "langroid[postgres]"poetry install -E postgresoruv pip install --extra postgres -r pyproject.toml
If this gives you an error, try pip install psycopg2-binary in your virtualenv.
Initialize the agent¶
from langroid.agent.special.sql.sql_chat_agent import (
SQLChatAgent,
SQLChatAgentConfig,
)
agent = SQLChatAgent(
config=SQLChatAgentConfig(
database_uri="postgresql://example.db",
)
)
Configuration¶
The following components of SQLChatAgentConfig are optional but strongly
recommended for improved results:
context_descriptions: A nested dictionary that specifies the schema context for the agent to use when generating queries, for example:
{
"table1": {
"description": "description of table1",
"columns": {
"column1": "description of column1 in table1",
"column2": "description of column2 in table1"
}
},
"employees": {
"description": "The 'employees' table contains information about the employees. It relates to the 'departments' and 'sales' tables via foreign keys.",
"columns": {
"id": "A unique identifier for an employee. This ID is used as a foreign key in the 'sales' table.",
"name": "The name of the employee.",
"department_id": "The ID of the department the employee belongs to. This is a foreign key referencing the 'id' in the 'departments' table."
}
}
}
By default, if no context description json file is provided in the config, the agent will automatically generate the file using the built-in Postgres table/column comments.
-
schema_tools: When set toTrue, activates a retrieval mode where the agent systematically requests only the parts of the schemas relevant to the current query. When this option is enabled, the agent performs the following steps:- Asks for table names.
- Asks for table descriptions and column names from possibly relevant table names.
- Asks for column descriptions from possibly relevant columns.
- Writes the SQL query.
Setting schema_tools=True is especially useful for large schemas where it is costly or impossible
to include the entire schema in a query context.
By selectively using only the relevant parts of the context descriptions, this mode
reduces token usage, though it may result in 1-3 additional OpenAI API calls before
the final SQL query is generated.
Security configuration¶
SQLChatAgentConfig exposes two flags that govern which SQL the agent will
execute:
allowed_statement_types: List[str] = ["SELECT"]— the SQL statement types the agent is permitted to run. Each query is parsed withsqlglot; any statement whose top level falls outside this list is rejected before it reaches the database. Multi-statement queries are validated statement-by-statement, so a payload likeSELECT 1; DROP TABLE tis also rejected.allow_dangerous_operations: bool = False— whenTrue, disables the statement-type allowlist and the dangerous-pattern blocklist. Only set this if (a) the agent's prompt input is fully trusted, and (b) the database connection uses a least-privilege role.
In addition to the statement-type check, the agent maintains a dialect-aware blocklist of patterns that enable code execution, arbitrary file access, or other escape primitives — for example:
| Dialect | Blocked construct (examples) |
|---|---|
| PostgreSQL | COPY … FROM/TO PROGRAM, pg_read_server_files, lo_import/export, CREATE EXTENSION, CREATE FUNCTION |
| MySQL | INTO OUTFILE, INTO DUMPFILE, LOAD_FILE(…), LOAD DATA |
| SQLite | load_extension(…), ATTACH DATABASE |
| SQL Server | xp_cmdshell, sp_OACreate, OPENROWSET, BULK INSERT |
To enable database writes from the agent, extend the allowlist explicitly rather than disabling all checks:
config = SQLChatAgentConfig(
database_uri="postgresql://example.db",
allowed_statement_types=["SELECT", "INSERT", "UPDATE", "DELETE"],
)
If you need DDL or other operations the allowlist doesn't cover, the
recommended approach is still to keep allow_dangerous_operations=False
and grant the DB role exactly the privileges you need, rather than relying
on the agent to filter SQL.
Putting it all together¶
In the code below, we will allow the agent to generate the context descriptions
from table comments by excluding the context_descriptions config option.
We set schema_tools to True to enable the retrieval mode.
from langroid.agent.special.sql.sql_chat_agent import (
SQLChatAgent,
SQLChatAgentConfig,
)
# Initialize SQLChatAgent with a PostgreSQL database URI and enable schema_tools
agent = SQLChatAgent(gi
config = SQLChatAgentConfig(
database_uri="postgresql://example.db",
schema_tools=True,
)
)
# Run the task to interact with the SQLChatAgent
task = Task(agent)
task.run()
By following these steps, you should now be able to set up an SQLChatAgent
that interacts with a PostgreSQL database, making querying a seamless
experience.
In the langroid repo we have provided a ready-to-use script
sql_chat.py
based on the above, that you can use right away to interact with your PostgreSQL database:
This script will prompt you for the database URI, and then start the agent.