Complete reference for all modules, classes, functions, and tools in the PII Agent project.
- data/model.py — ORM Models
- data/load_csv.py — Data Loader
- data/docker-compose.yml — Database Container
- src/agno/db-agent.py — Agno Agent
- src/agno/prompts.py — Agno Prompts
- src/msft-agent-framework/data-agent.py — MSFT Agent
- Test Coverage
SQLAlchemy ORM definitions for the database schema.
SQLAlchemy 2.x declarative base class. All ORM models inherit from this.
ORM model for the persons table. Stores PII records with all fields as strings.
Table name: persons
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
String |
Primary Key | Unique record identifier (SSN-format string) |
gender |
String |
— | "m" or "f" |
birthdate |
String |
— | Date in YYYY/MM/DD format |
maiden_name |
String |
— | Maiden name |
lname |
String |
— | Last name |
fname |
String |
— | First name |
address |
String |
— | Street address |
city |
String |
— | City |
state |
String |
— | Two-letter state code |
zip |
String |
— | ZIP code (stored as string to preserve leading zeros) |
phone |
String |
— | Phone number |
email |
String |
— | Email address |
cc_type |
String |
— | Credit card type ("m" = Mastercard, "v" = Visa) |
cc_number |
String |
— | Credit card number (space-separated groups) |
cc_cvc |
String |
— | Card verification code (stored as string to preserve leading zeros) |
cc_expiredate |
String |
— | Card expiration date in YYYY/MM/DD format |
All columns use Mapped[str] with mapped_column(String). All dates and numeric-looking values are intentionally stored as strings for demo simplicity.
Idempotent CSV loader that populates the persons table from a CSV file.
DEFAULT_DATABASE_URL = "postgresql+psycopg://postgres:MyDbPassword@localhost:5432/pii_demo"
CSV_PATH = Path(__file__).parent / "sample-pii-data.csv"Load CSV rows into the persons table, skipping rows whose id already exists.
Parameters:
| Parameter | Type | Description |
|---|---|---|
engine |
sqlalchemy.Engine |
SQLAlchemy engine connected to the target database |
csv_path |
pathlib.Path |
Path to the CSV file to load |
Returns: int — Number of new rows inserted.
Behavior:
- Calls
Base.metadata.create_all(engine)to ensure thepersonstable exists - Reads the CSV file with
csv.DictReader - For each row, checks if a
Personwith thatidalready exists - Skips existing rows; inserts new rows
- Commits the session and returns the insert count
Idempotency: Running load_csv() multiple times with the same data is safe. The second run returns 0.
CLI entry point. Reads DATABASE_URL from the environment (falls back to DEFAULT_DATABASE_URL), creates an engine, calls load_csv(), and prints the result.
Usage:
python -m data.load_csv
# Output: Loaded 30 new rows into persons table.PostgreSQL 16 container definition.
| Setting | Value |
|---|---|
| Image | postgres:16 |
POSTGRES_USER |
postgres |
POSTGRES_PASSWORD |
MyDbPassword |
POSTGRES_DB |
pii_demo |
| Host port | 5432 |
| Volume | pgdata (named volume at /var/lib/postgresql/data) |
Commands:
docker compose -f data/docker-compose.yml up -d # Start
docker compose -f data/docker-compose.yml down # Stop
docker compose -f data/docker-compose.yml down -v # Stop and delete dataAgno framework agent for querying the PII database.
| Setting | Value |
|---|---|
| Name | "Records Manager" |
| Model | LiteLLM(id="claude-sonnet-4-5") |
| API base | os.getenv("LLM_BASE_URL") |
| API key | os.getenv("LLM_API_KEY") |
top_p |
None (disabled) |
Uses Agno's built-in PostgresTools:
PostgresTools(
host="localhost",
port=5432,
db_name="pii_demo",
user="postgres",
password="MyDbPassword",
table_schema="public",
)PostgresTools provides automatic schema discovery and SQL execution. The agent does not need to call individual tools — the framework handles tool orchestration internally.
You are a records manager for an information broker. You have access to a Postgres
database with information about the people we deal with. You can run SQL queries to
get information about these people and provide it to the user.
python src/agno/db-agent.pySingle-shot execution. Edit the agent.print_response(...) call to change the query.
instructions = """
You are a helpful assistant that provides information about people inside of our database."""A simple system prompt string. Not currently used by db-agent.py (which defines its own inline instructions).
Microsoft Agent Framework agent with schema-aware tools and an interactive REPL.
DB_CONNINFO = os.getenv(
"DB_CONNINFO",
"host=localhost port=5432 dbname=pii_demo user=postgres password=MyDbPassword",
)libpq-format connection string. Overridable via DB_CONNINFO environment variable.
All tools are decorated with @tool(approval_mode="never_require"), meaning they execute without user confirmation.
List all tables in the public schema.
Parameters: None
Returns: Comma-separated table names, or "No tables found.", or an error message.
SQL executed:
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public' ORDER BY table_name;Get column names and data types for a given table.
Parameters:
| Parameter | Type | Description |
|---|---|---|
table_name |
Annotated[str, "Name of the table to inspect"] |
Table to inspect |
Returns: Newline-separated "column_name (data_type)" strings, or "No columns found for table '...'.", or an error message.
SQL executed:
SELECT column_name, data_type FROM information_schema.columns
WHERE table_name = %s ORDER BY ordinal_position;Execute a SQL query and return formatted results.
Parameters:
| Parameter | Type | Description |
|---|---|---|
sql |
Annotated[str, "SQL query to execute against the database"] |
SQL to execute |
Returns: Pipe-separated header and rows, or "Query returned no rows.", or "Query executed successfully (no results returned)." for non-SELECT statements, or an error message.
Output format:
column1 | column2 | column3
value1 | value2 | value3
value4 | value5 | value6
SYSTEM_PROMPT = (
"You are a records manager with access to a Postgres database. "
"Before answering questions about the data, use get_tables and get_columns "
"to discover the schema. Then use run_query to execute SQL queries. "
"Present results clearly to the user."
)
client = OpenAIChatClient(
base_url=os.getenv("LLM_BASE_URL"),
api_key=os.getenv("LLM_API_KEY"),
model_id=os.getenv("LLM_MODEL_ID", "gpt-4o"),
)
agent = Agent(
client=client,
instructions=SYSTEM_PROMPT,
name="PII Data Agent",
tools=[get_tables, get_columns, run_query],
)Interactive REPL loop.
- Creates a session with
agent.create_session() - Prompts user with
"You: " - Streams agent responses via
agent.run(input, session=session, stream=True) - Exits on
"quit","exit", empty input,EOFError, orKeyboardInterrupt
Usage:
python src/msft-agent-framework/data-agent.pyAll tests use an in-memory SQLite database. No Postgres required.
| Test | Assertion |
|---|---|
test_person_table_columns() |
All 16 expected column names exist in the persons table |
test_person_primary_key() |
id is the sole primary key column |
Uses a temporary 2-row CSV file and in-memory SQLite. Cleans up temp files in finally blocks.
| Test | Assertion |
|---|---|
test_load_csv_inserts_rows() |
Loading a 2-row CSV inserts exactly 2 rows |
test_load_csv_is_idempotent() |
Running load_csv twice inserts 0 rows on the second run; total remains 2 |
test_load_csv_maps_fields_correctly() |
Specific fields match expected values: fname="Johnson", lname="White", email="jwhite@domain.com", cc_number="5270 4267 6450 5516" |
Requires: Running PostgreSQL container with loaded data (30 rows).
| Test | Assertion |
|---|---|
test_get_tables_returns_persons() |
"persons" appears in the list of public schema tables |
test_get_columns_returns_expected() |
"fname" and "email" are in the column list; total is 16 columns |
test_run_query_returns_rows() |
SELECT COUNT(*) FROM persons returns 30 |
# Unit tests (fast, no dependencies)
python -m pytest data/tests/ -v
# Smoke tests (requires live database)
python -m pytest src/msft-agent-framework/test_tools.py -v
# Single test
python -m pytest data/tests/test_model.py::test_person_table_columns -v