|
|
||
|---|---|---|
| pg_analyzer_skill | ||
| pg_mcp_server | ||
| .gitignore | ||
| HOW_IT_WORKS.md | ||
| NANOBOT_SETUP.md | ||
| NANOBOT_SKILL_SETUP.md | ||
| README.md | ||
| bestseller.py | ||
| demo.py | ||
| setup_test_db.py | ||
| show_tables.py | ||
README.md
PostgreSQL Analyzer - Practical MCP + Skills Demo
A production-ready database analysis tool using MCP (Model Context Protocol) and Skills. Connect to any PostgreSQL database, explore schemas, query data, and generate insights.
What This Does
┌────────────────────────────────────────────────────────────────────┐
│ User Request │
│ "Analyze my orders table and find revenue trends" │
└────────────────────────────────────┬───────────────────────────────┘
│
┌────────────────────────────────▼─────────────────────────────┐
│ Skill: postgres-analyzer │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Triggers on: database, analysis, query, insights │ │
│ │ │ │
│ │ Workflow: │ │
│ │ 1. get_schema() → Understand table structure │ │
│ │ 2. get_table_stats() → Get row counts, samples │ │
│ │ 3. execute_query() → Run revenue analysis SQL │ │
│ │ 4. analyze_column() → Check date ranges, distributions │ │
│ │ 5. Synthesize → Generate insights report │ │
│ └─────────────────────────────────────────────────────────┘ │
└────────────────────────────────┬─────────────────────────────┘
│ MCP Protocol
┌────────────────────────────────▼─────────────────────────────┐
│ MCP Server: postgres-analyzer │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Tools: │ │
│ │ • get_schema() - List tables/columns │ │
│ │ • execute_query() - Run SELECT queries │ │
│ │ • get_table_stats() - Stats + sample data │ │
│ │ • analyze_column() - Deep column analysis │ │
│ │ │ │
│ │ Safety: Read-only, query limits, injection protection │ │
│ └─────────────────────────────────────────────────────────┘ │
└────────────────────────────────┬─────────────────────────────┘
│ psycopg2
┌────────────────────────────────▼─────────────────────────────┐
│ PostgreSQL Database │
│ (Any accessible PostgreSQL instance) │
└──────────────────────────────────────────────────────────────┘
Quick Start
1. Install Dependencies
cd pg_analyzer_demo
pip install -r pg_mcp_server/requirements.txt
2. Set Database Connection
export PG_CONNECTION_STRING="postgresql://user:password@host:port/database"
# Examples:
# Local database:
export PG_CONNECTION_STRING="postgresql://postgres:secret@localhost:5432/myapp"
# Supabase:
export PG_CONNECTION_STRING="postgresql://postgres.xxxx:password@aws-0-region.pooler.supabase.com:5432/postgres"
# Railway/Render:
export PG_CONNECTION_STRING="postgresql://user:pass@host.render.com:5432/dbname"
3. Test with Demo Client
python demo.py
This interactively guides you through:
- Schema discovery
- Table analysis
- Custom queries
- Column deep-dives
Components
MCP Server (pg_mcp_server/server.py)
Exposes 4 tools for safe database access:
| Tool | Parameters | Returns |
|---|---|---|
get_schema |
table_name (optional) |
All tables or specific table schema |
execute_query |
query, limit |
Query results as markdown table |
get_table_stats |
table_name, sample_size |
Row count, column stats, sample rows |
analyze_column |
table_name, column_name |
Distribution, nulls, top values |
Safety Features:
- Read-only: Rejects INSERT/UPDATE/DELETE/DROP/CREATE
- Query limits: Auto-limits to 100 rows (max 1000)
- Connection pooling: Proper cleanup
- SQL injection protection: Uses parameterized queries
Skill (pg_analyzer_skill/SKILL.md)
Teaches the AI:
- When to use: Database questions, analysis needs
- Workflow: Discovery → Deep Dive → Insights
- SQL Patterns: Common analysis queries
- Safety Rules: Read-only, performance, PII warnings
- Output Format: Structured insights with recommendations
Helper Script (pg_analyzer_skill/scripts/generate_report.py)
Generates formatted markdown reports from analysis results.
Using with Kimi Code CLI
Setup
1. Configure MCP Server
Add to ~/.kimi/mcp.json:
{
"mcpServers": {
"postgres": {
"command": "python3",
"args": ["/absolute/path/to/pg_mcp_server/server.py"],
"env": {
"PG_CONNECTION_STRING": "postgresql://user:pass@host:5432/db"
}
}
}
}
2. Install Skill
mkdir -p ~/.config/agents/skills/
cp -r pg_analyzer_skill ~/.config/agents/skills/
Or use a local skill path:
kimi --skills-dir ./pg_analyzer_skill
3. Start Kimi
kimi
Example Sessions
Schema Exploration
User: What tables are in my database?
→ get_schema()
← Shows all tables with column counts
Kimi: Your database has 4 tables:
• users (8 columns) - User accounts
• orders (10 columns) - Purchase orders
• products (6 columns) - Product catalog
• order_items (5 columns) - Line items
Data Analysis
User: Analyze the orders table
→ get_table_stats("orders", sample_size=5)
← Row count: 15,420 | Columns: 10 | Sample data...
→ analyze_column("orders", "total_amount")
← Min: $12.50 | Max: $2,450.00 | Avg: $187.30
→ analyze_column("orders", "status")
← completed: 9,252 (60%) | pending: 4,626 (30%) | cancelled: 1,542 (10%)
→ execute_query("SELECT DATE_TRUNC('month', created_at) as month,
SUM(total_amount) as revenue
FROM orders WHERE status = 'completed'
GROUP BY 1 ORDER BY 1 LIMIT 12")
← Monthly revenue trend data
Kimi: ## Orders Table Analysis
**Overview**: 15,420 orders with healthy volume
**Key Findings**:
- Average order value: $187.30 (healthy for retail)
- 10% cancellation rate - investigate why
- Revenue trending up 15% MoM
**Recommendations**:
1. Review cancellation process - rate is above industry avg (5-7%)
2. Promote products in $150-250 range (sweet spot)
3. Follow up on pending orders (30% of total)
Custom Queries
User: Who are my top 10 customers by revenue?
→ execute_query("""
SELECT u.email, SUM(o.total_amount) as total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'completed'
GROUP BY u.id, u.email
ORDER BY total DESC
LIMIT 10
""")
Kimi: Here are your top 10 customers:
| Email | Total Revenue |
|-------|---------------|
| john@example.com | $12,450 |
| sarah@example.com | $11,230 |
...
Real-World Use Cases
1. Data Quality Audit
User: Check data quality in the users table
Kimi runs:
1. get_table_stats("users") - Overview
2. analyze_column("users", "email") - Check for nulls, duplicates
3. analyze_column("users", "created_at") - Date range validation
4. execute_query("SELECT COUNT(*) FROM users WHERE email NOT LIKE '%@%'")
Output: Data quality report with issues and recommendations
2. Business Metrics Dashboard
User: Give me a business overview
Kimi analyzes:
- User growth (signups by month)
- Revenue trends (completed orders)
- Product performance (top sellers)
- Churn indicators (inactive users)
Output: Executive summary with charts (as markdown tables)
3. Anomaly Detection
User: Find any unusual patterns in orders
Kimi checks:
- Orders with extreme amounts (outliers)
- Sudden spikes in cancellations
- Unusual time patterns (3am orders)
- Duplicate transactions
Output: Anomaly report with investigation queries
Configuration Reference
Environment Variables
| Variable | Required | Description |
|---|---|---|
PG_CONNECTION_STRING |
Yes | PostgreSQL connection URI |
PG_POOL_SIZE |
No | Connection pool size (default: 5) |
PG_QUERY_TIMEOUT |
No | Query timeout in seconds (default: 30) |
Connection String Format
postgresql://[user[:password]@][host][:port][/dbname][?param1=value1&...]
Examples:
postgresql://localhost/mydb
postgresql://user:secret@localhost:5432/mydb?sslmode=require
postgresql://user:pass@host.supabase.co:5432/postgres?sslmode=require
Security Considerations
MCP Server Safety
- Read-Only Enforcement: Only SELECT queries allowed
- Query Limits: Max 1000 rows returned
- No DDL: CREATE/ALTER/DROP rejected
- Connection Isolation: Per-request connections
Best Practices
- Use read-only database users
- Enable SSL for remote connections
- Monitor query logs
- Set appropriate query timeouts
Extending
Adding New Tools
Edit pg_mcp_server/server.py:
Tool(
name="get_slow_queries",
description="Find slow running queries from pg_stat_statements",
inputSchema={
"type": "object",
"properties": {
"limit": {"type": "integer", "default": 10}
}
},
)
Adding Analysis Patterns
Edit pg_analyzer_skill/SKILL.md:
### Cohort Analysis
```sql
SELECT
DATE_TRUNC('month', first_order) as cohort,
COUNT(*) as users
FROM (
SELECT user_id, MIN(created_at) as first_order
FROM orders GROUP BY user_id
) first_orders
GROUP BY 1
## Troubleshooting
### Connection Issues
```bash
# Test connection manually
psql "$PG_CONNECTION_STRING" -c "SELECT 1"
# Check server is running
pg_isready -h localhost -p 5432
Permission Errors
Create a read-only user:
CREATE USER analyst WITH PASSWORD 'safe_password';
GRANT CONNECT ON DATABASE mydb TO analyst;
GRANT USAGE ON SCHEMA public TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;
Performance
For large tables, add WHERE clauses:
-- Good: Limited time range
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days'
-- Bad: Full table scan
SELECT * FROM orders
Comparison: MCP vs Direct Connection
| Aspect | MCP + Skills | Direct SQL |
|---|---|---|
| Safety | ✅ Read-only enforced | ⚠️ User responsibility |
| Guidance | ✅ AI knows analysis patterns | ❌ Manual SQL writing |
| Insights | ✅ Automatic synthesis | ❌ Raw data only |
| Reusability | ✅ Skill applies to any DB | ❌ Custom each time |
| Setup | ⚠️ Requires configuration | ✅ Direct access |
Resources
License
MIT - Use this as a foundation for your own database analysis tools!