# PostgreSQL Analyzer - Practical MCP + Skills Demo A production-ready database analysis tool using **MCP (Model Context Protocol)** and **Skills**. Connect to any PostgreSQL database, explore schemas, query data, and generate insights. ## What This Does ``` ┌────────────────────────────────────────────────────────────────────┐ │ User Request │ │ "Analyze my orders table and find revenue trends" │ └────────────────────────────────────┬───────────────────────────────┘ │ ┌────────────────────────────────▼─────────────────────────────┐ │ Skill: postgres-analyzer │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Triggers on: database, analysis, query, insights │ │ │ │ │ │ │ │ Workflow: │ │ │ │ 1. get_schema() → Understand table structure │ │ │ │ 2. get_table_stats() → Get row counts, samples │ │ │ │ 3. execute_query() → Run revenue analysis SQL │ │ │ │ 4. analyze_column() → Check date ranges, distributions │ │ │ │ 5. Synthesize → Generate insights report │ │ │ └─────────────────────────────────────────────────────────┘ │ └────────────────────────────────┬─────────────────────────────┘ │ MCP Protocol ┌────────────────────────────────▼─────────────────────────────┐ │ MCP Server: postgres-analyzer │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Tools: │ │ │ │ • get_schema() - List tables/columns │ │ │ │ • execute_query() - Run SELECT queries │ │ │ │ • get_table_stats() - Stats + sample data │ │ │ │ • analyze_column() - Deep column analysis │ │ │ │ │ │ │ │ Safety: Read-only, query limits, injection protection │ │ │ └─────────────────────────────────────────────────────────┘ │ └────────────────────────────────┬─────────────────────────────┘ │ psycopg2 ┌────────────────────────────────▼─────────────────────────────┐ │ PostgreSQL Database │ │ (Any accessible PostgreSQL instance) │ └──────────────────────────────────────────────────────────────┘ ``` ## Quick Start ### 1. Install Dependencies ```bash cd pg_analyzer_demo pip install -r pg_mcp_server/requirements.txt ``` ### 2. Set Database Connection ```bash export PG_CONNECTION_STRING="postgresql://user:password@host:port/database" # Examples: # Local database: export PG_CONNECTION_STRING="postgresql://postgres:secret@localhost:5432/myapp" # Supabase: export PG_CONNECTION_STRING="postgresql://postgres.xxxx:password@aws-0-region.pooler.supabase.com:5432/postgres" # Railway/Render: export PG_CONNECTION_STRING="postgresql://user:pass@host.render.com:5432/dbname" ``` ### 3. Test with Demo Client ```bash python demo.py ``` This interactively guides you through: - Schema discovery - Table analysis - Custom queries - Column deep-dives ## Components ### MCP Server (`pg_mcp_server/server.py`) Exposes 4 tools for safe database access: | Tool | Parameters | Returns | |------|------------|---------| | `get_schema` | `table_name` (optional) | All tables or specific table schema | | `execute_query` | `query`, `limit` | Query results as markdown table | | `get_table_stats` | `table_name`, `sample_size` | Row count, column stats, sample rows | | `analyze_column` | `table_name`, `column_name` | Distribution, nulls, top values | **Safety Features:** - Read-only: Rejects INSERT/UPDATE/DELETE/DROP/CREATE - Query limits: Auto-limits to 100 rows (max 1000) - Connection pooling: Proper cleanup - SQL injection protection: Uses parameterized queries ### Skill (`pg_analyzer_skill/SKILL.md`) Teaches the AI: 1. **When to use**: Database questions, analysis needs 2. **Workflow**: Discovery → Deep Dive → Insights 3. **SQL Patterns**: Common analysis queries 4. **Safety Rules**: Read-only, performance, PII warnings 5. **Output Format**: Structured insights with recommendations ### Helper Script (`pg_analyzer_skill/scripts/generate_report.py`) Generates formatted markdown reports from analysis results. ## Using with Kimi Code CLI ### Setup **1. Configure MCP Server** Add to `~/.kimi/mcp.json`: ```json { "mcpServers": { "postgres": { "command": "python3", "args": ["/absolute/path/to/pg_mcp_server/server.py"], "env": { "PG_CONNECTION_STRING": "postgresql://user:pass@host:5432/db" } } } } ``` **2. Install Skill** ```bash mkdir -p ~/.config/agents/skills/ cp -r pg_analyzer_skill ~/.config/agents/skills/ ``` Or use a local skill path: ```bash kimi --skills-dir ./pg_analyzer_skill ``` **3. Start Kimi** ```bash kimi ``` ### Example Sessions #### Schema Exploration ``` User: What tables are in my database? → get_schema() ← Shows all tables with column counts Kimi: Your database has 4 tables: • users (8 columns) - User accounts • orders (10 columns) - Purchase orders • products (6 columns) - Product catalog • order_items (5 columns) - Line items ``` #### Data Analysis ``` User: Analyze the orders table → get_table_stats("orders", sample_size=5) ← Row count: 15,420 | Columns: 10 | Sample data... → analyze_column("orders", "total_amount") ← Min: $12.50 | Max: $2,450.00 | Avg: $187.30 → analyze_column("orders", "status") ← completed: 9,252 (60%) | pending: 4,626 (30%) | cancelled: 1,542 (10%) → execute_query("SELECT DATE_TRUNC('month', created_at) as month, SUM(total_amount) as revenue FROM orders WHERE status = 'completed' GROUP BY 1 ORDER BY 1 LIMIT 12") ← Monthly revenue trend data Kimi: ## Orders Table Analysis **Overview**: 15,420 orders with healthy volume **Key Findings**: - Average order value: $187.30 (healthy for retail) - 10% cancellation rate - investigate why - Revenue trending up 15% MoM **Recommendations**: 1. Review cancellation process - rate is above industry avg (5-7%) 2. Promote products in $150-250 range (sweet spot) 3. Follow up on pending orders (30% of total) ``` #### Custom Queries ``` User: Who are my top 10 customers by revenue? → execute_query(""" SELECT u.email, SUM(o.total_amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.status = 'completed' GROUP BY u.id, u.email ORDER BY total DESC LIMIT 10 """) Kimi: Here are your top 10 customers: | Email | Total Revenue | |-------|---------------| | john@example.com | $12,450 | | sarah@example.com | $11,230 | ... ``` ## Real-World Use Cases ### 1. Data Quality Audit ``` User: Check data quality in the users table Kimi runs: 1. get_table_stats("users") - Overview 2. analyze_column("users", "email") - Check for nulls, duplicates 3. analyze_column("users", "created_at") - Date range validation 4. execute_query("SELECT COUNT(*) FROM users WHERE email NOT LIKE '%@%'") Output: Data quality report with issues and recommendations ``` ### 2. Business Metrics Dashboard ``` User: Give me a business overview Kimi analyzes: - User growth (signups by month) - Revenue trends (completed orders) - Product performance (top sellers) - Churn indicators (inactive users) Output: Executive summary with charts (as markdown tables) ``` ### 3. Anomaly Detection ``` User: Find any unusual patterns in orders Kimi checks: - Orders with extreme amounts (outliers) - Sudden spikes in cancellations - Unusual time patterns (3am orders) - Duplicate transactions Output: Anomaly report with investigation queries ``` ## Configuration Reference ### Environment Variables | Variable | Required | Description | |----------|----------|-------------| | `PG_CONNECTION_STRING` | Yes | PostgreSQL connection URI | | `PG_POOL_SIZE` | No | Connection pool size (default: 5) | | `PG_QUERY_TIMEOUT` | No | Query timeout in seconds (default: 30) | ### Connection String Format ``` postgresql://[user[:password]@][host][:port][/dbname][?param1=value1&...] Examples: postgresql://localhost/mydb postgresql://user:secret@localhost:5432/mydb?sslmode=require postgresql://user:pass@host.supabase.co:5432/postgres?sslmode=require ``` ## Security Considerations ### MCP Server Safety 1. **Read-Only Enforcement**: Only SELECT queries allowed 2. **Query Limits**: Max 1000 rows returned 3. **No DDL**: CREATE/ALTER/DROP rejected 4. **Connection Isolation**: Per-request connections ### Best Practices - Use read-only database users - Enable SSL for remote connections - Monitor query logs - Set appropriate query timeouts ## Extending ### Adding New Tools Edit `pg_mcp_server/server.py`: ```python Tool( name="get_slow_queries", description="Find slow running queries from pg_stat_statements", inputSchema={ "type": "object", "properties": { "limit": {"type": "integer", "default": 10} } }, ) ``` ### Adding Analysis Patterns Edit `pg_analyzer_skill/SKILL.md`: ```markdown ### Cohort Analysis ```sql SELECT DATE_TRUNC('month', first_order) as cohort, COUNT(*) as users FROM ( SELECT user_id, MIN(created_at) as first_order FROM orders GROUP BY user_id ) first_orders GROUP BY 1 ``` ``` ## Troubleshooting ### Connection Issues ```bash # Test connection manually psql "$PG_CONNECTION_STRING" -c "SELECT 1" # Check server is running pg_isready -h localhost -p 5432 ``` ### Permission Errors Create a read-only user: ```sql CREATE USER analyst WITH PASSWORD 'safe_password'; GRANT CONNECT ON DATABASE mydb TO analyst; GRANT USAGE ON SCHEMA public TO analyst; GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst; ``` ### Performance For large tables, add WHERE clauses: ```sql -- Good: Limited time range SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days' -- Bad: Full table scan SELECT * FROM orders ``` ## Comparison: MCP vs Direct Connection | Aspect | MCP + Skills | Direct SQL | |--------|--------------|------------| | **Safety** | ✅ Read-only enforced | ⚠️ User responsibility | | **Guidance** | ✅ AI knows analysis patterns | ❌ Manual SQL writing | | **Insights** | ✅ Automatic synthesis | ❌ Raw data only | | **Reusability** | ✅ Skill applies to any DB | ❌ Custom each time | | **Setup** | ⚠️ Requires configuration | ✅ Direct access | ## Resources - [MCP Documentation](https://modelcontextprotocol.io/) - [PostgreSQL Docs](https://www.postgresql.org/docs/) - [psycopg2 Guide](https://www.psycopg.org/docs/) ## License MIT - Use this as a foundation for your own database analysis tools!