Go to file

tigerenwork df3c5b3296 Initial commit: PostgreSQL Analyzer with MCP + Skills demo		2026-03-17 23:34:07 +08:00
pg_analyzer_skill	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
pg_mcp_server	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
.gitignore	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
HOW_IT_WORKS.md	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
NANOBOT_SETUP.md	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
NANOBOT_SKILL_SETUP.md	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
README.md	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
bestseller.py	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
demo.py	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
setup_test_db.py	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00
show_tables.py	Initial commit: PostgreSQL Analyzer with MCP + Skills demo	2026-03-17 23:34:07 +08:00

README.md

PostgreSQL Analyzer - Practical MCP + Skills Demo

A production-ready database analysis tool using MCP (Model Context Protocol) and Skills. Connect to any PostgreSQL database, explore schemas, query data, and generate insights.

What This Does

┌────────────────────────────────────────────────────────────────────┐
│                         User Request                               │
│    "Analyze my orders table and find revenue trends"               │
└────────────────────────────────────┬───────────────────────────────┘
                                     │
    ┌────────────────────────────────▼─────────────────────────────┐
    │              Skill: postgres-analyzer                        │
    │  ┌─────────────────────────────────────────────────────────┐ │
    │  │ Triggers on: database, analysis, query, insights        │ │
    │  │                                                         │ │
    │  │ Workflow:                                               │ │
    │  │ 1. get_schema() → Understand table structure            │ │
    │  │ 2. get_table_stats() → Get row counts, samples          │ │
    │  │ 3. execute_query() → Run revenue analysis SQL           │ │
    │  │ 4. analyze_column() → Check date ranges, distributions  │ │
    │  │ 5. Synthesize → Generate insights report                │ │
    │  └─────────────────────────────────────────────────────────┘ │
    └────────────────────────────────┬─────────────────────────────┘
                                     │ MCP Protocol
    ┌────────────────────────────────▼─────────────────────────────┐
    │              MCP Server: postgres-analyzer                   │
    │  ┌─────────────────────────────────────────────────────────┐ │
    │  │ Tools:                                                  │ │
    │  │ • get_schema()       - List tables/columns              │ │
    │  │ • execute_query()    - Run SELECT queries               │ │
    │  │ • get_table_stats()  - Stats + sample data              │ │
    │  │ • analyze_column()   - Deep column analysis             │ │
    │  │                                                         │ │
    │  │ Safety: Read-only, query limits, injection protection   │ │
    │  └─────────────────────────────────────────────────────────┘ │
    └────────────────────────────────┬─────────────────────────────┘
                                     │ psycopg2
    ┌────────────────────────────────▼─────────────────────────────┐
    │                    PostgreSQL Database                        │
    │              (Any accessible PostgreSQL instance)            │
    └──────────────────────────────────────────────────────────────┘

Quick Start

1. Install Dependencies

cd pg_analyzer_demo
pip install -r pg_mcp_server/requirements.txt

2. Set Database Connection

export PG_CONNECTION_STRING="postgresql://user:password@host:port/database"

# Examples:
# Local database:
export PG_CONNECTION_STRING="postgresql://postgres:secret@localhost:5432/myapp"

# Supabase:
export PG_CONNECTION_STRING="postgresql://postgres.xxxx:password@aws-0-region.pooler.supabase.com:5432/postgres"

# Railway/Render:
export PG_CONNECTION_STRING="postgresql://user:pass@host.render.com:5432/dbname"

3. Test with Demo Client

python demo.py

This interactively guides you through:

Schema discovery
Table analysis
Custom queries
Column deep-dives

Components

MCP Server (`pg_mcp_server/server.py`)

Exposes 4 tools for safe database access:

Tool	Parameters	Returns
`get_schema`	`table_name` (optional)	All tables or specific table schema
`execute_query`	`query`, `limit`	Query results as markdown table
`get_table_stats`	`table_name`, `sample_size`	Row count, column stats, sample rows
`analyze_column`	`table_name`, `column_name`	Distribution, nulls, top values

Safety Features:

Read-only: Rejects INSERT/UPDATE/DELETE/DROP/CREATE
Query limits: Auto-limits to 100 rows (max 1000)
Connection pooling: Proper cleanup
SQL injection protection: Uses parameterized queries

Skill (`pg_analyzer_skill/SKILL.md`)

Teaches the AI:

When to use: Database questions, analysis needs
Workflow: Discovery → Deep Dive → Insights
SQL Patterns: Common analysis queries
Safety Rules: Read-only, performance, PII warnings
Output Format: Structured insights with recommendations

Helper Script (`pg_analyzer_skill/scripts/generate_report.py`)

Generates formatted markdown reports from analysis results.

Using with Kimi Code CLI

Setup

1. Configure MCP Server

Add to ~/.kimi/mcp.json:

{
  "mcpServers": {
    "postgres": {
      "command": "python3",
      "args": ["/absolute/path/to/pg_mcp_server/server.py"],
      "env": {
        "PG_CONNECTION_STRING": "postgresql://user:pass@host:5432/db"
      }
    }
  }
}

2. Install Skill

mkdir -p ~/.config/agents/skills/
cp -r pg_analyzer_skill ~/.config/agents/skills/

Or use a local skill path:

kimi --skills-dir ./pg_analyzer_skill

3. Start Kimi

kimi

Example Sessions

Schema Exploration

User: What tables are in my database?

→ get_schema()
← Shows all tables with column counts

Kimi: Your database has 4 tables:
  • users (8 columns) - User accounts
  • orders (10 columns) - Purchase orders
  • products (6 columns) - Product catalog
  • order_items (5 columns) - Line items

Data Analysis

User: Analyze the orders table

→ get_table_stats("orders", sample_size=5)
← Row count: 15,420 | Columns: 10 | Sample data...

→ analyze_column("orders", "total_amount")
← Min: $12.50 | Max: $2,450.00 | Avg: $187.30

→ analyze_column("orders", "status")
← completed: 9,252 (60%) | pending: 4,626 (30%) | cancelled: 1,542 (10%)

→ execute_query("SELECT DATE_TRUNC('month', created_at) as month, 
                  SUM(total_amount) as revenue
                  FROM orders WHERE status = 'completed'
                  GROUP BY 1 ORDER BY 1 LIMIT 12")
← Monthly revenue trend data

Kimi: ## Orders Table Analysis

**Overview**: 15,420 orders with healthy volume

**Key Findings**:
- Average order value: $187.30 (healthy for retail)
- 10% cancellation rate - investigate why
- Revenue trending up 15% MoM

**Recommendations**:
1. Review cancellation process - rate is above industry avg (5-7%)
2. Promote products in $150-250 range (sweet spot)
3. Follow up on pending orders (30% of total)

Custom Queries

User: Who are my top 10 customers by revenue?

→ execute_query("""
    SELECT u.email, SUM(o.total_amount) as total
    FROM users u
    JOIN orders o ON u.id = o.user_id
    WHERE o.status = 'completed'
    GROUP BY u.id, u.email
    ORDER BY total DESC
    LIMIT 10
  """)

Kimi: Here are your top 10 customers:
| Email | Total Revenue |
|-------|---------------|
| john@example.com | $12,450 |
| sarah@example.com | $11,230 |
...

Real-World Use Cases

1. Data Quality Audit

User: Check data quality in the users table

Kimi runs:
1. get_table_stats("users") - Overview
2. analyze_column("users", "email") - Check for nulls, duplicates
3. analyze_column("users", "created_at") - Date range validation
4. execute_query("SELECT COUNT(*) FROM users WHERE email NOT LIKE '%@%'")

Output: Data quality report with issues and recommendations

2. Business Metrics Dashboard

User: Give me a business overview

Kimi analyzes:
- User growth (signups by month)
- Revenue trends (completed orders)
- Product performance (top sellers)
- Churn indicators (inactive users)

Output: Executive summary with charts (as markdown tables)

3. Anomaly Detection

User: Find any unusual patterns in orders

Kimi checks:
- Orders with extreme amounts (outliers)
- Sudden spikes in cancellations
- Unusual time patterns (3am orders)
- Duplicate transactions

Output: Anomaly report with investigation queries

Configuration Reference

Environment Variables

Variable	Required	Description
`PG_CONNECTION_STRING`	Yes	PostgreSQL connection URI
`PG_POOL_SIZE`	No	Connection pool size (default: 5)
`PG_QUERY_TIMEOUT`	No	Query timeout in seconds (default: 30)

Connection String Format

postgresql://[user[:password]@][host][:port][/dbname][?param1=value1&...]

Examples:
postgresql://localhost/mydb
postgresql://user:secret@localhost:5432/mydb?sslmode=require
postgresql://user:pass@host.supabase.co:5432/postgres?sslmode=require

Security Considerations

MCP Server Safety

Read-Only Enforcement: Only SELECT queries allowed
Query Limits: Max 1000 rows returned
No DDL: CREATE/ALTER/DROP rejected
Connection Isolation: Per-request connections

Best Practices

Use read-only database users
Enable SSL for remote connections
Monitor query logs
Set appropriate query timeouts

Extending

Adding New Tools

Edit pg_mcp_server/server.py:

Tool(
    name="get_slow_queries",
    description="Find slow running queries from pg_stat_statements",
    inputSchema={
        "type": "object",
        "properties": {
            "limit": {"type": "integer", "default": 10}
        }
    },
)

Adding Analysis Patterns

Edit pg_analyzer_skill/SKILL.md:

### Cohort Analysis

```sql
SELECT 
    DATE_TRUNC('month', first_order) as cohort,
    COUNT(*) as users
FROM (
    SELECT user_id, MIN(created_at) as first_order
    FROM orders GROUP BY user_id
) first_orders
GROUP BY 1


## Troubleshooting

### Connection Issues

```bash
# Test connection manually
psql "$PG_CONNECTION_STRING" -c "SELECT 1"

# Check server is running
pg_isready -h localhost -p 5432

Permission Errors

Create a read-only user:

CREATE USER analyst WITH PASSWORD 'safe_password';
GRANT CONNECT ON DATABASE mydb TO analyst;
GRANT USAGE ON SCHEMA public TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;

Performance

For large tables, add WHERE clauses:

-- Good: Limited time range
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '30 days'

-- Bad: Full table scan
SELECT * FROM orders

Comparison: MCP vs Direct Connection

Aspect	MCP + Skills	Direct SQL
Safety	✅ Read-only enforced	⚠️ User responsibility
Guidance	✅ AI knows analysis patterns	❌ Manual SQL writing
Insights	✅ Automatic synthesis	❌ Raw data only
Reusability	✅ Skill applies to any DB	❌ Custom each time
Setup	⚠️ Requires configuration	✅ Direct access

Resources

License

MIT - Use this as a foundation for your own database analysis tools!