# Troubleshooting Guide ## Common Issues and Solutions ### 1. TiDB Health Check Failing #### Symptom ``` dependency failed to start: container tidb is unhealthy ``` Even though you can connect to TiDB from your host machine, Docker health check fails. #### Root Cause The original health check tried to use `mysql` command inside the TiDB container: ```yaml healthcheck: test: ["CMD", "mysql", "-h", "127.0.0.1", "-P", "4000", "-u", "root", "-e", "SELECT 1"] ``` The TiDB Docker image doesn't include the MySQL client binary, so this check always failed. #### Solution ✅ Use TiDB's built-in HTTP status endpoint instead: ```yaml healthcheck: test: ["CMD", "wget", "-q", "-O-", "http://127.0.0.1:10080/status"] interval: 10s timeout: 5s retries: 3 start_period: 10s ``` **Why this works:** - TiDB exposes a status endpoint on port 10080 - `wget` is available in the container - Returns HTTP 200 when TiDB is ready - `start_period` gives TiDB time to initialize before health checks begin ### 2. Docker Compose Version Warning #### Symptom ``` WARN[0000] version is obsolete, it will be ignored ``` #### Solution ✅ Remove the `version` field from `docker-compose.yml`. Modern Docker Compose (v2) doesn't need it. **Before:** ```yaml version: '3.8' services: ... ``` **After:** ```yaml services: ... ``` ### 3. Service Dependencies Not Starting in Order #### Symptom Services fail because dependencies aren't ready yet. #### Solution ✅ Use proper health checks and dependency conditions: ```yaml dm-worker: depends_on: tidb: condition: service_healthy dm-master: condition: service_healthy ``` **Important:** - Each dependency must have a working health check - `start_period` prevents false negatives during startup ### 4. dm-init Fails to Start #### Symptom ``` Error: dm-init exits immediately ``` #### Check: ```bash docker logs dm-init ``` #### Common Causes: **a) .env not configured:** ```bash # Check if .env exists and has real values cat .env ``` **Solution:** ```bash # Copy template and edit cp .env.example .env vim .env ``` **b) Test database not reachable:** ```bash # Test from dm-init container docker run --rm --network tidb-network pingcap/dm:latest \ sh -c "wget -q -O- http://tidb:10080/status" ``` **c) Script syntax error:** ```bash # Check init script sh -n scripts/init-dm.sh ``` ### 5. Containers Keep Restarting #### Check Status: ```bash docker ps -a docker logs ``` #### Common Issues: **a) Port already in use:** ``` Error: bind: address already in use ``` **Solution:** Change ports in `docker-compose.yml`: ```yaml ports: - "14000:4000" # Changed from 4000:4000 ``` **b) Out of memory:** ``` Error: OOM killed ``` **Solution:** Increase memory limits or free up system resources. **c) Permission issues:** ``` Error: permission denied ``` **Solution:** Check volume permissions or run: ```bash docker compose down -v # Remove volumes docker compose up -d # Recreate ``` ### 6. Sync Task Not Running #### Check Status: ```bash ./status.sh # or ./sync-control.sh status ``` #### Common Issues: **a) Task not created:** ```bash # Check if source is configured docker exec dm-master /dmctl --master-addr=dm-master:8261 operate-source show ``` **Solution:** ```bash ./sync-control.sh reinit ``` **b) Wrong credentials:** Check logs: ```bash docker logs dm-worker ``` Fix `.env` and reinit: ```bash vim .env ./sync-control.sh reinit ``` **c) Table doesn't exist:** Verify tables exist on source database: ```bash # Connect to test DB and check SHOW TABLES FROM your_database; ``` ### 7. Connection Refused to TiDB #### Symptom ``` ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (61) ``` #### Checks: **a) Is TiDB running?** ```bash docker ps | grep tidb ``` **b) Is it healthy?** ```bash docker ps # Look for "(healthy)" status ``` **c) Is port exposed?** ```bash docker port tidb # Should show: 4000/tcp -> 0.0.0.0:4000 ``` **d) Test from inside container:** ```bash docker exec tidb wget -q -O- http://127.0.0.1:10080/status ``` #### Solutions: **If container not running:** ```bash docker compose up -d tidb docker logs tidb ``` **If unhealthy:** ```bash # Wait for health check sleep 15 docker ps # If still unhealthy, check logs docker logs tidb ``` **If port not exposed:** ```bash # Recreate container docker compose down docker compose up -d ``` ### 8. Data Not Syncing #### Verify sync is running: ```bash ./sync-control.sh status ``` #### Check sync lag: Look for "syncer" section in status output. #### Common Issues: **a) Sync paused:** ```bash ./sync-control.sh resume ``` **b) Sync stopped with error:** ```bash # Check error in status output ./sync-control.sh status # Fix the issue, then restart ./sync-control.sh restart ``` **c) Network issues:** ```bash # Test connectivity from dm-worker to source docker exec dm-worker ping -c 3 your-test-db-host ``` **d) Binlog not enabled on source:** Source database must have binlog enabled for incremental sync. ### 9. Slow Sync Performance #### Check resource usage: ```bash docker stats ``` #### Solutions: **a) Increase worker resources:** Edit `docker-compose.yml`: ```yaml dm-worker: deploy: resources: limits: cpus: '2' memory: 2G ``` **b) Optimize batch size:** See [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable/tune-configuration) for advanced tuning. ### 10. Docker Compose v1 vs v2 Issues #### Symptom ``` docker: 'compose' is not a docker command ``` #### Solution See [DOCKER_COMPOSE_V2.md](DOCKER_COMPOSE_V2.md) for: - Upgrading to v2 - Creating an alias - Compatibility mode ## Diagnostic Commands ### Check everything at once: ```bash # Service status docker compose ps # Health checks docker ps # Logs (all services) docker compose logs --tail=50 # Logs (specific service) docker compose logs --tail=50 tidb # Resource usage docker stats --no-stream # Network connectivity docker network inspect tidb-network ``` ### Test connectivity: ```bash # From host to TiDB mysql -h 127.0.0.1 -P 4000 -u root -e "SELECT 1" # From host (HTTP) curl http://127.0.0.1:10080/status # From container to TiDB docker run --rm --network tidb-network pingcap/dm:latest \ wget -q -O- http://tidb:10080/status ``` ### Reset everything: ```bash # Stop and remove everything (including data) docker compose down -v # Start fresh ./start.sh ``` ## Getting Help ### Collect diagnostic information: ```bash # Create a diagnostic report echo "=== Docker Version ===" > diagnostic.txt docker --version >> diagnostic.txt docker compose version >> diagnostic.txt echo -e "\n=== Container Status ===" >> diagnostic.txt docker ps -a >> diagnostic.txt echo -e "\n=== TiDB Logs ===" >> diagnostic.txt docker logs tidb --tail=50 >> diagnostic.txt 2>&1 echo -e "\n=== DM Worker Logs ===" >> diagnostic.txt docker logs dm-worker --tail=50 >> diagnostic.txt 2>&1 echo -e "\n=== DM Init Logs ===" >> diagnostic.txt docker logs dm-init >> diagnostic.txt 2>&1 echo -e "\n=== Network Info ===" >> diagnostic.txt docker network inspect tidb-network >> diagnostic.txt echo "Report saved to diagnostic.txt" ``` ### Useful resources: - [TiDB Documentation](https://docs.pingcap.com/tidb/stable) - [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable) - Project documentation: - [README.md](README.md) - [SYNC_GUIDE.md](SYNC_GUIDE.md) - [DATAGRIP_SETUP.md](DATAGRIP_SETUP.md) - [DOCKER_COMPOSE_V2.md](DOCKER_COMPOSE_V2.md) ## Still Having Issues? If none of these solutions work: 1. Check logs: `docker compose logs` 2. Create diagnostic report (see above) 3. Check if it's a known issue in TiDB/DM GitHub issues 4. Verify your environment meets prerequisites (see [README.md](README.md))