420 lines
7.6 KiB
Markdown
420 lines
7.6 KiB
Markdown
# Troubleshooting Guide
|
|
|
|
## Common Issues and Solutions
|
|
|
|
### 1. TiDB Health Check Failing
|
|
|
|
#### Symptom
|
|
```
|
|
dependency failed to start: container tidb is unhealthy
|
|
```
|
|
|
|
Even though you can connect to TiDB from your host machine, Docker health check fails.
|
|
|
|
#### Root Cause
|
|
The original health check tried to use `mysql` command inside the TiDB container:
|
|
```yaml
|
|
healthcheck:
|
|
test: ["CMD", "mysql", "-h", "127.0.0.1", "-P", "4000", "-u", "root", "-e", "SELECT 1"]
|
|
```
|
|
|
|
The TiDB Docker image doesn't include the MySQL client binary, so this check always failed.
|
|
|
|
#### Solution ✅
|
|
Use TiDB's built-in HTTP status endpoint instead:
|
|
```yaml
|
|
healthcheck:
|
|
test: ["CMD", "wget", "-q", "-O-", "http://127.0.0.1:10080/status"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 3
|
|
start_period: 10s
|
|
```
|
|
|
|
**Why this works:**
|
|
- TiDB exposes a status endpoint on port 10080
|
|
- `wget` is available in the container
|
|
- Returns HTTP 200 when TiDB is ready
|
|
- `start_period` gives TiDB time to initialize before health checks begin
|
|
|
|
### 2. Docker Compose Version Warning
|
|
|
|
#### Symptom
|
|
```
|
|
WARN[0000] version is obsolete, it will be ignored
|
|
```
|
|
|
|
#### Solution ✅
|
|
Remove the `version` field from `docker-compose.yml`. Modern Docker Compose (v2) doesn't need it.
|
|
|
|
**Before:**
|
|
```yaml
|
|
version: '3.8'
|
|
services:
|
|
...
|
|
```
|
|
|
|
**After:**
|
|
```yaml
|
|
services:
|
|
...
|
|
```
|
|
|
|
### 3. Service Dependencies Not Starting in Order
|
|
|
|
#### Symptom
|
|
Services fail because dependencies aren't ready yet.
|
|
|
|
#### Solution ✅
|
|
Use proper health checks and dependency conditions:
|
|
|
|
```yaml
|
|
dm-worker:
|
|
depends_on:
|
|
tidb:
|
|
condition: service_healthy
|
|
dm-master:
|
|
condition: service_healthy
|
|
```
|
|
|
|
**Important:**
|
|
- Each dependency must have a working health check
|
|
- `start_period` prevents false negatives during startup
|
|
|
|
### 4. dm-init Fails to Start
|
|
|
|
#### Symptom
|
|
```
|
|
Error: dm-init exits immediately
|
|
```
|
|
|
|
#### Check:
|
|
```bash
|
|
docker logs dm-init
|
|
```
|
|
|
|
#### Common Causes:
|
|
|
|
**a) .env not configured:**
|
|
```bash
|
|
# Check if .env exists and has real values
|
|
cat .env
|
|
```
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Copy template and edit
|
|
cp .env.example .env
|
|
vim .env
|
|
```
|
|
|
|
**b) Test database not reachable:**
|
|
```bash
|
|
# Test from dm-init container
|
|
docker run --rm --network tidb-network pingcap/dm:latest \
|
|
sh -c "wget -q -O- http://tidb:10080/status"
|
|
```
|
|
|
|
**c) Script syntax error:**
|
|
```bash
|
|
# Check init script
|
|
sh -n scripts/init-dm.sh
|
|
```
|
|
|
|
### 5. Containers Keep Restarting
|
|
|
|
#### Check Status:
|
|
```bash
|
|
docker ps -a
|
|
docker logs <container_name>
|
|
```
|
|
|
|
#### Common Issues:
|
|
|
|
**a) Port already in use:**
|
|
```
|
|
Error: bind: address already in use
|
|
```
|
|
|
|
**Solution:** Change ports in `docker-compose.yml`:
|
|
```yaml
|
|
ports:
|
|
- "14000:4000" # Changed from 4000:4000
|
|
```
|
|
|
|
**b) Out of memory:**
|
|
```
|
|
Error: OOM killed
|
|
```
|
|
|
|
**Solution:** Increase memory limits or free up system resources.
|
|
|
|
**c) Permission issues:**
|
|
```
|
|
Error: permission denied
|
|
```
|
|
|
|
**Solution:** Check volume permissions or run:
|
|
```bash
|
|
docker compose down -v # Remove volumes
|
|
docker compose up -d # Recreate
|
|
```
|
|
|
|
### 6. Sync Task Not Running
|
|
|
|
#### Check Status:
|
|
```bash
|
|
./status.sh
|
|
# or
|
|
./sync-control.sh status
|
|
```
|
|
|
|
#### Common Issues:
|
|
|
|
**a) Task not created:**
|
|
```bash
|
|
# Check if source is configured
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 operate-source show
|
|
```
|
|
|
|
**Solution:**
|
|
```bash
|
|
./sync-control.sh reinit
|
|
```
|
|
|
|
**b) Wrong credentials:**
|
|
Check logs:
|
|
```bash
|
|
docker logs dm-worker
|
|
```
|
|
|
|
Fix `.env` and reinit:
|
|
```bash
|
|
vim .env
|
|
./sync-control.sh reinit
|
|
```
|
|
|
|
**c) Table doesn't exist:**
|
|
Verify tables exist on source database:
|
|
```bash
|
|
# Connect to test DB and check
|
|
SHOW TABLES FROM your_database;
|
|
```
|
|
|
|
### 7. Connection Refused to TiDB
|
|
|
|
#### Symptom
|
|
```
|
|
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (61)
|
|
```
|
|
|
|
#### Checks:
|
|
|
|
**a) Is TiDB running?**
|
|
```bash
|
|
docker ps | grep tidb
|
|
```
|
|
|
|
**b) Is it healthy?**
|
|
```bash
|
|
docker ps
|
|
# Look for "(healthy)" status
|
|
```
|
|
|
|
**c) Is port exposed?**
|
|
```bash
|
|
docker port tidb
|
|
# Should show: 4000/tcp -> 0.0.0.0:4000
|
|
```
|
|
|
|
**d) Test from inside container:**
|
|
```bash
|
|
docker exec tidb wget -q -O- http://127.0.0.1:10080/status
|
|
```
|
|
|
|
#### Solutions:
|
|
|
|
**If container not running:**
|
|
```bash
|
|
docker compose up -d tidb
|
|
docker logs tidb
|
|
```
|
|
|
|
**If unhealthy:**
|
|
```bash
|
|
# Wait for health check
|
|
sleep 15
|
|
docker ps
|
|
|
|
# If still unhealthy, check logs
|
|
docker logs tidb
|
|
```
|
|
|
|
**If port not exposed:**
|
|
```bash
|
|
# Recreate container
|
|
docker compose down
|
|
docker compose up -d
|
|
```
|
|
|
|
### 8. Data Not Syncing
|
|
|
|
#### Verify sync is running:
|
|
```bash
|
|
./sync-control.sh status
|
|
```
|
|
|
|
#### Check sync lag:
|
|
Look for "syncer" section in status output.
|
|
|
|
#### Common Issues:
|
|
|
|
**a) Sync paused:**
|
|
```bash
|
|
./sync-control.sh resume
|
|
```
|
|
|
|
**b) Sync stopped with error:**
|
|
```bash
|
|
# Check error in status output
|
|
./sync-control.sh status
|
|
|
|
# Fix the issue, then restart
|
|
./sync-control.sh restart
|
|
```
|
|
|
|
**c) Network issues:**
|
|
```bash
|
|
# Test connectivity from dm-worker to source
|
|
docker exec dm-worker ping -c 3 your-test-db-host
|
|
```
|
|
|
|
**d) Binlog not enabled on source:**
|
|
Source database must have binlog enabled for incremental sync.
|
|
|
|
### 9. Slow Sync Performance
|
|
|
|
#### Check resource usage:
|
|
```bash
|
|
docker stats
|
|
```
|
|
|
|
#### Solutions:
|
|
|
|
**a) Increase worker resources:**
|
|
Edit `docker-compose.yml`:
|
|
```yaml
|
|
dm-worker:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '2'
|
|
memory: 2G
|
|
```
|
|
|
|
**b) Optimize batch size:**
|
|
See [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable/tune-configuration) for advanced tuning.
|
|
|
|
### 10. Docker Compose v1 vs v2 Issues
|
|
|
|
#### Symptom
|
|
```
|
|
docker: 'compose' is not a docker command
|
|
```
|
|
|
|
#### Solution
|
|
See [DOCKER_COMPOSE_V2.md](DOCKER_COMPOSE_V2.md) for:
|
|
- Upgrading to v2
|
|
- Creating an alias
|
|
- Compatibility mode
|
|
|
|
## Diagnostic Commands
|
|
|
|
### Check everything at once:
|
|
```bash
|
|
# Service status
|
|
docker compose ps
|
|
|
|
# Health checks
|
|
docker ps
|
|
|
|
# Logs (all services)
|
|
docker compose logs --tail=50
|
|
|
|
# Logs (specific service)
|
|
docker compose logs --tail=50 tidb
|
|
|
|
# Resource usage
|
|
docker stats --no-stream
|
|
|
|
# Network connectivity
|
|
docker network inspect tidb-network
|
|
```
|
|
|
|
### Test connectivity:
|
|
```bash
|
|
# From host to TiDB
|
|
mysql -h 127.0.0.1 -P 4000 -u root -e "SELECT 1"
|
|
|
|
# From host (HTTP)
|
|
curl http://127.0.0.1:10080/status
|
|
|
|
# From container to TiDB
|
|
docker run --rm --network tidb-network pingcap/dm:latest \
|
|
wget -q -O- http://tidb:10080/status
|
|
```
|
|
|
|
### Reset everything:
|
|
```bash
|
|
# Stop and remove everything (including data)
|
|
docker compose down -v
|
|
|
|
# Start fresh
|
|
./start.sh
|
|
```
|
|
|
|
## Getting Help
|
|
|
|
### Collect diagnostic information:
|
|
```bash
|
|
# Create a diagnostic report
|
|
echo "=== Docker Version ===" > diagnostic.txt
|
|
docker --version >> diagnostic.txt
|
|
docker compose version >> diagnostic.txt
|
|
|
|
echo -e "\n=== Container Status ===" >> diagnostic.txt
|
|
docker ps -a >> diagnostic.txt
|
|
|
|
echo -e "\n=== TiDB Logs ===" >> diagnostic.txt
|
|
docker logs tidb --tail=50 >> diagnostic.txt 2>&1
|
|
|
|
echo -e "\n=== DM Worker Logs ===" >> diagnostic.txt
|
|
docker logs dm-worker --tail=50 >> diagnostic.txt 2>&1
|
|
|
|
echo -e "\n=== DM Init Logs ===" >> diagnostic.txt
|
|
docker logs dm-init >> diagnostic.txt 2>&1
|
|
|
|
echo -e "\n=== Network Info ===" >> diagnostic.txt
|
|
docker network inspect tidb-network >> diagnostic.txt
|
|
|
|
echo "Report saved to diagnostic.txt"
|
|
```
|
|
|
|
### Useful resources:
|
|
- [TiDB Documentation](https://docs.pingcap.com/tidb/stable)
|
|
- [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable)
|
|
- Project documentation:
|
|
- [README.md](README.md)
|
|
- [SYNC_GUIDE.md](SYNC_GUIDE.md)
|
|
- [DATAGRIP_SETUP.md](DATAGRIP_SETUP.md)
|
|
- [DOCKER_COMPOSE_V2.md](DOCKER_COMPOSE_V2.md)
|
|
|
|
## Still Having Issues?
|
|
|
|
If none of these solutions work:
|
|
|
|
1. Check logs: `docker compose logs`
|
|
2. Create diagnostic report (see above)
|
|
3. Check if it's a known issue in TiDB/DM GitHub issues
|
|
4. Verify your environment meets prerequisites (see [README.md](README.md))
|