tidbstandalone/SYNC_GUIDE.md

355 lines
6.8 KiB
Markdown

# Data Sync Guide
## How Sync Works
Your TiDB Data Migration (DM) setup continuously syncs data from your test environment to the local TiDB instance.
```
Test TiDB ──────┐
(DM reads changes)
DM Worker
(Applies to local)
Local TiDB
```
## Automatic Sync Setup
When you run `./start.sh`, the sync is **automatically configured and started**:
1. ✅ Reads your `.env` file for credentials and table list
2. ✅ Generates the DM task configuration
3. ✅ Configures the source connection (test TiDB)
4. ✅ Starts the sync task
5. ✅ Begins syncing data (full + incremental)
**You don't need to do anything manually!**
## Sync Modes
The sync is configured with `task-mode: "all"`:
- **Full sync**: Initial copy of all existing data
- **Incremental sync**: Continuous replication of changes (INSERT, UPDATE, DELETE)
## Managing Sync
### Easy Way (Recommended)
Use the [`sync-control.sh`](sync-control.sh) script:
```bash
# Check sync status
./sync-control.sh status
# Stop sync
./sync-control.sh stop
# Start sync
./sync-control.sh start
# Pause sync (temporarily)
./sync-control.sh pause
# Resume sync
./sync-control.sh resume
# Restart sync (stop + start)
./sync-control.sh restart
# Re-initialize configuration
./sync-control.sh reinit
```
### Advanced Way (dmctl)
Use `dmctl` directly:
```bash
# Check status
docker exec dm-master /dmctl --master-addr=dm-master:8261 query-status test-to-local
# Stop task
docker exec dm-master /dmctl --master-addr=dm-master:8261 stop-task test-to-local
# Start task
docker exec dm-master /dmctl --master-addr=dm-master:8261 start-task test-to-local
# Pause task
docker exec dm-master /dmctl --master-addr=dm-master:8261 pause-task test-to-local
# Resume task
docker exec dm-master /dmctl --master-addr=dm-master:8261 resume-task test-to-local
```
## Checking Sync Status
### Quick Check
```bash
./status.sh
```
This shows:
- Source configuration
- Task status (running, paused, stopped)
- Current sync position
- Error messages (if any)
- Local databases
### Detailed Status
```bash
./sync-control.sh status
```
### Verify Data Sync
Connect to local TiDB and verify:
```sql
-- Connect
mysql -h 127.0.0.1 -P 4000 -u root
-- Check databases
SHOW DATABASES;
-- Switch to your database
USE your_database;
-- Check tables
SHOW TABLES;
-- Verify row count
SELECT COUNT(*) FROM table1;
-- Compare with source (if you have access)
-- Run the same query on test environment
```
## Configuration Files
### Environment Variables (`.env`)
This is where you configure what to sync:
```bash
# Source database
TEST_DB_HOST=your-test-tidb-host
TEST_DB_PORT=4000
TEST_DB_USER=root
TEST_DB_PASSWORD=your-password
# What to sync
DATABASE_NAME=your_database
TABLES="table1,table2,table3"
```
### Task Template (`configs/task.yaml`)
This is just a **template for reference**. The actual task config is generated by [`scripts/init-dm.sh`](scripts/init-dm.sh).
### Source Config (`configs/source.yaml`)
Template for source database connection. Also generated dynamically.
## Common Scenarios
### Adding/Removing Tables
1. Edit `.env` and update `TABLES` variable:
```bash
TABLES="table1,table2,table3,new_table4"
```
2. Re-initialize:
```bash
./sync-control.sh reinit
```
### Changing Source Database
1. Edit `.env` with new credentials
2. Restart everything:
```bash
docker compose down
./start.sh
```
### Resetting Sync (Fresh Start)
```bash
# Stop and remove everything
docker compose down -v
# Start fresh
./start.sh
```
### Pausing Sync Temporarily
```bash
# Pause (without stopping containers)
./sync-control.sh pause
# Resume when ready
./sync-control.sh resume
```
## Monitoring
### View Logs
```bash
# All services
docker compose logs -f
# DM Worker only
docker compose logs -f dm-worker
# DM Master only
docker compose logs -f dm-master
# Init script logs
docker logs dm-init
```
### Check DM Master Status
```bash
docker exec dm-master /dmctl --master-addr=dm-master:8261 operate-source show
```
### Check DM Worker Status
```bash
docker ps | grep dm-worker
```
## Troubleshooting
### Sync Not Starting
**Check init logs:**
```bash
docker logs dm-init
```
**Common issues:**
- Wrong credentials in `.env`
- Test database not accessible
- Tables don't exist in source
**Solution:**
```bash
# Fix .env, then:
./sync-control.sh reinit
```
### Sync Stopped with Errors
**Check error message:**
```bash
./sync-control.sh status
```
**Common errors:**
- Network connectivity issues
- Permission denied on source
- Table schema mismatch
**Solution:**
```bash
# Fix the underlying issue, then:
./sync-control.sh restart
```
### Data Not Syncing
**Verify task is running:**
```bash
./sync-control.sh status
```
**Check if tables exist:**
```bash
# On source
mysql -h $TEST_DB_HOST -P 4000 -u root -p -e "SHOW TABLES FROM your_database;"
# On local
mysql -h 127.0.0.1 -P 4000 -u root -e "SHOW TABLES FROM your_database;"
```
**Compare row counts:**
```bash
# Create a verification script
mysql -h 127.0.0.1 -P 4000 -u root -e "SELECT COUNT(*) FROM your_database.table1;"
```
## Performance Tuning
### Adjust Sync Speed
Edit `docker-compose.yml`:
```yaml
dm-worker:
deploy:
resources:
limits:
cpus: '2' # Increase CPU
memory: 2G # Increase memory
```
Then restart:
```bash
docker compose down
docker compose up -d
```
### Monitor Resource Usage
```bash
docker stats
```
## Best Practices
1. **Always check status** after starting: `./status.sh`
2. **Monitor logs** during initial sync: `docker compose logs -f dm-worker`
3. **Verify data** in local TiDB after sync completes
4. **Use pause/resume** instead of stop/start for temporary halts
5. **Keep `.env` secure** - it contains credentials
6. **Test connectivity** before sync: `./test-connection.sh`
## FAQ
**Q: Is the sync real-time?**
A: Near real-time. Changes are replicated with minimal delay (usually seconds).
**Q: What happens if my laptop sleeps?**
A: Sync will resume automatically when containers restart.
**Q: Can I sync from multiple sources?**
A: Yes, but requires manual DM configuration. This setup is for single source.
**Q: Does it sync schema changes?**
A: Yes, DDL statements are replicated (CREATE, ALTER, DROP).
**Q: Can I sync to a different database name locally?**
A: Requires custom task configuration. Default syncs to same database name.
**Q: How do I exclude certain tables?**
A: Remove them from `TABLES` in `.env` and run `./sync-control.sh reinit`.
## See Also
- [README.md](README.md) - Main documentation
- [DATAGRIP_SETUP.md](DATAGRIP_SETUP.md) - Connect with GUI clients
- [scripts/init-dm.sh](scripts/init-dm.sh) - Initialization script
- [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable)