# Data Sync Guide ## How Sync Works Your TiDB Data Migration (DM) setup continuously syncs data from your test environment to the local TiDB instance. ``` Test TiDB ──────┐ │ (DM reads changes) │ ▼ DM Worker │ (Applies to local) │ ▼ Local TiDB ``` ## Automatic Sync Setup When you run `./start.sh`, the sync is **automatically configured and started**: 1. ✅ Reads your `.env` file for credentials and table list 2. ✅ Generates the DM task configuration 3. ✅ Configures the source connection (test TiDB) 4. ✅ Starts the sync task 5. ✅ Begins syncing data (full + incremental) **You don't need to do anything manually!** ## Sync Modes The sync is configured with `task-mode: "all"`: - **Full sync**: Initial copy of all existing data - **Incremental sync**: Continuous replication of changes (INSERT, UPDATE, DELETE) ## Managing Sync ### Easy Way (Recommended) Use the [`sync-control.sh`](sync-control.sh) script: ```bash # Check sync status ./sync-control.sh status # Stop sync ./sync-control.sh stop # Start sync ./sync-control.sh start # Pause sync (temporarily) ./sync-control.sh pause # Resume sync ./sync-control.sh resume # Restart sync (stop + start) ./sync-control.sh restart # Re-initialize configuration ./sync-control.sh reinit ``` ### Advanced Way (dmctl) Use `dmctl` directly: ```bash # Check status docker exec dm-master /dmctl --master-addr=dm-master:8261 query-status test-to-local # Stop task docker exec dm-master /dmctl --master-addr=dm-master:8261 stop-task test-to-local # Start task docker exec dm-master /dmctl --master-addr=dm-master:8261 start-task test-to-local # Pause task docker exec dm-master /dmctl --master-addr=dm-master:8261 pause-task test-to-local # Resume task docker exec dm-master /dmctl --master-addr=dm-master:8261 resume-task test-to-local ``` ## Checking Sync Status ### Quick Check ```bash ./status.sh ``` This shows: - Source configuration - Task status (running, paused, stopped) - Current sync position - Error messages (if any) - Local databases ### Detailed Status ```bash ./sync-control.sh status ``` ### Verify Data Sync Connect to local TiDB and verify: ```sql -- Connect mysql -h 127.0.0.1 -P 4000 -u root -- Check databases SHOW DATABASES; -- Switch to your database USE your_database; -- Check tables SHOW TABLES; -- Verify row count SELECT COUNT(*) FROM table1; -- Compare with source (if you have access) -- Run the same query on test environment ``` ## Configuration Files ### Environment Variables (`.env`) This is where you configure what to sync: ```bash # Source database TEST_DB_HOST=your-test-tidb-host TEST_DB_PORT=4000 TEST_DB_USER=root TEST_DB_PASSWORD=your-password # What to sync DATABASE_NAME=your_database TABLES="table1,table2,table3" ``` ### Task Template (`configs/task.yaml`) This is just a **template for reference**. The actual task config is generated by [`scripts/init-dm.sh`](scripts/init-dm.sh). ### Source Config (`configs/source.yaml`) Template for source database connection. Also generated dynamically. ## Common Scenarios ### Adding/Removing Tables 1. Edit `.env` and update `TABLES` variable: ```bash TABLES="table1,table2,table3,new_table4" ``` 2. Re-initialize: ```bash ./sync-control.sh reinit ``` ### Changing Source Database 1. Edit `.env` with new credentials 2. Restart everything: ```bash docker compose down ./start.sh ``` ### Resetting Sync (Fresh Start) ```bash # Stop and remove everything docker compose down -v # Start fresh ./start.sh ``` ### Pausing Sync Temporarily ```bash # Pause (without stopping containers) ./sync-control.sh pause # Resume when ready ./sync-control.sh resume ``` ## Monitoring ### View Logs ```bash # All services docker compose logs -f # DM Worker only docker compose logs -f dm-worker # DM Master only docker compose logs -f dm-master # Init script logs docker logs dm-init ``` ### Check DM Master Status ```bash docker exec dm-master /dmctl --master-addr=dm-master:8261 operate-source show ``` ### Check DM Worker Status ```bash docker ps | grep dm-worker ``` ## Troubleshooting ### Sync Not Starting **Check init logs:** ```bash docker logs dm-init ``` **Common issues:** - Wrong credentials in `.env` - Test database not accessible - Tables don't exist in source **Solution:** ```bash # Fix .env, then: ./sync-control.sh reinit ``` ### Sync Stopped with Errors **Check error message:** ```bash ./sync-control.sh status ``` **Common errors:** - Network connectivity issues - Permission denied on source - Table schema mismatch **Solution:** ```bash # Fix the underlying issue, then: ./sync-control.sh restart ``` ### Data Not Syncing **Verify task is running:** ```bash ./sync-control.sh status ``` **Check if tables exist:** ```bash # On source mysql -h $TEST_DB_HOST -P 4000 -u root -p -e "SHOW TABLES FROM your_database;" # On local mysql -h 127.0.0.1 -P 4000 -u root -e "SHOW TABLES FROM your_database;" ``` **Compare row counts:** ```bash # Create a verification script mysql -h 127.0.0.1 -P 4000 -u root -e "SELECT COUNT(*) FROM your_database.table1;" ``` ## Performance Tuning ### Adjust Sync Speed Edit `docker-compose.yml`: ```yaml dm-worker: deploy: resources: limits: cpus: '2' # Increase CPU memory: 2G # Increase memory ``` Then restart: ```bash docker compose down docker compose up -d ``` ### Monitor Resource Usage ```bash docker stats ``` ## Best Practices 1. **Always check status** after starting: `./status.sh` 2. **Monitor logs** during initial sync: `docker compose logs -f dm-worker` 3. **Verify data** in local TiDB after sync completes 4. **Use pause/resume** instead of stop/start for temporary halts 5. **Keep `.env` secure** - it contains credentials 6. **Test connectivity** before sync: `./test-connection.sh` ## FAQ **Q: Is the sync real-time?** A: Near real-time. Changes are replicated with minimal delay (usually seconds). **Q: What happens if my laptop sleeps?** A: Sync will resume automatically when containers restart. **Q: Can I sync from multiple sources?** A: Yes, but requires manual DM configuration. This setup is for single source. **Q: Does it sync schema changes?** A: Yes, DDL statements are replicated (CREATE, ALTER, DROP). **Q: Can I sync to a different database name locally?** A: Requires custom task configuration. Default syncs to same database name. **Q: How do I exclude certain tables?** A: Remove them from `TABLES` in `.env` and run `./sync-control.sh reinit`. ## See Also - [README.md](README.md) - Main documentation - [DATAGRIP_SETUP.md](DATAGRIP_SETUP.md) - Connect with GUI clients - [scripts/init-dm.sh](scripts/init-dm.sh) - Initialization script - [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable)