355 lines
6.8 KiB
Markdown
355 lines
6.8 KiB
Markdown
# Data Sync Guide
|
|
|
|
## How Sync Works
|
|
|
|
Your TiDB Data Migration (DM) setup continuously syncs data from your test environment to the local TiDB instance.
|
|
|
|
```
|
|
Test TiDB ──────┐
|
|
│
|
|
(DM reads changes)
|
|
│
|
|
▼
|
|
DM Worker
|
|
│
|
|
(Applies to local)
|
|
│
|
|
▼
|
|
Local TiDB
|
|
```
|
|
|
|
## Automatic Sync Setup
|
|
|
|
When you run `./start.sh`, the sync is **automatically configured and started**:
|
|
|
|
1. ✅ Reads your `.env` file for credentials and table list
|
|
2. ✅ Generates the DM task configuration
|
|
3. ✅ Configures the source connection (test TiDB)
|
|
4. ✅ Starts the sync task
|
|
5. ✅ Begins syncing data (full + incremental)
|
|
|
|
**You don't need to do anything manually!**
|
|
|
|
## Sync Modes
|
|
|
|
The sync is configured with `task-mode: "all"`:
|
|
- **Full sync**: Initial copy of all existing data
|
|
- **Incremental sync**: Continuous replication of changes (INSERT, UPDATE, DELETE)
|
|
|
|
## Managing Sync
|
|
|
|
### Easy Way (Recommended)
|
|
|
|
Use the [`sync-control.sh`](sync-control.sh) script:
|
|
|
|
```bash
|
|
# Check sync status
|
|
./sync-control.sh status
|
|
|
|
# Stop sync
|
|
./sync-control.sh stop
|
|
|
|
# Start sync
|
|
./sync-control.sh start
|
|
|
|
# Pause sync (temporarily)
|
|
./sync-control.sh pause
|
|
|
|
# Resume sync
|
|
./sync-control.sh resume
|
|
|
|
# Restart sync (stop + start)
|
|
./sync-control.sh restart
|
|
|
|
# Re-initialize configuration
|
|
./sync-control.sh reinit
|
|
```
|
|
|
|
### Advanced Way (dmctl)
|
|
|
|
Use `dmctl` directly:
|
|
|
|
```bash
|
|
# Check status
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 query-status test-to-local
|
|
|
|
# Stop task
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 stop-task test-to-local
|
|
|
|
# Start task
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 start-task test-to-local
|
|
|
|
# Pause task
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 pause-task test-to-local
|
|
|
|
# Resume task
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 resume-task test-to-local
|
|
```
|
|
|
|
## Checking Sync Status
|
|
|
|
### Quick Check
|
|
|
|
```bash
|
|
./status.sh
|
|
```
|
|
|
|
This shows:
|
|
- Source configuration
|
|
- Task status (running, paused, stopped)
|
|
- Current sync position
|
|
- Error messages (if any)
|
|
- Local databases
|
|
|
|
### Detailed Status
|
|
|
|
```bash
|
|
./sync-control.sh status
|
|
```
|
|
|
|
### Verify Data Sync
|
|
|
|
Connect to local TiDB and verify:
|
|
|
|
```sql
|
|
-- Connect
|
|
mysql -h 127.0.0.1 -P 4000 -u root
|
|
|
|
-- Check databases
|
|
SHOW DATABASES;
|
|
|
|
-- Switch to your database
|
|
USE your_database;
|
|
|
|
-- Check tables
|
|
SHOW TABLES;
|
|
|
|
-- Verify row count
|
|
SELECT COUNT(*) FROM table1;
|
|
|
|
-- Compare with source (if you have access)
|
|
-- Run the same query on test environment
|
|
```
|
|
|
|
## Configuration Files
|
|
|
|
### Environment Variables (`.env`)
|
|
|
|
This is where you configure what to sync:
|
|
|
|
```bash
|
|
# Source database
|
|
TEST_DB_HOST=your-test-tidb-host
|
|
TEST_DB_PORT=4000
|
|
TEST_DB_USER=root
|
|
TEST_DB_PASSWORD=your-password
|
|
|
|
# What to sync
|
|
DATABASE_NAME=your_database
|
|
TABLES="table1,table2,table3"
|
|
```
|
|
|
|
### Task Template (`configs/task.yaml`)
|
|
|
|
This is just a **template for reference**. The actual task config is generated by [`scripts/init-dm.sh`](scripts/init-dm.sh).
|
|
|
|
### Source Config (`configs/source.yaml`)
|
|
|
|
Template for source database connection. Also generated dynamically.
|
|
|
|
## Common Scenarios
|
|
|
|
### Adding/Removing Tables
|
|
|
|
1. Edit `.env` and update `TABLES` variable:
|
|
```bash
|
|
TABLES="table1,table2,table3,new_table4"
|
|
```
|
|
|
|
2. Re-initialize:
|
|
```bash
|
|
./sync-control.sh reinit
|
|
```
|
|
|
|
### Changing Source Database
|
|
|
|
1. Edit `.env` with new credentials
|
|
2. Restart everything:
|
|
```bash
|
|
docker compose down
|
|
./start.sh
|
|
```
|
|
|
|
### Resetting Sync (Fresh Start)
|
|
|
|
```bash
|
|
# Stop and remove everything
|
|
docker compose down -v
|
|
|
|
# Start fresh
|
|
./start.sh
|
|
```
|
|
|
|
### Pausing Sync Temporarily
|
|
|
|
```bash
|
|
# Pause (without stopping containers)
|
|
./sync-control.sh pause
|
|
|
|
# Resume when ready
|
|
./sync-control.sh resume
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### View Logs
|
|
|
|
```bash
|
|
# All services
|
|
docker compose logs -f
|
|
|
|
# DM Worker only
|
|
docker compose logs -f dm-worker
|
|
|
|
# DM Master only
|
|
docker compose logs -f dm-master
|
|
|
|
# Init script logs
|
|
docker logs dm-init
|
|
```
|
|
|
|
### Check DM Master Status
|
|
|
|
```bash
|
|
docker exec dm-master /dmctl --master-addr=dm-master:8261 operate-source show
|
|
```
|
|
|
|
### Check DM Worker Status
|
|
|
|
```bash
|
|
docker ps | grep dm-worker
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Sync Not Starting
|
|
|
|
**Check init logs:**
|
|
```bash
|
|
docker logs dm-init
|
|
```
|
|
|
|
**Common issues:**
|
|
- Wrong credentials in `.env`
|
|
- Test database not accessible
|
|
- Tables don't exist in source
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Fix .env, then:
|
|
./sync-control.sh reinit
|
|
```
|
|
|
|
### Sync Stopped with Errors
|
|
|
|
**Check error message:**
|
|
```bash
|
|
./sync-control.sh status
|
|
```
|
|
|
|
**Common errors:**
|
|
- Network connectivity issues
|
|
- Permission denied on source
|
|
- Table schema mismatch
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Fix the underlying issue, then:
|
|
./sync-control.sh restart
|
|
```
|
|
|
|
### Data Not Syncing
|
|
|
|
**Verify task is running:**
|
|
```bash
|
|
./sync-control.sh status
|
|
```
|
|
|
|
**Check if tables exist:**
|
|
```bash
|
|
# On source
|
|
mysql -h $TEST_DB_HOST -P 4000 -u root -p -e "SHOW TABLES FROM your_database;"
|
|
|
|
# On local
|
|
mysql -h 127.0.0.1 -P 4000 -u root -e "SHOW TABLES FROM your_database;"
|
|
```
|
|
|
|
**Compare row counts:**
|
|
```bash
|
|
# Create a verification script
|
|
mysql -h 127.0.0.1 -P 4000 -u root -e "SELECT COUNT(*) FROM your_database.table1;"
|
|
```
|
|
|
|
## Performance Tuning
|
|
|
|
### Adjust Sync Speed
|
|
|
|
Edit `docker-compose.yml`:
|
|
|
|
```yaml
|
|
dm-worker:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '2' # Increase CPU
|
|
memory: 2G # Increase memory
|
|
```
|
|
|
|
Then restart:
|
|
```bash
|
|
docker compose down
|
|
docker compose up -d
|
|
```
|
|
|
|
### Monitor Resource Usage
|
|
|
|
```bash
|
|
docker stats
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Always check status** after starting: `./status.sh`
|
|
2. **Monitor logs** during initial sync: `docker compose logs -f dm-worker`
|
|
3. **Verify data** in local TiDB after sync completes
|
|
4. **Use pause/resume** instead of stop/start for temporary halts
|
|
5. **Keep `.env` secure** - it contains credentials
|
|
6. **Test connectivity** before sync: `./test-connection.sh`
|
|
|
|
## FAQ
|
|
|
|
**Q: Is the sync real-time?**
|
|
A: Near real-time. Changes are replicated with minimal delay (usually seconds).
|
|
|
|
**Q: What happens if my laptop sleeps?**
|
|
A: Sync will resume automatically when containers restart.
|
|
|
|
**Q: Can I sync from multiple sources?**
|
|
A: Yes, but requires manual DM configuration. This setup is for single source.
|
|
|
|
**Q: Does it sync schema changes?**
|
|
A: Yes, DDL statements are replicated (CREATE, ALTER, DROP).
|
|
|
|
**Q: Can I sync to a different database name locally?**
|
|
A: Requires custom task configuration. Default syncs to same database name.
|
|
|
|
**Q: How do I exclude certain tables?**
|
|
A: Remove them from `TABLES` in `.env` and run `./sync-control.sh reinit`.
|
|
|
|
## See Also
|
|
|
|
- [README.md](README.md) - Main documentation
|
|
- [DATAGRIP_SETUP.md](DATAGRIP_SETUP.md) - Connect with GUI clients
|
|
- [scripts/init-dm.sh](scripts/init-dm.sh) - Initialization script
|
|
- [TiDB DM Documentation](https://docs.pingcap.com/tidb-data-migration/stable)
|