Skip to content

Commit b562f81

Browse files
committed
Support tidb cdc connector source #7199
1 parent c859071 commit b562f81

File tree

37 files changed

+4184
-0
lines changed

37 files changed

+4184
-0
lines changed

config/plugin_config

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ connector-cdc-mongodb
2828
connector-cdc-sqlserver
2929
connector-cdc-postgres
3030
connector-cdc-oracle
31+
connector-cdc-tidb
3132
connector-clickhouse
3233
connector-datahub
3334
connector-dingtalk
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# TiDB CDC
2+
3+
> TiDB CDC source connector
4+
5+
## Support Those Engines
6+
7+
> SeaTunnel Zeta<br/>
8+
> Flink <br/>
9+
10+
## Key features
11+
12+
- [ ] [batch](../../concept/connector-v2-features.md)
13+
- [x] [stream](../../concept/connector-v2-features.md)
14+
- [x] [exactly-once](../../concept/connector-v2-features.md)
15+
- [ ] [column projection](../../concept/connector-v2-features.md)
16+
- [x] [parallelism](../../concept/connector-v2-features.md)
17+
- [ ] [support user-defined split](../../concept/connector-v2-features.md)
18+
19+
## Description
20+
21+
The TiDB CDC connector allows for reading snapshot data and incremental data from TiDB database. This document
22+
describes how to set up the TiDB CDC connector to snapshot data and capture streaming event in TiDB database.
23+
24+
## Supported DataSource Info
25+
26+
| Datasource | Supported versions | Driver | Url | Maven |
27+
|------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|----------------------------------|----------------------------------------------------------------------|
28+
| MySQL | <li> [MySQL](https://dev.mysql.com/doc): 5.5, 5.6, 5.7, 8.0.x </li><li> [RDS MySQL](https://www.aliyun.com/product/rds/mysql): 5.6, 5.7, 8.0.x </li> | com.mysql.cj.jdbc.Driver | jdbc:mysql://localhost:3306/test | https://mvnrepository.com/artifact/mysql/mysql-connector-java/8.0.28 |
29+
| tikv-client-java | 3.2.0 | - | - | https://mvnrepository.com/artifact/org.tikv/tikv-client-java/3.2.0 |
30+
31+
## Using Dependency
32+
33+
### Install Jdbc Driver
34+
35+
#### For Flink Engine
36+
37+
> 1. You need to ensure that the [jdbc driver jar package](https://mvnrepository.com/artifact/mysql/mysql-connector-java) and the [tikv-client-java jar package](https://mvnrepository.com/artifact/org.tikv/tikv-client-java/3.2.0) has been placed in directory `${SEATUNNEL_HOME}/plugins/`.
38+
39+
#### For SeaTunnel Zeta Engine
40+
41+
> 1. You need to ensure that the [jdbc driver jar package](https://mvnrepository.com/artifact/mysql/mysql-connector-java) and the [tikv-client-java jar package](https://mvnrepository.com/artifact/org.tikv/tikv-client-java/3.2.0) has been placed in directory `${SEATUNNEL_HOME}/lib/`.
42+
43+
Please download and put Mysql driver and tikv-java-client in `${SEATUNNEL_HOME}/lib/` dir. For example: cp mysql-connector-java-xxx.jar `$SEATNUNNEL_HOME/lib/`
44+
45+
## Data Type Mapping
46+
47+
| Mysql Data Type | SeaTunnel Data Type |
48+
|------------------------------------------------------------------------------------------------|---------------------|
49+
| BIT(1)<br/>TINYINT(1) | BOOLEAN |
50+
| TINYINT | TINYINT |
51+
| TINYINT UNSIGNED<br/>SMALLINT | SMALLINT |
52+
| SMALLINT UNSIGNED<br/>MEDIUMINT<br/>MEDIUMINT UNSIGNED<br/>INT<br/>INTEGER<br/>YEAR | INT |
53+
| INT UNSIGNED<br/>INTEGER UNSIGNED<br/>BIGINT | BIGINT |
54+
| BIGINT UNSIGNED | DECIMAL(20,0) |
55+
| DECIMAL(p, s) <br/>DECIMAL(p, s) UNSIGNED <br/>NUMERIC(p, s) <br/>NUMERIC(p, s) UNSIGNED | DECIMAL(p,s) |
56+
| FLOAT<br/>FLOAT UNSIGNED | FLOAT |
57+
| DOUBLE<br/>DOUBLE UNSIGNED<br/>REAL<br/>REAL UNSIGNED | DOUBLE |
58+
| CHAR<br/>VARCHAR<br/>TINYTEXT<br/>MEDIUMTEXT<br/>TEXT<br/>LONGTEXT<br/>ENUM<br/>JSON<br/>ENUM | STRING |
59+
| DATE | DATE |
60+
| TIME(s) | TIME(s) |
61+
| DATETIME<br/>TIMESTAMP(s) | TIMESTAMP(s) |
62+
| BINARY<br/>VARBINAR<br/>BIT(p)<br/>TINYBLOB<br/>MEDIUMBLOB<br/>BLOB<br/>LONGBLOB <br/>GEOMETRY | BYTES |
63+
64+
## Source Options
65+
66+
| Name | Type | Required | Default | Description |
67+
|------------------------------|---------|----------|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
68+
| base-url | String | Yes | - | The URL of the JDBC connection. Refer to a case: `jdbc:mysql://tidb0:4000/inventory`. |
69+
| username | String | Yes | - | Name of the database to use when connecting to the database server. |
70+
| password | String | Yes | - | Password to use when connecting to the database server. |
71+
| pd-addresses | String | Yes | - | TiKV cluster's PD address |
72+
| database-name | List | Yes | - | Database name of the database to monitor. |
73+
| table-name | List | Yes | - | Table name of the database to monitor. The table name needs to include the database name. |
74+
| startup.mode | Enum | No | INITIAL | Optional startup mode for TiDB CDC consumer, valid enumerations are `initial`, `earliest`, `latest` and `specific`. <br/> `initial`: Synchronize historical data at startup, and then synchronize incremental data.<br/> `earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup from the latest offset.<br/> `specific`: Startup from user-supplied specific offsets. |
75+
| tikv.grpc.timeout_in_ms | Long | No | - | TiKV GRPC timeout in ms. |
76+
| tikv.grpc.scan_timeout_in_ms | Long | No | - | TiKV GRPC scan timeout in ms. |
77+
| tikv.batch_get_concurrency | Integer | No | - | TiKV GRPC batch get concurrency |
78+
| tikv.batch_scan_concurrency | Integer | No | - | TiKV GRPC batch scan concurrency |
79+
80+
## Task Example
81+
82+
### Simple
83+
84+
```
85+
env {
86+
parallelism = 1
87+
job.mode = "STREAMING"
88+
checkpoint.interval = 5000
89+
}
90+
91+
source {
92+
# This is a example source plugin **only for test and demonstrate the feature source plugin**
93+
TiDB-CDC {
94+
result_table_name = "products_tidb_cdc"
95+
base-url = "jdbc:mysql://tidb0:4000/inventory"
96+
driver = "com.mysql.cj.jdbc.Driver"
97+
tikv.grpc.timeout_in_ms = 20000
98+
pd-addresses = "pd0:2379"
99+
username = "root"
100+
password = ""
101+
database-name = "inventory"
102+
table-name = "products"
103+
}
104+
}
105+
106+
transform {
107+
}
108+
109+
sink {
110+
jdbc {
111+
source_table_name = "products_tidb_cdc"
112+
url = "jdbc:mysql://tidb0:4000/inventory"
113+
driver = "com.mysql.cj.jdbc.Driver"
114+
user = "root"
115+
password = ""
116+
database = "inventory"
117+
table = "products_sink"
118+
generate_sink_sql = true
119+
primary_keys = ["id"]
120+
}
121+
}
122+
```
123+
124+
## Changelog
125+
126+
- Add TiDB CDC Source Connector
127+
128+
### next version
129+

0 commit comments

Comments
 (0)