Kettle Doris Plugin
Kettle Doris Pluginβ
Kettle Doris Plugin is used to write data from other data sources to Doris through Stream Load in Kettle.
This plug-in uses the Stream Load function of Doris to import data. It needs to be used in conjunction with the Kettle service.
About Kettleβ
Kettle is an open source ETL (Extract, Transform, Load) tool, first developed by Pentaho, Kettle is one of the core components of the Pentaho product suite, mainly used for data integration and data processing, and can easily complete the tasks of extracting data from various sources, cleaning and transforming data, and loading it into the target system.
For more information, please refer to: https://pentaho.com/
User Manualβ
Download Kettle and installβ
Kettle download address: https://pentaho.com/download/#download-pentaho After downloading, unzip it and run spoon.sh to start kettle You can also compile it yourself, refer to the Compilation Chapter
Compile Kettle Doris Pluginβ
cd doris/extension/kettle
mvn clean package -DskipTests
After compiling, unzip the plug-in package and copy it to the plugins directory of kettle
cd assemblies/plugin/target
unzip doris-stream-loader-plugins-9.4.0.0-343.zip
cp -r doris-stream-loader ${KETTLE_HOME}/plugins/
mvn clean package -DskipTests
Build a jobβ
Find Doris Stream Loader in the batch loading in Kettle and build a job
Click Start Running the Job to complete data synchronization
Parameter Descriptionβ
Key | Default Value | Required | Comment |
---|---|---|---|
Step name | -- | Y | Step name |
fenodes | -- | Y | Doris FE http address, supports multiple addresses, separated by commas |
Database | -- | Y | Doris write database |
Target table | -- | Y | Doris's write table |
Username | -- | Y | Username to access Doris |
Password | -- | N | Password to access Doris |
Maximum number of rows for a single import | 10000 | N | Maximum number of rows for a single import |
Maximum bytes for a single import | 10485760 (10MB) | N | Maximum byte size for a single import |
Number of import retries | 3 | N | Number of retries after import failure |
StreamLoad properties | -- | N | Streamload request header |
Delete Mode | N | N | Whether to enable delete mode. By default, Stream Load performs insert operations. After the delete mode is enabled, all Stream Load writes are delete operations. |