Then we see how to synchronize the data in mysql to hdfs in real time
Preparations First, create a mysql table, and then start the hadoop cluster
Processors We need these processors, first query the data in mysql through the querydatabasetable processor, and then import the data in mysql into
Convertavrotojson processor, pay attention to querydatabasetable processor, the exported data is in avro format, and then use
convertavrotojson converts the avro format to json format, and then uses splitjson to cut the json data, extract the data in json, put it into the custom attribute of splitjson, and then
Use the puthdfs processor to extract the splitjson cut and extracted data, splice them into the puthdfs command, and then submit the data to hdfs
First look at the familiarity of this querydatabasetable processor, you can see
First, you need a database connection pooling service, you need a database connection pool
available here