Recently because of the epidemic, stole a lazy good for a long time, now finally starting to see continued Flink of the SQL
————————————————
Flink items on the computer already upgraded to 1.10, and also recently Tell me what network the new document, taking advantage of the weekend to experience the new version of SQL API (step on pit).
Direct from the previous SQL sample Flink cloud evil Gangster start (pom has a good finishing ahead).
Recall from simple, Kafka is received from the user behavior, in accordance with time of the packet, seeking PV and UV, and then output to the mysql.
Look at adding dependence:
<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table</artifactId> <version>${flink.version}</version> <type>pom</type> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java-bridge_2.11</artifactId> <version>${flink.version}</version> </dependency> <!-- or... --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-scala-bridge_2.11</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-common</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-scala_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner-blink_2.11</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner_2.11</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-jdbc_2.11</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-csv</artifactId> <version>${flink.version}</version> </dependency>
table associated with these, several new dependency attention, such as: flink-jdbc_2.11-1.10.0.jar
Look corresponding sql file:
--sourceTable CREATE TABLE user_log ( user_id VARCHAR, item_id VARCHAR, category_id VARCHAR, behavior VARCHAR, ts TIMESTAMP(3) ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal', 'connector.topic' = 'user_behavior', 'connector.startup-mode' = 'earliest-offset', 'connector.properties.0.key' = 'zookeeper.connect', 'connector.properties.0.value' = 'venn:2181', 'connector.properties.1.key' = 'bootstrap.servers', , 'connector.properties.1.value' = 'friend: 9092''update-mode' = 'append', 'format.type' = 'json', 'format.derive-schema' = 'true' ); --sinkTable CREATE TABLE pvuv_sink ( dt VARCHAR, pv BIGINT, uv BIGINT ) WITH ( 'connector.type' = 'jdbc', 'connector.url' = 'jdbc:mysql://venn:3306/venn', 'connector.table' = 'pvuv_sink', 'connector.username' = 'root', 'connector.password' = '123456', 'connector.write.flush.max-rows' = '1' ); --insert INSERT INTO pvuv_sink(dt, pv, uv) SELECT DATE_FORMAT(ts, 'yyyy-MM-dd HH:00') dt, COUNT(*) AS pv, COUNT(DISTINCT user_id) AS uv FROM user_log GROUP BY DATE_FORMAT(ts, 'yyyy-MM-dd HH:00');
carried out
The first problem encountered is: "Type TIMESTAMP (6) of table field 'ts' does not match with the physical type TIMESTAMP (3) of the 'ts' field of the TableSource return type"
The default look is TIMESTAMP TIMESTAMP (. 6), and the source TIMESTAMP ( "ts": "2017-11-26T01: 00: 01Z") does not match the data type directly to ts: TIMESTAMP (3), get.
If there is no other pit, can be executed directly, the data output to myql in the
After starting from the connector sql, and look at the next kafak, middle Flink 1.10 SQL, kafka only supports csv, json and avro three types. (Try the next json and csv)
Sql two programs, including read and write json, csn.
Directly above the table sink sql modified to write kafak:
--sinkTable CREATE TABLE user_log_sink ( dt VARCHAR, pv BIGINT, uv BIGINT ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal', 'connector.topic' = 'user_behavior_sink', 'connector.properties.zookeeper.connect' = 'venn:2181', 'connector.properties.bootstrap.servers' = 'venn:9092', 'update-mode' = 'append', 'format.type' = 'json' );
However, it can not be executed.
It reported the following error:
AppendStreamTableSink requires that Table has only insert changes.
WTF, the above 'update-mode' is clearly written ' the append '
Then, I started a while, and there is nothing Birds of operation: Tell me what network documentation, sql modify the configuration. .
Spent a lot of time here --------------- -----------------
Until the end, whim, directly to the contents of the output source of it, without any conversion:
--insert
INSERT INTO user_log_sink(dt, pv, uv)
SELECT user_id, item_id, category_id, behavior, ts
FROM user_log;
sink part also will be modified:
--sinkTable CREATE TABLE user_log_sink ( user_id VARCHAR, item_id VARCHAR, category_id VARCHAR, behavior VARCHAR, ts TIMESTAMP(3) ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal', 'connector.topic' = 'user_behavior_sink_1', 'connector.properties.zookeeper.connect' = 'venn:2181', 'connector.properties.bootstrap.servers' = 'venn:9092', 'update-mode' = 'append', 'format.type' = 'json' );
Like, well, a. .
Hey, the official website and other documentation, read, and you should know why (note: such as know to add)
Then began the final hole.
When writing csv, he met the last pit, previous versions, "flink-shaded-jackso" I have been using "2.7.9-3.0," but there was not CsvSchame, so there has been this error:
Caused by: java.lang.ClassNotFoundException: org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema$Builder
The flink-shaded-jackso version replaced flink code version of "2.9.8-7.0"
Basically, it is the smooth completion of the write json kafka connector and the csv.
And finally paste the full SQL:
--sourceTable CREATE TABLE user_log( user_id VARCHAR, item_id VARCHAR, category_id VARCHAR, behavior VARCHAR, ts TIMESTAMP(3) ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal', 'connector.topic' = 'user_behavior', 'connector.properties.zookeeper.connect' = 'venn:2181', 'connector.properties.bootstrap.servers' = 'venn:9092', 'connector.startup-mode' = 'earliest-offset', 'format.type' = 'json' # 'format.type' = 'csv' ); --sinkTable CREATE TABLE user_log_sink ( user_id VARCHAR, item_id VARCHAR, category_id VARCHAR, behavior VARCHAR, ts TIMESTAMP(3) ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal', 'connector.topic' = 'user_behavior_sink', 'connector.properties.zookeeper.connect' = 'venn:2181', 'connector.properties.bootstrap.servers' = 'venn:9092', 'update-mode' = 'append', # 'format.type' = 'json' 'format.type' = 'csv' ); --insert INSERT INTO user_log_sink(dt, pv, uv) SELECT user_id, item_id, category_id, behavior, ts FROM user_log;
SQL-related files uploaded to the github: Flink-rookic , depend in pom.xml also updated.
Long time no write, recently will simply try again the SQL kafak / mysql / hbase / es / file / hdfs and other connector, and then try something else of SQL