flink读写MySQL的两种方式

 

目前跑通的读写MySQL的方式有三种,一种是直接使用flink自带的JDBCInputFormat和JDBCOutputFormat,一种是自定义source和sink,最后一种是通过DDL连接MySQL进行读写(但是这种只在idea调试通了,打包上传后运行报错,因此比较的时候只比较前两种)。

引入依赖

 
  1. <!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->

  2. <dependency>

  3. <groupId>mysql</groupId>

  4. <artifactId>mysql-connector-java</artifactId>

  5. <version>8.0.17</version>

  6. </dependency>

  7. <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-jdbc -->

  8. <dependency>

  9. <groupId>org.apache.flink</groupId>

  10. <artifactId>flink-jdbc_2.12</artifactId>

  11. <version>1.10.0</version>

  12. </dependency>

方式一:使用自带的JDBCInputFormat和JDBCOutputFormat

 
  1. public class ReadWriteMysqlByJDBC {

  2. public static void main(String[] args) throws Exception {

  3. ExecutionEnvironment fbEnv = ExecutionEnvironment.getExecutionEnvironment();

  4. //需要与获取的字段一一对应,否则会取不到值

  5. TypeInformation[] fieldTypes = new TypeInformation[] {

  6. BasicTypeInfo.STRING_TYPE_INFO,

  7. BasicTypeInfo.STRING_TYPE_INFO,

  8. BasicTypeInfo.STRING_TYPE_INFO,

  9. BasicTypeInfo.STRING_TYPE_INFO,

  10. BasicTypeInfo.STRING_TYPE_INFO,

  11. BasicTypeInfo.STRING_TYPE_INFO,

  12. BasicTypeInfo.STRING_TYPE_INFO};

  13. RowTypeInfo rowTypeInfo = new RowTypeInfo(fieldTypes);

  14.  
  15. //读mysql

  16. DataSet<Row> dataSource = fbEnv.createInput(JDBCInputFormat.buildJDBCInputFormat()

  17. .setDrivername("com.mysql.jdbc.Driver")

  18. .setDBUrl("jdbc:mysql://xxx/xxx")

  19. .setUsername("xxx")

  20. .setPassword("xxx")

  21. .setQuery("xxx")

  22. .setRowTypeInfo(rowTypeInfo)

  23. .finish());

  24.  
  25. //写MySQL

  26. dataSource.output(JDBCOutputFormat.buildJDBCOutputFormat()

  27. .setDrivername("com.mysql.jdbc.Driver")

  28. .setDBUrl("jdbc:mysql://xxx/xxx")

  29. .setUsername("xxx")

  30. .setPassword("xxx")

  31. .setQuery("xxx")

  32. .finish());

  33.  
  34. fbEnv.execute();

  35. }

  36. }

设置并行度为2,show plan如下:

方式二:自定义source和sink

source

 
  1. public class MysqlSource extends RichSourceFunction<SourceVo> {

  2. private static final Logger logger = LoggerFactory.getLogger(MysqlSource.class);

  3.  
  4. private Connection connection = null;

  5. private PreparedStatement ps = null;

  6.  
  7. @Override

  8. public void open(Configuration parameters) throws Exception {

  9. super.open(parameters);

  10. Class.forName("com.mysql.jdbc.Driver");//加载数据库驱动

  11. connection = DriverManager.getConnection("jdbc:mysql://xxx/xxx", "xxx", "xxx");//获取连接

  12. ps = connection.prepareStatement("xxx");

  13. }

  14.  
  15. @Override

  16. public void run(SourceContext<SourceVo> ctx) throws Exception {

  17. try {

  18. ResultSet resultSet = ps.executeQuery();

  19.  
  20. while (resultSet.next()) {

  21. SourceVo vo = new SourceVo();

  22. vo.setxxx(resultSet.getString("xxx"));

  23. ctx.collect(vo);

  24. }

  25. } catch (Exception e) {

  26. logger.error("runException:{}", e);

  27. }

  28. }

  29.  
  30. @Override

  31. public void cancel() {

  32. try {

  33. super.close();

  34. if (connection != null) {

  35. connection.close();

  36. }

  37. if (ps != null) {

  38. ps.close();

  39. }

  40. } catch (Exception e) {

  41. logger.error("runException:{}", e);

  42. }

  43. }

  44. }

sink

 
  1. public class MysqlSink extends RichSinkFunction<SourceVo> {

  2. private Connection connection;

  3. private PreparedStatement preparedStatement;

  4. @Override

  5. public void open(Configuration parameters) throws Exception {

  6. super.open(parameters);

  7. // 加载JDBC驱动

  8. Class.forName("com.mysql.jdbc.Driver");

  9. // 获取数据库连接

  10. connection = DriverManager.getConnection("jdbc:mysql://xxx/xxx", "xxx", "xxx");//获取连接

  11. preparedStatement = connection.prepareStatement("xxx");

  12. super.open(parameters);

  13. }

  14.  
  15. @Override

  16. public void close() throws Exception {

  17. super.close();

  18. if(preparedStatement != null){

  19. preparedStatement.close();

  20. }

  21. if(connection != null){

  22. connection.close();

  23. }

  24. super.close();

  25. }

  26.  
  27. @Override

  28. public void invoke(SourceVo value, Context context) throws Exception {

  29. try {

  30. preparedStatement.setString(1,value.getxxx());

  31. preparedStatement.executeUpdate();

  32. }catch (Exception e){

  33. e.printStackTrace();

  34. }

  35. }

  36. }

main

 
  1. public static void main(String[] args) throws Exception {

  2. StreamExecutionEnvironment fsEnv = StreamExecutionEnvironment.getExecutionEnvironment();

  3. DataStreamSource<SourceVo> source = fsEnv.addSource(new MysqlSource());

  4. source.addSink(new MysqlSink());

  5. fsEnv.execute();

  6. }

设置并行度为2,show plan如下: 

方式三:通过DDL读写mysql

 
  1. public class ReadWriteMysqlByDDL {

  2. public static void main(String[] args) throws Exception {

  3. StreamExecutionEnvironment streamEnv = StreamExecutionEnvironment.getExecutionEnvironment();

  4. EnvironmentSettings fsSettings = EnvironmentSettings.newInstance().useOldPlanner().inStreamingMode().build();

  5. StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(streamEnv,fsSettings);

  6. String sourceTable ="CREATE TABLE sourceTable (\n" +

  7. " FTableName VARCHAR,\n" +

  8. " FECName VARCHAR\n" +

  9. ") WITH (\n" +

  10. " 'connector.type' = 'jdbc', -- 使用 jdbc connector\n" +

  11. " 'connector.url' = 'jdbc:mysql://xxx/xxx', -- jdbc url\n" +

  12. " 'connector.table' = 'xxx', -- 表名\n" +

  13. " 'connector.username' = 'xxx', -- 用户名\n" +

  14. " 'connector.password' = 'xxx', -- 密码\n" +

  15. " 'connector.write.flush.max-rows' = '1' -- 默认5000条,为了演示改为1条\n" +

  16. ")";

  17. tableEnvironment.sqlUpdate(sourceTable);

  18. String sinkTable ="CREATE TABLE sinkTable (\n" +

  19. " FID VARCHAR,\n"+

  20. " FRoomName VARCHAR\n" +

  21. ") WITH (\n" +

  22. " 'connector.type' = 'jdbc', -- 使用 jdbc connector\n" +

  23. " 'connector.url' = 'jdbc:mysql://xxx/xxx', -- jdbc url\n" +

  24. " 'connector.table' = 'xxx', -- 表名\n" +

  25. " 'connector.username' = 'xxx', -- 用户名\n" +

  26. " 'connector.password' = 'xxx, -- 密码\n" +

  27. " 'connector.write.flush.max-rows' = '1' -- 默认5000条,为了演示改为100条\n" +

  28. ")";

  29.  
  30. tableEnvironment.sqlUpdate(sinkTable);

  31. String query = "SELECT FTableName as tableName,FECName as ecName FROM sourceTable";

  32. Table table = tableEnvironment.sqlQuery(query);

  33. table.filter("tableName === 'xxx'").select("'1',ecName").insertInto("sinkTable");

  34.  
  35. streamEnv.execute();

  36. }

  37. }

奇怪的是这种方式打包上传show plan的时候报错:

 
  1. 2020-05-22 12:09:48,198 WARN org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler - Configuring the job submission via query parameters is deprecated. Please migrate to submitting a JSON request instead.

  2. 2020-05-22 12:09:48,201 WARN org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler - Configuring the job submission via query parameters is deprecated. Please migrate to submitting a JSON request instead.

  3. 2020-05-22 12:09:48,201 WARN org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler - Configuring the job submission via query parameters is deprecated. Please migrate to submitting a JSON request instead.

  4. 2020-05-22 12:09:48,372 ERROR org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler - Unhandled exception.

  5. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: SQL validation failed. findAndCreateTableSource failed.

  6. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)

  7. at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)

  8. at org.apache.flink.client.program.OptimizerPlanEnvironment.getPipeline(OptimizerPlanEnvironment.java:79)

  9. at org.apache.flink.client.program.PackagedProgramUtils.getPipelineFromProgram(PackagedProgramUtils.java:101)

  10. at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:56)

  11. at org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toJobGraph(JarHandlerUtils.java:128)

  12. at org.apache.flink.runtime.webmonitor.handlers.JarPlanHandler.lambda$handleRequest$1(JarPlanHandler.java:100)

  13. at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)

  14. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

  15. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

  16. at java.lang.Thread.run(Thread.java:748)

  17. Caused by: org.apache.flink.table.api.ValidationException: SQL validation failed. findAndCreateTableSource failed.

  18. at org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:130)

  19. at org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)

  20. at org.apache.flink.table.sqlexec.SqlToOperationConverter.convert(SqlToOperationConverter.java:124)

  21. at org.apache.flink.table.planner.ParserImpl.parse(ParserImpl.java:66)

  22. at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:464)

  23. at connector.mysql.ReadWriteMysqlByDDL.main(ReadWriteMysqlByDDL.java:44)

  24. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  25. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

  26. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

  27. at java.lang.reflect.Method.invoke(Method.java:498)

  28. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)

  29. ... 10 more

  30. Caused by: org.apache.flink.table.api.TableException: findAndCreateTableSource failed.

  31. at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:55)

  32. at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:92)

  33. at org.apache.flink.table.catalog.DatabaseCalciteSchema.convertCatalogTable(DatabaseCalciteSchema.java:138)

  34. at org.apache.flink.table.catalog.DatabaseCalciteSchema.convertTable(DatabaseCalciteSchema.java:97)

  35. at org.apache.flink.table.catalog.DatabaseCalciteSchema.lambda$getTable$0(DatabaseCalciteSchema.java:86)

  36. at java.util.Optional.map(Optional.java:215)

  37. at org.apache.flink.table.catalog.DatabaseCalciteSchema.getTable(DatabaseCalciteSchema.java:76)

  38. at org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83)

  39. at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:289)

  40. at org.apache.calcite.sql.validate.EmptyScope.resolve_(EmptyScope.java:143)

  41. at org.apache.calcite.sql.validate.EmptyScope.resolveTable(EmptyScope.java:99)

  42. at org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203)

  43. at org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:105)

  44. at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:177)

  45. at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)

  46. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1008)

  47. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:968)

  48. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3122)

  49. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3104)

  50. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3376)

  51. at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)

  52. at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)

  53. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1008)

  54. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:968)

  55. at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216)

  56. at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:943)

  57. at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:650)

  58. at org.apache.flink.table.calcite.FlinkPlannerImpl.validateInternal(FlinkPlannerImpl.scala:126)

  59. ... 20 more

  60. Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in

  61. the classpath.

  62.  
  63. Reason: Required context properties mismatch.

  64.  
  65. The following properties are requested:

  66. connector.password=xxx

  67. connector.table=xxx

  68. connector.type=jdbc

  69. connector.url=jdbc:mysql://xxx/xxx

  70. connector.username=xxx

  71. connector.write.flush.max-rows=100

  72. schema.0.data-type=VARCHAR(2147483647)

  73. schema.0.name=FTableName

  74. schema.1.data-type=VARCHAR(2147483647)

  75. schema.1.name=FECName

  76.  
  77. The following factories have been considered:

  78. org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory

  79. org.apache.flink.table.sources.CsvBatchTableSourceFactory

  80. org.apache.flink.table.sources.CsvAppendTableSourceFactory

  81. at org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:322)

  82. at org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:190)

  83. at org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:143)

  84. at org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:96)

  85. at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:52)

  86. ... 47 more

  87.  

暂时不知道是什么原因

比较

  1. 如果不设置并行度,则两种方式的source和sink并行度都是1(默认)
  2. 如果设置多并行度,则方式一的source会采用设置的并行度读数据,会造成数据重复读,而方式二不会
  3. 对于第一种方式,读取的数据以ROW类型返回,且写入时也必须为ROW类型,不方便使用pojo 
  4. 方式一返回的是DataSet,方式二返回的是DataStreamSource
  5. 方式一需要为每个字段指定类型BasicTypeInfo,如果字段比较多,则不太方便

猜你喜欢

转载自blog.csdn.net/wangshuminjava/article/details/108123589