Flink- acquaintance

As the company decided to use the framework as a priority Flink distributed computing technology framework later in the product, so recently took some time to learn Flink, as used herein, kafka as a data source.

  1. Flink installation: Flink go to the official website to download Flink components, I downloaded version is '? The Apache 1.8.1 Flink 2.11 for Scala '. After downloading extract to a local /usr/local/flink-1.8.1
  2. Flink Start: 2.1 cd /usr/local/flink-1.8.1/conf; 2.2 less flink -conf.yaml, found '. Jobmanager.rpc.address: ', replace it with a 'jobmanager.rpc.address: localhost' ; 2.3 cd /usr/local/flink-1.8.1/bin, start-cluster.sh, this time in the local Flink will start up, we open Flink UI in the browser: HTTP: // localhost: 8081
  3. Flink api use: Create a Java maven project, and introduced flink kafka core components in the pom file:
    <project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.damon</groupId>
        <artifactId>flink</artifactId>
        <version>0.0.1-SNAPSHOT</version>
        <packaging>jar</packaging>
    
        <name>flink</name>
        <url>http://maven.apache.org</url>
    
        <properties>
            <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
            <flink.version>1.4.1</flink.version>
            <deploy.dir>./target/flink/</deploy.dir>
        </properties>
    
        <dependencies>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-web</artifactId>
                <version>1.5.10.RELEASE</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-java</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-core</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-streaming-java_2.11</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-clients_2.11</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-connector-kafka-0.9_2.11</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>org.apache.flink</groupId>
                <artifactId>flink-runtime_2.11</artifactId>
                <version>${flink.version}</version>
            </dependency>
            <dependency>
                <groupId>junit</groupId>
                <artifactId>junit</artifactId>
                <version>3.8.1</version>
                <scope>test</scope>
            </dependency>
            <dependency>
                <groupId>com.google.code.gson</groupId>
                <artifactId>gson</artifactId>
                <version>2.8.5</version>
            </dependency>
            <dependency>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
                <version>1.2.17</version>
            </dependency>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-api</artifactId>
                <version>1.7.26</version>
            </dependency>
            <dependency>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
                <version>1.7.25</version>
                <scope>compile</scope>
            </dependency>
        </dependencies>
    
    
        <build>
            <finalName>flinkpackage</finalName>
            <sourceDirectory>src/main/java</sourceDirectory>
            <resources>
                <!-- 控制资源文件的拷贝 -->
                <resource>
                    <directory>src/main/resources</directory>
                    <targetPath>${project.build.directory}</targetPath>
                </resource>
            </resources>
            <plugins>
                <!-- 设置源文件编码方式 -->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
    <!--                    <defaultLibBundleDir>lib</defaultLibBundleDir>-->
                        <source>1.8</source>
                        <target>1.8</target>
                        <encoding>UTF-8</encoding>
                    </configuration>
                </plugin>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-shade-plugin</artifactId>
                    <version>1.2.1</version>
                    <executions>
                        <execution>
                            <phase>package</phase>
                            <goals>
                                <goal>shade</goal>
                            </goals>
                            <configuration>
                                <transformers>
                                    <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                        <mainClass>com.damon.flink.App</mainClass>
                                    </transformer>
                                    <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                        <resource>reference.conf</resource>
                                    </transformer>
                                </transformers>
                            </configuration>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </project>

    Add flink setup code and data source kafka consumption data in the main category:

    package com.damon.flink;
    
    import com.damon.flink.model.Student;
    import com.damon.flink.sink.StudentSink;
    import com.google.gson.Gson;
    import org.apache.flink.api.common.serialization.SimpleStringSchema;
    import org.apache.flink.streaming.api.TimeCharacteristic;
    import org.apache.flink.streaming.api.datastream.DataStream;
    import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
    import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09;
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    
    import java.util.Properties;
    
    public class App
    {
        private static Logger log = LoggerFactory.getLogger(App.class);
    
        private static Gson gson = new Gson();
    
        @SuppressWarnings({ "serial", "deprecation" })
        public static void main( String[] args ) throws Exception {
    
            String topic = "test.topic";
            final  StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
            env.enableCheckpointing(5000);
            env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
            Properties properties = new Properties();
            properties.setProperty("bootstrap.servers","localhost:9092");
            properties.setProperty("zookeeper.connect","localhost:2181");
            properties.setProperty("group.id","test-consumer-group");
            FlinkKafkaConsumer09<String> consumer09 = new FlinkKafkaConsumer09<String>(topic,new SimpleStringSchema(),properties);
    
            DataStream<String> kafkaStream = env.addSource(consumer09);
    
            DataStream<Student> studentStream = kafkaStream.map(stuent->gson.fromJson(stuent,Student.class)).keyBy("gender");
    
            studentStream.addSink(new StudentSink());
    
            env.execute("Flink Streaming Java API Skeleton");
            log.debug("Flink Started ...");
    
        }
    
    }

    Wherein the model Student categories:

    package com.damon.flink.model;
    
    public class Student {
        private String id;
        private String name;
        private String gender;
        private int age;
        private int score;
    
        public String getId() {
            return id;
        }
    
        public void setId(String id) {
            this.id = id;
        }
    
        public String getName() {
            return name;
        }
    
        public void setName(String name) {
            this.name = name;
        }
    
        public String getGender() {
            return gender;
        }
    
        public void setGender(String gender) {
            this.gender = gender;
        }
    
        public int getAge() {
            return age;
        }
    
        public void setAge(int age) {
            this.age = age;
        }
    
        public int getScore() {
            return score;
        }
    
        public void setScore(int score) {
            this.score = score;
        }
    }

    For consumption kafka custom message StudentSink:

    package com.damon.flink.sink;
    
    import com.damon.flink.model.Student;
    import org.apache.flink.configuration.Configuration;
    import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
    
    public class StudentSink extends RichSinkFunction<Student> {
    
        private static Logger log = LoggerFactory.getLogger(StudentSink.class);
    
        @Override
        public void open(Configuration parameters) throws Exception {
            super.open(parameters);
        }
    
        @Override
        public void close() throws Exception
        {
            super.close();
        }
    
        @Override
        public void invoke(Student value, Context context) throws Exception {
            log.info("Student : "+value.getName()+", Score : "+value.getScore());
        }
    }

    Start this project, and then use kafka send a message, you can see the message console consumed kafka in:

    16:32:24.386 [flink-akka.actor.default-dispatcher-4] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:29.387 [flink-akka.actor.default-dispatcher-5] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:34.388 [flink-akka.actor.default-dispatcher-7] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:39.386 [flink-akka.actor.default-dispatcher-7] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:43.967 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:43.043, Score : 67
    16:32:44.385 [flink-akka.actor.default-dispatcher-7] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:44.587 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:44.044, Score : 88
    16:32:45.413 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:45.045, Score : 93
    16:32:45.925 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:45.045, Score : 62
    16:32:46.339 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:46.046, Score : 51
    16:32:47.059 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:46.046, Score : 61
    16:32:47.370 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:47.047, Score : 86
    16:32:47.986 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:47.047, Score : 58
    16:32:48.392 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:48.048, Score : 66
    16:32:48.806 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:48.048, Score : 58
    16:32:49.320 [Sink: Unnamed (2/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:49.049, Score : 50
    16:32:49.388 [flink-akka.actor.default-dispatcher-7] DEBUG org.apache.flink.runtime.taskmanager.TaskManager - Sending heartbeat to JobManager
    16:32:49.735 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:49.049, Score : 62
    16:32:50.150 [Sink: Unnamed (7/12)] INFO com.damon.flink.sink.StudentSink - Student : DamonTest-2019-07-28 16:32:50.050, Score : 50
  4. Flink Upload Project: The above demo projects packaged into a jar package, upload it to the local flink platform, can be visualized uploaded on the previous flink ui, you can use a script to upload, here we use the bin directory under flink script upload:
    192:bin damon$ ./flink run -c com.damon.flink.App /Users/damon/Project/flink/flink/target/flink-0.0.1-SNAPSHOT.jar

    At this point we'll open flink ui interface, 'Running Jobs' We will be able to see the project:

    Kafka we continue to send the message, it can be seen that shows the amount of message bytes and message, and we can see in particular log message processing 'Task Managers':

    

At this point, we have a local flink + kafka the demo project even if it is done, the latter will continue to study flink deployment in yarn mode.

 

Guess you like

Origin www.cnblogs.com/DamonCoding/p/11259972.html
Recommended