kafka(14) 环境搭建（源码包方式）

官网地址：Apache Kafka 下载tar包

Kafka 是依赖于 Zookeeper 及 JDK 的所以在安装 Kafka 之前我们需要准备好 Zookeeper 及 JDK 环境。

下载完JDK、Zookeeper、Kafka后。

修改配置信息：

修改 Kafka 配置文件 config/server.properties

# 唯一标识 broker.id 该属性在 kafka 集群中必须是唯一 
broker.id=0
# kafka 部署的机器 ip 和提供服务的端口号 
listeners=PLAINTEXT://your.ip:9092
# kafka 的消息存储文件 
log.dir=/data/logs/kafka‐logs
# kafka 连接 zookeeper 的地址
zookeeper.connect=your.zk.ip:2181

Kafka 服务启动

基础语法 kafka-server-start.sh [-daemon] server.properties

配置文件 server.properties 是必须的，-daemon 表示可以后台运行

# 启动 kafka，运行日志在 logs 目录的 server.log 文件里 
bin/kafka‐server‐start.sh ‐daemon config/server.properties 
# 后台启动，不会打印日志到控制台 或者
bin/kafka‐server‐start.sh config/server.properties &

server.properties 部分核心配置解释

参数	默认值	描述
broker.id	0	每个 broker 使用一个唯一的非负整数进行标识，你可以自定义但需要保证唯一
log.dirs	/tmp/kafka-logs	存放数据路径，可配置多个，使用英文逗号分隔。每当创建新的 partition 会选择分区最少的路径进行创建
listeners	PLAINTEXT://:9092	server 接受客户端连接的端口，ip 配置 kafka 本机 ip 即可，如 PLAINTEXT://localhost:9092
zookeeper.connect	localhost:2181	zookeeper 注册中心连接地址，多个使用英文逗号分隔
log.retention.hours	168	每个日志文件删除之前保存的时间。默认数据保存时间对所有 topic 都一样。
num.partitions	1	topic 默认分区数
default.replication.factor	1	自动创建 topic 的默认副本数量，建议设置为大于等于2
min.insync.replicas	1	当 producer 设置 acks 为 -1 时，min.insync.replicas 用于指定 replicas 的最小同步数目（必须确认每一个 repica 的写数据都是成功的），如果这个数目没有达到，producer发送消息会产生异常
delete.topic.enable	false	是否允许删除主题

从命令行级别了解 Kafka 常用操作

创建主题

bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic codertnt

成功提示：

查询存在的 topic 主题

bin/kafka-topics.sh --list --bootstrap-server localhost:9092

删除主题

bin/kafka-topics.sh --delete --topic codertnt   --bootstrap-server localhost:9092

发送消息

bin/kafka-console-producer.sh --broker-list localhost:9092   --topic codertnt

再开一个console进行消费消息

发送端：

消费端：

多主题消费

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --whitelist  "codertnt|testtopic"

单播消费

开启两个属于同组的消费者，同一组别下只有一个消费者可以消费到消息

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=group1 --topic codertnt

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=group1 --topic codertnt

多播消费

不同组可以同时消费同一条消息 .模拟不同的groupId 代表多个消费者组

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=group1 --topic codertnt

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=group2 --topic codertnt

结果:

查看消费组消息偏移量

对于 Kafka 来说，它以组为单位来记录消息消费的偏移量来实现多播消息的消费，这样消息只需要存储一份即可

值得一提的是：消息的消费情况只与消费组绑定，与组下的消费者并无太大关系，即所有消费以组为准

bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group group1

current-offset：当前消费组的已消费偏移量
log-end-offset：主题对应分区消息的结束偏移量(HW)
lag：当前消费组未消费的消息数

查看分区详情

bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic codertnt

leader 该 partition 分区相对于备份来说负责所有读写请求。
replicas 表示某个 partition 在哪几个 broker 上存在备份。
isr 是 replicas 的一个子集，它只列出当前还存活着的，并且已同步备份了该 partition 的节点。

创建多分区主题

--partitions 2 设为2个分区

bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 2 --topic codertntmulti

分区扩容

bin/kafka-topics.sh -alter --partitions 3 --bootstrap-server localhost:9092 --topic codertnt

尝试启动集群
我们可以在一台机器上启动多个 Kafka broker 实例来模拟真实的集群，为此我们需要创建多个 server.properties 文件并更改部分配置：

拷贝配置文件

cp config/server.properties config/server-1.properties
cp config/server.properties config/server-2.properties

修改配置文件信息 server-1.properties

broker.id=1
listeners=PLAINTEXT://localhost:9093
log.dir=/tmp/kafka-logs/kafka-logs-1

修改配置文件信息 server-2.properties

broker.id=2
listeners=PLAINTEXT://locahost:9094
log.dir=/tmp/kafka-logs/kafka-logs-2

启动多个实例

bin/kafka-server-start.sh -daemon config/server-1.properties
bin/kafka-server-start.sh -daemon config/server-2.properties

查看你的 Kafka 集群是否都已注册成功

在集群环境下创建多分区主题、消费等

与上述其他都一致只是把broker list 修改为多个地址。

--broker-list localhost:9092,localhost:9093,localhost:9094

与之前的区别仅仅是增加了多台 Kafka 的地址而已。

---------------以下为官网安装 quick start doc部分---------

官网地址：Apache Kafka 下载tar包
参考官网quick start。Apache Kafka
具体的步骤为：PS.需要下英文功底
STEP 1: GET KAFKA

Download the latest Kafka release and extract it:
```
$ tar -xzf kafka_2.13-3.0.0.tgz
$ cd kafka_2.13-3.0.0
```
STEP 2: START THE KAFKA ENVIRONMENT

NOTE: Your local environment must have Java 8+ installed.

Run the following commands in order to start all services in the correct order:
```
# Start the ZooKeeper service
# Note: Soon, ZooKeeper will no longer be required by Apache Kafka.
$ bin/zookeeper-server-start.sh config/zookeeper.properties
```
Open another terminal session and run:
```
# Start the Kafka broker service
$ bin/kafka-server-start.sh config/server.properties
```
Once all services have successfully launched, you will have a basic Kafka environment running and ready to use.

STEP 3: CREATE A TOPIC TO STORE YOUR EVENTS

Kafka is a distributed event streaming platform that lets you read, write, store, and process events (also called records or messages in the documentation) across many machines.

Example events are payment transactions, geolocation updates from mobile phones, shipping orders, sensor measurements from IoT devices or medical equipment, and much more. These events are organized and stored in topics. Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder.

So before you can write your first events, you must create a topic. Open another terminal session and run:
```
$ bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1
```
All of Kafka's command line tools have additional options: run the kafka-topics.sh command without any arguments to display usage information. For example, it can also show you details such as the partition count of the new topic:
```
$ bin/kafka-topics.sh --describe --topic quickstart-events --bootstrap-server localhost:9092
Topic:quickstart-events  PartitionCount:1    ReplicationFactor:1 Configs:
    Topic: quickstart-events Partition: 0    Leader: 0   Replicas: 0 Isr: 0
```
STEP 4: WRITE SOME EVENTS INTO THE TOPIC

A Kafka client communicates with the Kafka brokers via the network for writing (or reading) events. Once received, the brokers will store the events in a durable and fault-tolerant manner for as long as you need—even forever.

Run the console producer client to write a few events into your topic. By default, each line you enter will result in a separate event being written to the topic.
```
$ bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
This is my first event
This is my second event
```
You can stop the producer client with Ctrl-C at any time.

STEP 5: READ THE EVENTS

Open another terminal session and run the console consumer client to read the events you just created:
```
$ bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092
This is my first event
This is my second event
```
You can stop the consumer client with Ctrl-C at any time.

Feel free to experiment: for example, switch back to your producer terminal (previous step) to write additional events, and see how the events immediately show up in your consumer terminal.

Because events are durably stored in Kafka, they can be read as many times and by as many consumers as you want. You can easily verify this by opening yet another terminal session and re-running the previous command again.

STEP 6: IMPORT/EXPORT YOUR DATA AS STREAMS OF EVENTS WITH KAFKA CONNECT

You probably have lots of data in existing systems like relational databases or traditional messaging systems, along with many applications that already use these systems. Kafka Connect allows you to continuously ingest data from external systems into Kafka, and vice versa. It is thus very easy to integrate existing systems with Kafka. To make this process even easier, there are hundreds of such connectors readily available.

Take a look at the Kafka Connect section to learn more about how to continuously import/export your data into and out of Kafka.

STEP 7: PROCESS YOUR EVENTS WITH KAFKA STREAMS

Once your data is stored in Kafka as events, you can process the data with the Kafka Streams client library for Java/Scala. It allows you to implement mission-critical real-time applications and microservices, where the input and/or output data is stored in Kafka topics. Kafka Streams combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology to make these applications highly scalable, elastic, fault-tolerant, and distributed. The library supports exactly-once processing, stateful operations and aggregations, windowing, joins, processing based on event-time, and much more.

To give you a first taste, here's how one would implement the popular WordCount algorithm:
```
KStream<String, String> textLines = builder.stream("quickstart-events");

KTable<String, Long> wordCounts = textLines
            .flatMapValues(line -> Arrays.asList(line.toLowerCase().split(" ")))
            .groupBy((keyIgnored, word) -> word)
            .count();

wordCounts.toStream().to("output-topic", Produced.with(Serdes.String(), Serdes.Long()));
```
The Kafka Streams demo and the app development tutorial demonstrate how to code and run such a streaming application from start to finish.

STEP 8: TERMINATE THE KAFKA ENVIRONMENT

Now that you reached the end of the quickstart, feel free to tear down the Kafka environment—or continue playing around.

  Stop the producer and consumer clients with Ctrl-C, if you haven't done so already.
        Stop the Kafka broker with Ctrl-C.
        Lastly, stop the ZooKeeper server with Ctrl-C.

If you also want to delete any data of your local Kafka environment including any events you have created along the way, run the command:

$ rm -rf /tmp/kafka-logs /tmp/zookeeper

CONGRATULATIONS!

You have successfully finished the Apache Kafka quickstart.

kafka(14) 环境搭建（源码包方式）

从命令行级别了解 Kafka 常用操作

猜你喜欢