Cloud Computing and Big Data - Configuring and running WordCountTopology with Storm (nanny-level tutorial!)
Preface
The world today is in the rapid development stage of cloud computing and big data, and Storm, as an efficient and reliable real-time computing framework, has received widespread attention and applications. In this article, we'll start from scratch and provide a nanny-level tutorial to help configure and run WordCountTopology. Shows you how to configure a Storm environment. We'll detail the required software and tools and provide step-by-step guidance to help you through the installation and configuration process. I am also constantly working hard to improve, and I hope my blog posts can be helpful to everyone.
1.Preparation stage
1.1 Virtual machine
There are two virtual machines with IP addresses 192.168.95.20 and 192.168.95.21 respectively. The first node serves as the master node; the specific IP address and ordinary user name are handled accordingly according to your own situation. (The IP address should be changed according to your actual situation!)
All the information used in this article is on the network disk, 网盘链接
link
: https://pan.baidu.com/s/1MrzDAGUxqduU0HFNuTebdA?pwd=1234
Friends who need it can download it by themselves.
master 192.168.95.20
node1 192.168.95.22
node2 192.168.95.23
1.1 Create a normal user
on all nodes.
Create a normal user named iot (depending on your situation) and log in as iot.
Create useriot
useradd iot
Set the IoT user password for each node separately
passwd iot
Set iot user permissions on the master and node1 nodes respectively.
chmod -v u+w /etc/sudoers
vi /etc/sudoers
Add iot ALL=(ALL) ALL
and then wq! Save and exit.
After the setting is completed, reboot and select the iot user to log in.
Node2
1.2 Create directory
on all nodes.
$mkdir -p /opt/softwares //用于存放软件包
$mkdir -p /opt/modules //用于存放解压文件夹
Upload the installation package to the softwares directory
and distribute the /opt/softwares file under the master node to the node1 node
sudo scp -r /opt/softwares [email protected]:/opt
Node1 node views the transferred file
cd /opt
ls
cd softwares
ls -hl
node2
1.4 places the downloaded software in the /opt/softwares directory
on the main node. If you want to place it on the slave node later, please follow this.
2. Install dependent packages and software
2.1 Install dependent packages
on all nodes.
$ sudo yum -y install gcc-c++ uuid* libtool libuuid libuuid-devel
2.2 Install and configure JDK
2.2.1 Install JDK
2.2.1.1.1 Decompress on the master node
If this step has been installed in the previous chapter, it has been ignored.
2.2.1.2 Check whether Java is installed successfully.
Check java in Master:
java -version
View java in Node1:
java -version
View java in Node2:
Java -version
3.Install Zookeeper
Unless there are special instructions, please run the following steps on the master node.
3.1 Unzip the installation package
$tar -zxvf apache-zookeeper-3.6.3-bin.tar.gz -C /opt/modules
3.2 Configure Zookeeper
3.2.1 Edit the master node configuration file
$vi /opt/modules/apache-zookeeper-3.6.3-bin/conf/zoo.cfg
Add the following content at the end of the configuration file:
server.1=192.168.95.20:2888:3888
server.2=192.168.95.22:2888:3888
server.3=192.168.95.23:2888:3888
3.2.1myid file
3.2.1.1 Create zkData directory and myid file
on all nodes.
$mkdir -p /opt/modules/apache-zookeeper-3.6.3-bin/zkData/
$cd /opt/modules/apache-zookeeper-3.6.3-bin/zkData/
$touch myid
Master:
Node1:
Node2:
3.2.1.2 The first node
is performed on the first node.
$vi /opt/modules/apache-zookeeper-3.6.3-bin/zkData/myid
Type:
1
3.2.1.3 Second node
Proceed at the second node.
$vi /opt/modules/apache-zookeeper-3.6.3-bin/zkData/myid
Type:
2
3.2.1.4 Third node
Proceed at the third node.
$vi /opt/modules/apache-zookeeper-3.6.3-bin/zkData/myid
Type:
3
3.3 Configure Zookeeper environment variables
on all nodes.
$sudo vi /etc/profile
Make the following changes:
export ZOOKEEPER_HOME=/opt/modules/apache-zookeeper-3.6.3-bin
export PATH=$PATH:$ZOOKEEPER_HOME/bin
master:
Node1:
Node2
4. Install ZeroMQ
Performed on all nodes.
4.1 Unzip the package
cd /opt/softwares
rpm -ivh zeromq-4.3.4-37.5.src.rpm
cd ~/rpmbuild/SOURCES
tar -zxvf zeromq-4.3.4.tar.gz
4.2 Compile and install
$ cd zeromq-4.3.4/
$ ./autogen.sh
$./configure
$make
$sudo make install
4.3 Update dynamic link library
$sudo vi /etc/ld.so.conf
Append /usr/local/lib/ at the end //This directory stores the library files of JZMQ and ZeroMQ
sudo ldconfig //将库路径加载到内存
Reference: https://blog.51cto.com/u_2650279/6143472 ; https://www.656463.com/article/UbuntuxStormazdjfbs_3
5. Install JZMQ
Performed on all nodes.
5.1 Install git
$ sudo yum install -y git
5.2 Download JZMQ code
$cd /opt/softwares
$git clone https://github.com/zeromq/jzmq.git
Note: If fatal: unable to access 'https://github.com/zeromq/jzmq.git/': Failed connect to github.com:443; Connection refused
a problem occurs, you can use the following command to solve it: (Reference: https://blog.csdn.net/weixin_44442186/article/details/124979085 )
Cancel the global proxy:
git config --global --unset http.proxy
git config --global --unset https.proxy
Add global proxy:
git config --global http.proxy
git config --global https.proxy
Transfer the downloaded JZMQ code to the node1 and node2 nodes through scp.
5.3 Compile and install
directly copy and paste into the terminal:
$cd jzmq
$cd jzmq-jni
$./autogen.sh
$./configure
$make
$sudo make install
Master node:
Node1 node:
Node2 node:
6. Install Storm
Unless there are special instructions, please run the following steps on the master node.
6.1 Unzip the Storm installation package
. Unzip apache-storm-2.4.0.tar.gz to the /opt/modules directory (create it if this directory does not exist)
$ tar -zxvf apache-storm-2.4.0.tar.gz -C /opt/modules
6.2 Set the Storm configuration file
$vi /opt/modules/apache-storm-2.4.0/conf/storm.yaml
and make the following substitutions:
6.2.1 Replace 1
#storm.zookeeper.servers:
- "server1"
- "server2"
Replace with:
storm.zookeeper.servers:
- "192.168.95.20"
- "192.168.95.22"
- "192.168.95.23"
6.2.2 Replacement 2
nimbus.seeds: ["host1", "host2", "host3"]
Replace with:
nimbus.seeds: ["192.168.109.131", "192.168.109.132", "192.168.109.133"]
6.2.3 Set temporary paths
on all nodes.
storm.local.dir: "/tmp/storm"
//The premise is that this temporary path has been created.
Reference: https://blog.csdn.net/zjjcchina/article/details/120650514
6.3 Copy to slave node
Sudo chmod 777 /opt/modules
$scp -r /opt/modules/apache-storm-2.4.0/ [email protected]: /opt/modules/
$scp -r /opt/modules/apache-storm-2.4.0/ [email protected]: /opt/modules/
6.4 Configure Storm environment variables
to run on all nodes.
$sudo vi /etc/profile
Make the following changes:
#set storm environment
export STORM_HOME=/opt/modules/apache-storm-2.4.0
export PATH=$PATH:$STORM_HOME/bin
6.5 Make environment variables effective
and run on all nodes.
$source /etc/profile
7 Start Zookeeper.
This step needs to be performed before starting Storm.
Execute on all nodes.
Make sure the java environment has been added.
sudo vim /etc/profile
cd /opt/modules/apache-zookeeper-3.6.3-bin/bin
udo vi ./zkServer.sh
join in
export JAVA_HOME=/usr/lib/jvm/java-openjdk
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
$cd /opt/modules/apache-zookeeper-3.6.3-bin/bin
$./zkServer.sh start
The execution results and process are as follows:
7. Start Storm
It is recommended to use MobaXterm to remotely log in to the client, because the same node, such as 192.168.109.131, needs to start multiple services, and each service will not return to the shell interface after starting. At this time, you can use MobaXterm to open multiple terminals and start other services more conveniently.
8.1 Master node
8.1.1 Start the nimbus process
$storm nimbus
8.1.2 Start UI
$vi /opt/modules/apache-storm-2.4.0/conf/storm.yaml
The default port of storm ui is 8080, and this port is occupied by other processes (such as hadoop, etc.), we will change it to other ports.
Add:
ui.port: 19999 (note that there must be a space after the colon)
$storm ui
8.1.3 Start logviewer
$storm logviewer
8.1.4 Master node execution results and process.
View the process directly via jps:
jps
8.2
Start supervisor from node 8.2.1
$storm supervisor
8.2.2 Start logviewer
.$storm logviewer
9Storm Application Practice
9.1 Use Maven to manage storm-starter
on the master node.
9.1.1 Install Maven
9.1.1.1 Unzip
$tar -zxvf apache-maven-3.9.2-bin.tar.gz -C /opt/modules/
9.1.2 Configure Maven environment variables
9.1.2.1 Set Maven environment variables
$sudo vi /etc/profile
export MAVEN_HOME=/opt/modules/apache-maven-3.9.2
export PATH=$PATH:$MAVEN_HOME/bin
9.1.2.2 Making environment variables effective
$source /etc/profile
9.1.2.3 Test whether Maven is installed successfully
mvn -version
9.1.3 Use Maven to manage the sample project storm-starter
9.1.3.1 Modify the Maven configuration file
$cd /opt/modules/apache-maven-3.9.2/conf
$vi settings.xml
Comment out the following statements:
<mirror>
<id>maven-default-http-blocker</id>
<mirrorOf>external:http:*</mirrorOf>
<name>Pseudo repository to mirror external repositories initially using HTTP.</name>
<url>http://0.0.0.0/</url>
<blocked>true</blocked>
</mirror>
9.1.3.2 Enter the storm-starter directory
$cd /opt/modules/apache-storm-2.4.0/examples/storm-starter
9.1.3.3 Edit pom.xml file
$vi pom.xml
In the and tags add:
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.6.0</version>
<executions>
<execution>
<goals>
<goal>java</goal>
</goals>
</execution>
</executions>
<configuration>
<mainClass>storm.starter.WordCountTopology</mainClass>
</configuration>
</plugin>
9.1.3.4 Compile the storm-starter project
$ mvn clean install -DskipTests=true
This process will download a large number of files, please be patient...
9.1.3.5 Packaged into jar
$ mvn package
9.2 Submit to run
$cd /opt/modules/apache-storm-2.4.0/examples/storm-starter/target
$storm jar ./storm-starter-2.4.0.jar org.apache.storm.starter.WordCountTopology wordcountTpy
//Reference: https://blog.csdn.net/lt1693016523/article/details/82662071Successful
submission of running jar package
9.2.1UI monitoring
can be monitored after the task is submitted. Enter in your browser:
192.168.95.20:19999 //ip地址视自己情况改变
At this point, the configuration of Storm and running WordCountTopology is over. The process is a bit cumbersome, but you can solve problems step by step and improve your operation and maintenance capabilities in practice. No matter what we do, our main task cannot be interrupted. Whether you are taking the postgraduate entrance examination, the public examination or the teaching qualification examination, I hope you will not forget the main task and waste your energy and time on things that are not worth it. You will inevitably have regrets along the way. , but there is still a chance. I hope that one day we can hear our own "Congratulations" sooner or later. I wish you and me.
Today’s Blog Bgm - "Congratulations" Singer: Judy Hopps Album: Hope (Remix)