Big data server environment configuration
Foreword:
I am a newbie. I recently bought 3 cloud servers and wanted to build a big data environment myself. Because it is a trial server, and the student party has no money ~ So the configuration is not too high. I chose 3 servers with 1 core and 2G .
1. Operating system selection
Because it is a cloud server, there is less unnecessary trouble such as installing the system (a lot of things are saved in an instant). Under the selection is the Centos7.3 system.
2. Network card environment
Because I am using a cloud server, the network card does not need to be configured.
Assuming it is a virtual machine or a server you use yourself, you need to set the .NAT mode and modify the IP address. The details are not described here.
3. Download and install a series of auxiliary function software
These softwares are actually convenient for you to operate. If you are a god or you are not too annoying, you can use the server directly. The tools I choose are Xshell, Xftp, notepad++.
3.1 Install Xshell
3.1.1 Function
Xshell is a powerful security terminal emulation software, it supports SSH1, SSH2, and TELNET protocol of Microsoft Windows platform. Xshell's secure connection to remote hosts through the Internet and its innovative design and features help users enjoy their work in complex network environments.
Xshell can be used to access servers under different remote systems under the Windows interface, so as to better achieve the purpose of remote control of the terminal.
3.1.2. Download address
http://www.netsarang.com/products/xsh_overview.html
3.1.3 Installation
Install it! I installed it beforehand, so I won't take a screenshot of the teacher. Remember to install in your own familiar path.
3.1.4 Open configuration
name you like;
The host uses the server's ID or can configure the mapping itself on Windows to use the mapping name. Then it will be written in future notes;
port 22 ;
Click OK.
3.1.5 Enter account password
(Note: There will be a key verification when you log in for the first time, just select Yes)
3.1.6 Successful connection
3.2 Install Xftp
3.2.1 Function
It is a powerful SFTP and FTP file transfer software based on MS windows platform. With Xftp , MS windows users can securely transfer files between UNIX / Linux and Windows PCs . Xftp can meet the needs of both novice and advanced users. It uses standard Windows -style wizards, its simple interface works closely with other Windows applications, and it offers many powerful features for advanced users.
3.2.2. Download address
https://www.netsarang.com/products/xfp_overview.html
3.2.3 Installation is similar to Xshell , here is a picture
3.2.4 Successful connection
3.3 Install notepad++
3.2.1 Function
My usual laptop. It is a very unique editor.
3.2.2. Download address
https://notepad-plus-plus.org/
3.2.3 Installation I don't need to say more
3.2.4 Successful connection
4. Create a new ordinary user
We can't always use the root user to operate the server! In case one accidentally deletes something ~ hehe
Finally, add the same permissions as the root user for our new user
5. Modify the machine name
How can I say something like a machine name? Obsessive-compulsive disorder is impossible.
If we are in the Centos7 version, we can use the following line of code.
And in the Centos6 version, let me talk about it. Modify hostname under /etc/sysconfig/network
6. Modify the mapping
Write down all the machines you want to ping in the map !
7. Turn off the firewall
It only takes two lines of code to turn off the firewall on Centos7
Reference website: https://www.aliyun.com/jiaocheng/121592.html
When Centos6 shuts down, let me talk about it too
First close iptables
service iptables stop temporary shutdown
chkconfig iptables off boot does not start
Second, close selinux (sub-security system)
vi /etc/sysconfig/selinux
Inside set selinux=disabled
8. SSH key-free setting
These servers need to be connected to each other and it is impossible to always enter the password, including the local machine sometimes also needs to enter the password in the resourcemananger, to avoid trouble, so this must be configured.
7.1 Configuration
Enter ssh-keygen -t rsa and press Enter
id_rsa - "private key
id_rsa.pub -> public key
In all servers: enter ssh-copy-id server name
With a few desks to write a few desks
Note: You also have to send yourself the public and private keys!
Authorized_keys - "Save the public key to a file and copy it to other machines remotely for saving
known_hosts - "Record key information
7.2 Error solution:
If it does not take effect, delete all files in the .ssh directory and regenerate
Or delete the .ssh directory directly, the generation method ssh-keygen will generate the .ssh directory, do not use mkdir
7.3 After configuring SSH , you do not need to enter a password, you can directly start the service process of multiple nodes, for example: sbin/start-dfs.sh
9. Cluster node time synchronization
I use a cloud server, so the server time is automatically synchronized, and no configuration is required, hahahaha.
But there is a saying that time synchronization is very necessary.
There are many specific methods, and there are also tutorials on the Internet.
Then I wrote my previous tutorial on deploying vmware below.
1. Simulate the intranet environment
Find a server in the cluster as: time server
bigdata01 time server
bigdata02 and bigdata03 sync 01 this machine
2. Check the ntpd time service in Linux (only the ntpd service of the first machine is turned on here , the others do not need to be turned on)
sudo service ntpd status
sudo service ntpd start
3. Boot settings (set on the first unit, do not set others)
sudo chkconfig ntpd on
4. Modify system files
vi /etc/ntp.conf
[First place] Modify it to your own network segment, pay attention to remove the preceding # , it will take effect
# Hosts on local network are less restricted.
restrict 192.168.163.0 mask 255.255.255.0 nomodify notrap
[Second place] Since it is an intranet environment, there is no need to add services, so add a comment in front
#server 0.centos.pool.ntp.org
#server 1.centos.pool.ntp.org
#server 2.centos.pool.ntp.org
[The third place] Open the local service, pay attention to remove the preceding # , it will take effect
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
save document
5. After modifying the configuration file, it is recommended to restart the ntpd service and re-read the configuration
sudo service ntpd restart
6. View time service related commands
rpm -qa | grep ntp
ntpdate-4.2.4p8-3.el6.centos.x86_64 sync
ntp-4.2.4p8-3.el6.centos.x86_64 selects one as the time server
7. You can perform a synchronous operation test first
sudo /usr/sbin/ntpdate bigdata-01
The error is within two or three minutes, which is acceptable
8. Write crontab timing tasks and write them on the nodes that need to be synchronized (second and third)
##sync time
0-59/10 * * * * /usr/sbin/ntpdate fantai-01
10. Installation of Java environment
Big data frameworks are all written based on java ! So java is a must.
10.1 Download from domestic mirror sources
java8 http://www.linuxidc.com/Linux/2015-05/117967.htm
10.2 File transfer software such as XFTP
Transfer the installation package, decompress the JDK to the specified directory, the directory is arbitrary, it is recommended not to install it in a user's home directory
- I chose to install in the /opt directory
( Remember to modify the permissions first chown -R hadoop:hadoop /opt/ use your own permissions without root )
10.3 Adding environment variables
Modify the vi /etc/profile file and configure the jdk environment variables
#JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_161/
export PATH=$PATH:$JAVA_HOME/bin
10.4 Verify that the configuration is successful: java -version
The jps command can view the java process