CDH Installation Part 4: Enable Kerberos Authentication

Enable Kerberos authentication

•          Install Kerberos

• Install and configure master KDC/Kerberos Server

Note: Kerberos Server can be any host on the same Hadoop cluster network

1. Install the software krb5-server, krb5-workstation required by KDC

yum install krb5-libs krb5-server krb5-workstation

 

View the installation list with the command rpm -qa|grep krb5

2. After the software installation is complete, first configure the /etc/krb5.conf file.

[libdefaults]

default_realm = EXAMPLE.COM #You need to configure here, modify the default EXAMPLE.COM to the value you want to define

[realms]

EXAMPLE.COM ={     

kdc = kerberos.example.com #The hostname is configured here

admin_server = kerberos.example.com #Same as above

}

3. Configure the /var/kerveros/krb5kdc/kdc.conf file

Notice:

[realms]

EXAMPLE.COM = {

#master_key_type = aes256-cts

acl_file = /var/kerberos/krb5kdc/kadm5.acl

dict_file = /usr/share/dict/words

admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab

supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal

}

Here is Example.COM consistent with the configuration in /etc/krb5.conf.

4. Create a Kerberos database

This step may take a long time, and a series of files will be generated under /var/kerberos/krb5kdc/ after creation. And you will be prompted to enter the database administrator's password.

kdb5_util create -r EXAMPLE.COM -s

Other operations:

Delete the Kerberos database. If the database is rebuilt, the principal related files under /var/kerberos/krb5kdc will be deleted first:

kdb5_util -r EXAMPLE.COM destroy

5. Create an administrator and enter the password [admin]. kadmin.local can be run directly on the KDC without Kerberos authentication.

/usr/sbin/kadmin.local -q "addprinc admin/admin"【admin】

Add ACL permissions to the database administrator, modify the kadm5.acl file, * represents all permissions

 cat /var/kerberos/krb5kdc/kadm5.acl

6. Set the kerberos service to start at boot and turn off the firewall

chkconfig krb5kdc on

chkconfig kadmin on

chkconfig iptables off

7. Start the krb5kdc and kadmind processes

/usr/sbin/kadmind

/usr/sbin/krb5kdc

or

service krb5kdc start

service kadmin start

service krb5kdc status

8. Check that Kerberos is running normally

kinit admin/admin

9. Use admin to log in to Kerberos

kinit admin/[email protected] [initialization certificate]

klist [View current certificate]

10. Use the kadmin.local tool to create a user, and use the command listprincs to check whether the user has been created successfully.

11. Use the administrator to create a keytab authentication file for the user

Execute under kadmin

addprinc -randkey [email protected]

xst -k service.keytab test

The default build directory is /tmp/

View keytab file

klist –k -t /etc/security/service.keytab

direct execution

ktadd -k /root/wangjy.keytab -norandkey [email protected]

 

This keytab file is equivalent to the user's long-term key, which can be used for account authentication on any host.

12. Use administrator to delete users

Execute under kadmin

delprinc -force [email protected]

 

• Install idap client

yum install openldap-clients

• Install Kerberos Client

Install Kerberos Client on other hosts in the cluster:

yum install krb5-libs krb5-server krb5-workstation

•          Enable Kerberos authentication for Hadoop environment

Note: The following steps are only for CDH5.5.X version of Hadoop

• Basic environment

1. Configure the KDC and its domains

2. Install openldap-clients on the Cloudera Manager Server host

3. Install krb5-workstation, krb5-libs on other nodes of the Hadoop cluster

4. Add the following configuration information to /var/kerberos/krb5kdc/kdc.conf on the host where the KDC is located

max_life = 1d 

max_renewable_life = 7d

kdc_tcp_ports = 88

5. If YARN-HA is turned on, you need to clear the relevant status in Zookeeper:

Stop YARN and format the State Store. This can be done through the Cloudera Manager page.

• installation steps

1. Enable conditions:

Set up a running KDC. Cloudera Manager supports MIT KDC and Active Directory.

The KDC should be configured to have a non-zero ticket lifetime and an updatable lifetime. CDH does not work properly if the ticket is not updatable.

If you want to use Active Directory, the OpenLdap client library should be installed on the Cloudera Manager Server host. Also, the Kerberos client library should be installed on all hosts.

After all the above conditions are confirmed, check Yes and go to the next step.

2. KDC information

3. KRB5 configuration

Whether to deploy krb5.conf to each node in the cluster

4. Import KDC Account Management Ticket

5. Configure the HDFS Datanode port

6. Successful start

• Possible errors

1、Communication failure with server while initializing kadmin interface

 

Reason :

The host specified for the admin server (also known as the master KDC) does not have the kadmind daemon running.

Workaround :

Make sure to specify the correct hostname for the master KDC. If the correct hostname is specified, make sure kadmind is running on the specified master KDC.

•          Turn off Kerberos authentication

• Close steps

1. Modify the hdfs configuration core-site.xml

hadoop.security.authentication -> Simple

hadoop.security.authorization -> false

dfs.datanode.address -> from 1004 (for Kerberos) to 50010 (default)

dfs.datanode.http.address -> from 1006 (for Kerberos) to 50075 (default)

2.hbase configuration

hbase.security.authentication -> simple

hbase.security.authorization -> false

3.zookeeper configuration

enableSecurity-> false

4.Hue configuration

Delete Kerberos Ticket Renewer instance

• Possible errors

 datanodes fail to start

Exception information:

java.io.IOException: Failed on local exception: java.net.SocketException: Permission denied; Host Details : local host is: "xxxxx"; destination host is: (unknown)

 

Workaround :

Restore the datanode's dfs.datanode.address to 50010 and dfs.datanode.http.address to 50075.

Namenode in Yarn is in standby state

•          Development machines connect to Kerberos

•          Windows

1.        Configure environment variables: USERDNSDOMAIN=HADOOP.COM

2.        Modify mapred.properties

#kerberos authentication configuration

hadoop.security.authentication=kerberos

#kerberos.file.path=/etc/krb5.conf

kerberos.file.path=E:/etc/security/keytab/krb5.conf

hdfs.user=hdfs

#mapreduceAuthenticating users

[email protected]

dfs.client.keytab.file=/etc/security/keytab/hebei.keytab

#hiveAuthenticating Users

hive.dfs.client.kerberos.principal=hiveuser/[email protected]

hive.dfs.client.keytab.file=E:/etc/security/keytab/hiveuser.keytab

 

FAQ

•          Cancel access control

Hadoop file system (HDFS) has permission control, similar to Linux permissions. If dfs.permissions is found in the test environment, uncheck this item

Note: The production environment must use HDFS with permission control

•          File backup modification

The Hadoop file system (HDFS) defaults to three copies of data to ensure the reliability of Hadoop. If it is a test environment or if the hard disk capacity is small, you can find the dfs.replication configuration and modify the number of backups (this number is an integer greater than 0). )

Note: The production environment must use more than three backups, depending on the data and hard disk capacity

•          Hadoop file system ( HDFS ) space is insufficient, increase the space

Insufficient space in the Hadoop file system (HDFS) will cause errors in programs such as MR and Hive. It is necessary to add storage to each device in the Hadoop cluster. After the operating system mounts the new storage, the configuration of some components in Hadoop must be modified.

On the CM management page (port 7180 of the master node), click HDFS to enter the HDFS page, and click Configure:

Search for the dfs.datanode.data.dir configuration item, click "+", and add a new storage directory /newdisk1/dfs/dn

Search for the hadoop.log.dir configuration item and change the original directory to the new storage directory /newdisk1/var/log/hadoop-mapreduce

On the CM management page (port 7180 of the main node), click YARN to enter the YARN page, click Configure, search for the following configuration items in turn, and modify them to a new storage directory:

Configuration item: yarn.nodemanager.local-dirs Storage directory: /newdisk1/yarn/nm

Configuration item: yarn.nodemanager.log-dirs Storage directory: /newdisk1/var/log/hadoop-yarn/container

Configuration item: hadoop.log.dir Storage directory: /newdisk1/var/log/hadoop-yarn

Configuration item: hadoop.log.dir Storage directory: /newdisk1/var/log/hadoop-yarn

•          If an error occurs or is interrupted during the installation process, and if you want to start the installation again, you can perform the following operations:

master node: shut down server and agent

   /opt/cm-5.5.0/etc/init.d/cloudera-scm-server stop

   /opt/cm-5.5.0/etc/init.d/cloudera-scm-agent stop

   rm -rf /opt/cloudera/parcel-cache

   rm -rf /opt/cloudera/parcel-parcels

Clear the database :

drop database scm;

Rebuild the database :

CREATE DATABASE scm OWNER scm ENCODING 'UTF8';

slave node : close the agent

 /opt/cm-5.5.0/etc/init.d/cloudera-scm-agent stop

 rm -rf /opt/coudera

Restart the service :

master node: open server, agent

   /opt/cm-5.5.0/etc/init.d/cloudera-scm-server start

   /opt/cm-5.5.0/etc/init.d/cloudera-scm-agent start

slave node: open agent

/opt/cm-5.5.0/etc/init.d/cloudera-scm-agent start

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325254287&siteId=291194637