CDH high availability configuration hadoop cluster performance



 

1, HDFS high availability configuration

dfs.namenode.edits.dir (NameNode edited directory): write directory on the local file system NameNode edit. Namenode not specified will be stored in the data directory.

dfs.journalnode.edits.dir (JournalNode edited directory): the local file system directory used to store NameNode edit. The directory where the node journalnode, storage editlog and other information.

               

               

                 

 

                   

 

 

                  

                   

 

 

 

 

 

 

2, YARN high-availability configuration

When using yarn HA, when the yarn run hive on the task can not be the outcome, and the following error

Caused by:javax.servlet.ServletException: Could not determine the proxy server for redirection

Problem: Unable to determine the proxy server for redirection

Solution: Disable YARN HA, namely ResourceManager using only one master node, in fact, general yarn HA can still run hive on yarn task and can be obtained normal results, but will still report the same error

                   

                   

 

                  

 

 

                    

 

 

The current operating environment is the case (node1, node2 are deployed ResourceManager) of YARN HA, perform the hive on spark of the program, although able to come to a successful outcome of the normal execution,

But the program still should log error: unable to determine for redirection proxy Could not determine the proxy server for redirection.

select * from test_tb;

select count(*) from test_tb;

insert into test_tb values(2,'ushionagisa');

 

 

                    

 

Guess you like

Origin www.cnblogs.com/Raodi/p/11460848.html