Solr7.3 Cloud On HDFS build

1 Overview


SolrCloud is actually a solr cluster that relies on zk to achieve centralized configuration management. It has the characteristics of fault tolerance, horizontal expansion and high availability, and can perform automatic load balancing and fail-over processing on queries, and is suitable for large-scale distributed indexing and search. This article mainly introduces the build process of the latest version of solr on hdfs.


2 Environment


  • Centos7

  • JDK8

  • ZooKeeper

  • Hadoop 2.7


3 Solr installation


Use external zk to build solr cloud by installing solr on the following four nodes.

emr-worker-1
emr-worker-2
emr-worker-3
emr-worker-4

In the emr-worker-1node [download the installation package]. (http://www.apache.org/dyn/closer.lua/lucene/solr/7.3.1)

Unzip the service installation script.

tar xzf solr-7.3.1.tgz solr-7.3.1/bin/install_solr_service.sh --strip-components=2


Install solr service. The default installation directory in solr /optunder the startup configuration file solr.in.shinstalled /etc/default, the data configuration solr.xmlfiles and logs in the default installation /var/solr. The above installation directory can be install_solr_service.shconfigured with parameters during execution , and the details can be viewed through -help.


bash ./install_solr_service.sh solr-7.3.1.tgz -n

Configuration solr.in.shfile

SOLR_JAVA_MEM="-Xms4g -Xmx4g"GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
 -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
ZK_HOST="emr-header-1:2181,emr-header-2:2181,emr-header-3:2181/solr"SOLR_HOST="emr-worker-1"SOLR_TIMEZONE="UTC+8"SOLR_OPTS="$SOLR_OPTS -Dsolr.directoryFactory=HdfsDirectoryFactory \
-Dsolr.lock.type=hdfs \
-Dsolr.hdfs.home=hdfs://emr-header-1:8020/solr"


Note: SOLR_HOST needs to configure its own HOST NAME for each node

Configuration filesolr.xml

<str name="host">${host:emr-worker-1}</str>


Note: Configure each node's own HOST NAME

Then, install and configure on other nodes in turn. Finally, create a solr directory on zk

[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha, hbase]
[zk: localhost:2181(CONNECTED) 1] create /solr ''Created /solr
[zk: localhost:2181(CONNECTED) 2] ls /
[zookeeper, hadoop-ha, hbase, solr]


Start the solr service

for h in "emr-worker-1" "emr-worker-2" "emr-worker-3" "emr-worker-4"do
 ssh $h service solr start
done


To check the status of solr cloud, execute bin/solr status:

Found 1 Solr nodes:

Solr process 21527 running on port 8983{  "solr_home":"/var/solr/data",  "version":"7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:30:57",  "startTime":"2018-05-17T08:06:10.296Z",  "uptime":"0 days, 0 hours, 1 minutes, 46 seconds",  "memory":"112.8 MB (%2.9) of 3.8 GB",  "cloud":{    "ZooKeeper":"emr-header-1:2181,emr-header-2:2181,emr-header-3:2181/solr",    "liveNodes":"4",    "collections":"1"}
}


4 Collection operation


Switch to solr user

su solr

Create collection

bin/solr create_collection -c collection1 -shards 4 -replicationFactor 2

Collection health check

bin/solr healthcheck -c test_collection

Delete collection

bin/solr delete -c collection1



Guess you like

Origin blog.51cto.com/15060465/2679582