Solr cluster version environment construction

Introduction to SolrCloud

SolrCloud is a distributed search solution provided by Solr, which is used when large-scale fault tolerance, distributed indexing and retrieval capabilities are required. When the number of indexes in a system is small, SolrCloud does not need to be used; when the number of indexes is large and the search request concurrency is high, SolrCloud needs to be used.

SolrCloud is a distributed search solution based on Solr and Zookeeper. Its main idea is to use Zookeeper as the configuration information center of the cluster.

It has several special features:

1) Centralized configuration information

2) Automatic fault tolerance

3) Near real-time search

4) Automatic load balancing when querying

Introduction to Zookeeper

As the name suggests, zookeeper is a zookeeper. He is an administrator used to manage hadoop (elephant), Hive (bee), and pig (pig). The distributed clusters of Apache Hbase and Apache Solr use zookeeper. Zookeeper is a distributed, open source program coordination service, a sub-project under the hadoop project.

zookeeper: cluster management tool, cluster entry.

  1. Cluster management

Master-slave management, load balancing, and high-availability management. The entrance to the cluster. Zookeeper must be a cluster to ensure high availability. Zookeeper has the mechanism of election and voting. There should be at least three nodes in the cluster.

  1. Centralized management of configuration files

When building a solr cluster, you need to upload the Solr configuration file to zookeeper, so that zookeeper can manage it uniformly. Each node fetches the configuration file from zookeeper.

  1. Distributed lock

SolrCloud Structure

In order to reduce the processing pressure of the click, SolrCloud needs to complete the index search task jointly by the polymorphic server. The idea of ​​implementation is to split the index data into shards (shards). Each shard is jointly completed by a polymorphic server. When an index or search request comes, the index will be operated from different shard servers.

SolrCloud requires Solr to be deployed based on Zookeeper. Zookeeper is a cluster management software. Since SolrCloud needs to be composed of multiple servers, zookeeper is used for coordination and management.

Construction steps Zookeeper cluster construction

Step 1: You need to upload the installation package of zookeeper-3.4.6.tar.gz to the server.

Step 2: Unzip zookeeper: tar –zxvfzookeeper-3.4.6.tar.gz

Step 3: Copy zookeeper three copies to the /usr/local/solr-cloud directory.

Step 3: Configure zookeeper.

1. Create a data folder in the zookeeper01 directory.

2. Create a myid file in the data directory

3. The content of myid is 1 (02 corresponds to "2", 03 corresponds to "3")

Create a myid file in the data folder, and in the myid file, write the content 1

4.zookeeper02, 03 and so on.

5. Enter the conf file and rename the zoo_sample.cfg file to zoo.cfg; or copy a copy of zoo_sample.cfg and name it zoo.cfg

6. Modify zoo.cfg and specify the dataDir= attribute as the newly created data folder.

(This demo path: /usr/local/solr-cloud/zookeeper01/data/)

7. Modify zoo.cfg and specify clientPort as a non-conflicting port number (01:2181, 02:2182, 03:2183)

(2181: This port is the client connection port

8. Add the following to zoo.cfg:

server.1=192.168.33.10:2881:3881

server.2=192.168.33.10:2882:3882

server.3=192.168.33.10:2883:3883

(This port is the server connection port and cannot be repeated)

Step 4: Start zookeeper.

There is a bin directory in the Zookeeper directory. Start the zookeeper service using zkServer.sh.

Start command: ./zkServer.sh start

Another way to start zookeeper: ./zkServer.shstart-foreground //This startup method can view the log of zookeeper

Shutdown command: ./zkServer.sh stop

View service status: ./zkServer.sh status

Solr cluster building

Step 1: Install four tomcats, and modify their port numbers to avoid conflicts. 8080~8083

The port numbers are modified to: 8005-8006-8007-8008

8080-8081-8082-8083

8009-8010-8011-8012

Step 2: Deploy solr to tomcat. Copy the stand-alone version of the solr project to tomcat.

(Copy the solr under the stand-alone version of solr/tomcat/webapps to solr-cloud/tomcat01/webapps/)

Step 3: Create a solrhome for each solr instance.

Copy the solrhome under the stand-alone version of solr/ to the solr-cloud/ directory and name them solrhome01-solrhome04

Step 4: Associate the corresponding solrhome for each solr instance. Modify web.xml (associate solrhome for tomcat)

Modify the solr-cloud/tomcat01/webapps/solr/WEB-INF/web.xml file

Step 5: Modify the solr.xml file under each solrhome. Modify the two properties of host and hostPort. The corresponding ip and port numbers, respectively.

host: the ip address where the current instance is running

port: the port number on which the current instance is running, which is the port number of the current tomcat

Step 6: Upload the configuration file to zookeeper. Need to use

The /root/solr-4.10.3/example/scripts/cloud-scripts/zkcli.sh command uploads the configuration file.

Upload the /usr/local/solr-cloud/solrhome01/collection1/conf directory to zookeeper.

Requires that the zookeeper cluster has been started.

(upload any one)

./zkcli.sh -zkhost 192.168.33.10:2181,192.168.33.10:2182,192.168.33.10:2183 -cmd upconfig -confdir /usr/local/solr-cloud/solrhome01/collection1/conf -confname myconf

Step 7: Check whether the upload is successful.

Use zookeeper's zkcli.sh command.

zkcli.sh exists in the location: /usr/local/solr-cloud/zookeeper01/bin

(Note: zookeeper only needs to connect to any node, it will do)

Step 8: Tell solr the location of the zookeeper instance. Need to modify tomcat's catalina.sh to add (tomcat/bin/catalina.sh)

JAVA_OPTS="-DzkHost=192.168.33.10:2181,192.168.33.10:2182,192.168.33.10:2183"

Add this sentence before using JAVA_OPTS:

Every node needs to be added.

Step 9: Start each Solr instance.

Using a script, start all solr instances

vi start-all.sh

View the startup log: tail -f tomcat01/logs/catalina.out

Check whether the solrCloud is successfully built, use the browser to view solr, click Cloud to see the solr cluster

Step 10: Cluster sharding.

Divide the cluster into two shards, each with two replicas. (numShards: divided into two pieces; replicationFactor: two copies)

Run the link below in your browser:

http://192.168.33.10:8080/solr/admin/collections?action=CREATE&name=collection2&numShards=2&replicationFactor=2

When you see success, it means success.

Split result:

Step 11: Delete unused collection1

Open the link below in your browser:

http://192.168.33.10:8080/solr/admin/collections?action=DELETE&name=collection1

When you see success, it means success.

The effect after deletion:

problem solved

Question 1: When viewing the zookeeper service status, an error is reported: Errorcontactingservice.Itisprobablynotrunning.

There is no problem with startup, but when viewing the status of the solr cluster, an error is always reported. Solr clearly shows that it is up, but when checking the status of solr, it prompts that it is not started:

Solution: It may be that it is really not started. Although it shows success, it will be closed immediately after success. It can be closed and then restarted.

Solutions found online:

Open zkServer.sh and find the status, that is, find the following sentence:

STAT="echo stat | nc localhost$(grep clientPort "$ZOOCFG" | sed -e "s/.*=//") 2> /dev/null| grep Mode"

Add -q 1 (the number 1 not the letter l) between nc and localhost, and remove it if it already exists. If this sentence is not present, add it.

 

Question 2: When zookeeper is started in log mode, an error is reported:

Unexpectedexception,exitingabnormally

java.net.BindException:Addressalreadyinuse

Reason: Port is occupied

Solution: 1. If you use ps aux|grep 2181, you cannot see that the port is occupied.

2. Use the lsof -i:2181 command to view the port usage

3. If you execute lsof -i:2181 command prompt: -bash:lsof:commandnotfound

4. Execute the yum install lsof command

5. Check the port occupancy and adjust the occupied port.

Question 3: After starting the solr cluster, the browser opens the solr link and reports an error:

Reason: As in question 1, it shows that zookeeper has been successfully started, but in fact, it has not really started successfully.

Workaround: Reboot

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327067914&siteId=291194637