solr search engine_e-commerce search (centos service configuration)

Early understanding
1.lucence: jar package is composed of components. It is not a complete set of search engine services. It provides a
lot of abstractions in terms of concepts, but there is no specific implementation

. 2.Solr is a full-text search engine and a complete set of enterprise Level Distributed Search and Search Engine
is an enterprise level, full-text search for e-commerce platforms.
The well-known domestic e-commerce platform searches for various data nosql, the data does not change much like the people nearby are changing, after all, it is fixed.

Full-text search
Hit representation
Paging search
Dynamic clustering
Data integration
Rich text processing
High expansion It is doing real-time search, and the amount of data to be searched may change at will. It is more efficient than solr. It is like some social software to search for people nearby at any time. 1. Download and obtain solr and build our own solr search platform http://archive.apache.org /dist/lucene/solr solr-4.10.3.zip windows solr-4.10.3.tgz linux 2. Solr service overall structure http://www.cnblogs.com/HD/p/3977799.html 2.1 Run war
















solr-4.10.3/example/webapps/solr.war Run the webapp under tomcat to start the war package and get the solr file

2.2 Configure web.xml
Go back to the webapps directory of tomcat, open solr\WEB-INF\web.xml in Notepad document.
Add the following code: at the end of the <web-app /> node.
  <env-entry>
       <env-entry-name>solr/home</env-entry-name>
       <env-entry-value>/usr/local/solr-entry/solrhome</env-entry-value>
       <env -entry-type>java.lang.String</env-entry-type>
  </env-entry>


2.3 Configure the solrhome file
solr-4.10.3\example\solr, copy all the contents to the solr configuration module /usr/local/ solr-entry/solrhome


2.4 Copy the jar package
to open the folder: solr-4.10.3\example\lib\ext, copy all the jar packages to webapps\solr\WEB-INF\lib under the solr project of tomcat .


2.5 Run the solr service platform
localhost:


3. We use solr word segmentation (IKAnalyzer Chinese word segmentation device)
to have jar package support before word segmentation: ik word segmentation plugin,
IKAnalyzer2012FF_u1.jar extensible plugin toolkit
for word segmentation, this toolkit has an interface that supports lucene, so use it To do word segmentation, word segmentation Chinese sentence is judged according to subject-verb-object intelligence.

3.1. The IKAnalyzer2012FF_u1.jar package is placed under the solr project of [b]tomcat tomcat/webaaps/solr/WEB-INF/lib/IKAnalyzer2012FF_u1.jar [/b] Note: IKAnalyzer2012_u6.jar or other versions may have problems.


3.2. Retrieval configuration file
ex_stopword.dic Chinese lexicon
IKAnalyzer.cfg.xml extended configuration file mydict.dic Chinese lexicon
The
above three files are downloaded online and placed in the following path
tomcat/webaaps/solr/WEB-INF/classes/



3.3 Configuration solrhome configuration module, modify the schema.xml file

and the configuration of other tokenizers is roughly the same, add the following configuration between the <types></types> configuration items:
<!-- Configuring Chinese tokenizer fieldType TextField is different from lucene-->
<fieldType name="text_ik" class="solr.TextField">     
     <analyzer class="org.wltea.analyzer.lucene.IKAnalyzer"/>     
</fieldType>

Finally you can use text_ik
<!-- Configure Chinese word segmentation Field -->
<field name="hailong_name" type="text_ik" indexed="true" stored="true" />
<field name="product_name" type="text_ik" indexed="true" stored="true" />  

Another: solr default tokenizer
    <!-- A general text field that has reasonable, generic
         cross-language defaults: it tokenizes with StandardTokenizer,
	 removes stop words from case-insensitive "stopwords.txt"
	 (empty by default), and down cases.  At query time only, it
	 also applies synonyms. -->
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>


Intelligent word segmentation intelligently judges the subject, predicate and object of a sentence like a human being.
I like Bai Fumei => I | Like | White | Rich | Beauty
Frequently update thesaurus

Big data cloud computing: insight into opportunities (everyone's heart preferences are subject to A well-known e-commerce platform has analyzed, as long as it is their member)
Big Data Enterprise Index (directly owned)

4. Solr and e
-commerce platform use solr full-text retrieval module


Brief introduction
a. Simple retrieval
Page retrieval:
java solrj client retrieval
b. Complex retrieval
Simple
retrieval: The Dataimport function indexes all solr e-commerce data from the database solr to our real database.
Dataimport function maps various types of data to our service solr.
The id in the index (the address that directly owns the data), the query is very The data in the
database is not stored in the solr service,
4.1 Create a mapping
4.1.1 Add the jar package to the solr project of tomcat
solr-4.10.3/dist/solr-dataimporthandler-4.10. 3. jar is introduced into the solr service:

add it to apache-tomcat\webapps\solr\WEB-INF\lib

4.1.2 Configure solrconfig.xml (configure dataimport handler)
<!-- add by hailong 20170308 -->
   <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
	<lst name="defaults">
	     <!-- <str name="config">/usr/local/solr-entry/solrhome/conllection1/conf/mysql-data-config.xml</str>-->
		 <!-- Put it in the same directory -->
        <str name="config">mysql-data-config.xml</str>
    </lst>
  </requestHandler>

The above code configuration is added to /usr/local/solr-entry/solrhome/conllection1/conf/solrconfig.xml to
add the location according to your own habits, which is easy to find. I put it here under the last built-in handler and replication of the system.
Note: solrconfig.xml also requires mysql-connector-java-5.1.36.jar driver package


4.1.3 Add mysql-data-config.xml file (configure data source)
Add mysql-data-config.xml file, map to database
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
    <dataSource type="JdbcDataSource"
		driver="com.mysql.jdbc.Driver"
		url="jdbc:mysql://192.168.92.129:3306/solr_test"
		user="root"
		password="123456"/>
    <document>
        <entity name="product"  transformer="HTMLStripTransformer"
               query="SELECT pid, product_name, catalog_name, product_price, product_description, product_picture FROM product">
                <field column="pid" name="id" />
                <field column="product_name" name="product_name" />
		<field column="catalog_name" name="catalog_name" />
		<field column="product_price" name="product_price" />
		<field column="product_description" name="description" />
                <field column="product_picture" name="product_picture" />
        </entity>
    </document>
</dataConfig>

dataSource node configuration:
name: The name of the dataSource, the configuration file can have multiple datasources, which are distinguished by name.
type: data source type, such as JDBC
driver: database driver package, put it in the lib directory in advance
url: database connection url
<field> Field configuration:
column: database query column name
name: field in Schema.xml

File location:

column Corresponding to name

Note : As the search field, product_name needs to be set to type="text_ik", otherwise the word segmentation query cannot be performed.

4.1.4 Modify the schema.xml file (configure the data source)
to add the field name attribute, which corresponds to the value of <field name in mysql-data-config.xml: the

above configuration is completed

4.1.5 Execute the creation of the index
and start solr again:
the start is successful , create an index of mysql to the doc file:

query doc:

get the data of mysql: the

above configuration is completed, you can combine the spring java code to realize the e-commerce plp product list to load the solr index information.

4.1.6
There are some problems in configuring solr logs. Tomcat cannot print out logs, so it is inconvenient to find out the cause. Configure solr logs to find problems with solr:
add a log property file under the solr project of tomcat

#  Logging level
log4j.rootLogger=WARN, file

#- size rotation with log cleanup.
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.MaxFileSize=4MB
log4j.appender.file.MaxBackupIndex=9

#- File to log to and log format
log4j.appender.file.File=logs/solr.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%-5p - %d{yyyy-MM-dd HH:mm:ss.SSS}; %C; %m\n


View the log jar package imported before under tomcat/webapps/solr/WEB-INF/lib:


the log file tomcat/bin/logs/solr.log appears after running:


there may be errors:
org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: id

When Solr builds an index, if there is no field id in the doc you submit, the following error occurs when Solr builds the index:
org.apache.solr.common.SolrException: Document [null] missing required field: id

Mainly because <uniqueKey>id</uniqueKey> is defined in Solr's solrconfig configuration file, and the ID is unique by default.
If your index field doesn't need ID, you can change this.
<uniqueKey>kwid</uniqueKey>

And add the required="true" zodiac sign in the id field
to the kwid field.
<field name="pid" type="string" indexed="true" stored="true" required="true"/>



Other details:
combined domain
, multi- domain combined query
, detailed control,
hit representation,
dynamic clustering
, rich text processing,
data integration

web application uses solr General idea: java server is mapped to solr service (httpSolrServer), solr web.xml is mapped to entry solrhome entity data (entity configuration module), solr entity data mapping database.
Application in e-commerce: pagination, return plp object list, and set it to highlight.
Reference:
Attribute Definition http://www.cnblogs.com/rainbowzc/p/3695058.html
Java Code for Search Engine Implementation with Solr Service
http://572327713.iteye.com/blog/2360936
http://www.cnblogs .com/easong/p/6258280.html
http://www.w2bc.com/article/204862

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326760788&siteId=291194637