[Solr] Chinese word segmentation configuration

Tip: Make sure that the core has been generated before setting the Chinese word segmentation. If the core has not been generated, you can use: solr create -c "自定义名称"to define.

Effect preview before word segmentation:
insert image description here

  1. Download tokenizer:
    Download address: https://mvnrepository.com/artifact/com.github.magese/ik-analyzer/8.3.0
    or mavendownload via update

    <dependency>
        <groupId>com.github.magese</groupId>
        <artifactId>ik-analyzer</artifactId>
        <version>8.4.0</version>
    </dependency>
    
  2. Copy the jar package
    and put the downloaded jar package into the following directory:server\solr-webapp\webapp\WEB-INF\libinsert image description here

  3. Modify the schema
    before solr 6.6it is schema.xmla file, and after it is managed-schema, its location is server\solr\新建的core文件夹\conf\under the folder, for example: server\solr\test001\conf
    Add the following content:

        <!-- ik分词器 -->
        <fieldType name="text_ik" class="solr.TextField">
            <analyzer type="index">
                <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false" conf="ik.conf"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true" conf="ik.conf"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>
    
  4. Restart verification
    Restart the solr service: solr.cmd restart -p 8983
    or I do this:
    insert image description here
    enter the service address: http://localhost:8983/, follow the steps below to verify~
    insert image description here
    Complete~


This article is referenced from: Introduction to the basics of Solr

Guess you like

Origin blog.csdn.net/ruisasaki/article/details/131322129