Lucence study notes (1)

table of Contents

Introduction to Luence

Introduction to Lucene

Lucene uses

Lucene applicable scenarios

Features of Lucene

1. Stable and high index performance

2. Efficient, accurate and high-performance search algorithm

3. Cross-platform

Luence architecture

Luence integration

Lucene version selected

System Requirements

Integration: Introduce the lucene core jar into your application

Integration: Introduce the required lucene module jar into your application

Lucene module description

First introduce the core module of lucene

Understand the composition of core modules


Introduction to Luence

Introduction to Lucene

The most popular java open source full-text search engine development kit. Provides a complete query engine and index engine, part of the text word segmentation engine (English and German two western languages). The purpose of Lucene is to provide software developers with a simple and easy-to-use toolkit to facilitate the realization of full-text search functions in the target system, or to build a complete full-text search engine based on this. It is a sub-project of Apache, website: http://lucene.apache.org/

Lucene uses

Provide software developers with a simple and easy-to-use toolkit to facilitate the realization of full-text search functions in the target system, or build a complete full-text search engine based on this.

Lucene applicable scenarios

  • Provide full-text search implementation for data in the database in the application.
  • Develop independent search engine services and systems

Features of Lucene

1. Stable and high index performance

  • Can index more than 150GB of data per hour.
  • Small memory requirements-only 1MB of heap memory is required
  • Incremental indexing is as fast as bulk indexing.
  • The size of the index is about 20%~30% of the size of the index text.

2. Efficient, accurate and high-performance search algorithm

  • Good search sorting.
  • Powerful query mode support: phrase query, wildcard query, proximity query, range query, etc.
  • Support field search (such as title, author, content).
  • Can be sorted according to any field, support multiple index query results merge
  • Support update operation and query operation at the same time
  • Support highlighting, join, grouping results
  • high speed
  • Scalable sorting module, built-in including vector space model, BM25 model optional
  • Configurable storage engine

3. Cross-platform

  • Written in pure java.
  • As an open source project under the Apache open source license, you can use it in commercial or open source projects.
  • Lucene is available in a variety of language versions (such as C, C++, Python, etc.), not just JAVA.

Luence architecture

  1. data collection
  2. Create index
  3. Index storage
  4. Search (use index)

Luence integration

Lucene version selected

Choose the current latest version 7.3.0: https://lucene.apache.org/

System Requirements

JDK1.8 and above

Integration: Introduce the lucene core jar into your application

Method 1: Download the zip from the official website, unzip and copy the jar to your project

Method 2: Maven introduces dependencies

Integration: Introduce the required lucene module jar into your application

Method 1: Download the zip from the official website, unzip and copy the jar to your project

Method 2: Maven introduces dependencies

Lucene module description

  • core: Lucene core library core modules: word segmentation, indexing, query
  • analyzers-*: tokenizer
  • facet: Faceted indexing and search capabilities provide classification indexing and search capabilities
  • grouping: Collectors for grouping search results. Search result grouping support
  • highlighter: Highlights search keywords in results
  • join: Index-time and Query-time joins for normalized content  连接支持
  • queries: Filters and Queries that add to core Lucene
  • queryparser: Query parsers and parsing framework query expression parsing module
  • spatial: Geospatial search support
  • suggest: Auto-suggest and Spellchecking support

First introduce the core module of lucene

<!-- lucene 核心模块  --> 

<dependency>

    <groupId>org.apache.lucene</groupId>

    <artifactId>lucene-core</artifactId>

    <version>7.3.0</version> 

</dependency>

 

Understand the composition of core modules

 

Guess you like

Origin blog.csdn.net/qq_34050399/article/details/112369053