Sphinx sphinx of learning to the Installation Guide

Sphinx sphinx of learning to the Installation Guide

A, Sphinx Profile

Sphinx was developed by the Russians Andrew Aksyonoff of a full-text search engine. For other applications intended to provide high-speed, low footprint, high degree of correlation results of full-text search capabilities. Sphinx can be easily integrated with SQL databases and scripting languages. The current system is built-in support MySQL and PostgreSQL database data sources, also support XML data read from the standard input a specific format.

Sphinx the following features:

a) speed index (in contemporary CPU, peak performance can reach 10 MB / sec);

b) search performance (at 2 - 4GB on the text data, the average response time for each search less than 0.1 seconds);

c) a massive data (currently known can handle more than 100 GB of text data, the system on a single CPU can handle the document 100 M);

d) provides excellent correlation algorithm, based on a composite Ranking phrase similarity and statistical (BM25) of;

e) support distributed search;

f) support phrase searches

g) provide a summary document generation

h) to provide search services as MySQL storage engine;

i) support a variety of search modes Boolean, phrase, word similarity and so on;

j) document supports multiple full-text search field (maximum of not more than 32);

k) a plurality of additional document support attribute information (for example: information packet, timestamp, etc.);

l) support hyphenation;

Although the mysql MYISAM provides full-text indexing, but the performance can not people compliment, in addition to the database after all, not very good at doing such a thing, we need to give these activities more suitable program to do, to reduce the pressure on the database. Therefore, the use of full-text indexing tools to do mysql Sphinx is a good choice. This week is mainly to learn the use of this tool, the learning process will generally record it, to be a note, also we hope to inspire other friends to learn this tool.

Two, Sphinx installation

Sphinx on the application mysql in two ways:

  1. The use of API calls, such as the use of PHP, java and other API function or method query. Mysql advantage is not necessary to recompile the server process "low coupling", and the program can be flexible, easy call; drawback is that such procedures under the existing criteria, the need to modify part of the program. Recommended programmers.
  2. The use of plug-ins (sphinxSE) to compile mysql sphinx into a plug-in and use of specific sql statement retrieval. Its characteristics are easy to assemble in the sql side, and can return data directly to the client. Do not have a secondary query, only need to modify the corresponding sql on the program, but it is very easy to program using the framework developed, such as the use of ORM. Mysql also need to be recompiled, and requires mysql-5.1 or later support plug storage.

Here's installation is the first major presentation of the way through api calls. Installation Sphinx follows:

# Download the latest stable version

wget http://www.sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz

tar xzvf sphinx-0.9.9.tar.gz

cd sphinx-0.9.9

./configure --prefix=/usr/local/sphinx/   --with-mysql  --enable-id64

make

make install

Note: In this way the installation does not support the Chinese word.

Three, Sphinx Chinese word

Chinese and English full-text search and other latin series is not the same, which is to break the word according to the special characters such as spaces, while the Chinese are according to word semantics. Chinese word there are two plug-ins

  1. Coreseek

Coreseek is now the most used sphinx Chinese full-text search, which provides for the design of the Chinese word Sphinx package LibMMSeg  , is developed based on the basis of sphinx.

  1. sfc(Sphinx-for-chinese)

sfc ( sphinx-for-chinese ) is another Chinese provided by the users happy brother segmentation plug-ins. Its Chinese dictionary uses xdict .

This paper describes the method of installation Coreseek

Four, Coreseek (support Chinese search sphinx) Installation

  1. Install the upgrade autoconf

Because coreseek need autoconf 2.64 or later, and therefore need to upgrade autoconf, or will be error from http://download.chinaunix.net/download.php?id=29328&ResourceID=648 Download autoconf-2.64.tar.bz2, installation is as follows:

takes -jxvf autoconf-2.64.tar.bz2

cd autoconf-2.64

./configure

make

make install

  1. Download coreseek

The new version of the dictionary and the sphinx coreseek source placed in a package, and therefore only need to download coreseek package on it.

  1. Installation mmseg (coreseek used dictionaries)

tar xzvf coreseek-3.2.14.tar.gz

cd mmseg-3.2.14

warning information ./bootstrap # output can be ignored if the error appears to be resolved

./configure --prefix=/usr/local/mmseg3

make && make install

cd ..

  1. Installation coreseek (sphinx)

cd csft-3.2.14

warning information sh buildconf.sh # output can be ignored if the error appears to be resolved

./configure --prefix=/usr/local/coreseek  --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql

make && make install

cd ..

  1. Test mmseg word search and coreseek

Note: The need to pre-set the character set for zh_CN.UTF-8, to ensure that the correct Chinese, my system en_US.UTF-8 character set is also possible.

cd testpack

cat var / test / test.xml # should display correctly Chinese

/usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc var/test/test.xml

/usr/local/coreseek/bin/indexer -c etc/csft.conf --all

/ Usr / local / coreseek / bin / search -c etc / csft.conf Web search

At this point you should correct return

words:

1. 'Network': 1 documents, 1 hits

2. 'Search': 2 documents, 5 hits

  1. Thesaurus and configuration files generated mmseg

The new version has been automatically generated.

V. Reference article:

Sphinx Chinese guide

http://www.sphinxsearch.org/sphinx-tutorial

Sphinx Chinese word application

http://www.sphinxsearch.org/archives/82

Sphinx 0.9.8 Reference Manual

Installation on CoreSeek BSD / Linux

http://www.coreseek.cn/products/products-install/install_on_bsd_linux/

Reproduced in: https: //www.cnblogs.com/guolanzhu/p/4304550.html

Guess you like

Origin blog.csdn.net/weixin_34327761/article/details/94192962