tantivy & lucene function, write performance comparison

Hardware Overview : cpu: 24, memory: 20g, Disk: 10 * 2.7T.

Write performance :( not ip add geo information).

Write performance comparison

speed

Commit time-consuming (s)

500 * 1000

Bulk consuming (s)

1000 dns

Bulk consuming (s)

1000 tcpflow

Bulk consuming (s)

1000 weblog

Cpu occupancy

Disk Usage

The amount of data tcpflow

Thread configuration

tantivy

155272

6-19

0.01-0.06

0.1-0.2

0.1-0.2

40-80 us, 5-15

20-90

4_000_000 bar, 870M

10*2+10*2*3

lucene

151633

3-4

0.2-0.3

1.3-1.4

1.3-1.4

60-80 us, 5sy

20-90

4_500_000 bar, 1.3G

10*5

 

 

 

 

 

 

 

 

Features:

Query: query.

Query

TermQuery
BooleanQuery
WildcardQuery
PhraseQuery
RangeQuery
FuzzyQuery
RegexpQuery
ConstantScoreQuery
PrefixQuery

tantivy

Y

Y

Y

Y

Y

Y

Y

Y

N

lucene

Y

Y

Y

Y

Y

Y

Y

Y

Y

 

Collector : used to obtain information in the query field of doc, used to sort, filter, and aggregation.

Collector
TopCollector
TimeLimitingCollector
CountCollector
tantivy
Y
N
Y
lucene
Y
Y
N

 

Docvalues / fastfield : obtaining field information by doc docvalues, used to sort, filter, and aggregation.

 

Docvalues/fastfield

tantivy
fastfield (currently only supports digital)
lucene
Docvalues

 

IndexWriter: write data.

IndexWriter

Flush (without fsync, data may be in the buffer)

Commit (fsync to disk)

tantivy

N (not currently found)

Y

lucene

Y

Y

 

to sum up:

Features

tantivy has implemented most of the features of lucene. Specific differences in the table above.

 

Write performance

Overall write performance similar.

When bulk index data, tantivy faster than lucene.

When executed commit, tantivy better to lucene, see write performance.

 

Disk Usage

Disk Usage or less, as described in write performance.

Guess you like

Origin www.cnblogs.com/vsop/p/11493045.html