Import data Solr MongoDB

Data import method:

The total amount of import and incremental import:
Query is at full volume import your data found all the data import, deltaImportQuery and deltaQuery incremental import two queries required data. deltaImportQuery behind a filter that uniquely identifies the database id = '$ {dataimporter.delta.id}' solr behind this id is the index database id, fixed. then later add deltaQuery need only update the time field in the database e.g. updateDate> '$ {dataimporter.last_index_time}' can, when we import the data records each time dataimport.properties conf directory index file library point is, last_index_time = date, that is, for the convenience of doing incremental indexing. As long as there are some fields update or add a new number of (provided that the table must have a field update will update any updateDate this field) in your database, as long as we do a timed incremental index, you can do when each increment, can ensure solr query the data are up to date.

Configuration and index difference between the total amount of incremental index, the first index of the total amount of all the data in the database will update the index, only the incremental index update database CRUD had to use incremental indexing, there must be a database identifier field to represent the changes in the data, we can use the time stamp to indicate when the data is updated timestamp is also updated so that, solr to incrementally update the index by comparing the time stamp changes.

Import data instance Solr mongodb

Mongo-connector project open source Solr achieve incremental import function:

1, a copy of the cluster configuration mode configuration mongodb






Mongodb replica set configuration starts successfully;

2, mongo-connector mounted
using mounting python: pip install 'mongo-connector [ solr]'

3, solr configuration solrconfig.xml:

<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" />
solr配置managed-schema:
<uniqueKey>id</uniqueKey>  
//修改为
<uniqueKey>_id</uniqueKey>
//添加
<field name="_id" type="string" indexed="true" stored="true" />  
    <field name="name" type="string" indexed="true" stored="true" />  
    <field name="area" type="string" indexed="true" stored="true"/> 
//注释原有的
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

4. Once configured, go to the following address, there is the JSON data success:
http://127.0.0.1:20001/solr/test01/admin/luke?show=schema&wt=json

5, connection configuration
into the directory mongo-connector enter the command:
mongo-connector---auto the commit-interval The 127.0.0.1:27111 -m = 0 -t http://127.0.0.1:20001/solr/test01 -d solr_doc_manager

6, mongo database insert data:

7, solr query data: data below represents mongodb successfully introduced solr

Guess you like

Origin www.cnblogs.com/shlerlock/p/11772773.html