I. INTRODUCTION
elasticsearch-py is an official of elasticsearch python client library of low-level. Why it is a low-level client libraries do? Because it's just elasticsearch the rest API interface to do one simple package, thus providing maximum flexibility, but the same time it is not too easy to use. Relative to this low-level client libraries, the official also provides a high-level of python client library: elasticsearch-dsl, this will be covered in another article.
See more of the description official document: https://elasticsearch-py.readthedocs.io/en/master/
Second, the installation
Elasticsearch different requirements of different versions of the client version, so installation time needed to decide based on your elasticsearch, the following is an easy reference:
# Elasticsearch 6.x elasticsearch>=6.0.0,<7.0.0 # Elasticsearch 5.x elasticsearch>=5.0.0,<6.0.0 # Elasticsearch 2.x elasticsearch>=2.0.0,<3.0.0
Try to choose the latest version at large compatible version.
pip install elasticsearch
三, API
3.1 API Documentation
All API are mapped closely as possible the original rest API.
3.1.1 Global Options
Some parameters are added to the client can be used on all of the API.
1.ignore
Some users are ignored http error status code.
from elasticsearch import Elasticsearch es = Elasticsearch() # ignore 400 cause by IndexAlreadyExistsException when creating an index es.indices.create(index='test-index', ignore=400) # ignore 404 and 400 es.indices.delete(index='test-index', ignore=[400, 404])
2.timeout
It is used to set the timeout.
# only wait for 1 second, regardless of the client's default es.cluster.health(wait_for_status='yellow', request_timeout=1)
3.filter_path
It is used to filter the return value.
es.search(index='test-index', filter_path=['hits.hits._id', 'hits.hits._type'])
3.1.2 Elasticsearch
Elasticsearch is a low-level client, provides a direct mapping from python to rest endpoint es. This example has a property cat, cluster, indices, ingest, nodes, snapshot and tasks, you can access the instance CatClient, ClusterClient, IndicesClient, IngestClient, NodesClient, SnapshotClient and TasksClient through them.
elasticsearch class contains methods elasticsearch many common operations, such as: get, mget, search, index, bulk, create, delete, etc., the specific use of these methods, you can refer to the official documentation of elasticsearch-py.
Before performing the above method, first obtain an instance of elasticsearch, acquired this example there are two methods, one for transmitting an initialization function elasticsearch connection class instance, and the other is transmitted to the initialization function elasticsearch node to be connected host and port, in fact, the final host, port or is passed to the connection class.
# create connection to localhost using the ThriftConnection es = Elasticsearch(connection_class=ThriftConnection) # connect to localhost directly and another node using SSL on port 443 # and an url_prefix. Note that ``port`` needs to be an int. es = Elasticsearch([ {'host': 'localhost'}, {'host': 'othernode', 'port': 443, 'url_prefix': 'es', 'use_ssl': True}, ])
3.1.3 Indices
indices for operation, query information about the index, or can be said to operate, query index associated metadata.
3.1.4 Ingest
ingest is a plug-in for rich inserting data insertion.
3.1.5 Cluster
cluster and related information used to obtain the cluster, for example: a cluster of health status, settings and so on.
3.1.6 Nodes
nodes for information related to the acquisition and nodes.
3.1.7 Cat
cat can be used for the alias, fragmentation information, document information quantity.
3.1.8 Snapshot
snapshot management for snapshots.
3.1.9 Tasks
tasks for task management, suggesting that the task is a new feature on the official document, the future may change, so pay attention.
3.2 X-Pack APIs
X-Pack Elastic Stack is extended, it will security, alarm, monitoring, reporting and graphics capabilities bundled into a package easy installation.
3.2.1 Info
3.2.2 Graph Explore
3.3.3 Licensing API
3.3.4 Machine Learning
3.3.5 Security APIS
3.3.6 Watcher APIS
3.3.7 Migration APIS
3.3 Abnormal
This section shows that might be thrown when using elasticsearch-py exception.
3.4 connection layer API
connection class is responsible for the cluster connection.
3.4.1 Transport
packaging and transport-related transfer logic. Each instance of the connection process, and create a connection pool to save them.
3.4.2 Connection Pool
connection pool is a connection pool, used to manage connections.
3.4.3 Connection Selector
connection selector is connected to a selector, it is the best example of a zone-aware selection, you can automatically select the local connection, only if the local node can not be connected only to choose to connect other node.
3.4.4 Urllib3HttpConnection
The default connection class.
3.5 Transport Class
The transmission module can be listed as the initialization parameter connection_class elasticsearch the connection class.
3.5.1 Connection
connection with elasticsearch responsible for connection management node.
3.5.2 Urllib3HttpConnection
Based connection class urllib is the default connection class.
3.5.3 RequestsHttpConnection
Based connection class requests, unless you want to use advanced features related requests, it is recommended not to use the class.
3.6 helpers
helpers is a collection of simple auxiliary functions, these functions abstract some of the details or the original API.
3.6.1 bulk helpers
Specific format requirements for bulk API directly lead to use them will be very complicated, so here provide a helper function of several bulk API, the specific method of use may refer to the official documentation of elasticsearch-py.
3.6.2 scan
scan is a simple abstract scroll API's.
3.6.3 reindex
reindex may be used to satisfy a given query an index of all documents in the index to another index again