Learning python library: elasticsearch-py

I. INTRODUCTION

elasticsearch-py is an official of elasticsearch python client library of low-level. Why it is a low-level client libraries do? Because it's just elasticsearch the rest API interface to do one simple package, thus providing maximum flexibility, but the same time it is not too easy to use. Relative to this low-level client libraries, the official also provides a high-level of python client library: elasticsearch-dsl, this will be covered in another article.

See more of the description official document: https://elasticsearch-py.readthedocs.io/en/master/

Second, the installation

Elasticsearch different requirements of different versions of the client version, so installation time needed to decide based on your elasticsearch, the following is an easy reference:

# Elasticsearch 6.x
elasticsearch>=6.0.0,<7.0.0
# Elasticsearch 5.x
elasticsearch>=5.0.0,<6.0.0
# Elasticsearch 2.x
elasticsearch>=2.0.0,<3.0.0

Try to choose the latest version at large compatible version.

 pip install elasticsearch 

三, API

3.1 API Documentation

All API are mapped closely as possible the original rest API.

3.1.1 Global Options

Some parameters are added to the client can be used on all of the API.

1.ignore

Some users are ignored http error status code.

from elasticsearch import Elasticsearch
es = Elasticsearch()

# ignore 400 cause by IndexAlreadyExistsException when creating an index
es.indices.create(index='test-index', ignore=400)

# ignore 404 and 400
es.indices.delete(index='test-index', ignore=[400, 404])

2.timeout

It is used to set the timeout.

# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)

3.filter_path

It is used to filter the return value.

es.search(index='test-index', filter_path=['hits.hits._id', 'hits.hits._type'])

3.1.2 Elasticsearch

Elasticsearch is a low-level client, provides a direct mapping from python to rest endpoint es. This example has a property cat, cluster, indices, ingest, nodes, snapshot and tasks, you can access the instance CatClient, ClusterClient, IndicesClient, IngestClient, NodesClient, SnapshotClient and TasksClient through them.

elasticsearch class contains methods elasticsearch many common operations, such as: get, mget, search, index, bulk, create, delete, etc., the specific use of these methods, you can refer to the official documentation of elasticsearch-py.

Before performing the above method, first obtain an instance of elasticsearch, acquired this example there are two methods, one for transmitting an initialization function elasticsearch connection class instance, and the other is transmitted to the initialization function elasticsearch node to be connected host and port, in fact, the final host, port or is passed to the connection class.

# create connection to localhost using the ThriftConnection
es = Elasticsearch(connection_class=ThriftConnection)

# connect to localhost directly and another node using SSL on port 443
# and an url_prefix. Note that ``port`` needs to be an int.
es = Elasticsearch([
    {'host': 'localhost'},
    {'host': 'othernode', 'port': 443, 'url_prefix': 'es', 'use_ssl': True},
])

3.1.3 Indices

indices for operation, query information about the index, or can be said to operate, query index associated metadata.

3.1.4 Ingest

ingest is a plug-in for rich inserting data insertion.

3.1.5 Cluster

cluster and related information used to obtain the cluster, for example: a cluster of health status, settings and so on.

3.1.6 Nodes

nodes for information related to the acquisition and nodes.

3.1.7 Cat

 cat can be used for the alias, fragmentation information, document information quantity.

3.1.8 Snapshot

snapshot management for snapshots.

3.1.9 Tasks

tasks for task management, suggesting that the task is a new feature on the official document, the future may change, so pay attention.

3.2 X-Pack APIs

X-Pack Elastic Stack is extended, it will security, alarm, monitoring, reporting and graphics capabilities bundled into a package easy installation.

3.2.1 Info

3.2.2 Graph Explore

3.3.3 Licensing API

3.3.4 Machine Learning

3.3.5 Security APIS

3.3.6 Watcher APIS

3.3.7 Migration APIS

3.3 Abnormal

This section shows that might be thrown when using elasticsearch-py exception.

3.4 connection layer API

connection class is responsible for the cluster connection.

3.4.1 Transport

packaging and transport-related transfer logic. Each instance of the connection process, and create a connection pool to save them.

3.4.2 Connection Pool

connection pool is a connection pool, used to manage connections.

3.4.3 Connection Selector 

connection selector is connected to a selector, it is the best example of a zone-aware selection, you can automatically select the local connection, only if the local node can not be connected only to choose to connect other node.

3.4.4 Urllib3HttpConnection

The default connection class.

3.5 Transport Class

The transmission module can be listed as the initialization parameter connection_class elasticsearch the connection class.

3.5.1 Connection

connection with elasticsearch responsible for connection management node.

3.5.2 Urllib3HttpConnection

Based connection class urllib is the default connection class.

3.5.3 RequestsHttpConnection

Based connection class requests, unless you want to use advanced features related requests, it is recommended not to use the class.

3.6 helpers

helpers is a collection of simple auxiliary functions, these functions abstract some of the details or the original API.

3.6.1 bulk helpers

Specific format requirements for bulk API directly lead to use them will be very complicated, so here provide a helper function of several bulk API, the specific method of use may refer to the official documentation of elasticsearch-py.

3.6.2 scan

scan is a simple abstract scroll API's.

3.6.3 reindex

reindex may be used to satisfy a given query an index of all documents in the index to another index again

 

Guess you like

Origin www.cnblogs.com/lit10050528/p/12122494.html