Summary ES use in finishing

Recent projects using ES search, built at the beginning of their own to do the test ES environment, behind a company's cloud application platform application,

ES docking process in quite twists and turns, encountered many problems, where the consolidation under the record:

1, ES 9200 and 9300 Port Description 

    9200 as Http protocol, exposed ES RESTful interface port number, mainly for external communication

    Between ES cluster is to communicate through 9300

    As Tcp 9300 agreement between the jar is through tcp protocol communications

    Please use the ES docking port 9200, port 9300 is not recommended for use

 

2, master, data and client nodes difference

      Recommended that the cluster set three or more nodes as a master node, these nodes are only responsible becomes the master node, the maintenance status of the entire cluster.
  Then the amount of data to set a number of data nodes, which is only responsible for storing data, the latter providing indexing and query index of service, so if a user requests more frequent, the pressure of these nodes will be relatively large
  so propose to set up a cluster batch ingest node also known as client nodes, which is only responsible for processing user requests, forwarding the request to achieve load balancing and other functions.

  master nodes: ordinary server can (CPU memory consumption in general) the Data node: major consumer of disk, memory, client | ingest nodes: ordinary server can (if you want to be grouped aggregate operations, it is recommended that memory node is also assigned a little more)
  
  

 

3, jar package version docking ES ES version in use must be consistent

     When docking ES, be sure to ensure consistent jar package version docking ES and ES version , for example, I use the ES version 6.1.4, the corresponding jar package also needs to be 6.1.4, pom information is as follows:

     <dependency>

            <groupId>org.elasticsearch</groupId>

            <artifactId>elasticsearch</artifactId>

            <version>6.1.4</version>

        </dependency>


        <dependency>

            <groupId>org.elasticsearch.client</groupId>

            <artifactId>transport</artifactId>

            <version>6.1.4</version>

        </dependency>


        <dependency>

            <groupId>org.elasticsearch.client</groupId>

            <artifactId>elasticsearch-rest-high-level-client</artifactId>

            <version>6.1.4</version>

        </dependency>

       If the project is also used netty frame, make sure the package version netty netty frame and docking es related jar package used in consistent,

      Otherwise, the client is closed when an exception may occur , for example, I appeared abnormal abnormal "thread was not start" and the like, cause the entire application to hang.

       es of the relevant version 6.1.4 jar package netty version is 4.1.13.Final, referenced in the project and use the same jar package, as follows:

       <dependency>

            <groupId>io.netty</groupId>

            <artifactId>netty-all</artifactId>

            <version>4.1.13.Final</version>

        </dependency>

 

 4,使用tcp及http方式对接es,请使用http方式对接es,tcp方式多用于es节点间通讯,不推荐做对外对接(而且在使用中感觉http方式会明显快于tcp方式)

 4.1 tcp方式对接可以使用类 org.elasticsearch.transport.client.PreBuiltTransportClient,端口9300,可以对接master节点和client节点

      初始化方式为:

       //说明:clusterName为集群名称,在ES6.x版本后可以不使用;client.transport.sniff表示是否嗅探(设置client.transport.sniff为true来使客户端去嗅探整个集群的状态,把集群中其它机器的ip地址加到客户端中,这样做的好处是一般你不用手动设置集群里所有集群的ip到连接客户端,它会自动帮你添加,并且自动发现新加入集群的机器)

        Settings settings = Settings.builder().put("cluster.name",clusterName).put("client.transport.sniff", true).build();


       //esurl是es的对接信息,我这边不同节点间使用, 分隔,例如 1.1.1.2:9300

       // getHttpHosts 用于使用ip和端口初始化一个个TransportAddress对象,最终返回一个TransportAddress 数组,

       List<String> hostNames = Arrays.asList(esurl.split(","));
       client = new PreBuiltTransportClient(settings).addTransportAddresses(getHttpHosts(hostNames));

 

 

4.2 http方式对接可以使用类 org.elasticsearch.client.RestHighLevelClient,端口9200,只可以对接client节点

     初始化方式如下:

      // esurl是es的对接信息,我这边不同节点间使用, 分隔,例如 1.1.1.2:9300

      // getHttpHosts 用于使用ip和端口初始化一个个HttpHost对象,最终返回一个HttpHost数组,

     RestClientBuilder clientBuilder = RestClient.builder(getHttpHosts(esurl));

     RestHighLevelClient client = new RestHighLevelClient(clientBuilder);


     但是很多情况下http方式对接es需要输入鉴权信息(用户名及密码),此时的对接方式如下:

     (此处摘录自这里:https://msd.misuland.com/pd/3255817963235711680)

     // esname为用户名, espassword为密码

      String auth = Base64.encodeBase64String((esname + ":" + espassword).getBytes());

      RestClientBuilder clientBuilder = RestClient.builder(getHttpHosts(esurl));

      clientBuilder.setDefaultHeaders(new BasicHeader[] { new BasicHeader("Authorization", "Basic " + auth) });

      RestHighLevelClient client = new RestHighLevelClient(clientBuilder);


 5,es查询

      需要先构建SearchSourceBuilder,然后通过client查询结果SearchResponse,然后在通过SearchResponse获取查询返回值,直接上代码如下:

        // 构建查询请求对象

        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();

        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();

        // 设置查询条件

        queryBuilder.must(QueryBuilders.matchPhrasePrefixQuery("id", id));

        sourceBuilder.query(queryBuilder);

        // 设置排序依据,及排序方式

        sourceBuilder.sort("timestamp", SortOrder.DESC);

        // 设置起始页及每页大小

        sourceBuilder.from(page);
        sourceBuilder.size(size);
         // 设置查询索引信息及类型
        SearchRequest request = new SearchRequest(INDEX_NAME);
        request.types(INDEX_TYPE);
        request.source(sourceBuilder);

       .....

        // 查询,我这边使用的是http方式,如果使用tcp方式请使用如下语句response = client.search(request).get(); (将其转换为一个同步操作)

        response = client.search(request);

        if (response == null || response.getHits().getTotalHits() == 0) {

             LOGGER.error("ES查询返回空");

        }


        // 将数据转换为结果对象,直接将结果json字符串序列化为目标对象

       SearchHit[] hits = response.getHits().getHits();

       for (SearchHit sh : hits) {

            Info info = new Info();

            try {             

                 info= JSON.parseObject(sh.getSourceAsString(), Info.class);

            } catch (Exception e) {

                 LOGGER.error("ES结果序列化失败:{}", sh.getSourceAsString(), e);

                 continue;

            }

            list.add(info);

        }


6,ES检索语法

    在上面构建ES查询请求时,我们使用一些检索语法,用于构建查询条件, 例如

    queryBuilder.must(QueryBuilders.matchPhrasePrefixQuery("id", id)); 这句里面的must和matchPhrasePrefixQuery即为一些检索语法,

    这些语法很容易找到大量介绍,这里记录一些我感觉很不错的文章地址:

     ES 高级检索语法:https://www.cnblogs.com/shoufeng/p/11103913.html

    es 的常用查询语法 : https://blog.csdn.net/qingmoruoxi/article/details/77221602

    Es学习系列之一: 常见的查询场景总结 : http://tech.dianwoda.com/2017/09/22/esxue-xi-xi-lie-zhi-chang-jian-de-cha-xun-chang-jing-zong-jie/


 

 7, ES提示某个字段不可用于排序

     此时需要修改该字段配置,使其可以排序,主要是设置fielddata属性为true,例如

     设置time为索引可以排序字段,需要发送http请求,linux环境可以使用curl工具
     url:    ip:9200/index/_mapping/type      使用POST方法, index为索引,type为索引中数据的类型
     内容: {"properties":{"time":{"type":"text","fielddata":true}}}

     可以查看该连接, es fielddata理解 : https://www.cnblogs.com/chenmz1995/p/10198967.html


 

8,ES查询报错page+size > 10000

     ES默认的结果集窗口大小为10000,很多时候是不够用的,这时候需要放大这个数值,

     设置结果窗口最大值为10000000,默认为10000某些时候太小了, 需要发送http请求,linux环境可以使用curl工具

 

     url:   ip:9200/index/_settings    使用PUT方法

     内容:{ "index" : { "max_result_window" : 10000000}}
     使用linux curl工具
     linux执行后面的命令: curl   -XPUT   http://ip:9200/index/_settings -d '{ "index" : { "max_result_window" : 100000000}}'

 

9,es中单个中文字查询成功,但是中文词组查询不成功;英文单词查询成功,但是包含特殊字符时查询失败,

    这个是ES的默认分词器及我们使用的检索语法造成的,

    查询单个中文能查询出来,这里采用的是ES标准分词器,将内容分成了一个一个的字,查询单个中文能查询出来,

    搜索词组之类的等查询不出来,因为采用的是精确查找,比如term查询,那么相当于去找内容里是否存在“端口”这个词,但是由于系统的默认分词器的内容已经变成了“端”和“口”,在这种情况之下,单个字查询可以查到的,
    想要查询到也有办法的,采用matchparse查询(查询分词),例如我如下使用的这样:

 

    queryBuilder.must(QueryBuilders.matchPhrasePrefixQuery("content", "端口"));
    具体也可以查看如下两个链接,解释的比较清楚

 

    es中文查询不成功(注意看下面的回答): https://elasticsearch.cn/question/2337

    Es学习系列之一: 常见的查询场景总结(第2节) : http://tech.dianwoda.com/2017/09/22/esxue-xi-xi-lie-zhi-chang-jian-de-cha-xun-chang-jing-zong-jie/

 

    记录下遇到的ES相关问题,在安装及部署时内容相对简单, 不在多余叙述,

    安装参考链接:  https://blog.csdn.net/weixin_38040473/article/details/81082968

 

Guess you like

Origin www.cnblogs.com/widget90/p/11502342.html