Use Tencent Cloud Big Data Elasticsearch 8.8.1 to achieve: NLP+Vector Search+GAI

guide

Tencent Cloud Big Data Elasticsearch Service recently launched version 8.8.1. The core capability in this version is to provide advanced search capabilities for the AI ​​revolution! This release features the Elasticsearch Relevance Engine™ (ESRE™), a powerful AI-enhanced search engine that brings a new cutting-edge experience to search and analytics.

Although everyone will feel that vector libraries are blooming everywhere, this release is only "one of them". But if I emphasize the key points for you, it may give you a clearer understanding and positioning:

This is currently the only end-to-end search and analysis platform on China's public cloud that provides natural language processing, vectorization, and vector search, and can be integrated with large models:

49f33c3e058a8eecaf24f0890cc2869d.png

Figure 1

1. Not all vector libraries can realize multi-way recall mixed sorting in a single interface call!

2. Not all search engines can perform aggregation operations after performing vector searches!

6b54790c9c7a3837e9132d173dcef004.png

Figure II

Of course, the focus of this article is not the introduction, but the application and practice. Next, this article will show how to create an Elasticsearch 8.8.1 cluster on Tencent Cloud, deploy and use the NLP model, and combine it with the large model on the basis of vector search.

1. Create an Elasticsearch 8.8.1 cluster

The process of building is very simple, as before, just select the corresponding version. What needs to be emphasized here is that because we want to deploy various NLP models and embedding models to the cluster, we need to select enough memory for model deployment as much as possible.

cc04f18d922f6e9b504215c657a6a76b.gif

Figure three

18718a876129f7c02f9929fbe7b7fa0f.gif

Figure four

2. Deploy the NLP model

Whether performing vector searches or extracting information from text by performing NLP tasks such as named entity recognition, inference tasks need to be performed. The biggest difference of Tencent Cloud Elasticsearch 8.8.1 is that you don't need to build a machine learning environment for data processing and reasoning. You can directly process data flexibly on Elasticsearch by integrating different Processors in the pipeline.

3d72918d8561c545e4f631922b7e741f.png

Figure 5. Perform processing and inference in the ingest pipeline

And ensure that our queries and written data use the same model to process data. To simplify the cost of using, updating and maintaining the model.

926d68d11981c6da411c7f7265614266.png

Figure six

And the deployment method is very simple. We provide a tool called eland to upload and deploy models:

eland_import_hub_model --url https://es-7cu6zx9m.public.tencentelasticsearch.com:9200  --insecure -u elastic -p changeme --hub-model-id sentence-transformers/msmarco-MiniLM-L-12-v3 --task-type text_embedding --start --insecure

When deploying, if you use eland_import_hub_model on your own personal computer (because if you download the model from huggingFace, you need Internet access), you need to provide the public network access interface of Tencent Cloud Elasticsearch:

14972b029b9fdee817dad7fb37b6e34e.png

Figure seven

Of course, you can also purchase a CVM on Tencent Cloud, and then use the intranet access address:

eland_import_hub_model --url https://172.27.0.11:9200  --insecure -u elastic -p changeme --hub-model-id canIjoin/datafun --task-type ner --start

However, it should be noted that sometimes the huggingFace cannot be accessed on the CVM or the access times out, which may cause the model to fail to upload and deploy.

Similar to limited Internet access, if you train your own model and don't want to publish it to huggingFace, you can refer to this article How to deploy the local transformer model to Elasticsearch to upload and deploy the local NLP model.

If the upload of the model is performed correctly, you will see the following print:

eland_import_hub_model --url https://es-7cu6zx9m.public.tencentelasticsearch.com:9200  --insecure -u elastic -p changeme  --hub-model-id distilbert-base-uncased-finetuned-sst-2-english --task-type text_classification --start --insecure
2023-07-13 10:06:23,354 WARNING : NOTE: Redirects are currently not supported in Windows or MacOs.
2023-07-13 10:06:24,358 INFO : Establishing connection to Elasticsearch
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/elasticsearch/_sync/client/__init__.py:394: SecurityWarning: Connecting to 'https://es-7cu6zx9m.public.tencentelasticsearch.com:9200' using TLS with verify_certs=False is insecure
  _transport = transport_class(
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
2023-07-13 10:06:24,535 INFO : Connected to cluster named 'es-7cu6zx9m' (version: 8.8.1)
2023-07-13 10:06:24,537 INFO : Loading HuggingFace transformer tokenizer and model 'distilbert-base-uncased-finetuned-sst-2-english'
Downloading pytorch_model.bin: 100%|████████████████████████████████████████████████████████████| 268M/268M [00:19<00:00, 13.6MB/s]
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/distilbert/modeling_distilbert.py:223: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  mask, torch.tensor(torch.finfo(scores.dtype).min)
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
2023-07-13 10:06:48,795 INFO : Creating model with id 'distilbert-base-uncased-finetuned-sst-2-english'
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
2023-07-13 10:06:48,855 INFO : Uploading model definition
  0%|                                                                                                   | 0/64 [00:00<?, ? parts/s]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
  2%|█                                                                                         | 1/64 [00:01<01:25,  1.36s/ parts]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
  3%|██                                                                                        | 2/64 [00:01<00:53,  1.16 parts/s]/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
...
100%|██████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:45<00:00,  1.42 parts/s]
2023-07-13 10:07:34,021 INFO : Uploading model vocabulary
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
2023-07-13 10:07:34,110 INFO : Starting model deployment
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'es-7cu6zx9m.public.tencentelasticsearch.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
2023-07-13 10:07:41,163 INFO : Model successfully imported with id 'distilbert-base-uncased-finetuned-sst-2-english'

3. Model management and testing

After the model is uploaded, you can directly manage and test the model on the Kibana interface of Tencent Cloud Elasticsearch Service:

37c0968b2ecbd7d7e3c623557ea85725.gif

Figure eight

4. Realize NLP+Vector Search+GAI through Elasticsearch in the application

After we have completed the deployment and debugging of the model, we can integrate this capability in the application. For example, if we want to implement a question answering system for a paper, we can implement it in the following steps:

6e637b844e6e533e6c4590cbb8b6f94d.png

Figure 9

Some of the core codes are:

async def nlp_blog_search():
    # 判断模型是否在Elasticsaerch中加载
    global app_models
    is_model_up_and_running(INFER_MODEL_TEXT_EMBEDDINGS)
    is_model_up_and_running(INFER_MODEL_Q_AND_A)


    qa_model = True if app_models.get(
        INFER_MODEL_Q_AND_A) == 'started' else False
    index_name = INDEX_BLOG_SEARCH


    if not es.indices.exists(index=index_name):
        return render_template('nlp_blog_search.html', title='Blog search', te_model_up=False,
                               index_name=index_name, missing_index=True, qa_model_up=qa_model)


    if app_models.get(INFER_MODEL_TEXT_EMBEDDINGS) == 'started':
        form = SearchBlogsForm()


        # Check for method
        if request.method == 'POST':


            if form.validate_on_submit():
                if 'filter_by_author' in request.form:
                    form.searchboxAuthor.data = request.form['filter_by_author']


                if form.searchboxBlogWindow.data is None or len(form.searchboxBlogWindow.data) == 0:
                    # 对查询进行embedding转换
                    embeddings_response = infer_trained_model(
                        form.searchbox.data, INFER_MODEL_TEXT_EMBEDDINGS)
                    # 执行向量搜索()混合搜索、/混合搜索                     
                    search_response = knn_blogs_embeddings(embeddings_response['predicted_value'],
                                                           form.searchboxAuthor.data)
                    cfg = {
                        "question_answering": {
                            "question": form.searchbox.data,
                            "max_answer_length": 30
                        }
                    }


                    hits_with_answers = search_response['hits']['hits']
                    #使用QA模型做第一遍过滤
                    answers = executor.map(q_and_a, map(lambda hit: hit["_id"], hits_with_answers),
                                           map(lambda hit: form.searchbox.data, hits_with_answers),
                                           map(lambda hit: get_text(hit=hit), hits_with_answers))


                    best_answer = None
                    for i in range(0, len(hits_with_answers)):
                        hit_with_answer = hits_with_answers[i]


                        matched_answer = next(
                            (obj['result'] for obj in answers if obj["_id"] == hit_with_answer["_id"]), None)


                        if (matched_answer is not None):
                            hit_with_answer['answer'] = matched_answer
                            if (best_answer is None or (
                                    matched_answer is not None and 'prediction_probability' in matched_answer and
                                    matched_answer['prediction_probability'] > best_answer['prediction_probability'])):
                                best_answer = matched_answer


                            start_idx = matched_answer['start_offset']
                            end_idx = matched_answer['end_offset']


                            text = hits_with_answers[i]['fields']['body_content_window'][0]
                            text_with_highlighted_answer = Markup(''.join([text[0:start_idx - 1],
                                                                           "<b>", text[start_idx -
                                                                                       1:end_idx],
                                                                           "</b>", text[end_idx:]]))
                            hits_with_answers[i]['fields']['body_content_window'][0] = text_with_highlighted_answer


                    # 将结果交给大模型进行总结
                    messages = blogs_convert_es_response_to_messages(search_response,
                                                                            form.searchbox.data)
                    # Send a request to the OpenAI API
                    try:
                        response_ai = openai.ChatCompletion.create(
                            engine="gpt-35-turbo",
                            temperature=0,
                            messages=messages
                        )


                        answer_openai = response_ai["choices"][0]["message"]["content"]
                    except RateLimitError as e:
                        print(e.error.message)
                        answer_openai =  e.error.message
                    except APIConnectionError as e:
                        print(e.error.message)
                        answer_openai = e.error.message


                    return render_template('nlp_blog_search.html', title='Blog search', form=form,
                                           search_results=hits_with_answers,
                                           best_answer=best_answer, openai_answer=answer_openai,
                                           query=form.searchbox.data, te_model_up=True, qa_model_up=qa_model,
                                           missing_index=False)
                else:
                    search_response = q_and_a(
                        question=form.searchbox.data, full_text=form.searchboxBlogWindow.data)
                    return render_template('nlp_blog_search.html', title='Blog search', form=form,
                                           qa_results=search_response,
                                           query=form.searchbox.data, te_model_up=True, qa_model_up=qa_model,
                                           missing_index=False)
            else:
                return redirect(url_for('nlp_blog_search'))
        else:  # GET
            return render_template('nlp_blog_search.html', title='Blog Search', form=form, te_model_up=True,
                               qa_model_up=qa_model, missing_index=False)
    else:
        return render_template('nlp_blog_search.html', title='Blog search', te_model_up=False, qa_model_up=qa_model,
                               model_name=INFER_MODEL_TEXT_EMBEDDINGS, missing_index=False)

Through these integrations, the following effects can be achieved:

058944f7265a7b2f5c5c76656bac9417.gif

Figure ten

That is to say, through Tencent Cloud Elasticsearch 8.8.1, we can easily implement vector search in the application only by calling the interface of Elasticsearch and hand over the results to the QnA model to capture key points. And further give the content to the large model for summary.

V. Summary

The latest version 8.8.1 released by Tencent Cloud Big Data Elasticsearch Service introduces the Elasticsearch Relevance Engine™ (ESRE™), which provides advanced search and AI-enhanced search functions. This release supports natural language processing, vector search, and integration with large models in a single end-to-end search and analytics platform. Using this service, you can easily create clusters, deploy NLP models, and perform search and inference tasks. You can also manage and test models on the Kibana interface. Through Tencent Cloud Elasticsearch, you can realize advanced search capabilities driven by AI and help you make better use of AI technology. Hurry up and experience it!

New Customer Experience Activity Zone

Exclusive benefits for new Elasticsearch Service customers!

2-core 4G-20GB-SSD cloud hard drive

10% discount for the first month experience, and the early adopter price is only 87 yuan! Smoothly experience clustering on the cloud, and easily start ES services!

514f403d0863e45ee430487c48a999a2.png

recommended reading

c461a4aa641a5164f4a1221b1093daa1.jpeg

7dd078ccf6666f7a089b66e945a00708.jpeg

ee1c236f790a94de58f82c44ea977db1.jpeg

Follow Tencent Cloud Big Data Official Account

Invite you to explore the infinite possibilities of data

a86fa94b9728b3354f7efe2a2a7868a1.png

Click to read the original text for more product information

↓↓↓

Guess you like

Origin blog.csdn.net/cloudbigdata/article/details/131843265