Elastic Announces Elasticsearch Relevance Engine™ — Powering Advanced Search for the AI Revolution

By Matt Riley

Today we're introducing Elasticsearch Relevance Engine™ (ESRE™) , a new capability for creating highly relevant AI search applications. ESRE builds on Elastic's leadership in search and more than two years of machine learning research and development. Elasticsearch Relevance Engine combines the best practices of AI with Elastic's text search. ESRE provides developers with a full set of capabilities to integrate complex retrieval algorithms and large language models (LLMs). Not only that, but ESRE uses Elastic's simple, unified APIs that developers can start immediately to improve search relevance because these APIs are already trusted and widely used by the Elastic community.

Elasticsearch Relevance Engine launch

The configurable features of the Elasticsearch Relevance Engine improve relevancy in the following ways:

  • Apply advanced relevance ranking features, including BM25f, a key component of hybrid search
  • Create, store, and search dense vectors using Elastic's vector database
  • Process text using various natural language processing (NLP) tasks and models
  • Let developers manage and use their own transformer models in Elastic , adapting to business-specific contexts
  • Integrate via API with third-party transformer models such as OpenAI's GPT-3 and 4 to extract intuitive summaries from retrieved content from customer data stores aggregated in Elasticsearch clusters
  • Use Elastic's out-of-the-box Learned Sparse Encoder model to implement ML-based search without training or maintaining the model, providing highly relevant, semantic search in a variety of domains
  • Integrate with third-party tools, such as LangChain , to help build complex data pipelines and generative AI applications

The evolution of search has always been driven by the need to increase relevance and continuously improve the way search applications interact. Highly relevant search results can lead to increased user engagement on search applications, with a significant impact on revenue and productivity. In the new world of LLM and generative AI, search can go further, understand user intent, and provide unprecedented specificity in responses.

Notably, each advance in search provides better relevance while addressing new challenges posed by emerging technologies and changing user behavior. Whether extending keyword search to provide semantic search or enabling new search modes for video and images, new technologies require unique tools to provide a better experience for search users. Likewise, today's AI world requires a new, highly scalable developer toolkit built on a broadly proven, customer-tested technology stack.

With the momentum of generative AI and the growing popularity of techniques like ChatGPT, and the growing awareness of the power of large language models, developers are eager to experiment with techniques to improve their applications. The Elasticsearch Relevance Engine brings new capabilities to the world of generative AI and ushers in the modern era with powerful tools that any development team can use immediately.

Elasticsearch Relevance Engine is now available on Elastic Cloud, the only managed Elasticsearch service that includes all the new features in this latest release. You can also download the Elastic Stack and our cloud orchestration products Elastic Cloud Enterprise and Elastic Cloud for Kubernetes to gain experience on self-built clusters.

Want to learn more about Elasticsearch Relevance Engine™? Check out these tech blogs:

Overcoming the Limitations of Generative AI Models

The Elasticsearch Relevance Engine™ is well positioned to help developers rapidly evolve and address these challenges of natural language search, including generative AI.

1) Enterprise Data/Context Awareness : Models may not have sufficient internal knowledge related to a specific domain. This is derived from the dataset on which the model was trained. To customize the data and content generated by LLMs, businesses need a way to feed models with proprietary data so they can learn to provide more relevant, business-specific information.

2) Excellent correlation : The Elasticsearch correlation engine makes it easy to integrate data from private data sources, just generate and store embeddings, and then use semantic search to retrieve context. Embeddings are numerical representations of words, phrases, or documents that help LLM understand the meaning and relationship of words. These embeddings can improve the output speed and scale of the converter model. ESRE also allows developers to bring their own converter models into Elastic or integrate with third-party models.

We also realized that the availability of late interaction models allowed us to provide out-of-the-box functionality—without extensive training or fine-tuning on third-party datasets. Since not every development team has the resources or expertise to train and maintain machine learning models, nor understand the trade-offs between scale, performance, and speed, Elasticsearch Relevance Engine also provides Elastic Learned Sparse Encoder, a cross-domain semantic A retrieval model built for searching. The model pairs sparse vectors with traditional keyword-based BM25 searches, providing an easy-to-use Reciprocal Rank Fusion (RRF) scorer for hybrid searches. ESRE provides developers with machine learning-driven relevance and hybrid search technology on day one.

3) Privacy and Security : Data privacy is at the heart of how businesses use and securely pass proprietary data across networks and components, even when building innovative search experiences.

Elastic provides native support for role-based and attribute-based access control to ensure that only those roles authorized to access data can see it, even for chat and question-answer applications. Elasticsearch can support your organization's need to keep certain documents accessible to privileged individuals, helping your organization maintain common privacy and access controls across all search applications.

When privacy is the paramount concern, keeping all data within your organization's network is not only critical, but a must. From enabling your organization to deploy applications in isolated environments, to enabling access to secure networks, ESRE provides the tools you need to help your organization keep your data safe.

4) Scale and cost : Using large language models may be impractical for many businesses due to the amount of data and the required computing power and memory. However, businesses that want to build their own generative AI applications such as chatbots need to combine LLM with their private data.

Elasticsearch Relevance Engine provides enterprises with an engine to efficiently deliver relevance using precise context windows to help reduce data footprint without the hassle and expense.

5) Outdated : The model is frozen at a point in the past when training data is collected. Therefore, the content and data created by generative AI models are only current when they are trained. Integrating enterprise data is inherent in enabling LLM to deliver timely results.


6) Hallucinations : When answering questions or in interactive dialogues, LLM may concoct facts that sound believable and convincing, but are actually untrue predictions. This is another crucial reason why LLM needs to be combined with contextualized, tailored knowledge to make the model usable in a business setting.

Elasticsearch Relevance Engine enables developers to connect to their own data stores through a contextual window in generative AI models. Added search results can provide up-to-date information from private sources or specialized domains, and thus return more factual information when prompted, rather than relying solely on the model's so-called "parameterized" knowledge .


Leverage the Power of Vector Databases

Elasticsearch Relevance Engine includes an elastic, production-grade vector database by design. It provides developers with a foundation for building rich semantic search applications. Using Elastic's platform, development teams can use dense vector retrieval to create more intuitive questions and answers, not limited by keywords or synonyms. They can use unstructured data like images to build multimodal searches, and even model personas to get personalized search results for use in product and discovery, job search or matchmaking apps. These NLP transformer models can also implement machine learning tasks such as sentiment analysis, named entity recognition, and text classification. Elastic's Vector Database lets developers create, store, and query vectors that are highly scalable and performant for real production applications.

Elasticsearch excels at highly relevant search retrieval. With ESRE, Elasticsearch provides generative AI with a contextual window connected to enterprise proprietary data, allowing developers to build more engaging and accurate search experiences. Search results are returned based on the user's original query, and developers can pass the data to a language model of their choice to provide answers with additional context. Elastic accelerates question answering and personalization by leveraging relevant contextual data content from your enterprise store that is private and tailored to your business.


Superior out-of-the-box relevance for all developers

With the release of the Elasticsearch Relevance Engine, we are making Elastic's proprietary retrieval model readily available. The model is easy to download and works with all of our ingestion mechanisms such as Elastic web crawlers , connectors or APIs. Developers can use it with their searchable corpora out of the box, and it's small enough to fit in a laptop's RAM. Elastic Learned Sparse Encoder provides cross-domain semantic search for search use cases such as knowledge bases, academic journals, legal discovery, and patent databases, delivering highly relevant search results without tuning or training.

Most real-world tests show that hybrid ranking techniques are producing the most relevant sets of search results. Until now, we have been missing one key component - RRF. We now offer RRF for your application search needs so you can combine vector and text search capabilities.

Machine learning is leading the way in enhancing the relevance of search results to semantic context, but cost, complexity, and resource requirements often make it difficult for developers to implement it effectively. Developers often need the support of specialized machine learning or data science teams to build highly relevant AI-driven searches. These teams spend a lot of time selecting appropriate models, training them on domain-specific datasets, and maintaining models as the data and its relationships change.


Learn how Go1 uses Elastic's vector database for scalable semantic search.

Developers without the support of a professional team can also implement semantic search and benefit from AI-driven search relevance from the start without the effort and expertise required by other alternatives. Starting today, all customers have the building blocks to help enable better relevance and a more modern, smarter search.


try it

Learn about these features and view more information .

Existing Elastic Cloud customers can access many of these capabilities directly from the Elastic Cloud console . Not taking advantage of Elastic on Cloud? Learn how to use Elasticsearch with LLM and generative AI .

The release and timing of any features or functionality described in this blog post is at the sole discretion of Elastic. Any functionality or features not currently available may be delayed or not appear at all.
Elastic, Elasticsearch, Elasticsearch Relevance Engine, ESRE, Elastic Learned Sparse Encoder, and related marks are trademarks, logos, or registered trademarks of Elasticsearch NV in the US and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners.

Guess you like

Origin blog.csdn.net/UbuntuTouch/article/details/130862585