Introducing the Elastic Learned Sparse Encoder: Elastic's AI Model for Semantic Search

Literature: Aris Papadopoulos , Gilad Gal

Find meaning, not just words

We're excited to share that in 8.8, Elastic ® provides semantic search out of the box. Semantic search is designed to search based on the intent or meaning of text, not vocabulary matching or keyword queries. This is a qualitative leap forward from traditional lexical term-based searches, providing breakthrough relevance. Rather than simply querying for terms, it captures the relationship between words at a conceptual level, understands context, and displays relevant results based on meaning. 

To remove the barriers to AI search, in 8.8 we're introducing a new semantic search model (currently in tech preview) trained and optimized by Elastic. Use it to immediately take advantage of the superior semantic relevance of vector search and hybrid search native to Elastic.

Introducing Elastic Learned Sparse Encoder, a new text scaling model for semantic search

Elastic has invested three years in vector search and AI, and in 8.0 released support for approximate nearest neighbor search (using HNSW in Lucene). Recognizing that the tools to enable semantic search are rapidly evolving, we provide third-party model deployment and management both programmatically and through the UI. With the combined function, you can load a vector model (embedding) and perform a vector search through the familiar, enhanced search API .

Let's say an employee is looking for leadership courses. With vector search in Elastic Enterprise Search, we can better understand user intent and return courses tailored to their industry, organization, and role.


- Jon Ducrou, Senior Vice President of Engineering, Go1


The effect of using vector search is amazing. But to achieve these goals, organizations require significant expertise and effort far beyond typical software productization. This includes annotating a sufficient number of queries (typically on the order of tens of thousands) within the domain on which the search will be performed, retraining machine learning (so-called "embedding") models within the domain to achieve domain adaptation, and maintaining the model to Prevent drift. At the same time, you may not want to rely on third-party models due to privacy, support, competition, or licensing concerns. As such, AI-driven search remains beyond the reach of most users. 

With this in mind, we're introducing the Elastic Learned Sparse Encoder in 8.8 - Technology Preview. You can start using this new retrieval model with the click of a button in the Elastic UI for a wide variety of use cases, and you don't need any machine learning expertise or deployment effort.

Machine Learning Training Model Management

Advanced semantic search out of the box

Elastic's Learned Sparse Encoder uses text expansion to inject meaning and improve relevance to simple search queries. It captures the semantic relationships between English words and based on these relationships expands the search query to include related terms not present in the query. This is more powerful than adding synonyms using lexical scores (BM25), as it uses this deeper knowledge of linguistic scales to optimize relevance. Not only that, but context is also taken into account, helping to disambiguate words that may have different interpretations in different sentences. 

Therefore, the model helps alleviate the vocabulary mismatch problem : even if the queried term does not exist in the document, the Elastic Learned Sparse Encoder will return the relevant document (if it exists).

According to our comparison, this novel retrieval model outperforms lexical search on 11 of 12 important relevance benchmarks , while hybrid search combining the two performs best on all 12 relevance benchmarks. In other words, you can get great performance in different domains, and in some cases best performance in combination with lexical search. If you've put the effort into fine-tuning vocabulary search in your domain, you can get an extra boost from blended scoring!


Why choose Elastic's Learned Sparse Encoder?

Best of all, you can use this new model out-of-the-box without domain adaptation - which we explain in detail below; it is a sparse vector model that performs well out-of-domain or in zero-shot. Let's take a look at how these features translate directly into value for your search application:

  • Our models are trained and architected in such a way that you don't need to fine-tune them on your data. As a cross-domain model, it outperforms dense vector models without applying domain-specific retraining. In other words, just hit " deploy " on the UI and it will work with your data using state-of-the-art semantic search.
  • Our model outperforms SPLADE (Sparse Lexicon and Extension Model), the previous cross-domain, sparse vector, text extension champion, on the same benchmark.
  • Plus, you don't have to worry about licenses, support, continuity of competitiveness, and scalability beyond Elastic license tiers. For example, SPLADE is for non-commercial use only. Our models are available in our Platinum subscription tier.
  • As a sparse vector representation, it uses Elasticsearch, a Lucene-based inverted index. This means that decades of optimization can be leveraged to deliver the best possible performance. As a result, Elastic offers one of the most powerful and effortless hybrid search solutions on the market.
  • For the same reason, it is both more efficient and more interpretable. Activations have fewer dimensions than dense representations, and they often map directly to words rather than the opacity of dense representations. In the case of a vocabulary mismatch, this will clearly show which words that were not present in the query triggered the result.


Let's talk about performance and Elasticsearch as a vector database

Retaining vectors of tens of thousands of dimensions and performing vector similarity on them might sound like a scale and latency challenge. However, sparse vectors compress very well, and Elasticsearch (and Lucene) inverted indexes are a powerful technical approach for this use case. Also, vector similarity is a less computationally intensive operation for Elastic, since Elasticsearch hides some neat inverted index tricks. Overall, query performance and index size are excellent when using our sparse retrieval model, and require fewer resources than typical dense vector indexes.

Having said that, regardless of platform, vector search (sparse or dense) has an inherently larger footprint in terms of memory footprint and time complexity compared to lexical search. Elastic is optimized as a vector database and provides all possible advantages at all levels (data structures and algorithms). Although learning sparse retrieval may require more resources than lexical search, depending on your application and data, the enhancements it provides may well be worth the investment.

The Future: The Most Powerful Hybrid Search on the Market

In the first tech preview, we limited the length of the input to 512 tokens, which is roughly the first 300-400 words in each field to pass through the inference pipeline. This is sufficient for many use cases, and we are investigating ways to handle longer documents in future releases. For successful early evaluation, we recommend using documents where most of the information is stored in the first 300-400 words. 

After evaluating different relevance models, we found that the best results came from an ensemble of different ranking methods. You can combine vector search (with or without the new retrieval model) with Elastic's lexical search through our streamlined search API. Linearly combining the normalized scores of each method gives excellent results. 

However, we wanted to push the boundaries and deliver the most powerful hybrid search capability on the market by eliminating any search science effort of fine-tuning based on scores, data, queries, etc. To this end, we released Reciprocal Rank Fusion (RRF) in 8.8, initially available with third-party models in Elastic, and we are working on integrating our sparse retrieval models and lexical search via RRF in subsequent releases. This way, you'll be able to take advantage of Elastic's innovative hybrid search architecture, combining semantic, lexical, and multimedia searches through the mature Elastic search API you've known and trusted for years.

Finally, as we work toward a GA production-ready release, we are exploring strategies for handling long documents and overall optimizations to further improve performance.

Get started today with Elastic's AI-powered search

To try out the Elastic Learned Sparse Encoder, head to Machine Learning or Enterprise Search in the Trained Models view and start enriching your ingested data with semantically relevant terms with the click of a button. If you don't already have access to Elastic, you can request access to the required premium trial here .

To learn more about our investments and trajectory in vector search and AI, watch Matt Riley, General Manager, Enterprise Search, ElasticON Global Spotlight.

Elasticsearch for Developers

For a more in-depth look at the architecture and training of new models, read this blog from our machine learning experts .

To learn how to use the model for semantic and hybrid search, visit our API and requirements documentation.

The release and timing of any features or functionality described in this blog post is at the sole discretion of Elastic. Any features or functionality not currently available may not be delivered on time or at all.

Guess you like

Origin blog.csdn.net/UbuntuTouch/article/details/130863612