Elasticsearch - Fuzzy search not giving suggestions

Gabriel :

I am trying to implement a fuzzy/autocomplete search in Elasticsearch through NodeJS. I have indexed data by index "artist". Here is an example of stored data in ES.

{
  "hits": [{
    "_index": "artist",
    "_type": "_doc",
    "_id": "EyejqnAB2pHGVJHwV53Q",
    "_score": 1,
    "_source": {
      "kind": "song",
      "artistId": 111051,
      "artistName": "Eminem",
      "trackName": "Crack a Bottle (feat. Dr. Dre & 50 Cent)",
      "collectionName": "Relapse (Deluxe Version)",
      "collectionCensoredName": "Relapse (Deluxe Version)",
      "artistViewUrl": "https://music.apple.com/us/artist/eminem/111051?uo=4",
      "collectionViewUrl": "https://music.apple.com/us/album/crack-a-bottle-feat-dr-dre-50-cent-feat-dr-dre-50-cent/1440558626?i=1440558826&uo=4",
      "trackViewUrl": "https://music.apple.com/us/album/crack-a-bottle-feat-dr-dre-50-cent-feat-dr-dre-50-cent/1440558626?i=1440558826&uo=4",
      "previewUrl": "https://audio-ssl.itunes.apple.com/itunes-assets/AudioPreview128/v4/da/a5/c1/daa5c140-2c3d-1f74-40c3-b6e596e52b82/mzaf_7480202713407880256.plus.aac.p.m4a",
      "artworkUrl100": "https://is1-ssl.mzstatic.com/image/thumb/Music128/v4/c5/f8/fd/c5f8fdf6-d4c9-85c9-d169-c5d349a44f1c/source/100x100bb.jpg",
      "collectionPrice": 12.99,
      "releaseDate": "2009-02-02T12:00:00Z",
      "collectionExplicitness": "explicit",
      "trackExplicitness": "explicit",
      "discCount": 1,
      "discNumber": 1,
      "trackCount": 24,
      "trackNumber": 18,
      "country": "USA",
      "currency": "USD"
    }
  }]
}

Above artistName has value as Eminem and problem is when I type 'e', it doesn't shows anything, same on 'em', emi, emin. When i type emine then it starts giving out results. Where am i going wrong?

Opster Elasticsearch Ninja :

There are multiple ways to implement the autocomplete functionality, and a fuzzy search is not the correct one(It's mainly used to search the related documents w.r.t to tokens(de-dupe) and in spell-checker refer this for applications of fuzzy search).

In your case, I would suggest using the prefix query if your index size isn't huge and restrict the minimum character length to two, i.e., don't search for e and shows search results only when the user typed two or more characters ie em or emi,emin etc.

Working example

Index mapping

{
    "mappings": {
        "properties": {
            "artistName": {
                "type": "text"
            }
        }
    }
}

Index doc

{
   "artistName" : "Eminem"
}

{
   "artistName" : "Emiten"
}

Search query

{
    "query": {
        "prefix": {
            "artistName": {
                "value": "em"
            }
        }
    }
}

Search result

{
            "_index": "so-60558525-auto",
            "_type": "_doc",
            "_id": "1",
            "_score": 1.0,
            "_source": {
               "artistName": "Eminem"
            }
         },
         {
            "_index": "so-60558525-auto",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.0,
            "_source": {
               "artistName": "Emiten"
            }
         }

Important read

There are broadly four approaches that you can choose to implement the autocomplete, and each of them has some trade-off, which you should be aware of justifying your functional requirement as well as non-functional(performance, maintenance, implementation difficulties).

Considering the importance and presence of the Autocomplete feature in modern search engines, there is an detailed blog, mentioning all the approaches and their trade-off in detail.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=29908&siteId=1