Getting full names from NER

Patrick Lightbody :

From reading through the docs and playing with the API, it looks like CoreNLP will tell me the NER tags per token, but it won't help me extract out full names from a sentence. For example:

Input: John Wayne and Mary have coffee
CoreNLP Output: (John,PERSON) (Wayne,PERSON) (and,O) (Mary,PERSON) (have,O) (coffee,O)
Desired Result: list of PERSON ==> [John Wayne, Mary]

Unless there is some flag I missed, I believe to do this I will need to parse the tokens and glue together successive tokens tagged PERSON.

Can someone confirm that this is indeed what I need to do? I mostly want to know if there is some flag or utility in CoreNLP that does something like this for me. Bonus points if someone has a utility (ideally Java, since I'm using the Java API) that does this and wants to share :)


PS: There was a very similar question here, which seems to suggest the answer is "roll your own", but it was never confirmed by anyone else.

Manos Nikolaidis :

Your are probably looking for entity mentions instead of or as well as NER tags. For example with the Simple API:

new Sentence("Jimi Hendrix was the greatest").nerTags()

new Sentence("Jimi Hendrix was the greatest").mentions()
[Jimi Hendrix]

The link above has an example with the traditional non-simple API using a good old StanfordCoreNLP pipeline

Guess you like