Method for extracting company names from text

A method for extracting company names from textual information uses a combination of heuristics, exception lists, and extensive corpus analysis. The method first locates company name suffixes (i.e., Company, Corporation) and attempts to locate the beginning of the company name. The method works on both mixed-case text and capitalized text. Upon identification of a company name, the method proceeds to generate variations of the name for later extraction.

猜你喜欢

转载自sharehua.iteye.com/blog/1754710