Ali AI break the visual dialogue to identify the record, the ability to emulate human machine Picture Talk

Recently, at the second session of the visual dialogue Visual Dialogue Challenge race, the AI ​​Ali beat Microsoft, Seoul National University and other 10 participating teams, won the championship.

_AI_

(Ali AI was in the visual dialogue crown race)

Visual dialogue competition by the Georgia Tech University, Facebook Artificial Intelligence Laboratory (FAIR) and other organizations jointly leading the world in visual technology conferences CVPR launched, is one of the most authoritative in the field of visual dialogue competitions currently.

The contest required participating AI after reading ten thousand pictures, answer the question for any human content of the picture either. The results showed that competition, Ali AI to the accuracy of 74.57% of the championships, the last game will record increased 16.82 percent. In the same data set, mankind was only 64.27 percent accurate.

Traditional AI mainly for visual detection and identification of targets, for example, identify whether the picture is a cat, but a logical relationship between the target complex scenes understanding, reasoning ability is weak, can not answer, "the boys wear next cat complicated question of what color clothes "and so on, it is difficult to picture information into human understanding of language output.

Ali AI breakthrough is to propose a "recursive model to explore the dialogue", a comprehensive integrated image recognition, natural language understanding the relationship between reasoning and the ability of the three, it marked the way of thinking of information imitate human cognitive learning through efficient use of complex scenes, can effectively entities and the relationships between them to identify the picture, the picture content inferred events described, and by the context modeling effectively, understand the issues raised by humans and true intentions, given natural accurate response.

_AI_

(Visual dialogue, AI can easily deal with the human question, left for the AI, the right to human)

Visual dialogue in recent years, the rapid rise of AI research, the purpose is to teach machines using natural language and human discuss visual content. If the visual recognition technology, the machine has the visual ability; technology so visual dialogue, so that the machine has the understanding and ability to infer the true vision of the world, meaning that AI cognitive ability to a new level.

_

(Visual dialogue technology is expected to improve the human earthquake relief efficiency)

It is understood that this technology will be applied in future human-computer interaction and many other scenes: search for survivors in the ruins of rescue robots after the earthquake, can be more timely, efficient and integrated command instruction information to make action scenes; visually impaired people can ask questions Ali AI, understand the content network in the photo, which is aware of their surroundings; unmanned vehicle on the intention of understanding the impact factor will be more accurate and better passenger ride experience.

Guess you like

Origin yq.aliyun.com/articles/706614