When a question is typed on a web forum or search engine, a reliable answer is always provided regardless of how it is framed. Well, this issue points out one of the problems of semantics as far as automatic natural language processing is concerned. Most search engines rely significantly on the words used to construct a question as opposed to its meaning when conducting their semantics analyses. For this reason, they are incapable of identifying the similarity between two akin phrasings in terms of semantics but differ because of the used words.
Shifting from expert knowledge to learned knowledge of data
Research engineers at Orange Labs, Geraldine Damnati and Delphine Charlet, renowned language experts are currently involved in studying natural language semantics. In fact, they topped at SemEval, a global semantics competition. According to Delphine, semantics implies the meaning of words or texts. In the past, automatic natural language processing mostly depended on expert knowledge created by lexicographers and linguists.
Nowadays, similar data sets exist for numerous languages but not all. In fact, thanks to AI technology with deep learning and statistical analysis capabilities, it is possible to deduce knowledge based on huge text quantities without the need for a database that has already been interpreted by a human being. Nonetheless, comprehending the exact details of a text has proven to be the Holy Grail of AI.
Online forums are undeniably a valuable source of information, as they depict sincere human combined intelligence at work. Nevertheless, forum content is still under-utilized. When it comes to knowledge bases, answering questions involving how many, who or what is easy as opposed to how or why questions. According to Geraldine, the question and answer model is a fundamental aspect in the AI field.
For the last ten years, the yearly-held SemEval world competitions comprised numerous teams from around the globe, tackling many semantic analysis operations. However, the 2017 SemEval campaign dubbed Community Question Answering task, dealt with the problem of recognizing similar questions in forums. In this case, the main obstacle involved improving on Google.
Delphine together with other Deskin team members won the competition by providing a reliable solution with the ability to compute the semantic similarity not only between words but also in data that features grammar and spelling errors. The team’s approach utilizes machine learning to process the history of the forum in question in a bid to understand the representations of each word according to the context it appears in a statement.
With the above AI technology solution, the question on what kind of field will benefit significantly from it remains. The first natural application revolves around self-troubleshooting, customer care service and the forums. Furthermore, the AI-powered tool can be used as a semantic search engine.
Source Orange Lab