In order to understand a text not only superficially, but also in its deeper meaning, not only single entities should be recognized without any connection, but the recognized entities have to be embedded into their contexts. Word embeddings based on Word2vec or Co-occurrence analyses are statistical methods of text corpus analysis that are often used to automatically calculate contexts of terms and phrases.
Conversational systems, also known as dialogue systems, have become increasingly popular. They can perform a variety of tasks e.g. in B2C areas such as sales and customer services. A significant amount of research has already been conducted on improving the underlying algorithms of the natural language understanding (NLU) component of dialogue systems. This paper presents an approach to generate training datasets for the NLU component from Linked Data resources. We analyze how differently designed training datasets can impact the performance of the NLU component.
Ontology alignment plays a key role in achieving interoperability on the semantic Web. Inspired by the success of word embeddingtechniques in several NLP tasks, we propose a new ontology alignmentapproach based on the combination of word embedding and the radiusmeasure. We tested our system on the OAEI conference track and thenapplied it to aligning ontologies in a real world case study.
Increasing digitization leads to a constantly growing amount of data in a wide variety of application domains. Data analytics, including in particular machine learning, plays the key role to gain actionable insights from this data in a variety of domains and real-world applications. However, configuration of data analytics workflows that include heterogeneous data sources requires significant data science expertise, which hinders wide adoption of existing data analytics frameworks by non-experts.