David Baehrens
In the era of Smart Data and the explosion of data volume of all kind, organizations seek for leveraging such data - being it patent information, research literature social media data etc. - for competitive advantage and to help achieving their strategic aims. The process of search, filtering and categorization of large data sets go typically far beyond simple keyword search. Semantic technologies paired with machine learning approaches from artifical intelligence are a promising approach to support more fine-granular analysis of data.
The European Patent Office and Averbis recently went into collaboration for the pre-classification of incoming patent applications (use case 1) and re-classification of existing classification schemes (use case 2). In this cooperation, various services are provided with the aim of automatically assigning patent applications to the right departments and automatically allocating existing patents with new CPC codes. The solution is based on complex linguistic and semantic analyses, as well as statistically-based machine learning processes. Up to 250.000 incoming patents shall be classified per year and categorized in up to 1.500 categories. In this talk, we want to present both use cases together with some technical background about the applied language technologies.
Rüdiger Schütz
MON, a subsidiary of Müller Medien has developed a framework of semantic representations comprising web services for a semantic web crawler, which can be used for auto classification and clustering the content of web sites, as well as web services for retrieving keyword clusters based on a system of facets and several thesauri for the relationships of branches. There are several implementations of the framework – from the automatic enrichment of SEO landing pages, the enhancement of search experience for white and yellow pages online and improving search facilities of web portals to tools for sales reps. The presentation will focus on one of the sales tools.
David Kuilman
The Content model for Elsevier Optimized Learning Suite is an extensible framework to allow for authoring, storage and delivery of content assets that are used for highly interactive, personalized learning experiences. The content standards are based on W3C XML, HTML5, RDF, RDFa, JSON (Javascript Object Notation) and the extended JSON-LD (JSON for linking data) standards. An API supports the full workflow of content structuring, authoring, learning orchestration and deployment of learning objects into a product framework.
Heiner Oberkampf
The software environment currently found in the analytical community consists of a patchwork of incompatible software, proprietary and non-standardized file formats, which is further complicated by incomplete, inconsistent and potentially inaccurate metadata. To overcome these issues, Allotrope Foundation is developing a comprehensive and innovative framework consisting of metadata dictionaries, data standards, and class libraries for managing analytical data throughout its life cycle. In this talk we describe how laboratory data and semantic metadata descriptions are brought together to ease the management of a vast amount of data that underpins almost every aspect of drug discovery and development.
Florian Bauer
We strongly believe that semantic technologies and open data (ie. in the form of open data thesauri and ontologies) are key factors to connect information and knowledge in the climate sector. Using open data to increase consistency in language and allow links between related documents (across languages and platforms) provides those who are affected by climate change as well as decision makers in the field of sustainable development access to accurate and timely knowledge on climate related issues.
Interactive session: See a demo and experience first-hand our Knowledgecafe!
Simon Dalferth
As a European public administration, the General Secretariat of the EU Council needs to provide information to EU member states efficiently and effectively. This requires a knowledge-based administration. Equally, the data that we make available is addressed to citizens, researchers, journalists and others and should be available in a reusable format. Thus we embarked on a pilot project to make the votes of the EU Council available as a first open data set. This project helps greatly in developing a strongly semantics-based approach to internal and external information provision.
Jan Benedictus
Engaging authors to write semantically rich content is a key success factor in various industries. Especially relevant for organizations that re-use and re-purpose content, and aim for multi-channel, dynamic and personalized publishing. We made it our mission: enable anyone to write semantically rich- and structured content. To introduce Semantic Intelligence Tooling into the authoring process is key. We do so by providing authors real-time feedback on content-quality and by pro-actively suggesting tags, links and consistent language.
Jan Benedictus will show examples of how semantic intelligence is applied to make structured authoring easier to do. In other words: how authoring becomes efficient and fun.
Fabian Heinemann
A common problem in large and complex organizations is the distribution of information over various systems. This hampers overview creation of business relevant data. Typically, this issue is further complicated by the problem, that often non-standardized strings are used to describe concepts (i.e. external cooperation partners or diseases are expressed in various ways). In the context of clinical cooperation’s, we developed an application to solve these issues. First, we integrated data from several relational databases with different table schemas. To become independent of the source system, we converted the data to simple RDF triples containing the information from the different relational databases. Subsequently, this intermediate RDF was processed in Unified Views, a custom ETL (Extract Transform Load) tool, specifically build for RDF - ETL tasks. Within Unified Views, spelling variants were normalized and further information such as institutional or disease hierarchies were added. Moreover, the data is converted to a predefined RDF model. Finally the data was loaded into a RDF data store (Virtuoso) and queried using SPARQL. The queries were wrapped by an intuitive user interface. Due to the semantic enrichment of the data during the ETL conversion, outstanding search and overview features could be provided to the end-users.
Eelco Kruizinga
The United Nations’ Climate Technology Centre & Network (CTCN) provides technical assistance in response to requests submitted by developing countries via their Nationally-selected focal points, or National Designated Entities (NDEs). Upon receipt of such technical assistance requests, the CTC needs to quickly mobilize its global Network of climate technology experts to design and deliver a customized response tailored to local needs. The volume of requests as well as the amount of experts and technical assistance responses ('solutions') is growing considerably and therefore it was decided to explore if and how an automated process can support filtering technical assistance requests ('requirements') and the identification of experts and resources to fulfil these requests (a "Matchmaking Assistant"). Ideally, such an automated process would offer NDEs with the opportunity to query the CTCN's knowledge management system using problem-oriented language to find expertise, case studies/good practice stories, relevant documents and signposts to relevant other knowledge sources. This latter scenario would allow the CTCN to significantly scale-up its technical assistance work to developing countries.
A demonstrator matchmaking assistant was developed to explore how existing tools such as the REEEP Climate Tagger (which can identify most relevant concepts from unstructured text and is based on an expert-developed climate thesaurus) and the underlying PoolParty Semantic Suite Technology provide a solution for the above-mentioned challenge.
This talk outlines the matchmaking scenario supported by the demonstrator, the technical set-up of the demonstrator, user feedback and lessons learned.
Aleksandar Kapisoda
Publications in peer-reviewed Scientific Journals are seen as a performance indicator reflecting the productivity of a research-based pharmaceutical company focused on innovation and new therapeutic concepts for unmet medical needs. A new corporate publication tracking system serves Boehringer Ingelheim employees and the Research leadership team as an important benchmark and tracking tool for “Boehringer Ingelheim Papers” published in the global scientific community. Our semantic approach comprises the results of automated literature database alerts and manually curated & enriched data for tracking, storing and visually analysing published articles in peer-reviewed scientific journals, based on semantic analysis of publications coming from Boehringer Ingelheim authors.
Pages