Thesauri have since long been a key element for uniform search and annotation on large collections, such as the media collections preserved at Sound and Vision and VIAA. This talk will outline the work that has been done in a joint project between Sound & Vision and VIAA.
Elisa SabotJoe Pairman
Talend is an innovative vendor of big data integration software. Their documentation was published on the web, but in a linear, book-type format. Their customers often used Google to look for support, and when they clicked through to the documentation, the information and the links on the page were often insufficient to provide the needed answer.
Across industry sectors, understanding, assessing and reporting for regulatory compliance is both a priority and a challenge for many organizations. Data-related laws, such as HIPAA, BASEL and GDPR, require an understanding of the sources, flows and destinations of data.
Franz Inc. is collaborating with the Montefiore Health System, Intel, Cloudera, and Cisco to deploy a cognitive computing platform for healthcare. The platform is used in the medical domain for personalized medicine, translational research, predictive modeling, real time decision support and most importantly: computing the true cost of care.
I will be presenting the fraud detection system based on the Cognitum Platform that we have implemented for a state in Brazil. The system is a rule based system that is analyzing a stream of bills in real time. The system is currently helping the fraud analyst of the ministry of finance to recognize quicker and more effectively VAT frauds.
Atanas KiryakovPlamen TarkalanovVladimir Alexiev
The euBusinessGraph project integrates European company and economic data from various data providers, including OpenCorporates (the largest open database of company info crawled from official registers), Norway's Bronnoysund Register Center (official register data), SpazioDati (rich IT data from official registers, additional databases, web crawl of company sites, tender info, etc), EventRegistry events, GLEI, Panama Leaks, etc.
The Dutch cadastre has taken the lead by developing the geospatial data platform of the future and releasing it as beta in July 2016, on data.pdok.nl. This platform offers a semi-automated transformation from geospatial data (such as WFS, GML) to Linked Data (RDF), and on top of that APIs, including a SPARQL endpoint, view, test & documentation environment.
One of the key features of a successful online employment marketplace is the ability to match people with the most relevant job opportunities. Our business uses data about candidates, jobs and hirers to perform this task. One valuable datapoint in this process is the job titles, which we discover in semi-structured forms in a candidate’s employment history and in a hirer’s job advertisement.
Many organisations are adopting a form of Structured Content Authoring (SCA). This to increase content consistency, discoverability, semantic quality of content, in a multi-channel world.
Mary-Ann GrossetThierry VebrJan-Anno Schuur
The talk and demonstration will highlight the development, at the Organisation for Economic Co-operation and Development, of “O.N.E Sight”, a fully semantic reading assistant, which unleashes the power of the triples, the result of 3 years of capacity building, developments and cross functional team work.
We will outline the project approach, the learning curve the team went through, the intellectual and technical challenges faced as issues linked to new ways of handling information, silos, traditional text-indexation, lack of text fragmentation and semantic links, reconciliation of semantic and textual searches, representation issues and more had to be addressed.
We will describe the long march towards semantic annotation and the emphasis placed on the quality of the tagging. This will include: i) development, maintenance and use of the OECD central Taxonomies and Ontologies in the semantic analysis tools, ii) hazards of semantics (fuzziness, context, acronyms and disambiguation), iii) creation of a golden corpora, annotation quality testing, multi-view annotation graphs and iv) development of tools to identify ‘knowledge nuggets’, such as socio-economic indicators, by tagging semantic relationships within texts. The methodology used to develop these quality tagging applications, persistently returning high precision and recall statistics (around 95%) to ensure reliable results enabling the use of the tags in a production environment, will be described.