Chaired by:  Hidir Aras (FIZ), Linda Andersson (TU Wien) and Allan Hanbury (TU Wien)


The PatentSemTech2019 workshop, aims to establish a long-term collaboration and a two-way communication channel between the IP industry and academia from relevant fields such as natural-language processing (NLP), text and data mining (TDM) and semantic technologies (ST) in order to explore and transfer new knowledge, methods and technologies for the benefit of industrial applications as well as support research in applied sciences for the IP and neighbouring domains.

We invite contributions that show relevant use cases for patent text mining and analytics as well as new means for bootstrapping training data generation, e.g. for labelling domain-specific datasets from the IP domain.

PatentSemTech2019 is a full day workshop in conjunction with SEMANTiCS 2019 consisting of paper presentations, industrial talks, poster & demos and an annotation session for experts.

Call for Submissions

Patent text mining includes research focusing on coordination of multiple diverse information sources related to patents, for example, data sources for patent law, regulations, patents, court litigations, scientific publications etc.

Main Topics

We encourage submissions of high quality research papers on all topics in the areas listed below.

  • Topics of interest included (but not limited to)
    • Text Mining and Text Retrieval with scientific-technical information e.g. patents, legal data, bio-medical information, etc.
    • Terminology detection
  • Entity extraction
  • Machine Learning methods applied to scientific-technical information for creating added value e.g. embeddings for query expansion, terminology extraction, etc.
    • Patent classification
    • IPC/CPC class prediction
  • Applications and methods for linking semantic information to patent data from external knowledge sources
  • Methods and applications for mining and analysing large amounts of scientific-technical information (big data analytics)
  • Methods for technology analysis with patent information, e.g. patent landscaping, hotspot analysis, technology trend analysis, etc.
  • Semantic enrichment of patent text
  • Visual user interface concentps for exploring patent data and patent retrieval results.

A few benchmarking data and test collections are listed on the Data Resources page.


Keynote speaker

Mr. Anthony Trippe is the Managing Director of Patinformatics LLC. Patinformatics is an advisory firm specializing in patent analytics and landscaping to support decision making for technology-based businesses. In addition to operating Patinformatics, Mr Trippe is also an Adjunct Professor of IP Management and Markets at Illinois Institute of Technology, teaching a course on patent analysis, and landscapes for strategic decision making. Mr Trippe has written or contributed to IP related articles that have appeared in the Wall Street Journal, Forbes, The Washington Post and more than a dozen additional sources.


Dr Hidir Aras (, Dr. Hidir Aras is a research assistant and project manager for text and data mining at FIZ Karlsruhe. His applied research interests include big data analytics, text and data mining, and semantic analysis of patent information. Hidir Aras joined FIZ Karlsruhe in 2012 and was previously a research associate and PhD student at the University of Bremen, where he received his PhD on "Semantic Interaction in Web-based Retrieval Systems". Before, after completing his studies in business informatics at the University of Mannheim, he worked for several years at the European Media Laboratory GmbH in Heidelberg on various research projects related to geographical information systems, intelligent mobile assistance and the Semantic Web.

Dr Lei Zhang (, Dr. Lei Zhang is currently a senior researcher at FIZ Karlsruhe – Leibniz Institute for Information Infrastructure, Germany. Prior to that, he was working as a research associate at the Institute AIFB, Karlsruhe Institute of Technology (KIT), where he got his Ph.D. degree with the highest distinction "summa cum laude". His interdisciplinary work has been published in more than 20 top-level journals and conference proceedings and earned the Best Student Research Paper Award at the 15th International Semantic Web Conference (ISWC) in Kobe, Japan. In addition, he was involved in  several EU research projects and leaded the development of a framework of cross-lingual and cross-media semantic annotation and search, which received the 2nd prize in the Semantic Web Challenge  2015 in Bethlehem, USA.

Dr Allan Hanbury (, Dr Allan Hanbury is Professor for Data Intelligence at the TU Wien, Austria, and Faculty Member of  the Complexity Science Hub, where he leads research and innovation to make sense of unstructured data. He is initiator of the Austrian ICT Lighthouse Project, Data Market Austria, which is creating a Data-Services Ecosystem in Austria. He was scientific coordinator of the EU-funded Khresmoi Integrated Project on medical and health information search and analysis, and is co-founder of contextflow, the spin-off company commercialising the radiology image search technology  developed in the Khresmoi project. He also coordinated the EU-funded VISCERAL project on evaluation of algorithms on big data, and the EU-funded KConnect project on technology for analysing medical text. Most recently, he was Short Paper Co-Chair of the European Conference on  Information Retrieval (ECIR) 2018 ; Programme Committee Member of the European Big Data Value  Forum 2017, Versailles, France ; Co-Organiser of the 3rd KEYSTONE COST Action Training School on Keyword Search in Big Linked Data, Vienna, 21-25 August 2017 ; Special Session Chair of the ACM International Conference on Multimedia Retrieval (ICMR) 2017; and Programme Committee co-chair  for the European Conference on Information Retrieval (ECIR) 2015.

Ms Linda Andersson (, 11 Ms Linda Andersson has for the last 15 years conducted text mining research in close connection to the IP industry. Ms Andersson has worked on different aspects of text mining. In 2009, Ms Andersson finalized her Master Thesis, “A Vector Space Analysis of Swedish Patent Claims, Does Decompounding Help?” which was based on a collaboration with the Swedish Patent and Registration Office. For her PhD Thesis, “The Essence of Patent Text Mining,” Linda continued working close with the text mining industry. Part of Ms Andersson’s work and research is developing real world patent text mining applications using Natural Language Processing techniques. Ms Andersson has in her PhD research established a generic method for Natural Language Annotation Design for domain-specific text mining solutions for medicine, legal and technical text. In 2018 she launched the product idea ‘Artificial Researcher in Science’ which received the Commercial Viability Award from the Austrian Angel Investors Association. Ms Andersson is the founder of the Artificial Researcher-IT GmbH start-up.

Dr Florina Piroi (, Dr Florina Piroi is a senior researcher at the TU Wien, IFS group, with experience in domain specific search, search engine evaluation and running evaluation campaigns. She has been coordinating the CLEF-IP evaluation campaign and organising workshops, where specific Information Retrieval methods for the Intellectual Property domain have been assessed. Dr. Piroi has also been on the  Organisation Committee of the European Conference for Information Retrieval (ECIR), 2015. She has  been working in Industry 4.0 projects that involved machine learning methods and algorithms. She  applied information extraction techniques and natural language processing tools to texts on research articles within the ADmIRE project. Since 2017 she is coordinating the Innovation Training Course  "Data Science and Deep Learning" at TU Wien. She has received her PhD degree from the Johannes Kepler University, Austria, where her work concentrated on management and retrieval of mathematical knowledge and automatic theorem provers.

Dr Mihai Lupu ( , Dr Mihai Lupu is Studio Director of the Research Studio Data Science, of the Research Studios Austria Forschungsgesellschaft. He is the Coordinator of the Lighthouse project of the Federal Ministry of  Infrastructure, Innovation, and Technology, as well as Scientific Coordinator of the recently granted H2020 Safe-DEED Research and Innovation Action. Dr Lupu has over 10 years of experience in Search Technologies, Artificial Intelligence and Machine Learning, with over 100 publications in these fields. He is also Associate Editor of World Patent Information Journal. Previously, DR Lupu has been Area  Chair of the Intl. Conference of Information and Knowledge Management (CIKM), programme chair  of Conference and Labs of the Evaluation Forum (CLEF) and the Information Retrieval Facility  Conference (IRFC) and organiser of the Patent Information Retrieval (PaIR) series of workshops, as  well as the Intl. Keystone Training School.

In collaboration with


Thursday, September 12, 2019 - 09:00 to 17:00