August 30, 2018

Premium Sponsor MarkLogic recently has released ODH, a revolutionary architecture pattern to enable organizations to ingest, curate, and access their data from across organizational silos. Mark Logic’s technical delivery architect (or “head consulting nerd” as she calls herself on LinkedIN) took the time to fit us into her busy schedule. In this Interview Jen talks about MarkLogic, constraints large organizations and enterprises face when integrating data from silos into their warehouses and ODH, the remedy recently introduced by our premium sponsor.

What are the focal areas of MarkLogic‘s business activities?

MarkLogic is the world’s best database for integrating data from silos. Organizations around the world rely on MarkLogic—an operational and transactional Enterprise NoSQL database platform—to integrate their most critical data and build innovative applications on a 360-degree view.

Over the years, MarkLogic has worked on some of the toughest data integration challenges faced by large enterprises in government, media, financial services, insurance, and manufacturing – amongst others. What we have found consistently is that organizations are held back by legacy technologies and approaches – such as inflexible relational databases and time-consuming ETL. These technologies and approaches not only eat up a large proportion of the organization’s resources, but they also have contributed to the accrual of significant amounts of technical debt. Significantly, the traditional ways of doing enterprise data management are also impeding innovation; this is because they separate out data management activities that are needed to observe the business (such as data warehouse) from those that run the business. At MarkLogic, we focus on integrating data to meet both of these needs, via an architectural pattern we call the Operational Data Hub that is powered by the MarkLogic Enterprise NoSQL database.

In the recent year you developed ODH, the Operational Data Hub. You are integrating different kind of domain-related sources into one graph. Can you elaborate further on ODH?

The ODH is the missing piece of the data integration puzzle – and one which remediates a lot of that technical debt that organizations have accrued over the years. It’s an architecture pattern that has evolved based on our experiences with customers around the world – and is now supported by productized components and features like the MarkLogic Data Hub Framework and Smart Mastering, which make it even easier for customers to deploy a MarkLogic-based ODH to ingest, curate, and access their data from across organizational silos.

The ODH pattern was only able to emerge due to a technological shift in databases, specifically NoSQL and multi-model databases. Relational databases require extensive up-front modelling and conformance to rigid schemas. Every data item must have its place pre-allocated in the data model, or the data cannot be loaded. This makes the cost of designing a data model to support complex integration too expensive for many use cases. A more flexible schema, provided by modern multi-model databases like MarkLogic, enables data to be loaded with much lower schema-design costs while allowing for much broader data coverage.

The MarkLogic databases has all the features needed to support the ODH pattern, including a flexible data model, sophisticated indexing, the ability to represent complex and evolving semantic relationships within and across data items, the ability to store data and metadata together to support robust data governance, the elasticity to scale to massive enterprise-wide data volumes, and robust security and encryption. And, we at MarkLogic have a track record of having deployed this architectural pattern at some of the largest institutions in the world.

I understand that ODH addresses demands your customers raised. Can you sketch such a use cases to help us better understand the benefits of ODH?

The reality is that for any organization today, data is locked up in silos. A typical company might work exclusively with a mega vendor like SAP for their ERP systems. That is great. But, then they acquire ten other companies, half of which do not use the same ERP system. Now, all their human resources, financial, or customer data is split across multiple systems. Other times, silos developed intentionally. For example, in financial services, banks historically separated out the investment, research, and retail arms. Now, however, regulators require banks to integrate data from those different lines of business. Regulations such as Know Your Customer requires banks to have an integrated, comprehensive view of their customers.

Regardless of how they came to be, the consequence of data silos is that they prevent real-time analysis and decision-making. It used to be okay to piece together data from different sources, look at it, massage it, and find interesting facts. But, that is too slow. Today, organizations need to be able to leverage data in real-time as well as through traditional approaches such as data warehouses.

We describe a number of industry-specific use cases in our free eBook, “Introducing the Operational Data Hub,” which can be downloaded at

What is your business perspective on this new product? Or should we rather call it service?

The ODH itself is an architecture pattern, that arose out of real-life data integration scenarios and challenges that we at MarkLogic have encountered at customer sites around the world. As part of our experience solving these data integration challenges, we’ve also developed and productized a number of tools to support critical data integration functionality – like our Smart Mastering feature – as well as a Data Hub Framework™ which makes it easy for organizations to build and deploy an ODH to meet their needs for data ingestion, curation, and data access. As for where we go from here, that continues to be customer demand driven. Not surprisingly this will cover things such as expansion of our cloud-centric offerings as they relate to our core capabilities. For instance, our MarkLogic query service offering that was released earlier this year is based on customers’ feedback around flexible bursting in the cloud. For the most part, that deals with scale and what are considered “non-functional” capabilities. However, with the ODH pattern and the related frameworks we’ve built as a result, we continue to get a lot of inbound requests related to the functional capabilities, all based on what we’ve delivered into production. In that respect we see a very high ceiling with respect to where we (and our customers) can take the pattern.

You have a workshop and a talk at the SEMANTiCS conference. What will be the topic of your talk, what will you talk about and what can we expect from the workshop?

Data integration is time consuming and difficult right? What if you could take unstructured data, enrich it and combine in with structured and semantic data to build a working semantic search and discovery hub in under 3 hours? In this hands on workshop participants will build a semantic data hub combining structured, unstructured and semantic data in a single application using PoolParty and the MarkLogic Database. At the end of the workshop participants will have a working semantic data hub and a good introduction to working with both technologies.

Last but not least, please describe MarkLogic‘s territory in the overall development world of Semantic AI, as it is sketched at the conference?

No matter what new technology is introduced – be it cloud, blockchain, AI, etc – it’s not going to be useful unless it’s working with integrated, accurate, governed data. We see ourselves as the platform to provide that data.


