Analyst Chat

Analyst Chat #113: Data Catalogs and Metadata Management

Data catalogs and metadata management solutions help capture and manage data from all enterprise data sources to enable the use of that data and support data governance and data security initiatives. This interesting and growing market segment is the topic this week when Martin Kuppinger and Matthias sit down for the Analyst Chat podcast.

Welcome to the KuppingerCole Analyst Chat. I'm your host. My name is Matthias Reinwarth and I'm Lead Advisor and Senior Analyst with KuppingerCole Analysts. My guest today is Martin Kuppinger, he is Principal Analyst with KuppingerCole, and he's one of the founders of KuppingerCole Analysts. Hi, Martin. Good to see you.
Hi Matthias, pleasure to talk to you again.
Great to have you. And it's quite some tradition already. We have started this year with a series of podcast episodes covering just recently published Leadership Compass documents. And we've talked with the individual authors, the analysts that did the research in that area. And we will continue that tradition. And we want to look at the Leadership Compass that you just recently published. It's the Leadership Compass Data Catalogs and Metadata Management. First of all, what is that market segment? What are we talking about? What are the types of products that we're looking at?
Okay. Okay. I think a that's a fair question. And it is... as the title says, it looks at two different aspects which are closely related, there are other related aspects. But at the end we focused on these, which is data catalogs and metadata management. Data catalogs are solutions that provide you with a catalog that shows you where does which type of data reside. In that catalog, in virtually all implementations, I think very few also can keep data, but the vast majority of solutions doesn't concentrate data into a single store, but the meta data. So which type of data is held where? Is it PII? And a lot of other data. And these data catalogs then help to understand where data sits and helps people to find the data that they need to do for their business in either using the data or in implementing data governance, data privacy and other types of solutions.
So it supports organizations in understanding where their intellectual property is in form of the data, where they can find it, and where they can protect it. And you mentioned, its business enabling, better understanding where things are and cybersecurity plus governance, protecting the data where it is at an adequate level, right?
Yeah. And I think the challenge is very simple, that most organizations have way more data than they think they have, that it's hard to find the data. So if you want to find data, you might assume, okay, we have a certain type of data here. It must be somewhere. But where is it? And then on the other side, as you mentioned, there's just governance, security aspect. And we've seen quite a strong uptake over the last few years in the context of regulations such as GDPR and CCPA, so the privacy regulations. Because for these regulations you are obliged to know where the data is, the personal data, in that case the PII and to protect it. And that is something which adds to that. And on the other hand, when we look at our advisory and what we do have a copy of code, we get these inquiries about, Can you support us in better protecting the data, better securing the data? We know that data is flowing from A to B to C from source to a data lake to an analytics program. To certain types of reports or other data sources. How to get a grip of that without knowing where which data is. It is hard to do. By the way, also one reason why the vast majority of the solutions in the market not only support the catalog as a feature, but several of them add data governance as an integrated capability or a side product. And most of these solutions have something in which is called data lineage. So data lineage, it's about the inheritance. Where does data come from? How does it flow? And this is, I think, a super important thing because that is what is very frequently lacking. Even if you can figure out, oh, this is a data source, we usually don't know, is it the one, or is it just something that just derived? And we need something that helps us to do that better. This is where data catalogs and data management come in. And due to this link to privacy, to data governance, to security, it is a topic that I felt super important for us as the analysts for identity, privacy and cybersecurity. It's super important for us to cover this area.
I would fully agree, and you've mentioned the driver being on the one hand the regulations and on the other hand organizations trying to better understand where their data is in its best form. And thus, is this also reflected within the products.? Have you seen that change in the development of the of the tools of the services of the products that you analyze? Is there a change since the last time you looked at that market?
It's the first time we looked at that market. But if you're clearly observing, we are observing the change. And I think that there are a couple of major trends in these products. One has been adding the capability for data lineage because just knowing data's here or here is one thing, knowing how it flows is the other. Having solutions for data governance, data privacy, in place or cooperating with specialists is another one. We also see at least one major vendor, which is OneTrust, which factually and actually, enter this market segment coming from the privacy management side. So most vendors come from data management, from data analytics and from related areas. But at least one vendor really came in right now from the privacy end of this. The other thing is that a lot of vendors have really strong capabilities for, on one hand, search and discovery, so how can a user identify or find data he or she needs? And the other is curation. So curation means, How can people rate data? How can people collaborate in data? And how can people find sort of the expert for a certain type of data? For certain information? All these features that really are targeted at making a better use of data very commonly are elements. That is also what we looked at in this Leadership Compass. And so how good are those technologies in searching discovery, in connecting to data sources, in data lineage, integration, collaboration features, etc. So this is a different directions the vendors take, and there are different histories they have. But the tendency is to to look at both, the governance and security enablement. So other solutions then commonly add to that or build on that information you have and that knowledge about data, and using it for it for being better in that data economy or however you'd like to name it.
Perfect. So we now understand what the market segment looks like, what the individual components are that are included, what are the capabilities? A Leadership Compass aims at providing the proper information for people deciding about solutions, about architectures to identify the right solution for them. Nevertheless, we also apply some first rating and identify leaders. You've mentioned one vendor already with OneTrust. What other vendors are there that would be relevant in that market that you would consider to be worthwhile mentioning as a summary to get a better picture for it? Apart from understanding what the software looks like, what the market looks like, what are the vendors?
Yeah. So I would actually say, more or less all vendors we've covered in the rating are relevant. So we have some which have a somewhat different approach, they take a different angle on that. So one from the big cloud service providers we already have and the rating, others are catching up with that, is AWS for instance, AWS is coming in with a solution that is really more targeted at not only doing a data catalog, but also extracting and transforming and loading data into new sources with respect to the typical use cases on that platform. So they are not the data catalog for every use case, but for certain use cases, they are clearly the best one. But then we have the usual leaders, just like Informatica, like Collibra, Alation, OvalEdge and precisely some of them being really very long in this market of data management in a broader sense. And so you can expect to find really the common, the typical players and when you go into the vendors to watch list and there are quite a number of additional players which are entering the market or which have metadata management and data catalogs more as a byproduct to other solutions and it's a fast emerging market. So we see more and more of these vendors entering this market. And I believe it's just because this is really... that we need something which helps us to deal better with the vast amount of data that exists in every organization.

Right, thansk. And for those of our audience who are interested in that market overview, in that rating, also in the criteria that you applied in identifying the right products for that market segment, I would highly recommend that you go to, find the Leadership Compass about that area, it is easily accessible either by doing this initial subscription where for 30 days, I assume, where you can get access to these documents and also the subscription is really affordable and it's really a powerful tool in understanding what market segments apart from this many others look like. And I would highly recommend that. Any final addition from your margin before we close down for today?
No, I think it's fine. Look at that Leadership Compass and look at the field because we need to get better in understanding which data we have to make use out of that data.
Right. So that's it for today. Thank you very much, Martin, for joining me today and looking forward to having you in a further episode.
Yeah. Thank you, Matthias.
Thank you. Bye bye.

Video Links

Stay Connected

KuppingerCole on social media

How can we help you

Send an inquiry

Call Us +49 211 2370770

Mo – Fr 8:00 – 17:00