Come to the KuppingerCole Analyst Chat. I'm your host. My name is Matthias Reinwarth, I'm an analyst and advisor at KuppingerCole analysts. And my guest today is Martin Kuppinger. He is principal analyst, and one of the founders of KuppingerCole. Hi Martin,
Hi Matthias. Welcome and thank you for inviting me to this conversation.
Great to have you. And we have an really interesting topic today, and it is not only interesting, but it's also long when it comes to the actual title of this episode. So I tried to read it out first. So we will be talking about data management and data linage as the foundation for big data governance, big data security and data security. That is our starting point. And we as identity and access management, people have been talking about controlling access for 20 years or longer. What is so special when it comes to security and governance? When we're talking about big data Martin.
So it's interesting because I would say this is one of the most widely ignored topics in the entire risk space. A lot of organizations have challenges, but few talk about few, really try to, to solve this challenge, but what makes it special? And when we look at the entire data challenges, it starts with, we have data in a variety of sources. This can be databases. This can be files that can be whatever they can be created on the fly by programs, et cetera, that they don't today frequently ends up in some sort of data lake, and then it ends up an analytics. And then there is some data extracted. And during this Bruce's data is combined. It is analyzed in whichever way. There might be combinations of data, which leads to far more sensitive data than the original data. And it's hard to keep track across the entire flow of the data, across the journey of the data.
And so implementing a consistent data security across all layers from the files and databases to the analytics end to the experts, to the results. That's extremely difficult. It's I would say it's close to impossible to do it today. More or less. I think there are some civil light at the Horace sun, but still a way to go. The other element, which adds to that is we need to get a better grip on data. So most of what we do in identity management and access management today is focused on someone that's allowed to access a system or not someone has entitlements in a system or an application. So he's allowed, or she's allowed to perform certain actions or not. But at the end, what we want to protect is not the application or the system or the network. It is the data, it's our valuable corporate information. This is what it's about. At the end of the day, having said this, we need a strong governance and we need a strong security on data across all types of data. And so that is why we need to look at this topic.
Yeah, I think that that's really important because when you take existing data out of a source system and put it into, as you've mentioned, a data lakes, a basis for executing all these new shiny approaches for data is the new gold. And there is so much more information in there. We are actually stripping off the access control because we need to, to have this information available. So controlling this access in such a plod platform, and to make sure that it's the data is still only used in a way that is consistent with the original access control. That is a challenge that many organizations are currently facing. And as you said are really currently not achieving to do.
Yeah. And so, so this is factually the reason why we, why we need to do things differently. And it also adds a level of complexity. So when you look at beta likes and then following the data lakes, the data models to complex ones where you have cubes and, and where you need them to, to define who is allowed. So to speak, to access, which slice of data they have to combine, which data it's more than saying, okay, someone is allowed to perform that actual, that action on a, on a certain system. And there there's a level of complexity in how to control data, but there's also, I think this is where we need to start. And where is it like to focus the conversation a little on how can we understand where which data resides? And this is also not only from protecting our intellectual properties and valuable data and avoiding data leaking product, it's a regulatory requirement. Then you look at PII, personal identifiable information as part of the data. It's essential to be able to understand where the data isn't to protect it adequately.
Exactly. And when we think of just this right, to be forgotten, when you want to execute that and to comply to this requirement, that's coming out of GDPR, then you really need to understand where, in which instances this data is in existence throughout your whole organization. And just to make that sure. And to implement that requires a much deeper understanding of your data structures and the data flows that you have, that many organizations just currently have.
Yeah. And that is what Y Y I think it's obvious that we must make this a key initiative in boost our, I am and our data and analytics initiatives within the organization. It is an essential topic. I believe there are. There can be no doubt about that. The question is, how do we do this? So what is the way to, I think soul is a big word, but to get better in big data and data governance and security, what is the place to start?
There's the first silver lining on the horizon, where are solutions that can tackle that, that obviously a huge gap that many organizations are facing right now?
Yes, there's that, that's exactly what we're. I believe that's the place to start. The place to start from my perspective is knowing where data resides, but which data resides. And that is something which was part of a set of technologies, which is today commonly referred to as meta data management. So meta data management is a technology which helps in discovering in extracting meta data. So data about data, data saying this is that type of data to put this into a data catalog. So meta data management is very closely related to data catalogs. In fact, this is something with tropes helps filling data catalogs, and it's done gives us what is called data line niche. Data linage is about having the alignment, the line of data to data flow. So knowing where does data flow and meta data management on the data catalogs are something which are important well beyond governance and security, but they are what can enable a far better Crip on our data from a data governance and data security perspective, because this helps us to know where is fixed data?
Where is it produced to learn about it, to discover it to infection then to manage it. This is what we see in DEROS number of vendors in that space. And we see some really mature and price level solutions in that space. I don't want to name winters here, but this technology is available. And if we want to succeed in big data governance and security, I believe it's essential to look at meta data management and data cut Luxem to start there. And is this something which then benefits a broad variety of other use cases, such as enabling a better analytics when you know, where data resides easier to analyze data, it will help in data quality because you learn about where it's redundant data health, et cetera, but from our perspective to today's talk without that, I don't see the chance to succeed.
So we need to add another building block for on the one hand applying access control, access management in a, in a global manner within an organization. And on the other hand, this is an enabler, as you've mentioned for the actually already existing use cases. When we think of, of nice front ends that provide visualizations. When if you think of machine learning algorithms that are running on top of this consolidated data. So that is, or should only be possible when, and if you have such a mechanism in place, it will be really neat also to extend the way we think of data control.
Yes, it is an additional element in what we need. You've got to sing us. It's not that we need to only for our identity management and access related and governance related security related challenges we are facing. It is something we need. Anyway, if we want to deal successfully with data and data is in some way. So I don't like this term, data's the new gold, but in some ways it's an essential element for capitalizing on what we can do on the data and to succeed in digital transformation. And with that, we need meta data management. We need data kind of looks, and we needed element. So commonly it will not be the identity or the security requirement, which drives the meta data management and data catalog project, but it will massively benefit from it. And it will also enable then Wenders of data, security of database security solutions to build and to extend their current offerings by knowing where to connect and having one place where they can figure out all the systems and the places to connect still a long way to go. But this is the foundation. Yes, it's another element. The good thing is normally will not be owned by the I am department. It will be just consumed
Understood. And I think we've just put our first toe into that topic. And I, we will come back and reiterate on that topic in further additions of this podcast. Is there already research available? You've mentioned a blog post earlier that you did already on that topic. So if somebody just uses our search engine at KuppingerCole dot com and types of meta data, that should be enough to get there. What else is?
Yeah, I think it's first to enter big data and big data security and big data governance there. A couple of advisory notes on to subtract. We already have a leadership compass for data based security Alec for a while, and varied in multiple additions. So what's the time. So they're small to come around to us because we believe that the future of access governance goes beyond system and application access. It must include data governance. So data governance is one of the key subtracts of what we are looking at when it comes to governance to a new and broadly defined access governance, because at the end, that is what we need to protect.
Normally I do a summary to sum up what we've been talking about. You did that already, so thank you very much, Martin, for being with me today. Thanks for your time. And thank you very much to the audience for listening. Bye. Bye
Bye-bye. Thank you for that conversation. Thank you.