1 Introduction
The amount of data is continuously growing. But not only the amount of data is growing, but also the number of places where data is kept, as is the variety of databases and other technologies for storing and managing data.
On the other hand, in an age where digital solutions build on data, from utilizing customer data for targeted management to all the varieties of AI (Artificial Intelligence) solutions found today that base their decisions on data, knowing where which data resides becomes an imperative. Statements such as "data is the new gold" or "data is the new oil" emphasize the importance of data in the digital age. You can't use what you don't know – this is where Data Catalogs and Metadata Management comes in, providing an overview of where data resides and insight into how data flows and the type and quality of data.
Utilizing data and "democratizing" data, i.e., making it available to more users for more use cases, is just one of today's challenges. With the increasing value of data, adequate protection moves to the center of attention. In addition, an ever-increasing number of regulations mandate that organizations protect certain types of data, such as PII (Personally Identifiable Information). You can't protect what you don't know – Data Catalogs and Metadata Management help in enforcing Data Governance and Data Security where it's needed.
While the focus in data utilization has been on business intelligence and analytics, the governance and security focus has been on technical measures. These include on one hand database security and big data security, providing, e.g., encryption capabilities or protection against specific types of attacks such as SQL injection. On the other hand, there is the established field of Access Governance, which focuses on managing and restricting entitlements for accessing data, and on controlling access to data, but does not focus on the data itself.
Data Catalogs and Metadata Management deliver the foundation for both a better utilization of data and a better protection and governance of data. Unfortunately, these technologies are not as widely used as they should be, wasting parts of the potential of data owned by organizations, and increasing risks of malicious use of data, of data leakage and other data security issues, and of becoming non-compliant with data-related regulations.

From our perspective, implementing feature-rich and comprehensive approaches for Metadata Management and the underlying Data Catalogs is a must for modern organizations, enabling both a better utilization of data and a higher level of data governance and data security. Additionally, by understanding where data resides and how "good" that data is, the creation, management, quality, and consumption of data can be optimized and thus the value of data can be increased while lowering the cost for redundant storage, management, and analytics of data.
1.1 Highlights
- Metadata Management evolves towards a core discipline within data management, data governance, and analytics, providing a unified perspective across all data sources.
- Data Catalogs are the central element of metadata management, delivering a repository of data and insights into the value of that data.
- Most solutions only handle metadata, but don't synchronize the data itself.
- The ability to scan a wide variety of sources, beyond traditional databases and data lakes, is essential for a comprehensive overview about where data resides.
- Data lineage is a key capability, analyzing and visualizing the flow of data between different databases, data lakes, analytics applications, etc.
- Many solutions provide integrated capabilities for data governance and data privacy, while others only integrate with specialized solutions.
- Metadata management and data catalogs also are the foundation for "data democratization", enabling users to better work and consume that but also collaborate with others on data.
- Overall Leaders (in alphabetical order) are Alation, Collibra, Informatica, and OneTrust.
- Product Leaders (in alphabetical order) are Alation, Collibra, Global IDs, Informatica, OneTrust, OvalEdge, and Precisely.
- Innovation Leaders (in alphabetical order) are Alation, Collibra, Informatica, OneTrust, and OvalEdge.