Data Security and Governance (DSG) for Big Data and BI Environments
Increased relevance of Big Data and BI environments in supporting business decisions requires a broad set of data points to be collected throughout the lifetime of an application and users’ interaction with it. Security and risk management leaders must ensure that the risks emerging from the abundance and extensive use of this data are identified and that the right security and governance controls are employed to address these risks.
1 The Challenge
The massive data collection by organizations for the purpose of creating data lakes and Big Data environment for better BI engagements raises some urgent concerns regarding appropriate and compliant use of data. Big Data comprises of large and complex sets of structured, unstructured and semi-structured data that can’t be managed and secured by traditional database security tools and techniques available in the market. The abundance of data combined with multiple distributed processing nodes only adds to the problem of inconsistent security controls resulting in data thefts, security breaches and failed audits.
Business Intelligence platforms help to extract and interpret business data using interactive tools for effective and accurate decision making. These platforms not only use multiple data sources but also provide self-service data modelling and dynamic content sharing options – all of which only exaggerate the problem of understanding data flows and complying with privacy and data residency regulations. A lack of visibility into information flows, particularly the unstructured data leads to inconsistent access policies.
Security hasn’t been a part of the strategy from CIOs or CDOs and is therefore largely missing from the architecture designs of Big Data and BI platforms. Distributed platforms increase complexity as they have inconsistent security practices across nodes and interactions between the distributed nodes are generally not secured.
Fragmented Big Data environments make it difficult to understand data flows and apply homogenous security policies and access controls throughout the environment.
The primary drivers for Data Security & Governance (DSG) for Big Data and BI platforms remain:
- Secured data processing across the distributed data environment
- Ensuring data quality through input source filtering and data validation
- Complying to data residency and privacy regulations
- Achieving access governance to a granular data level
While most Big Data and BI platforms offer propriety access control mechanisms, they are limited in scope and remain fragmented. Most of these controls are inflexible and remain vendor technology dependent. Existing IAM tools do not support Big Data and BI operations and therefore can’t be extended to address the security requirements of such environments. Further, access control for the vast amount of unstructured data is a major challenge for most organizations and security leaders that remains largely unaddressed by most IAM vendors today. With the massive collection of data through multiple data sources including third-party data streams, it becomes increasingly important for CIOs, CISOs and CDOs to implement effective data security and governance (DSG) for the Big Data and BI platforms to gain the required visibility and appropriate level of control over the data flowing through the enterprise systems, applications and databases. To add to these challenges is the lack of InfoSec professionals in the market and the Big Data security skills particularly remain few and rare.