Consumer permission is not compliance

GDPR and POPI compliance demand restructuring of data management practices, and deep data and process mapping.

Mervyn Mooi.

Mervyn Mooi.

The of Europe’s General Protection Regulation (GDPR) has sparked a flurry of mails and notices from businesses and suppliers asking consumers to allow them to use their personal information for brand marketing and purposes.

Companies have added opt-in notices to their sites and briefed their teams on GDPR and POPI compliance. Unfortunately for them, these measures are far from adequate for what is required to comply with data protection and privacy regulation.

Superficial GDPR and POPI compliance (such as getting consumer permission to send them information and taking broad steps to improve information security) is not true data governance, and many organisations fail to realise this.

Having policies in place or protecting information inside a system is not enough. Even data protected within an organisation can be misused or leaked by employees, whether deliberately or through an action as apparently innocent as passing on a sales lead or a job applicant’s CV to a colleague.

Effective governance and data protection still rests heavily on the discipline of the people handling the information. Therefore, when anyone in the company can access unprotected data and information, any governance mechanisms in place will be at risk.

How stringent Europe’s enforcement of GDPR will be has yet to be seen, and although South African law is not yet fully equipped to handle individuals’ lawsuits against companies for failing to protect their personal information, it is only a matter of time before someone challenges an organisation around the protection of personal information. And this is where the onus will be on the company to prove what measures it took to protect the information.

Compliance-Guide-logo-orange_blue

Contingent measures for protecting data should be put in place should the discipline of people falter. One such measure (which is pinnacle for enabling/proving governance) is the mapping of the rules, conditions, checks, standards (RCCSs) as transcribed from the regulations or accords (including GDPR covering data privacy through to POPI, King III, BCBS239, KYC and PCI) to the respective accountable and responsible people, to the data domains and to the control points of processes that handle the data/information within an organisation. These mappings need to be captured and maintained within a registry.

Effective governance and data protection still rests heavily on the discipline of the people handling the information.

Building an effective and future-proof RCCS registry can be a lengthy process. But the creation and maintenance of this registry is easily achieved within practice of metadata management, which already shows the mappings, which then simply need to be linked to policies, procedures and guidelines from the accords and regulations.

A registry typically evolves over time, mapping RCCSs to people, processes and data; ultimately proving that all rules, policies and procedures are physically implemented across all processes where the data is handled.

Once the mapping registry is in place, it becomes easier to identify and prevent data breaching or information leakage. More importantly, it also allows the organisation to ensure its data management rules and handling thereof are fully aligned with legislation across the organisation.

An effective digital RCCS mapping registry allows the auditor and responsible parties to easily link processes and data to legislation and policies, or to drill down to individual data fields to track compliance throughout its lifecycle/lineage.

But regardless if an organisation has all measures and controls to ensure GDPR RCCSs are implemented, governance (including that for the protection of data/information) still needs to be proved in terms of presentation or reporting.

In other words, a full data and process tracking (or lineage) and reporting capability needs to be in place, managed by a data governance organisational structure of people and regulated by a data governance framework which includes an engagement model that would be necessary between all responsible, accountable, consulted and informed parties.

For many, this could mean rebuilding their data management operating and system models from the ground up. Organisations should be taking steps now to put in place metadata management as the foundation for enabling compliance.

To build their ability to prove governance, organisations must prioritise this “governance” mapping exercise. Few companies have achieved this ‘sweet spot’ of data governance.

As the legislative environment changes and individuals begin challenging misuse of personal information, companies will increasingly be called on to show deep mapping and deep governance. Few, if any, do this today, but the implementation of GDPR serves as a useful reminder that this process should start now.

Advertisements

Blockchain in the compliance arsenal

By Mervyn Mooi

Blockchain technology may support some data management efforts, but it’s not a silver bullet for compliance.

Amid growing global interest in the potential for technologies to support management, enterprises may be questioning its role in compliance, particularly as the deadline looms for compliance with the European Union General Data Protection Regulation (GDPR).

complianceFor South African enterprises, compliance with the Protection of Personal Information (POPI) Act and alignment with the GDPR are a growing concern. Because GDPR and POPI are designed to foster best practice in data governance, it is in the best interests of any company to follow their guidelines for data quality, access , life cycle management and process management – no matter where in the world they are based.

At the same time, blockchain is attracting worldwide interest from a storage efficiency and optimisation point of view, and many companies are starting to wonder whether it can effectively support data management, security and compliance. One school of thought holds that moving beyond crypto-currency, blockchain’s decentralised data management systems and ledgers present new opportunities for more secure, more efficient data storage and processing.

However, there are still questions around how blockchain will align with best practice in data management and whether it will effectively enhance data security.

Once data is stored in blockchains, it cannot be changed or deleted.

Currently, blockchain technology for storing data may be beneficial for historic accounting and tracking/lineage purposes (as it is immutable), but there are numerous factors that limit blockchain’s ability to support GDPR/POPI and other compliance requirements.

Immutability pros and cons

Because public blockchains are immutable, once data is stored in blockchains, it cannot be changed or deleted. This supports auditing by keeping a clear record of the original, and every instance of change made to the data. While blockchain stores the lineage of data in an economical way, it will not address data quality and integration issues, however.

It should also be noted that this same immutability could raise compliance issues around the GDPR’s right to be forgotten guidelines. These dictate the circumstances under which records should be deleted or purged.

In a public blockchain environment, this is not feasible. Indeed, in many cases, it would not be realistic or constructive to destroy all records, and this is an area where local enterprises would need to carefully consider how closely they want to align with GDPR, and whether encryption to put data beyond use would suffice to meet GDPR’s right to be forgotten guidelines.

Publicly stored data concerns

In addition to the right to be forgotten issue, there is the challenge that data protection, privacy and accessibility are always at risk if data is stored in a public domain, such as the cloud or a blockchain environment. Therefore, enterprises considering the storage optimisation benefits of blockchain would also have to consider whether the core and confidential data is locally stored on private chains, and more importantly, whether those chains are subjected to security and access rules and whether the chain registries in the blockchain distributed environment are protected and subject to availability rules.

Blockchain environments also potentially present certain processing limitations: enterprises will have to consider whether blockchain will allow for parts of the chain stored for a particular business entity, such as a customer (or its versions), to be accessed and processed separately by different parties (data subjects) and/or processes.

Data quality question

The pros and cons of blockchain’s ability to support storage, management and security of data in the environment is just one side of the compliance coin: data quality is also a requirement of best practice data management. This is not a function of blockchain and therefore cannot be guaranteed by blockchain. Indeed, blockchain will store even unqualified data prior to its being cleansed and validated.

Enterprises will need to be aware of this, and consider how and where such data will be maintained. The issues of data integration and impact analysis also lie outside the blockchain domain.

IDC notes: “While the functions of the blockchain may be able to act independently of legacy systems, at some point blockchains will need to be integrated with systems of record,” and says there are therefore opportunities for “blockchain research and development projects, [to] help set standards, and develop solutions for management, integration, interoperability, and analysis of data in blockchain networks and applications”.

While blockchain is set to continue making waves as ‘the next big tech thing’, it remains to be seen whether this developing technology will have a significant role to play in compliance and overall data management in future.

Sub-second analytical BI time to value still a pipe dream

Internet search engines with instant query responses may have misled enterprises into believing all analytical queries should deliver split second answers.

With the advent of Big Data analytics hype and the rapid convenience of internet searches, enterprises might be forgiven for expecting to have all answers to all questions at their fingertips in near real time.

pexels-photo-256307.jpeg

Unfortunately, getting trusted answers to complex questions is a lot more complicated and time consuming than simply typing a search query. Behind the scenes on any internet search, a great deal of preparation has already been done in order to serve up the appropriate answers. Google, for instance, dedicates vast amounts of high-end resources and all of its time to preparing the data necessary to answer a search query instantly. But even Google cannot answer broad questions or make forward-looking predictions. In cases where the data is known and trusted, the data has been prepared and rules have been applied, and the search parameters are limited, such as with a property website, almost instant answers are possible, but this is not true BI or analytics.

Within the enterprise, matters become a lot more complicated.  When the end-user seeks an answer to a broad query – such as when a marketing firm wants to assess social media to find an affinity for a certain range of products over a 6-month period – a great deal of ‘churn’ must take place in the background to deliver answers. This is not a split-second process, and it may deliver only general trends insights rather than trusted, quality data that can serve as the basis for strategic decisions.

When the end user wishes to do a query and is given the power to process their own BI/Analytics, lengthy churn must take place. Every time a query, report or instance of data access is converted into useful BI/Analytical information for end-consumers, there is a whole lot of preparation work to be done along the way : i.e. identify data sources>  access> verify> filter> pre-process>  standardize> lookup> match> merge> de-dup> integrate> apply rules> transform> preprocess> format> present> distribute/channel.

Because most queries have to traverse, link and process millions of rows of data and possibly trillions of words from within the data sources, this background churn could take hours, days or even longer.

A recent TWDI study found that organisations are dissatisfied with the time it takes for the chain of processes involved for BI, analytics and data warehousing to deliver valuable data and insights to business users. The organisations attributed this, in part, to ill-defined project objectives and scope, a lack of skilled personnel, data quality problems, slow development or inability to access all relevant data.

The problem is that most business users are not BI experts and do not all have analytical minds, so the discover and report method may be iterative (therefore slow) and in many cases the outputs/results are not of the quality expected. The results may also be inaccurate as data quality rules may not have been applied, and data linking may not be correct, as it would be in a typical data warehouse where data has been qualified and pre-defined/derived. In a traditional situation, with a structured data warehouse where all the preparation is done in one place, and once only, and then shared many times, supported by quality data and predefined rules, it may be possible to get sub-second answers. But often even in this scenario, sub-second insights are not achieved, since time to insight also depends on properly designed data warehouses, server power and network bandwidth.

Users tend to confuse search and discover on flat raw data that’s already there, with information and insight generation at the next level. In more complex BI/Analytics, each time a query is run, all the preparation work has to be done from the beginning and the necessary churn can take a significant amount of time.

Therefore, demanding faster BI ‘time to value’ and expecting answers in sub-seconds could prove to be a costly mistake. While it is possible to gain some form of output in sub-seconds, these outputs will likely not be qualified, trusted insights that can deliver real strategic value to the enterprise.

By Mervyn Mooi, Director at Knowledge Integration Dynamics (KID)