Big data follows the BI evolution curve

Big Data analysis in South Africa is early in its maturity levels, and has yet to evolve in much the same way as BI did 20 years ago, says Knowledge Integration Dynamics.

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

Big data analysis tools aren’t ‘magical insight machines’ spitting out answers to all business’s questions: as is the case with all business intelligence tools, there are lengthy and complex processes that must take place behind the scenes before actionable and relevant insights can be drawn from the vast and growing pool of structured and unstructured data in the world.


South African companies of all sizes have an appetite for big data analysis, but because the country’s big data analysis segment is relatively immature, they are still focused on their big data strategies and the complexity of actually getting the relevant data out of this massive pool of information. We find many enterprises currently looking at technologies and tools like Hadoop to help them collate and manage big data. There are still misconceptions around the tools and methodologies for effective big data analysis: companies are sometimes surprised to discover they are expecting too much, and that a great deal of ‘pre-work’, strategic planning and resourcing is necessary.

Much like the early days of BI, big data analysis started as a relatively unstructured, ad hoc discovery process, but once patterns are established, models are developed, and the process becomes a structured one.

And in the same way that BI tools depend on data quality and relationship linking, big data analysis depends on some form of qualifying prior to being used. The data needs to be profiled for flaws which need to be cleansed (quality), it must be put into relevancy (relationships) and it must be timeous in context of what is being searched or reported on.  Methods must be devised to qualify much of the unstructured data, as a big question remains around how trusted and accurate information from the internet will be.

The reporting and application model that uses this structured and unstructured data must be addressed, and the models must be tried and tested. In the world of sentiment analysis and trends forecasting based on ever-changing unstructured data, automated models are not always the answer. Effective big data analysis also demands human intervention from highly skilled data scientists who have both business and technical experience.  These skills are still scare in South Africa, but we are finding a growing number of large enterprises retaining small teams of skilled data scientists to develop models and analyse reports.

As local big data analysis matures, we will find enterprises looking to strategise on their approaches, the questions they want to answer, what software and hardware to leverage and how to integrate new toolsets with their existing infrastructure. Some will even opt to leverage their existing BI toolsets to address their big data analysis needs.  BI and big data are already converging, and we can expect to see more of this taking place in years to come.


Fast data is old hat but customers now demand it in innovative ways

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

People don’t just need fast data, which is really real-time data by another name but infers that the data or information derived, received or consumed needs to be relevant and actionable. That means it must, for example, initiate or enforce a set of follow-up or completion tasks.


(Image not owned by KID)

Fast data is the result of data and information throughput at high speed. Real-time data has always been an enabler for real-time action that allows companies to respond to customer, business and other operational situations and challenges – almost immediately.

Fast, actionable data is that which is handed to decision-makers or users at lightning speed. But it is the application of knowledge gleaned from the data that is paramount. Give your business-people piles of irrelevant data at light speed and they will only get bogged down. Data consumers need the right insights and at the right time when they need it to effectively marshal resources to meet demands.

The problem for some companies is that they are still grappling with big data. There are many more sources of data, there are more types of data, and many organisations are struggling to connect the data from beyond their private domains with that inside their domains. However, big data fuels fast data but it must do so in real-time after being clearly interpreted and prepared so that decision-makers can take action. And it must all lead back to improving customer service.


(Image not owned by KID)

Why focus on customer service? Because, as Roxana Strohmenger, director, Data Insights Innovation at Forrester Research, says in a guest blog: “Bad customer experiences are financially damaging to a company.” The damage goes beyond immediate wallet share to include loyalty, which has potentially significant long-term financial implications.

Retailers, for example, are using the Internet of Things (IoT) to improve customer service. That’s essentially big data massaged and served directly to customers. The International Data Corporation (IDC) 2014 US Services Consumer Survey found that 34% of respondents said they use social media for customer support more than once a month. Customer support personnel who cannot access customer data quickly cannot efficiently help those people. In a 2014 report Forrester states: “Companies struggle to deliver reproducible, effective and personalised customer service that meets customer expectations.”

The concern for many companies is that they don’t get it right in time to keep up with their competition. They could spend years trying to regain market share at enormous expense.

So fast data can help but how do you achieve it? In reality it differs little from any previous data programme that feeds your business decision-makers. The need has always been for reliable data, available as soon as possible, that helps people to make informed decisions. Today we find ourselves in the customer era. The advent of digital consumer technologies have given consumers strong voice with the associated ability to hold widespread sway over company image, brand perceptions, and other consumers’ product choices. They can effectively influence loyalty and wallet share so their needs must be met properly and quickly. Companies need to know what these people think so they can determine what they want and how to give it to them.

All of this comes back to working with data. Data warehouses provision information to create business insight. Business intelligence (BI), using a defined BI vision, supporting framework and strategy, delivers the insights that companies seek. Larger companies have numerous databases, data stores, repositories – call them what you will, their data sits in different places, often in different technologies. Decision-makers need to have a reliable view into all of it to get a consistent single view of customers, or risk erroneous decisions.

Data warehousing, BI, and integration must be achieved in a strategic framework that leads back to the business goals, in this case at least partly being improved customer service, to make it cost effective, efficient, effective and deliver proper return on investment (ROI).

The following standard system development life-cycle process also applies to the world of immediacy driven by digital technologies as prior to it:


  1. Audit what exists and fix what is broken
  2. Assess readiness and implement a roadmap to the desired outcomes
  3. Discovery – scope requirements and what resources are available to meet them
  4. Design the system – develop it or refine what exists
  5. Implement the system – develop, test and deploy
  6. Train – executives and administrators
  7. Project manage – business users must be involved from the beginning to improve ROI and aid adoption
  8. Maintain – this essentially maintains ROI

Fast data relies on task and delivery agility using these pillars, which are in fact age-old data disciplines that must be brought to bear in a world where there are new and larger sources of data. The trick is to work correctly with these new sources, employ proven methodologies, and roll these out for maximum effect for customer satisfaction.



Governance: still the biggest hurdle in the race to effective BI

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

Whether you’re talking traditional big stack BI solutions or new visual analytics tools, it’s an unfortunate fact that enterprises still buy in to the candy-coated vision of BI without fully addressing the underlying factors that makes BI successful, cost-effective and sustainable.

Information management is a double-edged sword. Well architected, governed and sustainable BI will deliver the kind of data business needs to make strategic decisions. But BI projects built on ungoverned, unqualified data / information and undermined by ‘rebel’ or shadow BI will deliver skewed and inaccurate information: and any enterprise basing its decisions on bad information is making a costly mistake. Too many organisations have been doing the latter, resulting in failed BI implementations and investment losses.

For more than a decade, we at Knowledge Integration Dynamics have been urging enterprises to formalise and architect their enterprise information management (EIM) competencies based on best-practice or industry standards, which follow an architected approach and are subjected to governance.


EIM is a complex environment that needs to be governed and which encompasses data warehousing, business intelligence (BI), traditional data management, enterprise information architecture (EIA), data integration (DI), data quality management (DQM), master data management (MDM), data management life cycle (DMLC), information life cycle management (ILM), records and content management (ECM), metadata management and security / privacy management.

Effective governance is an ongoing challenge, particularly in an environment in which business must move at an increasingly rapid pace where information changes all the time.

For example, to tackle the governance issue in context of data quality starts with the matching and merging of historic data to ensure design and storage conventions are aligned and all data is accurate but according to set rules and standards. It is not just a matter of plugging in a BI solution that would give you results: it may require up to a year of careful design and architecture to integrate data from various departments and sources in order to feed the BI system. The conventions across departments within a single organization are often dissimilar, and all data has to be integrated and qualified. Even data as apparently straightforward as a customer’s ID number may be incorrect – with digits transposed, coded differently between source systems or missing – so the organisation must decide which data source or integration rule to trust in order to ensure data warehouses are compliant with quality rules and also with legislation standards needed to build the foundation of the 360-degree view of the customer that executive management aspires to. But integrating the data and addressing data quality is only one area where effective governance must be applied.

Many organisations wrongly assume that in data, nothing changes. But in reality, the organisation must cater for constant change. For example, when reporting in a bank, customer records could be dramatically incorrect if the data fails to reflect that certain customers have moved to new cities, or that bank branch hierarchies have changed. Therefore, linking and change tracking is crucial in ensuring data integrity and accurate current and historic reporting.

And automation takes you only so far: you can automate to the nth degree, but you still require data stewards to carry out certain manual verifications to ensure that the data is correct and remains so. Organisations also need to know who is responsible and accountable for their data and be able to monitor and control the lifecycle process from one end to the other. The goals are to eliminate multiple versions of the truth (results), have a trail back to sources and ensure that only the trusted version of the truth is integrated into systems.

Another challenge in the way of effective information management is the existence of ‘rebel’ or shadow data systems. In most organisations, departments frustrated by slow delivery from IT or with unique data requirements, start working in siloes, creating their own spreadsheets, duplicating data and processes, and not inputting all the data back into the central architecture. This undermines effective data governance and results in huge overall costs for the company. Instead, all users should follow the correct processes and table their requirements, and the BI system should be architected to cater for these new requirements. It all needs to come through the central architecture: In this way, the entire ecosystem can be governed effectively and data /information could be delivered from one place, also making management thereof easier and more cost-effective.

The right information management processes also have to be put in place, and they must be sustainable. This is where many BI projects fail – an organization builds a solution and it lasts only a year, because no supporting frameworks were put in place to make it sustainable. Organisations need to take a standards-based, architected approach to ensure EIM and governance is sustained and perpetuated.

New BI solutions and best practice models emerge continually, but will not solve the business and operational problems if they are implemented in an ungoverned environment, much the way a beautiful luxury car may have all the features you need, but unless the driver is disciplined, it will not perform as it should.


Knowledge Integration Dynamics, Mervyn Mooi, (011) 462-1277,

Big data best practices, and where to get started

Big data analytics is on the ‘to do’ list of every large enterprise, and a lot of smaller businesses too. But perceived high costs, complexity and the lack of a big data game plan have hampered adoption in many South African businesses.

By Mervyn Mooi, Director, The Knowledge Integration Dynamics Group

Big data as a buzzword gets thrown around a great deal these days. Experts talk about zettabytes of data and the potential goldmines of information residing in the wave of unstructured data circulating in social media, multimedia, electronic communications and more.

As a result, every business is aware of big data, but not all of them are using it yet. In South Africa, big data analytics adoption is lagging for a number of reasons: not least of them, the cost of big data solutions. In addition, enterprises are concerned about the complexity of implementing and managing big data solutions, and the potential disruptions these programmes could cause to daily operations.

It is important to note that all business decision makers have been using a form of big data analytics for years, whether they knew it or not. Traditional business decision making has always been based on a combination of structured, tabular reports and a certain amount of unstructured data – be that a phone call to consult a colleague or a number of documents or graphs – and the analytics took place at the discretion of the decision maker. What has changed is that the data has become digital; it has grown exponentially in volume and variety, and now analytics is performed within an automated system. To benefit from the new generation of advanced big data analytics, there are a number of key points enterprises should keep in mind:

  • Start with a standards-based approach. To benefit from the almost unlimited potential of big data analytics, enterprises must adopt an architected and standards-based approach for data / information management implementation which includes business requirements-driven integration, data and process modeling, quality and reporting, to name a few competencies.


(Image not owned by KID)

In context of an organized approach, an enterprise first needs to determine where to begin on its big data journey. The Knowledge Integration Dynamics Group is assisting a number of large enterprises to implement their big data programmes, and we have formulated a number of preferred practices and recommendations that deliver almost instant benefits and result in sustainable and effective big data programmes.

  • Proof of Concept unlocks big value. Key to success is to start with a proof of concept (or pilot project) in a department or business subject area that has the most business “punch” or is of the most importance to the organisation. In a medical aid company, for example, the claims department or business might be the biggest cost centre and with the most focus. The proof of concept or pilot for this first subject area should not be a throwaway effort, but rather a solution that can later be quickly productionised, with relevant adjustments, and reused as a template (or “foot-print”) for programmes across the enterprise.
  • Get the data, questions and outputs right. Enterprises should also ensure that they focus on only the most relevant data and know what outputs they want from it. They would have to carefully select the data/information for analytics that would give the organisation the most value for the effort. Furthermore, the metrics and reports that the organisation generates and measures itself by, must also be carefully selected and adapted to specific business purposes. And of course, the quality and trust-worthiness of sourced data/ information must be ensured before analytical models and reports are applied to it.
  • Get the right tools. In many cases, enterprises do not know how to apply the right tools and methodologies to achieve this. Vendors are moving to help them by bringing to market templated solutions that are becoming more flexible in what they offer, so allowing organisations to cherry pick the functionality, metrics and features they need. Alternatively, organisations can have custom solutions developed.
  • It’s a programme, not a project. While proof of concepts typically show immediate benefits, it is important for organisations to realise that the proof of concept is not the end of the journey – it is just the beginning. Implementing the solution across the enterprise requires strategic planning, adoption of a common architected approach (e.g. to eliminate data siloes and wasted / overlapping resources), and effective change management and collaboration initiatives to overcome internal politics and potential resistance and ensure the programme delivers enterprise-wide benefits.



SA companies are finally on the MDM and DQ bandwagon

Data integration and data quality management have become important factors for many South African businesses, says Johann van der Walt, MDM practice manager at Knowledge integration Dynamics (KID).

We have always maintained that solid data integration and data quality management are essential building blocks for master data management (MDM) and we’re finally seeing that customers believe this too. One of the primary drivers behind this is the desire for services oriented architecture (SOA) solutions for which MDM is a prerequisite to be effective. SOA relies on core data such as products, customers, suppliers, locations, and employees. Companies develop the capacity for lean manufacturing, supplier collaboration, e-commerce and business intelligence (BI) programmes. Master data also informs transactional systems and analytics systems so bad quality master data can significantly impact revenues and customer service as well as company strategies.

Taken in the context of a single piece of data MDM simply means ensuring one central record of a customer’s name, a product ID, or a street address, for example. But in the context of companies that employ in excess of 1 000 people, McKinsey found in 2013 that they have, on average, around 200 terabytes of data. Getting even small percentages of that data wrong can have wide ranging ramifications for operational and analytical systems, particularly as companies attempt to roll out customer loyalty programmes or new products, let alone develop new business strategies. It can also negatively impact business performance management and compliance reporting. In the operational context, transactional processing systems refer to the master data for order processing, for example.


(Image not owned by KID)

MDM is not metadata, which refers to technical details about the data. Nor is it data quality. However, MDM must have good quality data in order to function correctly. These are not new concerns. Both MDM and good quality data have existed for as long as there have been multiple data systems operating in companies. Today, though, they are exacerbated concerns because of the volume of data, the complexity of data, the most acute demand for compliance in the history of business, and the proliferation of business systems such as CRM, ERP and analytics. Add to that the fact that many companies use multiple instances of these systems across their various operating companies, divisions and business units, and can even extend to multiple geographies, across time zones with language variations. It unites to create a melting pot of potential error with far reaching consequences unless MDM is correctly implemented based on good quality data.

None of these concerns yet raise the issue of big data or the cloud. Without first ensuring MDM is properly and accurately implemented around the core systems companies don’t have a snowball’s hope in Hell of succeeding at any big data or cloud data initiatives. Big data adds successive layers of complexity depending on the scope of the data and variety of sources. Shuffling data into the cloud, too, introduces a complexity that the vast majority of businesses, especially those outside of the top 500, simply cannot cope with. With big data alone companies can expect to see an average growth of 60% of their stored data annually, according to IDC. That can be a frightening prospect for CIOs and their IT teams when they are still struggling to grapple with data feeding core systems.

While MDM is no longer a buzzword and data quality is an issue as old as data itself they are certainly crucial elements that South African companies are addressing today.