Answers in no time

Internet search engines with instant query responses may have misled enterprises into believing all analytical queries should deliver split-second answers.

With the advent of big data analytics hype and the rapid convenience of Internet searches, enterprises might be forgiven for expecting to have all answers to all questions at their fingertips in near real-time.

dataflow

 

Unfortunately, getting trusted answers to complex questions is a lot more complicated and time-consuming than simply typing a search query. Behind the scenes on any Internet search, a great deal of preparation has already been done in order to serve up the appropriate answers.

Google, for instance, dedicates vast amounts of high-end resources and all of its time to preparing the data necessary to answer a search query instantly. But, even Google cannot answer broad questions or make forward-looking predictions.

In cases where the data is known and trusted, the data has been prepared and rules have been applied, and the search parameters are limited, such as with a property Web site, almost instant answers are possible, but this is not true business intelligence (BI) or analytics.

Behind the scenes

Within the enterprise, matters become a lot more complicated. When the end-user seeks an answer to a broad query – such as when a marketing firm wants to assess social media to find an affinity for a certain range of products over a six-month period – a great deal of ‘churn’ must take place in the background to deliver answers. This is not a split-second process, and it may deliver only general trend insights rather than trusted, quality data that can serve as the basis for strategic decisions.

Most business users are not BI experts.

 

When end-users wish to do a query and are given the power to process their own BI/analytics, lengthy churn mke place. Every time a query, report or instance of data access is converted into useful BI/analytical information for end-consumers, there is a whole lot of preparation work to be done along the way: ie, identify data sources> access> verify> filter> pre-process> standardise> look up> match> merge> de-dup> integrate> apply rules> transform> pre-process> format> present> distribute/channel.

Because most queries have to traverse, link and process millions of rows of data and possibly trillions of words from within the data sources, this background churn could take hours, days or even longer.

A recent TWDI study found organisations are dissatisfied with the time it takes for the chain of processes involved for BI, analytics and data warehousing to deliver valuable data and insights to business users. The organisations attributed this, in part, to ill-defined project objectives and scope, a lack of skilled personnel, data quality problems, slow development or inability to access all relevant data.

The problem is most business users are not BI experts and do not all have analytical minds, so the ‘discover and report’ method may be iterative (therefore slow), and in many cases, the outputs/results are not of the quality expected. The results may also be inaccurate as data quality rules may not have been applied, and data linking may not be correct, as it would be in a typical data warehouse where data has been qualified and pre-defined/derived.

In a traditional situation, with a structured data warehouse where all the preparation is done in one place, and once only, and then shared many times, supported by quality data and predefined rules, it may be possible to get sub-second answers.

But, often, even in this scenario, sub-second insights are not achieved, since time to insight also depends on properly designed data warehouses, server power and network bandwidth.

Users tend to confuse search and discover on flat raw data that’s already there, with information and insight generation at the next level. In more complex BI/analytics, each time a query is run, all the preparation work has to be done from the beginning and the necessary churn can take a significant amount of time.

Therefore, demanding faster BI ‘time to value’ and expecting answers in sub-seconds could prove to be a costly mistake. While it is possible to gain some form of output in sub-seconds, these outputs will likely not be qualified, trusted insights that can deliver real strategic value to the enterprise.

Advertisements

Old business issues drive a spate of data modernisation programmes

 

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

The continued evolution of all things is obviously also felt in the data warehousing and business intelligence fields and it is apparent that many organisations are currently on a modernisation track.

But why now? Behind it all is exponential growth and accumulation of data and businesses are actively seeking to derive value, in the form of information and insights, from the data. They need this for marketing, sales and performance measurement purposes and to help them face other business challenges. All the business key performance indicators or buzzwords are there: wallet share, market growth, churn, return on investment (ROI), margin, survival, customer segments, competition, productivity, speed, agility, efficiency and more. Those are business factors and issues that require stringent management for organisational success.

Take a look at Amazon’s recommended lists and you’ll see how evident and crucial these indicators are. Or peek into a local bank’s, retailer’s or other financial institution’s rewards programmes.

AAEAAQAAAAAAAA1bAAAAJGE4ZTEwYWNjLTE4YjctNGU2YS1hOWFhLTA2NWU0OTYxOWNlNA

(Image not owned by KID)

Social media has captured the media limelight in terms of new data being gathered, explored and exploited. But it’s not the only one. Mobility, cloud and other forms of big data, such as embedded devices and The Internet of Things, collectively offer a smorgasbord of potential that many companies are mining for gold while others are entrenching such value in their IT and marketing sleuths if they’re to remain in the game. Monetisation of the right data, information and functionality, at the right time, is paramount.

The tech vendors have been hard at work to crack the market and give their customers what they need to get the job done. One of the first things they did was come out with new concepts of working with data under using the old technologies. They introduced tactical strategies like centre of excellence, enterprise resource planning, application and information integration, sand-pitting and more. They also realised the need to bring the techies out of the IT cold room and put them in front of business-people so that they could get the reports the business needed to be competitive, agile, efficient and all the other buzzwords. That had limited success.

In the meantime the vendors were also developing modern and state-of-the-art technologies that people can use. The old process of having techies write reports that would be fed to business-people on a monthly basis was not efficient, not agile, not competitive and generally not at all what they needed. What they needed were tools that could hook into any source or system, that could be accessed and massaged by the business-people themselves and that could be relied upon for off-the-shelf integration and reporting. Besides that, big data was proving to be complex and required a new and useable strategy that would be scalable and affordable to both the organisation and the man on the street.

Hadoop promised to help that along. Hadoop is a framework based on open source technology that can give other benefits such as better return on investment by using clusters of low cost servers. And it can chew through petabytes of information quickly. The key is integrating Hadoop into mainstream analytics applications.

Columnar databases make clever use of the properties of the underlying storage technologies that enable compression economies and make searching through the data quicker and more efficient. There’s a lot of techie mumbo jumbo that makes it work but suffice to say that searching information puts the highest overhead on systems and networks so it’s a natural area to address first.

NoSQL is also known as Not only SQL because it provides storage and retrieval modelled not only on tables, common to relational databases, but also by column, document, key values, graphs, lists, URLs and more. Its designs are simpler, horizontal scaling is better – which improves the ability to add low cost systems to improve performance – and it offers better control over availability.

Data appliances are just as the name suggests: plug and play, data warehousing in a box, systems, software and the whole caboodle. Just pop it in and: “Presto,” you’ve got a ton more capacity and capability. These technologies employ larger, massively parallel, and faster in-memory processing techniques.

Those technologies, and there are others like them, solve the original business issues mentioned upfront. They deliver the speed of analytics that companies need today, they give companies the means to gather, store and view the data differently that can lead to new insights, they can grow or scale as the company’s data demands change, their techies and business-people alike are more productive using the new tools, and they bring a whole raft of potential ROI benefits. ROI, let’s face it, is becoming a bigger issue in environments where demands are always growing, never diminishing, and where financial directors are increasingly furrow browed with an accumulation of nervous tics.

Large businesses aren’t about to rip out their existing investments – there’s the implicit ROI again – but will rather evolve what they have. The way organisations are working to change reporting and analytics, though, will have an impact on the skills that they require to sustain their environments. Technical and business tasks are being merged and that’s why there’s growing demand for so-called data scientists.

Data scientists are supposed to be the do-it-all guys, right from data sourcing and discovery to franchising insightful and sentiment-based intelligence. They are unlike traditional information analysts and data stewards or report writers, who had distinct roles and responsibilities in the data and information domains.

 

 

 

 

 

The wielder, not the axe, propel plunder aplenty

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

Business intelligence is a fairly hot topic today – good news for me and my ilk – but that doesn’t mean everything about it is new and exciting. The rise and rise of BI has seen a maturation of the technologies, derived from a sweeping round of acquisitions and consolidations in the industry just a few years ago, that have created something of a standardisation of tools.

business-dashboard-types

(image not owned by KID)

We have dashboards and scorecards, data warehouses and all the old Scandinavian-sounding LAPs: ROLAP, MOLAP, OLAP and possibly a Ragnar Lothbrok or two. And, like the Vikings knew, without some means to differentiate, everyone in the industry becomes a me-too, which means that’s what their customers ultimately get. And that makes it very hard to win battles.

 

Building new frameworks around tools to achieve some sense of differentiation achieves just that: only a sense of differentiation. In fact, even when it comes to measurements, most measures, indicators and references in BI today are calculated in a common manner across businesses. They typically use financial measures, such as monthly revenues, costs, interest and so on. The real difference, however, comes in preparing the data and the rules that are applied to the function.

 

A basic example that illustrates the point: let’s say the Vikings want to invade England and make off with some loot. Before they can embark on their journey of conquest they need to ascertain a few facts. Do they have enough men to defeat the forces in England? Do they have enough ships to get them there? Do they know how to navigate the ocean? Are their ships capable of safely crossing? Can they carry enough stores to see them through the campaign or will they need to raid settlements for food when they arrive? Would those settlements be available to them? How much booty are they likely to capture? Can they carry it all home? Will it be enough to warrant the cost of the expedition?

 

The simple answer was that the first time they set sail they had absolutely no idea because they had no data. It was massively risky of the type that most organisations aim to avoid these days. So before they could even begin to analyse the pros and cons they had to get at the raw data itself. And that’s the same issue that most organisations have today. They need the raw data but they don’t need it, in the Viking context, from travellers and mystics, spirits and whispers carried on the wind. It must be good quality data derived from reliable sources and a good geographic cross-section. And in preparing their facts, checking they are correct, that they come from reliable sources, that there has been case of broken telephone, that businesses will truly make a difference. Information is king in war because it allows a much smaller force to figure out where to maximise its impact upon a potentially much larger enemy. The same is true in business today.

 

Before the Vikings could begin to loot and pillage they had to know where they could put ashore quickly to effect a surprise raid with overwhelming odds in their favour. In business you could say that you need to know the basic facts before you drill down for the nuggets that await.

 

The first Viking raids grew to become larger as the information the Vikings had about England grew. Pretty soon they had banded their tribes or groups together, shared their knowledge and were working toward a common goal: getting rich by looting England. In business, too, divisions, units or operating companies may individually gain knowledge that it makes sense to share with the rest to work toward the most sought-after plunder: the overall business strategy.

 

Because the tools and technologies supply common functionality and businesses or implementers can put them together in fairly standard approaches as they choose, the real differentiator for BI is the data itself and how the data is prepared – what rules are applied to it before it enters the BI systems. Preparation is king.

 

These rules ultimately differentiate information based on wind-carried whispers or reliable reports from spies abroad. Which would you prefer with your feet on the deck?

 

Contact

 

Knowledge Integration Dynamics, Mervyn Mooi, (011) 462-1277, mervyn.mooi@kid.co.za

Thought Bubble, Jeanné Swart, 082-539-6835, jeanne@thoughtbubble.co.za

 

 

Big data: don’t adopt if you can’t derive value from it

Amid massive big data hype, KID warns that not every company is geared to benefit from costly big data projects yet.

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

images-3

Big data has been a hot topic for some time now, and unfortunately, many big data projects still fail to deliver on the hype. Recent global studies are pointing out that it’s time for enterprises to move from big data implementations and spend, to actually acting on the insights gleaned from big data analytics.

 

But turning big data analytics into bottom line benefits requires a number of things, including market maturity, the necessary skills, and processes geared to auctioning insights. In South Africa, very few companies have these factors in place to allow them to benefit from significant big data projects. Despite the hype about the potential value derived from big data; in truth, value derivation is still in its infancy.

 

Locally, we find the early adopters have been major enterprises like banks, where big data tools are necessary for sifting through massive volumes of structured and unstructured data to uncover trends and run affinity analysis and sentiment analysis. But while they have the necessary advanced big data tools, we often find that these new technologies are delivering little more than a sense of confirmation, rather than the surprise findings and bottom line benefits they hoped for.

 

This may be due to processes that result in slow application of new insights, as well as to a dire shortage of the new data science skills that marry technical, analytics and strategic business know-how. Currently, the process of big data management is often disjointed from start to finish: companies may be asking the right questions and gaining insights, but unless these insights are delivered rapidly and companies actually use those insights effectively, the whole process is rendered ineffective. There is  little point in having of a multi-million rand big data infrastructure if the resulting insights aren’t applied at right time in the right places.

 

The challenge now is around the positioning, management and resourcing of big data as a discipline. Companies with large big data implementations must also face the challenges of integration, security and governance at scale. We also find there are many misconceptions about big data, what it is, and how it should be managed. There is an element of fear about tackling the ‘brave new world’ of technology, when in reality, big data might be seen as the evolution of BI.

 

Most commonly, we see companies feeling pressured to adopt big data tools and strategies when they aren’t ready, and are not positioned to benefit. As with many technologies, hype and ‘hard sell’ may convince companies to spend on big data projects when they are simply not equipped to use them. In South Africa, only the major enterprises, research organisations and perhaps players in highly competitive markets stand to benefit from big data investments. For most of the mid-market, there is little to be gained from being a big data early adopter. We are already seeing cheaper cloud-based big data solutions coming to market, and – as with any new technology – we can expect more of these to emerge in future. Within a year or two, big data solutions will become more competitively priced,  simpler, require fewer skilled resources to manage, and may then become more viable for small to mid-market companies. Until then, many may find that more effective use of their existing BI tools, and even simple online searches, meet their current needs for market insights and information.

 

Unless there is a compelling reason to embark on a major big data project now, the big data laggers stand to benefit in the long run. This is particularly true for those small and mid-size companies currently facing IT budget constraints. These companies should be rationalizing, reducing duplication and waste, and looking to the technologies that support their business strategies, instead of constantly investing in new technology simply because it is the latest trend.

 

Mervyn Mooi, Knowledge Integration Dynamics, (011) 462 1277

Jeanne Swart, Thought Bubble, 082 539 6835

Data (Information) Governance: a safeguard in the digital economy

Global interest in Data Governance is growing, as organisations around the world embark on Digital Transformation and Big Data management to become more efficient and competitive. But while data is being used in myriad new ways, the rules for effective governance must prevail.

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

176703924

(image not owned by KID)

The sheer volume and variety of data coming into play in the increasingly digital enterprise presents massive opportunities for organisations to analyse this data/information and apply the insights derived therefrom to achieve business growth and realise efficiencies. Digital transformation has made data management central to business operations and created a plethora of new data sources and challenges. New technology is enabling data management and analysis to be more widely applied; supporting organisations that are increasingly viewing data as a strategic business asset that could be utilised for gaining a competitive advantage.

To stay ahead, organisations have to be agile and quick in this regard, which has prompted some industry experts to take the view that data governance needs a new approach; with data discovery carried out first, before data governance rules are decided on and applied in an agile, scalable and iterative way.

While approaching data management, analysis and associated data governance in an iterative way using smaller packets of data makes sense, however, the rules that must be applied must still comply with legislation and best practice; and as a prerequisite these rules should therefore be formalised before any data project or data discovery is undertaken. Governance rules must be consistent and support the overall governance framework of the organisation throughout the data lifecycles of each data asset regardless of where and when the data is generated, processed, consumed and retired.

In an increasingly connected world, data is shared and analysed across multiple platforms all the time – by both organisations and individuals. Most of that data is being governed in some way, and where it is not, there is risk. Governed data is secure, applied correctly and of quality (reliable), and – crucially – it helps mitigate both legal and operational risk. Poor quality data alone is a significant cause for concern among global CEOs, with a recent Forbes Insights and KPMG study finding that 45% of CEOs say their customer insight is hindered by a lack of quality data and 56% saying they have concerns about the quality of data they base their strategic decisions on; while Gartner reports that the average financial impact of poor quality data could amount to around $9.7 million annually. On top of this, the potential costs of unsecured data or non-compliance could be significant. Fines, lawsuits, reputational damage and the loss of potential business from highly regulated business partners and customers are among the risks faced by the organisation failing to implement effective data governance frameworks, policies and processes.

Ungoverned data results in poor business decisions and exposes the organisation and its customers to risk. Internationally, data governance is taking top priority as organisations prepare for new legislation such as the new EU GDPR, formally known as the General Data Protection Regulation legislation, which is set to come into effect next year, and organisations such as Data Governance Australia launch a new draft Code of Practice on benchmarks for the responsible collection, use, management and disclosure of data. South Africa, surprisingly, is on the forefront here with its POPI regulations and wide implementations of other guideline such as KING III and Basel.  New Chief Data Officer (CDO) roles are being introduced around the world.

Now more than ever before, every organisation has to have up to date data governance frameworks in place and more importantly, have the rules articulated or mapped into their processes and data assets. They must look from the bottom up, to ensure that the rules on the floor align with the compliance rules and regulations from the top. These rules and conditions must be formally mapped to the actual physical rules and technical conditions in place throughout the organisation. By doing this, the organisation can illustrate that its data governance framework is real and articulated into its operations, across physical business and technical processes, methodologies, access controls and data domains of the organisation, ICT included.  This mapping process should ideally begin with a data governance maturity assessment upfront. Alongside this, the organisation should deploy dedicated data governance resources for sustained stewardship.

Mapping the rules and conditions, and the due configuration of the relevant toolsets to enforce data governance, can be a complex and lengthy process.  But they are necessary in order to entrench data governance throughout the organisation. Formalised data governance mapping proves to the world where and how the organisation has implemented data governance, demonstrating that policies are entrenched throughout its processes and so supporting audit and reducing compliance risk and operational risk.

To support agility and speed of delivery iterations for data management and analyses initiatives and instances, data governance can be “sliced” specifically for the work at hand and also applied in iterative fashion, organically covering all data assets over time.

 

 

Risks en route to cloud

By Veemal Kalanjee, Managing Director at Infoflow – part of the KID group

Security in the cloud worries many companies, but security and risk management during migration should be of greater concern.

cloud-security

Security and control of data are commonly cited as being among the top concerns of South African CIOs and IT managers. There is a prevailing fear that business-critical applications and information hosted anywhere but on-premises are at greater risk of being lost or accessed by cyber criminals.

In fact, data hosted by a reputable cloud service provider is probably far safer than data hosted on-premises and secured by little more than a firewall.

What many businesses overlook, however, is the possibility that the real business risks and data security issues could occur before the data has actually moved to the cloud, or during the migration to the cloud.

When planning a move to the cloud, risks are posed by attempting to rush the process. Poor selection of the cloud service provider, failure to ensure data quality and security, and overlooking critical integration issues can present risks both to data security and business continuity.

Large local companies have failed to achieve ambitious plans to rapidly move all their infrastructure and applications to the cloud due to an ‘eat the elephant whole’ approach, which can prove counter-productive and risky. To support progress to the cloud while mitigating risk, cloud migrations should be approached in small chunks instead, as this allows for sufficient evaluation and troubleshooting throughout the process.

Look before leaping

Before taking the plunge, companies must carefully evaluate their proposed cloud service and environment, and strategically assess what data and applications will be moved.

Cloud migrations should be approached in small chunks

Businesses must consider questions around what cloud they are moving to, and where it is hosted. For example, if the data will be hosted in the US, issues such as bandwidth and line speed come into play: companies must consider the business continuity risks of poor connections and distant service providers.

They must also carefully assess the service provider’s continuity and disaster recovery plans, the levels of security and assurances they offer, and what recourse the customer will have in the event of data being lost or compromised or the service provider going out of business. Moving to the cloud demands a broader understanding of security technologies and risk among all project team members than was needed previously, in non-cloud environments.

In addition, when considering a move to the public cloud, one aspect that can’t be mitigated is what was once an exclusive use environment for the company in a non-cloud form is now a multi-tenant shared environment, which potentially brings its own security risks.

It is up to the company to perform a comprehensive due diligence analysis on the cloud vendor to ensure the multitude of security risks are adequately addressed through preventative security measures put in place by the vendor.

Data on the move

Once a suitable cloud vendor has been identified, the data to be migrated must be assessed, its quality must be assured, and the data must be effectively secured.

The recommended first step is to identify the data to be migrated, considering, for example:
* Are there inactive customers on this database?
* Should the company retain that data, archiving it on-premises, and move only active customers to the cloud?

Once the data to be migrated has been identified, the company must review the quality of this data, identifying and addressing anomalies and duplicates before moving to the next phase of the cloud migration. Since poor quality data can undermine business success, the process of improving data quality ahead of a cloud migration can actually improve business operations, and so help mitigate overall business risk.

Moving data from the company’s internal network to an external network can present a number of risks.

Adequate levels of data encryption and/or masking must be applied and a secure transport layer implemented to ensure the data remains secure, wherever it is.

In the move to the cloud, the question of access must also be considered – both for individual users and for enterprise applications. It is important to consider all points of integration to mitigate business continuity issues. In many cloud migrations, companies tend to overlook points that haven’t been documented and integrated, presenting business continuity challenges. A robust cloud integration solution simplifies this task.

The risk of business processes failing should also be considered during the migration to the cloud. Companies must allocate sufficient time for testing – running systems in parallel for a period to ensure they all function as expected.

While there are risks in moving to the cloud, when the process is approached strategically and cautiously, there are many potential benefits to the migration process. Done well, the process can result in better quality data, a more strategic approach to data management and security, and more streamlined business processes.

Big data: over-hyped and under utilized

over-hype

(Image not owned by KID)

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

 

The spectre of big data analytics is driving businesses everywhere to reevaluate their strategies and consider massive investments to monetise their data. But many are missing the point – big data is available to virtually everyone without significant investment, and is being under-utilised within the enterprise right now.

 

Too many enterprises hold the mistaken belief that to get value from big data, they must invest heavily in infrastructure and software solutions that will allow them to gather practically all the internal and external, structured and unstructured data that exists, store it in massive data reservoirs and then embark on lengthy data analytics processes to arrive at insights.

 

This belief holds them back from fully capitalising on the big data they already have access to. Budget constraints and perceived complexity are limiting their use of data beyond the walls of their own enterprises. This need not be the case.

Big data has been hyped to a point where it has become daunting to many, yet in reality it is just the next level of the BI, fact-finding and business logic that has existed for years.  Big data practice simply delivers quicker value to end-users through enablement factors such as the internet, the cloud and the availability of feature-rich tools.

images-13
(Image not owned by KID)

Big data at its most basic

 

Many of these tools are affordable and scalable to a single user anywhere on the planet. For example, a consumer with a concern about his health might use his smartphone to go online and research high cholesterol symptoms and treatment. He uses a search engine to distill the massive volumes of big data that exist on the subject, he assesses the information, and makes an informed decision to consult a doctor based on that information. This is big data analytics methodology and analytics tools in use in their simplest form.

 

On a larger scale, a car dealer might assess his sales figures and expand his insight by following social media opinions about the car models he sells, studying industry forecasts and trends, and reading columns about buyer priorities. By bringing additional, external inputs into his data, he positions himself to offer better deals or models more likely to sell.

In these cases, the value of the data analysis comes from distilling only the relevant data from multiple sources to support decision-making.

 

Big data as broader BI

 

In large enterprises, large amounts of data already exist – often in siloes within the BI, CRM, customer service centre and sales divisions. This data, supplemented with external data from quality research, social media sentiment analysis, surveys and other sources, becomes big data that can be harnessed to deliver more advanced insights for a competitive edge. Big data is not as big as it sounds, and organisations do not need to invest millions to start benefiting from it. They just need to start looking outside the organisation and bringing in information that is relevant to the business case they want to address.

For many, this will be the extent of their big data analytics needs, and it is achievable with the technologies, skills and data they already have access to.  Big data practice is accommodating of less skilled analysts and is not just pitched for experienced BI or data scientists. Nor should it be the task of IT.

 

In fact, big data practice should be the preserve of business managers, who are best placed to determine what questions should be asked, what external factors impact on business, what information will be relevant, and what steps should be taken once insights are obtained from data analysis. Business managers, who are the data stewards and subject matter experts, will require certain technology tools to analyse the data, but these BI tools are typically user friendly and little training is needed to master them.

 

Big data moves for big business

 

In major enterprises who see potential long term business value from a big data investment, a simple way to assess its value is to outsource big data analysis before taking the plunge. This will allow the enterprise to determine whether the investment will deliver on its promise.

 

Whether outsourced or implemented internally, enterprises must determine at the outset what their objectives for big data projects are, to ensure that they deliver on expectations. Big Data practice is agile and can be applied to any data to deliver any insight.  It is not enough for enterprises to vaguely seek to ‘monetise’ data.

 

This term, which is merely a new spin on ‘data franchising’, remains meaningless without clear business objectives for the big data analysis exercise. To be effective, data analytics must be applied in a strategic way to achieve specific business outcomes.