Data is a fundamental element of every business and therefore fundamental to our architecture. Data is our record of current state of the business, the history of what has happened, and is the base which enables us to predict what may happen in the future.
However, on its own data doesn't do anything and to realise value from data we have to do something with it, we have to understand it, and act on it. Typically this requires something other than the data such as a computer program, a query, or a user (machine or person).
Perhaps one of the biggest and most complex challenges comes with managing data. Data is inert; it is not self-organizing or even self-understanding. So how do we manage the data , how do we organize an attach meaning so that the data can easily be used bu the business, or a computer program etc.
The DIKW pyramid, provides a simple visualisation of the value chain growing from data to Wisdom:
- Data is the base with the least amount of perceived usefulness.
- Information has higher value than data.
- Knowledge has higher value than information.
- Wisdom has the highest perceived value of all.
To move up the value chain data requires something else such as a program, a machine, or even a person—to add to understanding to it so that it becomes Information.
By organizing and classifying information, the value chain expands from data and information to be regarded as knowledge.
At the top of the data value chain is Wisdom. Wisdom comes from a combination of inert data, which is the fundamental raw material in the modern digital age, combined with a series of progressive traits such as: perspective context understanding learning * the ability to reason.
With the advent of Cognitive computing and artificial intelligence these traits can now be attributed to both a person and a machine.
The IBM AI Ladder loosely parallels the DIKW pyramid in that the AI Ladder represents a progressive movement towards value creation within an enterprise.
Increased value can be gained from completing activities at each step of the AI Ladder, with the potential to recognize higher levels of value, the higher the ladder is climbed.
The AI Ladder contains four discrete levels: Collect Organize Analyze Infuse.
The first rung is Collect, a primitive action that serves as the first element towards making data actionable and to help drive automation, insights, optimization, and decision-making. Collect is an ability to attach to a data source – whether transient or persistent, real or virtual, and while being agnostic as to its actual location or its originating (underlying) technology. In linking to the DIKW pyramid we could say that, data lies below the first rung, recognizing the inert nature of data.
The AI Ladder progresses through the rungs to infuse, a state of capability that means an enterprise has taken artificial intelligence beyond a science project. Infusion means that advanced analytical models have been interwoven into the essential fabric of an application or system whereby driving new or improved business capabilities.
Collect – Making Data Simple and Accessible
The first rung of the AI Ladder is Collect and is how an enterprise can formally incorporate data into any analytic process. Properties of data include:
- Structured, semi-structured, unstructured
- Proprietary or open
- In the cloud or on-premise
- Any combination above
Organize – Trusted, Governed Analytics
The second rung of the AI Ladder is Organize and is about how an enterprise can make data known, discoverable, usable, and reusable. The ability to organize is prerequisite to becoming data-centric. Additionally, data of inferior quality or data that can be misleading to a machine or end-user can be governed in such that any use can be adequately controlled. Ideally, the outcome of Organize is a body of data that is appropriately curated and offers the highest value to an enterprise. Organize allows data to be:
- Secured (e.g. through policy-based enforcement)
- A source of truth and utility
Analyze – Insights On-Demand
The third rung of the AI Ladder is Analyze and is about how an organization approaches becoming a data-driven enterprise. Analytics can be human-centered or machine-centered. In this regard the initials AI can be interpreted as Augmented Intelligence when used in a human-centered context and Artificial Intelligence when used in a machine-centered context. Analyze covers a span of techniques and capabilities, from basic reporting and business intelligence to deep learning. Analyze, through data, allows to:
- Determine what has happened
- Determine what is happening
- Determine what might happen
- Compare against expectations
- Automate and optimize decisions
Infuse – Operationalize AI with Trust and Transparency
The fourth rung of the AI Ladder is Infuse and is about how an enterprise can use AI as a real-world capability. Operationalizing AI means that models can be adequately managed which means an inadequately performing model can be rapidly identified and replaced with another model or by some other means. Transparency infers that advanced analytics and AI are not in the realm of being a dark art and that all outcomes can be explained. Trust infers that all forms of fairness transcend the use of a model. Infuse allows data to be:
- Used for automation and optimization
- Part of a causal loop of action and feedback
- Exercised in a deployed model
- Used for developing insights and decision-making
- Beneficial to the data-driven organization
- Applied by the data-centric enterprise
Data as a differentiator
Data needs to become treated as a corporate asset. Data has the power to transform any organization, add monetary value, and enable the workforce to accomplish extraordinary things. Data-driven cultures can realize higher business returns.
While a dog house can be built without much planning, you cannot build a modern skyscraper with the same approach. The scale of preserved data across a complex hybrid cloud or multi-cloud topology requires discipline, even for an organization that embraces agile and adaptive philosophies.
Data can and should be used to drive analytical insights. But what considerations and planning activities are required to enable the generation of insights, the ability to take action, and the courage to make decisions? Although the planning and implementation activities to maximize the usefulness of your data can require some deep thinking, organizations can become data-centric and data-driven in a short time.
More so than ever, businesses need to move rapidly. Organizations must respond to changing needs as quickly as possible or risk becoming irrelevant. This applies to both private or public organizations, irrespective of size.
Data and the related analytics are key to differentiation, but traditional approaches are often ad hoc, naive, complex, difficult, and brittle. This can result in delays, business challenges, lost opportunities, and the rise of unauthorized projects.
Making data enabled and active
There are five key tenets to making data enabled and active:
- Developing a data strategy
- Developing a data architecture
- Developing a data topology for analytics
- Developing an approach to unified governance
- Developing an approach to maximizing the accessibility of data consumption
If data is an enabler, then analytics can be considered one of the core capabilities that is being enabled.
Analytics can be a complex and involved discipline that encompasses a broad and diverse set of tools, methods, and techniques. One end of the IBM AI Ladder is enabled through data in a static format such as a pre-built report; the other end is enabled through deep-learning and advanced artificial intelligence. Between these two ends, the enablement methods include diagnostic analytics, machine learning, statistics, qualitative analysis, cognitive analysis, and more. A robot, a software interface, or a human may need to apply multiple techniques within a single task or across the role that they perform in driving insight, taking action, monitoring, and making decisions.
Drivers for what causes change within a business can be regarded as being stochastic. Whether foreseen or randomly determined, each change is likely to require new data – data that an organization has not previously anticipated. Increased data volumes, increases in the number of data sources, increases to the rates of data ingestion, and increases in the variety of the types of data are nothing more than de facto a prioris.
While users are likely to have access to terabytes, petabytes, or even exabytes of data from data streams, IOT-sensors, transactional systems, and so on, if the data is not properly incorporated, managed, controlled, enriched, governed, measured, and deployed then the data may not only become useless, the data may become a liability.
The activities to properly handle data and to pursue the AI Ladder, can be shown in the three solution areas of IBM Data and AI offerings:
- Hybrid Data Management
- Collect all types of data, structured and unstructured
- Include all open sources of data
- Single platform with a common application layer
- Write once and deploy anywhere
- Unified Governance and Integration
- Satisfy all matters of finding, cataloging and masking data
- Integrate fluid data sets
- Deliver built-in compliance
- Leverage advanced machine learning capabilities
- Data Science and Business Analytics
- Deliver descriptive, prescriptive and predictive insights across all types of data
- Enable advanced analytics and data science methods