Background:
Advanced analytics, in particular analytics that takes advantage of Machine Learning and Artificial Intelligence (ML/AI), have become established mainstream business initiatives. Several reports and surveys published by confirm the rapid growth in ML and AI projects at enterprises of all sizes and in all industries. A Gartner global survey of CIOs found that AI implementations grew by 270% in the prior four-year period. According to Forbes, 93% of executives expect to get some value from AI investments. Algorithmia’s “State of 2020 Machine Learning” survey found that budgets for ML initiatives are growing by 25% annually, with Banking, Manufacturing and IT industries having the largest growth.
Businesses are leveraging ML and AI for many different capabilities – at their core, these technologies allow businesses to uncover deeper insights, make better business predictions, and take actions on these predictions. Some of the business use cases for ML/AI that we have most commonly seen in our work with clients are:
- generating customer insights
- reducing churn
- improving customer experience
- recommendation engines
- fraud detection
- demand forecasting
- supply chain optimization
- internal process automation to reduce costs
Across industries, ML and AI not only provide competitive advantages but have become must-have capabilities that are necessary to remain viable and competitive. Due to the rapid decline in the cost of ML/AI platforms and technologies, the ROI for ML/AI initiatives has reached a level making it more actionable to an increasing number of businesses.
Challenges:
Despite this tremendous growth, many businesses have faced significant challenges in fulfilling the high expectations of ML/AI and actually realizing business value. In the Forbes report, 65% of executives reported that they are not yet seeing the expected value from their AI investments. In a TransUnion survey of finance, risk and marketing executives, 76% indicated that one of their biggest challenges was the data cleansing and prep work required to derive the expected value. Based on our experience with multiple ML engagements across a range of industries, one of the most significant challenges on ML projects is the poor quality of the data. In fact, 80% of time on ML/AI projects is spent on data understanding and preparation – cleaning poor quality data, determining how to fill data gaps, blending data from different sources, standardizing data definitions across various data sets, and other data prep activities.
Figure 1 below shows the steps and timeline of an ideal ML/AI project, versus what typically happens during an actual project that has a limited budget and timeline.
Figure 1 – Ideal vs Typical ML/AI Project Timeline
This illustrates the lack of maturity in Enterprise Data Management which is quite common in most organizations. Enterprise Data Management (EDM) is the discipline which strives to continually increase the overall data maturity of an organization. This includes capabilities of data governance, master data management (MDM), data quality, metadata management, data engineering, data security and data risk management. EDM maturity is important not only for general reporting needs, but particularly for ML/AI needs as well.
In many ML/AI projects, poor input data leads to less insightful ML/AI models, which result in limited business value! Some of the key impacts of this are highlighted below.
- Even though data scientists have tools to detect data discrepancies, poor data necessitates several iterations to deliver higher performant models, challenging the business case for future investments in ML
- Early investments in ML could result in models with limited economic value due to the prevalent data issues, or an inability to scale the ML models’ benefits due to data quality and data governance concerns
- The long-term ROI potential of deriving ML/AI value elevates the importance of leveraging capabilities like Data-Ops and ML-Ops to ease deployment and operationalization of these models
Real-world Examples:
Shown below in Figures 2 and 3 are some real-world examples of how poor master data, data quality, and metadata outcomes that do not deliver to the full potential of ML.
Figure 2 – Master data issues limit business value of ML outcomes
Figure 3 – Data quality (DQ) & metadata issues limit the business value of ML outcomes
As seen above, it is critically important that Enterprise Data Management be an integral part of ML/AI initiatives; mature EDM will help to ensure that the overall quality of the input data fed into the models is sufficiently high.
Bottom line – Enterprise Data Management and analytics initiatives must work together in order to deliver the full business value and expectations of ML/AI!
Recommendations to Address these Challenges:
Inspired Intellect makes four high level recommendations to help address the challenges described above. These are shown in Figure 4.
Figure 4 – High level recommendations to address EDM challenges for ML/AI initiatives
1. Develop and maintain an effective data governance and MDM program, ensuring ownership of all data assets. Specifically,
- specifically address standardization of definitions for key enterprise master and reference data such as customer, supplier, product, vendor, location, etc. – since they will typically play a significant part in most ML use cases
- prioritize data assets that are significant for high value ML/ analytics use cases
- prioritize active governance that helps to proactively enhance data quality at the source systems, before data is fed to ML/AI and other use cases
- include passive governance that helps to standardize data and address existing data quality issues prior to conducting ML/advanced analytics use cases
2. Implement a continuous data quality improvement initiative which includes the use of intelligent tools. Specifically,
- prioritize the critical master, reference, and transactional data assets that provide the greatest value for ML
- develop and automate data quality business rules
- use intelligent tools like ML/AI anomaly detection to discover data quality issues and provide recommendations and automated fixes
- analyze ML/AI models to determine which data assets are most influential on the models’ predictions to consequently help rationalize your data quality initiatives
- In effect, 2(c) and 2(d) above use ML/AI outcomes within the EDM process to improve business value outcomes from the ML use cases – a virtuous circle
- develop data quality metrics that drive the enterprise towards higher quality
3. Develop an enterprise data and features catalog which includes
- a business glossary with standard enterprise definitions and metrics, prioritizing those needed for ML/AI initiatives
- a features catalog to standardize and govern features that are developed during the ML/AI process
- a metadata catalog with both technical information and data lineage
- a user-friendly way for both business and technical data consumers to access, share and collaborate on this information
4. Incentivize data owners and ML business users based on governance and data quality metrics, as well as business value of the ML insights derived
- Enterprise data quality metrics should be part of the overall incentives for data owners as well as ML and other data consumers, which will drive behavior of the enterprise towards higher data quality
- Business users should be incentivized based on the actual value that ML initiatives are bringing to the business, which will help promote valuable ML/AI initiatives (enabled by high quality data) and weed out low value ML/AI initiatives that could be hampered by poor quality data
Conclusion
As ML/AI initiatives continue to grow in importance to business execution, they must be accompanied by strong Enterprise Data Management in order to increase their delivered value. By following the recommendations above, ML/AI project teams can focus on developing the best models rather than dealing with data quality issues. Each of these recommendations require prioritization, investment, and strategic focus driven by the Chief Data Officer (CDO) or equivalent C-suite executive, along with an integration partner like Inspired Intellect that can drive the organizational, business and technical workstreams. When executed well, EDM initiatives will enable higher business value from ML/AL investments.
In our next blog, we will discuss how several EDM technologies and tools work hand-in-hand with ML/AI tools to help automate and streamline both these processes.
Inspired Intellect is an end-to-end service provider of data management, analytics and application development. We engage through a portfolio of offerings ranging from strategic advisory and design, to development and deployment, through to sustained operations and managed services.
Learn how Inspired Intellect’s EDM and ML/AI strategy and solutions can help bring greater value to your analytics initiatives by contacting us at marketing@inspiredintellect-us.com.