Leveraging Sap’s Enterprise Data Management Tools to Enable ML/AI Success

Background

In our previous blog post, “Master Your ML/AI Success with Enterprise Data Management”, we outlined the need for Enterprise Data Management (EDM) and ML/AI initiatives to work together in order to deliver the full business value and expectations of ML/AI. We made a set of high-level recommendations to increase EDM maturity and in turn enable higher value from ML/AI initiatives. A graphical summary of these recommendations is shown below:

 

High level recommendations to address EDM challenges for ML-AI initiatives

Figure 1 – High level recommendations to address EDM challenges for ML/AI initiatives

 

In this post, we will present a specific instantiation of technology for bringing those concepts to life. There are countless examples that could be shown, but for the purposes of this post, we will present a solution within the SAP toolset. The end result is an implementation environment where the EDM technologies work hand-in-hand with ML/AI tools to help automate and streamline both these processes.

SAP’s preferred platform for ML/AI is SAP Data Intelligence (DI).  When it comes to EDM, SAP has a vast suite of tools that store, transfer, process, harness, and visualize data. We will focus on four tools that we believe provide the most significant impact to master ML/AI initiatives implemented on DI. These are SAP Master Data Governance (MDG)SAP Data Intelligence (DI) – Metadata Explorer component, and to a smaller extent, SAP Information Steward (IS)SAP Data Warehouse Cloud (DWC) can also be used to bring all the mastered and cleansed data together and to store and visualize the ML outputs.

Architecture

As with any other enterprise data solution, the challenge is to effectively integrate a set of tools to deliver the needed value, without adding the cost overhead of data being moved and stored in multiple places, as well as the added infrastructure, usage and support costs. For enterprises that run on SAP systems, a high-level architecture and descriptions of the tools that would achieve these benefits is shown below.

 

Figure 2 –High-level MDG/DI architecture and data flow

 

1. SAP MDG (Master Data Governance) with MDI (Master Data Integration)

SAP MDG and MDI go hand in hand. MDI is provided with the SAP Cloud Platform. It enables communication across various SAP applications by establishing One Domain Model (ODM). It enables a consistent view of master data across the end-to-end scenarios.

SAP MDG is available as S/4 HANA or ERP-based. This tool helps ensure high quality and trusted master data for initial and ongoing purposes. It can become a key part of the enterprise MDM and data governance program. Both active and passive governance are supported. Based on business needs, certain domains are prioritized out of the box in MDG.  MDG provides the capabilities like Consolidation, Mass Processing and Central Governance coupled with governance workflows for Create-Read-Update-Delete (CRUD) processes.

SAP has recently announced SAP MDG, cloud edition. While it is not a replacement for MDG on S/4 HANA, MDG cloud edition is planned to come with core MDG capabilities like Consolidation, Centralization and Data Quality Management to centrally manage core attributes of Business Partner data. This is a useful “very quick start” option for customers who never used MDG, but it can also help customers already using MDG on S/4HANA to build out their landscape to a federated MDG approach for better balancing centralized and decentralized master data.

 

2. Data Intelligence (with Metadata Explorer component)

SAP IS and MDG are the pathways to make enriched, trusted data available to Data Intelligence, which is used to actually build the ML/AI models. We can reuse SAP IS rules and metadata terms directly in SAP DI. This is achieved in DI by utilizing its data integration, orchestration, and streaming capabilities. DI’s Metadata Explorer component also facilitates the flow of business rules, metadata, glossaries, catalogs, and definitions to tools like IS (on-prem) for ensuring consistency and governance of data. Metadata explorer is geared towards discovery, movement and preparation of data assets that are spread across diverse and disparate enterprise systems including cloud-based ones.

 

3. Information Steward (IS) – Information Steward is an optional tool, useful for profiling data, especially for on-prem situations. The data quality effort can be initiated by creating the required Data Quality business rules, followed by profiling the data and running Information Steward to assess data quality. This would be the first step towards initial data cleansing, and thereby data remediation, using a passive governance approach via quality dashboards and reports. (Many of these features are also available in MDG and DI). SAP IS helps an enterprise address general data quality issues, prior to using specialized tools like SAP MDG to address master data issues. It can be an optional part of any ongoing data quality improvement initiative for an enterprise.

 

4. Data Warehouse Cloud (DWC) – Data Warehouse Cloud is used in this architecture to bring all the mastered and cleansed data together into the cloud, perform any other data preparation or transformations needed, and to model the data into the format needed by the ML models in DI. DWC is also used to store the results of the ML models, and to create visualizations of these results for data consumers.

 

Figure 3 – Summary of Functionality of SAP tools used for EDM

 

While there are some overlaps in functionality between these tools, Data Intelligence is more focused on the automation aspects of these capabilities. DI is primarily intended as an ML platform, and therefore has functionality such as the ability to create data models and organize the data in a format that facilitates the ML/AI process (ML Data Manager). This architecture allows for capitalizing on the EDM strengths of MDG and IS. This is also consistent with the strategic direction of SAP, that is, providing comprehensive “Business Transformation as a Service” approach, leading with cloud services. Together, these tools work in a complementary way (for hybrid on-prem plus cloud scenarios), and the combination of these tools work hand in hand to make trusted data available to AI/ML.

Conclusion

In summary, the SAP ecosystem has several EDM tools that can help address the data quality and data prep challenges of the ML/AI process. SAP tools like MDG and DI Metadata Explorer component have features and integration capabilities that can easily be leveraged during or even before the ML/AI use cases are underway. If used in conjunction with the general EDM maturity recommendations summarized above, these tools will help to deliver the full business value and expectations of ML/AI use cases.

In our next post, we will continue our discussion on EDM tools, some of their newer features, how they have evolved, and how ML/AI has been part of their own evolution. As a reminder, if you missed the first post in this series, you can find it here: “Master Your ML/AI Success with Enterprise Data Management”.

 

 


Inspired Intellect is an end-to-end service provider of data management, analytics and application development. We engage through a portfolio of offerings ranging from strategic advisory and design, to development and deployment, through to sustained operations and managed services.

 

Learn how Inspired Intellect’s EDM and ML/AI strategy and solutions can help bring greater value to your analytics initiatives by contacting us at marketing@inspiredintellect-us.com.

Machine Learning Enterprise Data Management Data Cube

Master Your ML/AI Success With Enterprise Data Management

Background:

Advanced analytics, in particular analytics that takes advantage of Machine Learning and Artificial Intelligence (ML/AI), have become established mainstream business initiatives. Several reports and surveys published by confirm the rapid growth in ML and AI projects at enterprises of all sizes and in all industries. A Gartner global survey of CIOs found that AI implementations grew by 270% in the prior four-year period. According to Forbes, 93% of executives expect to get some value from AI investments. Algorithmia’s “State of 2020 Machine Learning” survey found that budgets for ML initiatives are growing by 25% annually, with Banking, Manufacturing and IT industries having the largest growth.

 

Businesses are leveraging ML and AI for many different capabilities – at their core, these technologies allow businesses to uncover deeper insights, make better business predictions, and take actions on these predictions. Some of the business use cases for ML/AI that we have most commonly seen in our work with clients are:

  • generating customer insights
  • reducing churn
  • improving customer experience
  • recommendation engines
  • fraud detection
  • demand forecasting
  • supply chain optimization
  • internal process automation to reduce costs

Across industries, ML and AI not only provide competitive advantages but have become must-have capabilities that are necessary to remain viable and competitive. Due to the rapid decline in the cost of ML/AI platforms and technologies, the ROI for ML/AI initiatives has reached a level making it more actionable to an increasing number of businesses.

 

Challenges:

Despite this tremendous growth, many businesses have faced significant challenges in fulfilling the high expectations of ML/AI and actually realizing business value. In the Forbes report, 65% of executives reported that they are not yet seeing the expected value from their AI investments. In a TransUnion survey of finance, risk and marketing executives, 76% indicated that one of their biggest challenges was the data cleansing and prep work required to derive the expected value. Based on our experience with multiple ML engagements across a range of industries, one of the most significant challenges on ML projects is the poor quality of the data. In fact, 80% of time on ML/AI projects is spent on data understanding and preparation – cleaning poor quality data, determining how to fill data gaps, blending data from different sources, standardizing data definitions across various data sets, and other data prep activities.

 

Figure 1 below shows the steps and timeline of an ideal ML/AI project, versus what typically happens during an actual project that has a limited budget and timeline.

 

Figure 1 – Ideal vs Typical ML/AI Project Timeline

 

This illustrates the lack of maturity in Enterprise Data Management which is quite common in most organizations. Enterprise Data Management (EDM) is the discipline which strives to continually increase the overall data maturity of an organization. This includes capabilities of data governance, master data management (MDM), data quality, metadata management, data engineering, data security and data risk management. EDM maturity is important not only for general reporting needs, but particularly for ML/AI needs as well.

 

In many ML/AI projects, poor input data leads to less insightful ML/AI models, which result in limited business value! Some of the key impacts of this are highlighted below.

  • Even though data scientists have tools to detect data discrepancies, poor data necessitates several iterations to deliver higher performant models, challenging the business case for future investments in ML
  • Early investments in ML could result in models with limited economic value due to the prevalent data issues, or an inability to scale the ML models’ benefits due to data quality and data governance concerns
  • The long-term ROI potential of deriving ML/AI value elevates the importance of leveraging capabilities like Data-Ops and ML-Ops to ease deployment and operationalization of these models

 

Real-world Examples:

Shown below in Figures 2 and 3 are some real-world examples of how poor master data, data quality, and metadata outcomes that do not deliver to the full potential of ML.

 

Table Master data issues limit business value of ML outcomes

Figure 2 – Master data issues limit business value of ML outcomes

 

 

Data quality (DQ) & metadata issues limit the business value of ML outcomes

Figure 3 – Data quality (DQ) & metadata issues limit the business value of ML outcomes

 

As seen above, it is critically important that Enterprise Data Management be an integral part of ML/AI initiatives; mature EDM will help to ensure that the overall quality of the input data fed into the models is sufficiently high.

 

Bottom line – Enterprise Data Management and analytics initiatives must work together in order to deliver the full business value and expectations of ML/AI!

 

Recommendations to Address these Challenges:

Inspired Intellect makes four high level recommendations to help address the challenges described above. These are shown in Figure 4.

 

High level recommendations to address EDM challenges for ML-AI initiatives

Figure 4 – High level recommendations to address EDM challenges for ML/AI initiatives

 

1. Develop and maintain an effective data governance and MDM program, ensuring ownership of all data assets. Specifically,

  • specifically address standardization of definitions for key enterprise master and reference data such as customer, supplier, product, vendor, location, etc. – since they will typically play a significant part in most ML use cases
  • prioritize data assets that are significant for high value ML/ analytics use cases
  • prioritize active governance that helps to proactively enhance data quality at the source systems, before data is fed to ML/AI and other use cases
  • include passive governance that helps to standardize data and address existing data quality issues prior to conducting ML/advanced analytics use cases

2. Implement a continuous data quality improvement initiative which includes the use of intelligent tools. Specifically,

  • prioritize the critical master, reference, and transactional data assets that provide the greatest value for ML
  • develop and automate data quality business rules
  • use intelligent tools like ML/AI anomaly detection to discover data quality issues and provide recommendations and automated fixes
  • analyze ML/AI models to determine which data assets are most influential on the models’ predictions to consequently help rationalize your data quality initiatives
  • In effect, 2(c) and 2(d) above use ML/AI outcomes within the EDM process to improve business value outcomes from the ML use cases – a virtuous circle
  • develop data quality metrics that drive the enterprise towards higher quality

3. Develop an enterprise data and features catalog which includes

  • a business glossary with standard enterprise definitions and metrics, prioritizing those needed for ML/AI initiatives
  • a features catalog to standardize and govern features that are developed during the ML/AI process
  • a metadata catalog with both technical information and data lineage
  • a user-friendly way for both business and technical data consumers to access, share and collaborate on this information

4. Incentivize data owners and ML business users based on governance and data quality metrics, as well as business value of the ML insights derived

  • Enterprise data quality metrics should be part of the overall incentives for data owners as well as ML and other data consumers, which will drive behavior of the enterprise towards higher data quality
  • Business users should be incentivized based on the actual value that ML initiatives are bringing to the business, which will help promote valuable ML/AI initiatives (enabled by high quality data) and weed out low value ML/AI initiatives that could be hampered by poor quality data

Conclusion

As ML/AI initiatives continue to grow in importance to business execution, they must be accompanied by strong Enterprise Data Management in order to increase their delivered value. By following the recommendations above, ML/AI project teams can focus on developing the best models rather than dealing with data quality issues. Each of these recommendations require prioritization, investment, and strategic focus driven by the Chief Data Officer (CDO) or equivalent C-suite executive, along with an integration partner like Inspired Intellect that can drive the organizational, business and technical workstreams. When executed well, EDM initiatives will enable higher business value from ML/AL investments.

 

In our next blog, we will discuss how several EDM technologies and tools work hand-in-hand with ML/AI tools to help automate and streamline both these processes.

 


Inspired Intellect is an end-to-end service provider of data management, analytics and application development. We engage through a portfolio of offerings ranging from strategic advisory and design, to development and deployment, through to sustained operations and managed services.

 

Learn how Inspired Intellect’s EDM and ML/AI strategy and solutions can help bring greater value to your analytics initiatives by contacting us at marketing@inspiredintellect-us.com.