Data Governance and Automatic Data Lineage: Surviving Regulatory Requirements
The increasing volume of data and regulations require organizations to prove to regulators what data their systems contain and the path of this data. It is no longer enough to know and master the data; it is also necessary to describe their journey with complete transparency. What does this traceability requirement imply regarding the strategies to implement?
We have key insights from two experts: Frédéric Fourquet, Product Marketing Manager of Data Intelligence at MEGA International, and Ernie Ostic, Senior Vice President of Products at MANTA Software.
Trace the complex journey of data with automatic data lineage.
Ernie Ostic: Data lineage is a flow of transformation above all. It makes it possible to trace the technical genealogy of data by giving a precise overview of the path traveled in computer systems. This approach provides a complete view of the life cycle of the data, from its collection to its use, to its destruction. Although automated lineage is necessary for complex technology portfolios, it is also about connecting with users (businesses) beyond the technical aspects, especially during the data discovery phase and the modeling of underlying processes.
Frédéric Fourquet: The challenge of the lineage process is to understand what is happening in the data journey: where it comes from, where it goes, who collects it, who uses it, reuses it, etc. Data is not static; it integrates processes and requires an in-depth and dynamic view. The technical lineage of the data makes it possible to know what precisely happened in the system and what treatment the data have been through since it was set up. This work allows more in-depth data knowledge through retracing its history.
Data governance: proving the origin and destination of data
Ernie: With the increasing number of data regulations - notably GDPR in Europe - the challenge is to prove to the regulator how each piece of data was obtained. Failure to comply with this requirement can put organizations at risk in any industry. Since data management is a long-term transformation process, it is crucial to trace the history of each piece of data to track its origin, processing, etc. It is no longer just a question of providing the processed data to the legislator; it is also necessary to demonstrate its lineage, i.e., the data's genealogy in the system. Given the exponential growth in the volume of big data, automation is essential.
Frédéric: Modeling the data life cycle in an automated way makes it possible to avoid enormous manual efforts - efforts that might even be impossible with a specific volume of data. Automation is essential; for example, when a company has several hundred critical data items to process, only 10 to 15 data items per year can be processed manually by the Data Office. Automating the lineage phase frees more time to focus on data governance to ensure regulatory compliance work.
Ensure compliance: the truth is in the code.
Ernie: Automation ensures dynamic adaptation over time, depending on the different versions, processing periods, etc. The truth of the data is in the code. It is written somewhere in the technical process, and the lineage is there to prove it, to have a clear view of it. This is the case with the COBOL programs, for example, whose secrets must be revealed by describing the lineage through a detailed scan of the systems under the prism of their evolution over time. Thus, finding the path of the data can help analyze what COBOL programs are doing in a system.
Frédéric: The regulator needs to understand the data at the business and technical levels. Stakeholders know the risk, particularly in the banking industry, and their exposure to non-compliance fines. The processes are objectively described with data lineage and data governance, and everyone who has processed the data is identified. It is no longer necessary to investigate who made the code and who has the process in mind; everything is immediately available to the Chief Data Officer and the regulator.
Artificial Intelligence: leveraging data insights
Frédéric: While the Data Steward collects and models data for the data catalog, the role of the Data Scientist is to design algorithms that make recommendations to create or improve a service or product. This is possible, for example, thanks to customer behavior modeling based on data provided upstream. Therefore, it is interesting to know the data life cycle (lineage) from the design phase of Artificial Intelligence to enhance the selection of the best data sources and get the best possible results. It is also essential for the production phase to unlock the full potential of the A.I.
Ernie: To have reliable artificial Intelligence, stable and good quality data are mandatory. The ability to detect changes over time and set up alerts by topic would be of great interest. The most important thing is to help Data Scientists by working on the technical life cycle of data and by bringing value through smart tags, reminders on quality, or other data-related issues. This notification feature is another step to detect new data through progressive lineages.
Meet the experts
Frédéric Fourquet: Data Governance Product Marketing Manager, Frédéric started his career in 1997. Before joining MEGA as Product Marketing Manager for Data Intelligence, Frédéric was previously Product Director in Artificial Intelligence & Data Intelligence for Banking & Insurance - Strategy, Marketing, Business Development, Consulting & Partnerships (4 years), Consulting Director in Regulations, Compliance, Data Governance & Data Innovation (16 years), and Sales/Presales Executive and Consultant in US/EMEA Software Companies (4 years).
Ernie Ostic: Ernie is the SVP of Products at MANTA, focusing on lineage and metadata integration solutions. He has over forty years of experience in data integration, including twenty-plus years at IBM, working in various roles with responsibilities in product management and technical sales support. For most of the past decade, Ernie has been guiding information governance and helping architect custom lineage solutions. Earlier in his career, Ernie built decision support systems with fourth-generation languages and data access middleware. Ernie blogs on open metadata, data lineage, and overall metadata management and governance. He is a graduate of Boston College.
MEGA International and MANTA recently formed an important business and technological partnership to combine their expertise. This innovative collaboration enables companies to achieve regulatory compliance and provides them with access to highly accurate business data insights.