Modernizing Bayer’s Data Landscape with Databricks

120+

delivered data products

15 000+

tables are ingested every hour

Streamlining Data Architecture

with a scalable, self-service cloud platform

About the Client

Bayer is a life science company with over 150 years of history and core competencies in the fields of health care and agriculture. It is managed through the three divisions: Pharmaceuticals (prescription products), Consumer Health (non-prescription products), and Crop Science (crop protection), and is represented by around 250 consolidated companies in 83 countries. Bayer AG employs approximately 100,000 people worldwide, with around 45% of its workforce based in Europe. 

The Challenge

Bayer was unable to leverage the full potential of data for some of their most strategic data domains (Supply chain and Logistics, R&D, Procurement, Sales, and a dozen more). Data was often scattered across different SAP implementations, non-SAP ERPs, and other tools in their IT ecosystem. The data was siloed, invisible to other departments of the company, with no company-wide governance over data quality and access to it in a centralized, robust, and scalable data platform. 

The Solution

We led the design and integration of a Databricks Data Lakehouse platform at Bayer, streamlining data workflows, accelerating reporting, and contributing to the modernization of the overall IT environment. 

Our custom-built solution enabled the processing of over 15,000 SAP/SAP BW tables every hour, making both SAP and non-SAP data available in one place. With our agile DevOps team, we delivered over 120 data products across key domains, including supply chain, procurement, HR, and compliance. These are now available for the whole company. 

As the platform matured, we addressed topics such as data governance, security, and cost control, improving data quality, integrating data catalogs, and ensuring secure access in a multi-cloud environment. 

We also promoted decentralized data ownership and reusability, ensuring accountability and real business value in a complex, multi-vendor environment. 

Benefits

Integrated Data Landscape
Building one Databricks platform for seamless access to SAP and non-SAP data sources.

Efficient Development of Data and AI Use Cases

Accelerating development and deployment of data products and AI applications.

Decentralized Data Quality Ownership
Responsibility for data quality lies closer to the origin of business data.

Centralized Data Governance and Security

Governance and security are managed centrally and dynamically across the platform.

Enhanced Data Exchange and Reusability

Promoting data sharing across different business domains.

Scalable Solutions for the Future

Building scalable data platforms to support long-term business growth and adaptability.

Streamlining Data Architecture

Optimizing the data landscape by removing outdated technologies.

Need to Know More?

Ask us anything

Key contacts

Jan Císař - DataSentics

Jan Císař

Vertical AI Solutions PreSales Lead

Fanny Frafer - DataSentics

Fanny Frafer

Vertical AI Solutions France Lead