120+
delivered data products
15 000+
tables are ingested every hour
Streamlining Data Architecture
with a scalable, self-service cloud platform
About the Client
Bayer is a life science company with over 150 years of history and core competencies in the fields of health care and agriculture. It is managed through the three divisions: Pharmaceuticals (prescription products), Consumer Health (non-prescription products), and Crop Science (crop protection), and is represented by around 250 consolidated companies in 83 countries. Bayer AG employs approximately 100,000 people worldwide, with around 45% of its workforce based in Europe.
The Challenge
Bayer was unable to leverage the full potential of data for some of their most strategic data domains (Supply chain and Logistics, R&D, Procurement, Sales, and a dozen more). Data was often scattered across different SAP implementations, non-SAP ERPs, and other tools in their IT ecosystem. The data was siloed, invisible to other departments of the company, with no company-wide governance over data quality and access to it in a centralized, robust, and scalable data platform.

The Solution
We led the design and integration of a Databricks Data Lakehouse platform at Bayer, streamlining data workflows, accelerating reporting, and contributing to the modernization of the overall IT environment.
Our custom-built solution enabled the processing of over 15,000 SAP/SAP BW tables every hour, making both SAP and non-SAP data available in one place. With our agile DevOps team, we delivered over 120 data products across key domains, including supply chain, procurement, HR, and compliance. These are now available for the whole company.
As the platform matured, we addressed topics such as data governance, security, and cost control, improving data quality, integrating data catalogs, and ensuring secure access in a multi-cloud environment.
We also promoted decentralized data ownership and reusability, ensuring accountability and real business value in a complex, multi-vendor environment.
Benefits
Integrated Data Landscape
Building one Databricks platform for seamless access to SAP and non-SAP data sources.
Efficient Development of Data and AI Use Cases
Accelerating development and deployment of data products and AI applications.
Decentralized Data Quality Ownership
Responsibility for data quality lies closer to the origin of business data.
Centralized Data Governance and Security
Governance and security are managed centrally and dynamically across the platform.
Enhanced Data Exchange and Reusability
Promoting data sharing across different business domains.
Scalable Solutions for the Future
Building scalable data platforms to support long-term business growth and adaptability.
Streamlining Data Architecture
Optimizing the data landscape by removing outdated technologies.