Real-Time Data Driving M&S Marketing Success

Challenges
Marks & Spencer’s platform struggled with inconsistent data governance and batch-driven reporting. As a result, teams faced manual reconciliation, unreliable segmentation, and limited visibility into the full consent and membership lifecycle.
Outcome
The platform reliably supported 6,000–10,000 membership transactions per day while maintaining validated compliance and masking controls. The MVP was successfully delivered within three months, meeting programme launch timelines.
Solution
Databricks Modernisation
Challenges
Solution
Technology Stack
Outcomes
Summary: Real-Time Data Driving M&S Marketing Success
M&S Baby Club's offering required a scalable and compliant analytics platform capable of preserving the full consent history with enforced masking at scale. We engineered a layered Azure + Databricks architecture with deterministic Type 2 state modelling, operating at 6,000–10,000 daily transactions.
Client Problem
When Marks & Spencer launched Baby Club as an MVP loyalty programme for parents, the business goal was very clear.
Deliver personalised engagement while enrolling customers into Spark. The technical reality was less straightforward.
The platform needed to capture sensitive customer and baby attributes, support a two-year membership lifecycle and preserve every consent transition opt-in, opt-out, and re-join without ambiguity.
However:
Reporting operated on stale, batch-driven data.
Expiry rules and mandatory fields were inconsistently enforced.
Masking requirements risked breaking downstream analytics.
Operationally, reporting teams reconciled manually, governance risk increased, and CRM segmentation lacked a reliable historical context.
Root Cause Analysis
Experts at Cloudaeon conducted a detailed analysis to find out the root cause of all the issues. It was clear that it was not a tooling issue, but the way data and governance were managed.
Customer interactions were processed as daily facts instead of structured state transitions. Consent history lacked Type 2 modelling discipline (effective dates, surrogate keys, current flags). Governance controls were considered second priority, rather than embedded in the conform layer. Without deterministic modelling, compliance and analytics inevitably conflict.
Solution Architecture
We implemented a three-layer pattern: Ingestion → Conform → Analytics, with governance embedded right from the beginning.
Ingestion Layer
Customer interaction events ingested via Kafka for high-throughput and ordered streaming.
Raw events landed immutably in Azure Blob Storage for replay and auditability.
Conform Layer (Databricks)
Azure Databricks enforced validation, standardisation and identity resolution.
Implemented Type 2 modelling for membership and consent history thereby preserving every state transition.
Masking logic embedded as deterministic transformation: upon opt-out, personal attributes were replaced with placeholders while retaining analytical integrity.
Governance was not an afterthought but was encoded into state transformation rules.
Analytics Layer
Curated datasets published to Azure Synapse for enterprise reporting.
Azure Data Factory formalised promotion into analytics.
Apache Airflow orchestrated end-to-end workflows with dependency control and publish gating.
How We Delivered (Step-by-Step Engineering)
Experts at Cloudaeon did not rush to implement the solution at once. But we took a step-by-step engineering approach to ensure it solves every little issue and is addressed.
We began by defining explicit contract boundaries between raw, conform and analytics layers. Membership lifecycle rules included mandatory fields, and expiry logic was enforced centrally in Databricks.
Kafka decoupled ingestion from compute. Databricks ensured deterministic Type 2 behaviour. Synapse was well aligned with enterprise reporting standards.
Airflow DAGs enforced validation-before-publish logic. ADF pipelines standardised dataset promotion.
We executed 8–9 end-to-end scenarios, including opt-in/opt-out sequences, masking verification and expiry edge cases, thereby proving compliance through testable state transitions.
Technology Stack:
Kafka
Azure Databricks
Azure Blob Storage
Apache Airflow
Azure Data Factory
Azure Synapse (Synapse Server)
Outcomes
Sustained 6,000–10,000 membership transactions per day.
8–9 validated end-to-end test scenarios, including masking compliance.
MVP was deployed within three months, aligned to programme launch constraints.
POD & Managed Ops Transition
The engagement evolved through structured phases:
Solution: Establish ingestion-to-analytics architecture with embedded consent history and masking.
POD: Dedicated squad managed schema evolution, lifecycle rule changes, and campaign extensions without breaking governance guarantees.
Managed Ops: Operationalised monitoring for pipeline health, publish integrity and masking validation, ensuring long-term reliability as Baby Club scaled.
