Cut Retail Data Onboarding Time by 60% Using Prophecy Low-Code Pipelines

Retail companies today are drowning in supplier data. Every new vendor brings their own format, their own quirks, and their own headaches. For one of our clients, a major logistics leader, managing data from over 10 suppliers had become a full time nightmare for their engineering team.

The numbers were stark: each new supplier took weeks to onboard, engineers were writing the same transformation logic repeatedly and pipeline failures were becoming increasingly common. With retail seasons demanding rapid supplier additions, something had to change.

In this insight, we'll walk through how we transformed their data onboarding process using Prophecy's low code platform on Databricks, cutting development time by 60% whilst making the entire process accessible to non Spark experts.

Author

I'm a Data Engineer with 8 years of experience specialising in the Azure data ecosystem. I design and implement scalable data pipelines, lakes and ETL/ELT solutions using tools like ADF, Airflow, Databricks, Synapse and SQL Server. Focused on building high-quality, secure, and optimised cloud data architecture.

Nikhil

Mohod

I'm a Data Engineer with 8 years of experience specialising in the Azure data ecosystem. I design and implement scalable data pipelines, lakes and ETL/ELT solutions using tools like ADF, Airflow, Databricks, Synapse and SQL Server. Focused on building high-quality, secure, and optimised cloud data architecture.

Connect with

Nikhil

Mohod

Get a free recap to share with colleagues

Ready to shape the future of your business?

Let's Talk

What is Lorem Ipsum?

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

Introduction

The retail landscape has shifted dramatically. Companies need to onboard new suppliers faster than ever to stay competitive, especially during peak seasons or supply chain disruptions. Traditional hand coded approaches simply can't keep pace.

The business impact is significant: delayed supplier onboarding means missed revenue opportunities, manual data processes increase operational costs and inconsistent data formats create reporting nightmares that affect critical business decisions. For our client, slow onboarding was directly impacting their ability to expand their supplier network and respond to market demands.

Solution

We took a fundamentally different approach to solving their supplier onboarding challenge. Instead of writing custom Spark code for each new supplier, we built a standardised, reusable framework using Prophecy's low code platform running on Databricks.

The solution combined three key components:

Prophecy Platform: Visual pipeline designer with AI powered development assistance.
Databricks Runtime: Scalable compute engine with Delta Lake storage.
Unity Catalog: Centralised governance and metadata management.

This architecture allowed us to maintain enterprise-grade performance whilst dramatically simplifying the development process for non-experts.

Tools & Technologies Used

Prophecy: Visual low-code pipeline builder with AI-powered development accelerator.
Databricks: Compute engine, Delta Lake, Unity Catalog, job scheduling via Workflows.
Git Integration: Source control through GitHub for CI/CD workflows.
Unity Catalog: Metadata management, RBAC, and governance.
Microsoft Fabric: Downstream consumption and Power BI visualisation.

Step by Step Implementation Walkthrough

Step 1: Designing Visual Pipelines Without Code

When a new supplier like "Acme Traders" needed onboarding, our team opened Prophecy's visual designer instead of firing up an IDE. We simply dragged components like Read Dataset, Filter Rows, and Join Datasets onto the canvas.

Each component handled real Spark operations behind the scenes, but the team could focus entirely on business logic rather than syntax. For example, filtering active products became as simple as configuring a visual filter component rather than writing complex PySpark code.

The visual approach proved especially valuable for business analysts who understood the data requirements but weren't comfortable writing Spark transformations.

Step 2: Leveraging AI Powered Development Assistance

Prophecy's AI Co-Pilot became our secret weapon for rapid development. When mapping columns from supplier feeds, the AI would automatically suggest transformation logic based on the schema patterns it recognised.

Key AI features that accelerated our work:

Natural language to code: Describing transformations in plain English generated matching PySpark automatically.

Schema aware suggestions: Column mappings and data type conversions were recommended based on input schemas.
Auto generated unit tests: Synthetic test data and validation logic were created for each transformation.
Real time feedback: Unused variables and type mismatches were flagged instantly, like having a smart IDE built in.

Step 3: Building Modular, Reusable Components

Rather than building monolithic pipelines, we created modular transformation blocks for common operations like data cleansing and field standardisation. These modules lived in a shared Git repository, allowing reuse across different supplier onboarding projects.

For instance, our "cleanse_product_feed" module worked perfectly for Acme Traders and could be immediately applied to other suppliers with similar data patterns. This modular approach eliminated duplicate code and ensured consistent processing logic across all suppliers.

Step 4: Seamless Databricks Execution and Scaling

Once pipelines were designed visually in Prophecy, the platform compiled everything into clean PySpark code and executed it on Databricks clusters. We configured job parameters for node types, autoscaling rules, and cluster pools to optimise performance.

Multiple supplier ingestions ran in parallel using Databricks Workflows, with built in retry logic to handle unreliable source files. Whether processing 10 GB or 10 TB of supplier data, Databricks handled the scaling automatically.

Step 5: Governance and Monitoring Through Unity Catalog

Every dataset produced was automatically registered in Unity Catalog, where we applied granular access controls through workspace groups. Schema evolution rules handled changes in supplier data formats gracefully, preventing pipeline breaks when suppliers updated their feeds.

Databricks job logs combined with Prophecy's lineage tracking gave us complete visibility into pipeline execution. This proved invaluable during audits and when troubleshooting data quality issues.

Impact

The transformation delivered concrete, measurable improvements:

60% reduction in pipeline development time: Visual design and AI assistance eliminated weeks of manual coding for each new supplier.
70% reduction in hand written PySpark code: Auto generated, modular code improved consistency whilst reducing maintenance overhead.
Improved pipeline reliability: Auto generated tests, schema validation, and modular structure created more robust, production ready pipelines.
Faster team onboarding: Business analysts could now contribute to pipeline development without learning complex Spark programming.
Enhanced scalability: The standardised framework could handle new suppliers with minimal custom development effort.

Conclusion

Our experience with Prophecy and Databricks proved that low code doesn't mean low capability. By combining visual development with enterprise grade execution, we created a supplier onboarding process that was both fast and robust.

The key to success was recognising that not every data engineering challenge requires hand coded solutions. Sometimes, the smartest approach is to abstract away complexity whilst maintaining the power and flexibility your business demands.

For organisations struggling with similar supplier onboarding challenges, this approach offers a clear path forward: standardise common patterns, leverage AI for acceleration and let visual tools handle the complexity whilst your team focuses on business value. Want to see how this could work for you. Talk to an expert now.

Don’t forgot to download or share with your colleagues and help your organisation navigate these trends.

Smarter data, smarter decisions.

Let's Talk