Unlocking Excel ease of use at Enterprise Scale with Sigma and Databricks

To be competitive in a Data and AI powered market, businesses can no longer afford to rely on outdated, spreadsheet based processes with inefficiencies, errors and scalability challenges. This whitepaper explores how integrating Sigma on Databricks delivers a modern analytics architecture that overcomes these limitations, providing real-time insights, enhanced governance and optimised performance at scale.
By leveraging Databricks’ cloud-native data platform and Sigma’s intuitive, excel UI, organisations can streamline data operations, automate workflows and drive business intelligence with unparalleled performance. This approach ensures live data access, enhanced security and dynamic reporting capabilities, allowing business users to make faster, data driven decisions without IT bottlenecks.
We also highlight a real world case study with ENVU, demonstrating how Cloudaeon transformed their data management with an automated KPI dashboard. This resulted in a 99% reduction in manual data processing, seamless integration across business functions and real-time performance insights that empowered leadership with better decision making.
The adoption of Databricks and Sigma is not just an upgrade - it’s a fundamental shift in how businesses handle data. Organisations that embrace this modern approach will gain a competitive edge through future proofed operations, but without the on-boarding challenge of other business intelligence platforms.
Get a free recap to share with colleagues
What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
Unlocking Excel’s ease of use at Enterprise scale: with Sigma and Databricks
Introduction
Data is the driving force behind business success and organisations can no longer afford to rely on outdated, inefficient spreadsheet based processes. Traditional spreadsheets introduce critical challenges like stale data, manual errors, lack of scalability and governance issues that slow down decision making and compromise trust.
A survey conducted by Sapio Research revealed that while 92% of people manipulate their spreadsheet data to make it understandable, 40% of them struggle to make sense of the data in these sheets.
Businesses need a solution that not only enhances data accessibility but also ensures real-time insights, collaboration and enterprise grade controls. Sigma, when integrated with Databricks, delivers a revolutionary approach to data analytics by combining the familiarity of spreadsheets with the scalability and power of cloud based data processing.
This whitepaper explores how this architecture eliminates the limitations of legacy tools, optimises performance and strengthens governance, enabling organisations to access the full potential of their data.
With Sigma and Databricks, businesses gain the agility, accuracy and efficiency needed to thrive in today’s data economy.
Chapter 1: From traditional spreadsheets to Sigma
1.1 Challenges using traditional spreadsheets
Out of date data
A major issue with exporting data to a conventional spreadsheet is that it becomes outdated the moment it’s generated, leading to two key challenges:
Teams end up relying on separate, disconnected datasets, with some being more up to date than others.
Teams must either manually rebuild or update recurring reports, or risk making important decisions based on outdated and potentially inaccurate information.
Cell based errors and selection mistakes
Traditional spreadsheets require users to perform calculations at the individual cell level, making it easy to exclude or include incorrect data accidentally. This is particularly problematic when handling large datasets, where small mistakes can have significant consequences.
Manual data handling and lack of automation
Since calculations and formulas are applied at the cell level rather than to entire datasets, users must manually input and adjust values, increasing the likelihood of errors. This makes it difficult to maintain consistency, especially when working with dynamic data that frequently updates.
Scalability challenges
Traditional spreadsheets struggle when handling large datasets with millions of rows. The manual nature of cell based calculations makes them inefficient and error prone as data volume increases. Performance issues such as lagging, crashing and slow processing are common when spreadsheets reach their limits.
Formula Inconsistencies
Because each cell can contain a unique formula, errors can arise when formulas are not uniformly applied across a dataset. Users must double check calculations, which becomes increasingly difficult as spreadsheets grow. This inconsistency can lead to unreliable data analysis and incorrect reporting.
Data integrity risks
Since formulas and calculations depend on individual cell selections, there is a high risk of miscalculations due to human error. Omitting key data, misapplying formulas, or failing to update calculations when new data is added can compromise the accuracy of insights drawn from the spreadsheet.
Sigma addresses these issues by applying formulas at the column level rather than at the individual cell level, ensuring consistency and reducing errors when working with large datasets.
1.2 Spreadsheet challenges solved by Sigma
From stale to live data
Exporting data to traditional spreadsheets quickly becomes outdated, causing:
Disconnected, inconsistent datasets across teams
Manual report updates, increasing the risk of outdated decisions
Sigma solves this by querying live data from your cloud warehouse, ensuring a single source of truth.
Data updates automatically in real-time, eliminating manual exports. Reports update continuously, saving time and ensuring accuracy, while easing collaboration and preventing version control issues.
Ensure calculation accuracy with columns
Sigma’s table interface resembles a spreadsheet but operates differently. Unlike traditional spreadsheets that calculate in individual cells, Sigma applies commands at the column level, ensuring consistency across all rows, even with new data.
1.3 Why does this matter?
Manual cell based calculations increase the risk of selection errors. Sigma prevents such errors by applying formulas to entire columns, ensuring accuracy. Simply add a column and enter a formula; no manual adjustments are needed.
Gain deeper insights with groupings
Sigma’s groups feature simplifies comparative analytics by organising data based on shared values in a column. Grouping consolidates rows into single cells, automatically structuring related data and enabling quick aggregate calculations.
For example, when analysing a retail chain’s profitability, you can group profit by month and further, by store region to assess performance over time. Unlike flat spreadsheets that rely on pivot tables, Sigma allows multiple groupings for deeper insights effortlessly. And if you prefer pivot tables, Sigma supports those too.
Get a high level view with aggregates, totals and summaries
Aggregate functions in Sigma summarise data by calculating values across grouped rows, such as summing a column or counting entries per group. These results adjust dynamically based on the selected grouping.
Sigma also offers a summary bar, displaying calculations for an entire column, and group totals, which reflect a column’s total value at the top. Both can be referenced in formulas for high-level analysis.
For example, in a retail profitability analysis, aggregates, totals and summaries help compare store regions over time, simplifying cohort analysis for deeper insights.
Keep using your favourite spreadsheet formulas
With over 60% of business professionals using Excel for data analysis, Sigma offers a familiar experience with greater flexibility. It supports many of the same formulas, making the transition seamless for spreadsheet users.
Creating calculations is easy, just add a column and enter a formula in the formula bar. Unlike traditional spreadsheets, the formula applies to every row automatically and can be edited anytime, ensuring consistency and accuracy.
1.4 What does Sigma offer?
Quick self service data access
Seamless access to all relevant data
Sigma allows you to join, explore and analyse data from across your organisation without relying on data teams or requesting access from multiple sources. Unlike traditional spreadsheets that require manual data exports and merging, Sigma connects directly to all your data sources in real-time.
Lightning fast insights at scale
By pushing all queries and calculations to the cloud data warehouse, Sigma takes full advantage of near-unlimited computing power. This means you can analyse massive datasets including hundreds of billions of rows without performance limitations. Traditional spreadsheets, like Excel, limit themselves to around 1 million rows, often forcing users to work with summarised or reduced data.
Effortless Collaboration and Reusability
With Sigma, you can build reports, dashboards, and data tables once and they’ll always reflect the latest data. Easily share, duplicate, and repurpose analysis without disrupting others work ensuring a seamless and collaborative data experience.
Chapter 2: Databricks to Sigma architecture
Introduction
Businesses dealing with large volumes of data sets need capable data platforms like Databricks.
Databricks data lake architecture is designed to handle large volumes of data suitable for dynamic and voluminous environments. Businesses with high frequency engagement metrics and large multimedia content can be well accommodated in Databricks.
Combining Databricks and Sigma is the best approach for handling large data sets and top notch analytics.
Moreover, Sigma recommends adopting the Databricks Medallion Architecture as a data design framework for optimised results. However, properly structured data for downstream use in Databricks is essential for Sigma’s performance.

This approach establishes different data quality tiers based on how data moves through the platform. Also known as ‘multi hop architecture’, it categorises data into three levels:
Bronze - (Raw or newly ingested data)
Data can be stored in both Delta and Non-Delta tables or objects. Non-Delta objects provide file format flexibility without enforcing a predefined schema. Meanwhile, Delta tables support schema evolution, enabling the storing of structured, semi-structured and unstructured data within the Delta protocol.
Silver - (Transformed, cleaned and normalised data)
Data is typically stored in Delta tables. Delta tables provide data protection at this stage by ensuring ACID transaction guarantees and maintaining consistency and reliability.
Gold - (Curated business level data)
Data should always be stored in Delta tables. This ensures that the data benefits from the Delta Lake protocol, including ACID transactions, enhanced reliability, and optimised processing performance.
By following this architecture, you can streamline your data model, creating a clear and consistent structure for all data sources before analysis and visualisation in Sigma.
Chapter 3: Optimised Performance
Instantly familiar to users
Great product design balances innovation with user expectations.
Business users have spent years working with spreadsheets or mastering complex BI tools, but as data grows and cloud computing takes over, these traditional tools struggle to keep up. The need for fast, intuitive data access remains and users don’t want to learn yet another tool or wait for developers to deliver the insights they need.
Sigma bridges this gap.
With its familiar spreadsheet like interface, users can seamlessly analyse, visualise and collaborate, all while harnessing the power and scale of the cloud. It’s built for the way business works today.

This maximises productivity and ROI by enabling users to create value immediately, instead of having to be trained on some new tool(s) to work.
Optimised SQL and secure data access with Sigma
When you access data in Sigma, it automatically generates optimised SQL tailored to your cloud data warehouse (CDW), ensuring fast and efficient queries.
While most users don’t need to worry about query details, Sigma provides full transparency for data engineers allowing complete visibility into every query sent to your CDW.

Sigma leverages customer administered connections and the robust security of your CDW, including role based access controls. For added protection, Sigma also supports dedicated connections via Private Links.
Because Sigma queries live data, the insights you receive are always up to date, ensuring you never have to make critical business decisions based on stale information.

High performance data processing at scale
Sigma delivers lightening performance, effortlessly processing billions of rows in seconds. Unlike other systems that simply scale up warehouse compute instances to meet demand, Sigma takes a smarter, more efficient approach.
The specifics of how Sigma optimises query performance depend on business needs, but every customer automatically benefits from its intelligent query execution.
Sigma’s architecture is designed to maximise speed and efficiency ensuring users get rapid, reliable insights without unnecessary computing costs.
Massive scale: Optimised query processing

Here’s how Sigma intelligently handles queries:
Instant Cache Lookup
When you query data, Sigma first checks your local browser cache for the fastest possible response. If the cache is unavailable or expired, the query proceeds to the next step.
Efficient Browser Calculations
Once data is cached, any manipulations such as adding a column to calculate profit are performed locally in your browser. This avoids unnecessary CDW queries and runs instantly, as seen in your workbook’s query history.
Warehouse Results Caching
If multiple users query the same workbook, Sigma checks the CDW’s cache. Since CDWs typically store query results for 24 hours, repeat queries can be served instantly without reprocessing the data.
Precomputed Materialisation for Performance
Some customers choose to pre-calculate and store complex data using materialisation. Sigma intelligently leverages this data, significantly improving report speed and efficiency.
Direct CDW Query as a Last Resort
If none of the above applies, Sigma queries the CDW to retrieve fresh data.
By optimising query execution at every step, Sigma ensures high performance, minimising unnecessary warehouse usage while delivering real-time insights at scale.
Sigma Alpha Query maximises your CDW investment by efficiently handling massive datasets. Not only do users experience a significant boost in performance, but Sigma Alpha Query also continuously reduces the load on your CDW. With Alpha Query, as you add more Sigma users, you'll typically see your cost per user decrease.

Multiple performance levers
Sigma provides unmatched flexibility, allowing each customer to tailor their setup for optimal performance. To build a high performing system, it's important to consider key factors called the ‘performance levers.’
If your system isn't meeting expectations, these levels serve as areas to explore and optimise. Sigma categorise these levers into three groups, each addressing different aspects of performance to ensure seamless data analysis at scale.
Databricks CDW levers
Liquid Clustering
Hive Style Partitioning
z-Order Indexing
Warehouse Size
Data modeling levers
Avoid Selecting All Columns
Use Tables Not Views
Materialised Models (for complex joins)
Incremental Materialisation (large datasets)
Denormalised Model
Sigma levers
Filter Clause in Inner Selects
Default Limit Clause
Materialised Datasets (for complex joins)
Drive Filters from Dimensions (not Facts)
Default Filters
Monitoring
Sigma offers three built in, free features to help you track system performance and manage costs efficiently with features like:
Audit Logs for Transparency
Warehouse Usage Templates
Sigma ensures visibility, cost control and peak performance for your data operations.
Below is an example of the Cost Per Query template.

Collaboration
Traditional analytics sharing by emailing spreadsheets, PDFs, or meeting in conference rooms is slow and inefficient. Sigma revolutionises collaboration with Live Edit, enabling multiple users to work on workbook drafts simultaneously.
With Live Edit, all editors share a single live draft, seeing and contributing to updates in real-time, no matter where they are. Sigma even highlights who is working on what, displaying usernames on selected elements.

This seamless, interactive approach keeps everyone aligned, eliminates version confusion and accelerates decision making - true collaboration at its best.
Cloud Reliability
The cloud has proven its reliability across businesses of all sizes, making on-premise data centres a thing of the past. While there have been occasional outages, they haven't slowed the rapid adoption of cloud services.

Sigma partners with all major cloud providers like Azure and GCP, as well as leading data platforms like Databricks. It ensures that the systems are operational, redundant, and scalable to meet daily customer demands.
Chapter 4: Enhanced governance and audit control
Overview
Sigma has made substantial investments in this area, implementing industry leading security practices to protect your data. It includes features such as immutable hosts, strict container validation and advanced threat detection to ensure robust protection at every level.
To provide transparency, Sigma’s Trust Center offers customers access to latest security reports and certifications, including ISO certifications and SOC 2 reports.
Sigma has dedicated significant time and resources to achieving compliance with the following standards:

In addition to a secure architecture, Sigma ensures that customer data never leaves the Controlled Data Warehouse (CDW), maintaining security, governance and compliance. Users interact within a familiar interface, eliminating the need for Excel exports enhancing security while reducing spreadsheet related errors.
Sigma temporarily retains essential information to support user functionality. However, customer data is only held briefly and is promptly flushed from disk or cache once delivery is complete.
Extended storage applies only to the workbook and system configuration details. Items within the ‘Extended Storage’ red box persist as long as the Sigma instance remains active or until users delete them, while all other data remains short lived.

1. Authentication
Users accessing Sigma via a browser can do so using several methods:
Basic authentication: email and password
Sigma supports regular and guest users
Multi-factor authentication is available
SSO/SAML Single Sign-on using any compatible identity management provider
The SAML protocol enables access to Sigma using a single set of login credentials.
It works by passing authentication information between the identity provider e.g. OneLogin, AuthO, OKTA, Azure, Google SAML, Ping, etc. and Sigma
Sigma supports any SAML provider that uses SAML 2.0
SSO/OAuth
The "Open Authorisation Protocol" provides secure delegated access to applications with tokens, not passwords.
Sigma also supports the key pair authentication method (public key + private key) for Snowflake connections.
2. Authorisation
Authorisation is permitting someone (or an account) to have access to something.
In Sigma, this refers to (at a high level):
Content
Workbooks |
|
Folders | A storage container for Sigma content. |
Teams | A group of users who require common permissions |
Workspaces | These are used to categorise and share folders and documents. Then they can be shared with users and/or teams using permission grants. |
Data
The connection to the CDW and related database, schema, tables and so forth is accessible.
Row-level security
Column-level security
Feature(s)
For example, not allowing export of data. Sigma's role based access system provides granular control over what a user is permitted to do.
In Sigma, users are assigned an "Account Type", either Lite, Essential or Pro.
A custom option is also available and permissions of any account type can be adjusted, within the parameters of the license type.
Chapter 5: ENVU Case Study
Case Study: Transforming ENVU with automated KPI dashboard
Client overview
ENVU, a newly established entity spun off from Bayer, required a robust, scalable and automated data management system to transition into full independence within three years. Existing manual processes created inefficiencies, data inconsistencies and a lack of real-time visibility.
Challenges
Fragmented data: Each region tracked sales, inventory and approvals using separate Excel sheets, not connected to the central sheet.
Manual processes: Frequent errors and delays in updating regional and central trackers.
Poor visibility: Leadership lacked real-time insights for informed decision making.
Solution
Cloudaeon designed and implemented a fully automated data workflow to enable real-time KPI dashboards, delivering:
End to end automation: Eliminated manual data updates, ensuring real-time accuracy.
Scalable architecture: Replaced outdated Excel trackers with an integrated data platform.
Live performance insights: Empowered leadership with instant access to key metrics.
Bi-directional data sync: Ensured seamless updates across all regional and central trackers.
Technology stack
Azure Data Factory (ADF): Automated data extraction from SharePoint Excel files.
Databricks: Processed and transformed complex datasets for real-time insights.
Sigma: Provided a dynamic visualisation layer with advanced KPI dashboards and reports.
Write-back functionality: Enabled direct data updates, eliminating redundancies.
Business impact
99% Reduction in manual data processing
97% elimination of manual Excel tracking
95% reduction in FTE cost
Instant, real-time decision making with accurate insights
Seamless data consistency across all regional and group operations
Summary
Cloudaeon revolutionised ENVU’s data processing, enabling unmatched efficiency, accuracy, and performance. This transformation positions ENVU for long term success, ensuring a seamless transition to full independence and empowering leadership with unparalleled real-time insights.
ENVU now operates with confidence and efficiency with a modern approach to data.
Conclusion & Recommendations
The limitations of traditional spreadsheet based processes, manual errors, stale data, lack of scalability and governance risks, make them unsuitable for today’s enterprises.
As businesses scale, the need for real-time insights, security and performance becomes critical. The integration of Databricks and Sigma offers a modern, scalable alternative, combining cloud powered data processing with the familiar ease of use of Excel.
By moving Excel based workflows off the desktop and into a Databricks/Sigma architecture, organisations can:
Eliminate manual errors: Automate calculations and ensure data consistency with column level processing in Sigma.
Enable real-time decisions: Query live data directly from Databricks, ensuring reports and dashboards always reflect the latest insights.
Improve collaboration and governance: Centralise data access, enforce security policies and eliminate spreadsheet version control issues.
Enhance scalability and performance: Seamlessly analyse billions of rows without hitting performance limits.
Reduce operational costs: Avoid inefficiencies caused by data duplication, fragmented reporting and spreadsheet induced bottlenecks.
Why Cloudaeon is your ideal partner for modernisation
Transitioning from static, error prone Excel workflows to a powerful, enterprise scale analytics solution requires expertise, precision and a deep understanding of both cloud platforms and business intelligence. Cloudaeon is uniquely positioned to lead this transformation, bringing a proven track record in Databricks architecture, data engineering and analytics modernisation.
As Sigma’s first partner aligned to Databricks integration, we specialise in seamlessly migrating spreadsheet driven processes into a scalable, cloud native environment, ensuring minimal disruption, maximum efficiency and rapid adoption.
How Cloudaeon enables a seamless transition
End to end modernisation strategy: We assess your existing Excel based workflows to identify inefficiencies and design a Databricks/Sigma solution that optimises performance, security and scale.
Seamless Databricks integration: As a leading Databricks partner, we ingest, transform and structure your data to enable high performance, real-time analytics with enterprise governance.
Sigma implementation & adoption: We bridge the gap between traditional Excel users and cloud based analytics, ensuring intuitive, spreadsheet like functionality while removing the limitations of outdated tools.
Security & Governance at scale: Our governance frameworks establish role-based access, automated workflows and compliance controls, safeguarding your data while enhancing collaboration.
Ongoing optimisation & support: Cloudaeon provides continuous monitoring, performance tuning and user enablement, ensuring long term success with scalable analytics that grow with your business.
Unlock the full potential of your data
Moving Excel off the desktop and into Databricks and Sigma isn’t just an upgrade - it’s a business transformation. Cloudaeon ensures your organisation maximises efficiency, eliminates data silos and gains real-time insights for faster, smarter decisions.
With our deep expertise in Databricks and Sigma, we empower businesses to embrace modern analytics with confidence. Making spreadsheets obsolete and turning data into a competitive advantage.