Re-Architecting Alarm Management for Global Manufacturing at Scale

Challenges
As Cabot scaled globally, its alarm management systems became fragmented, with siloed data, limited analytics, and no enterprise-wide standardisation. Legacy tools nearing end-of-support and high-cost subscription platforms increased reliability risks and operational costs. The core issue was architectural: a lack of centralisation, governance gaps, and poor scalability, which limited visibility, control, and cross-plant performance analysis.
Outcome
Improved reliability with ~25–35% fewer failed jobs and ~30% lower MTTR, while optimising infrastructure to achieve ~15–25% cost savings. Enabled teams to shift from operational support to higher-value initiatives and innovation.
Solution
Lakehouse Build & Modernisation - Microsoft Fabric
Challenges
Solution
Technology Stack
Outcomes
Cabot Corporation, established in 1882, is an American speciality chemicals and performance materials company headquartered in Boston. It operates across 40+ manufacturing plants globally, spanning Asia Pacific, Europe, the Middle East & Africa, North and South America. With over 140 years of innovation in reinforcing carbons, battery materials, engineered elastomer composites, conductive compounds and aerogels, Cabot’s operations are deeply rooted in precision, safety and process reliability.
This case study explores how Cabot transitioned from a fragmented, tool-dependent alarm management setup to a centralised, enterprise-grade platform. Faced with rising costs, limited visibility and growing architectural constraints, the organisation moved beyond incremental fixes to implement a unified solution designed for standardisation, scalability
Client Problem
As Cabot’s manufacturing operations expanded across geographies, its alarm management ecosystem struggled to keep pace with enterprise-level demands. What was once a set of functional, plant-level solutions began to create challenges at scale.
Tooling Constraints
DeltaV Analyze, a third-party tool, was nearing the end of its support. This introduced long-term reliability and sustainability risks.
AgileOps operated as a third-party, subscription-based platform with high recurring licensing costs.
The existing tools were not designed for enterprise-wide standardisation or analytics at this scale.
Critical Functional Gaps
The system lacked any capability for cross-plant benchmarking, which prevented comparative performance analysis across sites.
Alarms could not be structured across equipment hierarchy, unit, asset or area, limiting contextual and operational analysis.
Lack of a centralised platform for alarm data analysis
Missing enterprise-wide standardisation results in inconsistent alarm management practices across sites.
Over time, these gaps moved beyond technical inconvenience and began directly impacting operations. Dependency on tooling nearing end-of-support emerged as a sustained long-term risk, while subscription-based licensing models increased costs across plants. At this stage, the challenge was not about the tools, but the real issue was architectural.
Root Cause Analysis
Considering all the challenges the enterprise was going through. Our experts conducted a thorough root cause analysis and identified the following:
Architecture-level limitations: At the architectural level, alarm data remained isolated, with no centralised ingestion or storage framework to unify it. This resulted in siloed systems that operated independently, preventing any form of enterprise-wide analytics or consolidated visibility across sites.
Governance gaps: From a governance perspective, there was no alignment with ISA 18.2 standards across plants, resulting in a lack of repeatable, auditable alarm management processes. In addition, the absence of consistent KPI definitions across sites created confusion in how alarm performance was measured and managed.
Performance bottlenecks: The legacy systems in place were not designed to support enterprise-scale analytics, making it difficult to handle growing data volumes and complexity. As a result, there was no capability to perform comparative analysis across plants.
Reliability Issues: Reliance on third-party platforms created dependencies that limited control, while the overall architecture lacked the flexibility required to scale and adapt to future needs.
Cost Drivers: The cost structure of the existing setup added another layer of strain. AgileOps carried high licensing costs under a subscription-based pricing model. In addition, the fragmented nature of the systems introduced operational inefficiencies, further increasing the total cost of ownership.
Solution Architecture
To address these challenges, the solution focused on a fundamental architectural shift. The aim was to move away from isolated plant-level systems toward a unified enterprise platform.
The approach was to build a centralised, in-house alarm and event management platform using Microsoft ADX and RTI.
Data flow:
DeltaV → OPC A&E (Pro+) → AspenTech Inmation → Event Hub → Azure Data Explorer (ADX) → Analytics Layer → Microsoft Fabric (KQL DB) → Power BI (Direct Query) This architecture was designed not just to replace tools, but to enable enterprise-wide capabilities:
Centralised alarm data across all plants
Standardised ISA 18.2 metrics
Automated reporting
Equipment hierarchy-based analysis
Cross-plant benchmarking
Scalable and future-proof architecture
Elimination of dependency on DeltaV Analyze and AgileOps
In-house dashboard replacing third-party tools
Analytics & Dashboard Layer with Power BI
The aim was to make insights accessible and actionable across the enterprise. The Alarm and Events Monitoring Dashboard was built using Direct Query from Microsoft Fabric (KQL DB). At the enterprise level, it gives a complete view of alarm performance across all plants. It shows average alarms per day and per hour, alarm categories and total alarms and events. This helped teams track trends and compare performance across sites. A bookmark view is also available, where all data can be seen in a table format for deeper analysis and access to detailed values.
At the plant level, the dashboard provides a quick snapshot of performance using primary and secondary KPIs. It also breaks down alarms by level, category and node. This makes it easier to spot patterns and identify areas that need attention. At a more detailed level, the dashboard looks at equipment-specific performance. It shows alarms by area and priority, along with operator actions by node. It also highlights the top 10 alarm sources and contributors. This helps teams quickly identify root causes and focus on the most critical issue.
How We Delivered
Once the architecture was defined, execution focused on systematically replacing legacy dependencies while ensuring continuity and validation across all plants.
Platform Changes
A centralised ADX-based alarm management platform was built to unify data across all plants.
DeltaV Analyze was replaced to eliminate dependency on a tool nearing end-of-support.
AgileOps was replaced to remove reliance on a high-cost, subscription-based third-party platform.
Microsoft Fabric was integrated to enable analytics consumption through a unified layer.
Tooling Decisions
Each component in the architecture was selected for a clearly defined role:
Azure Data Explorer (ADX) was used to enable scalable event analytics across large volumes of alarm data.
RTI was used as the integration layer to connect systems across the architecture.
AspenTech Inmation was used to orchestrate data flow from source systems.
Event Hub was used to support real-time streaming ingestion of alarm data.
Power BI was used to deliver visualisation and reporting capabilities.
Automation Introduced
To ensure consistency and scalability, automation was embedded across the pipeline:
Alarm data ingestion from DeltaV systems was fully automated to eliminate manual intervention.
KPI generation aligned with ISA 18.2 standards was automated to ensure consistency across plants.
Enterprise-level reporting was automated to improve efficiency and reduce manual effort.
Testing & Validation
The solution was rigorously validated to ensure both accuracy and reliability:
End-to-end data flow across all plants was validated to confirm complete and accurate data movement.
KPIs such as alarm per day, alarm per hour and category distribution were validated for correctness.
Cross-plant benchmarking capabilities were validated to ensure consistent comparisons across sites.
Dashboard outputs were validated for visual accuracy, drill-down functionality and tabular data consistency.
Technology Stack
DeltaV
OPC A&E (Pro+)
AspenTech Inmation
Event Hub
Azure Data Explorer (ADX)
RTI
Microsoft Fabric (KQL DB)
Power BI
Outcome
Improved platform reliability with ~25–35% reduction in failed jobs.
Reduced mean time to resolution (MTTR) by ~30% through proactive monitoring and structured incident management.
Optimised infrastructure and workload usage, delivering ~15–25% cost savings without impacting performance
Enabled internal teams to shift focus from operational support to higher-value initiatives and innovation.
POD & Managed Ops Transition
Cloudaeon’s engagement with Cabot Corporation moved in a structured transition model: Solution → POD → Managed Operations
Solution Phase: Built a unified, centralised alarm management platform and established an ISA 18.2-aligned framework.
POD Phase: Executed a structured rollout across multiple plants with standardised ingestion and analytics implementation.
Managed Ops Phase: Transitioned to continuous monitoring and support, with ongoing optimisation of analytics and reporting and sustained governance and standardisation.
Conclusion
This transformation highlights the importance of combining strong architectural thinking with deep execution expertise. By leading the shift from fragmented, plant-level systems to a centralised, ISA 18.2-aligned platform, Cloudaeon enabled Cabot to not only address immediate challenges around cost, risk, and visibility, but also establish a scalable and governed system for alarm management across its global operations. It is this ability to move beyond tools and design for enterprise-wide standardisation and long-term sustainability that defines the real impact of Cloudaeon’s approach.
