
Introduction
Have you ever gotten into a car and been overwhelmed with an onslaught of warnings, beeps and flashes? Have you ever received so many email notifications that you simply close the application?
This is a realistic scenario that occurs when online DGA monitors have overly “cautious” alarm settings applied coupled with no formal process to adjust the settings. When one considers an alarm level and a warning associated with that, it is easy to overlook the fact the alarm is just the start of an investigative and reactive procedure.
Effective alarm settings and strategies can and do save transformers, providing essential early warning and breathing space so engineers can make better decisions about what do with the asset.
However, we all remember the story about the “boy who cried wolf” – if an engineer or an operator is inundated with alarms, the real alarm will be missed among the noise.
In this article we will unpack a few aspects related to alarm management and aim to provide some inspiration regarding how end users should set their corresponding policies and strategies.
Alarms In the Online DGA Context
In the context of online DGA monitors there are three fundamental components of an alarm management process:
- The configuration of individual alarms.
- The establishment of a formal response process.
- Establishing and configuring new alarm settings to the monitor.
Regarding first establishing settings on a monitor for a particular transformer:
- Each gas on each monitor needs specific alarms set for gas levels and optional use of rate-of-change thresholds.
- The gas level thresholds should be contextualized based on the age and application of the transformer fleet.
- IEEE C57.104 and IEC 60599 (and related guidelines) provide a foundation for setting these thresholds, but they are often unsuitable for transformers with a tendency to produce CO, CO2, H2 and C2H6 under fairly normal loading conditions!
- There must be a documented and auditable process for making changes to alarm settings and verifying these settings.
Effective alarm settings and strategies can and do save transformers, providing essential early warning and breathing space so engineers can make better decisions about what do with the asset.
The Alarm Response Process
Equally crucial is the creation of a structured response process. This process should categorise alarms by their severity and priority, laying out clear, actionable steps for various scenarios. Whether it involves planning transformer testing or taking supplementary oil samples, each action must be having a WHAT and a WHO is responsible.
Response Process:
- Classify alarms by severity.
- Who should receive which alarms?
- Define actions for each alarm scenario.
Alarm Classification:
- Operational Alarms: E.g. Notify operators of severe gassing alarm – de-load or de-energize transformer. This could be for severe increase in acetylene and ethylene which can indicate the transformer is at serious risk.
- Asset Health Alarms: E.g. Notify engineers of a gassing event, warranting investigation. This could include increases in gases such as carbon dioxide, carbon monoxide and ethane which are usually used for supplementary diagnostics.
However, each scenario may warrant a different strategy. To illustrate, consider the classical power plant control room. In such environments, alarms are linked to the number of transformers, with operators focusing on power generation, here even the slightest gassing alarm may be of interest. Conversely, in a transmission system operator control room, where redundancy is common, only the most critical alarms warrant attention. Meanwhile, dedicated asset health control rooms, manned by asset managers more than operators, are designed for this very purpose.
Operational Contexts:
- Power Plant: Focus on key transformer alarms.
- TSO Control Room: Limit alarms to critical cases.
- Asset Health Room: Frequent alarms for comprehensive monitoring.
Lastly, the intricacies of multi-gas online DGA require nuanced strategies. Not all gases indicate severe conditions; the severity is contingent on both the gas type and quantity. Moreover, service alarms, which address self-diagnostic issues, must be distinguished from primary asset health alarms. Maintaining independent thresholds for caution and alarm states preserves the integrity of the system, while each alarm triggers specific, actionable responses, ensuring a proactive approach to asset management.
Learning from other industry practices
Turning our attention to industry norms for more general industrial alarm guidance outside of the DGA world, the EEMUA 191 guidelines emerge as a cornerstone for alarm settings across industrial sectors. They advocate for a tiered priority system, recommending four distinct levels: Critical, High, Medium, and Low. The configuration must be precise, with critical alarms capped at fewer than 20 per control room. Anymore and an operator is overwhelmed (as a rough guide maximum 6 alarms in total per hour per operator). Meanwhile, the distribution of other priorities—High at 5%, Medium at 15%, and Low at 80%—ensures a balanced and manageable alarm system. This has implications for the DGA ecosystem to be discussed in the next section.
Multi-Gas DGA Strategy
An effective monitoring and alarming strategy regarding multi gas online DGA requires the appreciation of a few key points:
- Not all the gases represent a severe condition – that being – severity is both gas and gas quantity dependent. Just because IEC and IEEE values may exist for all gases, does not mean alarms should be set for all gases.
- Service alarms (where the monitor flags self-diagnostic aspects) must not be viewed in the same manner as DGA gas or moisture alarms. A distinction between primary asset health alarms (the transformer) and monitor maintenance alarms (the monitoring device) must therefore be made.
- The usefulness of caution (high) and alarm (high-high) thresholds being independent is that the severity of an alarm can be preserved. This becomes useless if too many alarms are set or if the difference between a caution and an alarm threshold is too small.
- On the device side each gas can be mapped to a caution or alarm, but on the asset management side, each individual gas caution or alarm must be mapped to a specific action. For example, a moisture alarm could trigger an action for moisture trending to estimate paper wetness, whereas an acetylene alarm could trigger both a SCADA alarm and the possibility of preventative de-energization for further investigation.
Table I below provides an example of how each gas can be given independent treatment in the priority and severity.
If using the above philosophy, then increases in CO2 would not cause an upgrade from CAUTION to ALARM state, but for the more severe gases it would. This is one example how the alarm settings approach is much more than the individual concentration threshold.
Alarm settings themselves
The setting of caution and alarm thresholds must be done based on a minimum of 5-14 days data so that the immediate levels and trends of gases are established. Rate of change thresholds are best for gassing transformers where a slow and steady production are already known. In other cases, concentration thresholds are normally sufficient. Values in IEC 60599 and IEEE C57.104 and related standards are mainly designed from a retrospective monitoring perspective i.e. receiving a 6 monthly or yearly update on DGA levels from lab DGA. In addition, those levels are designed to benchmark a unit not to provide a specific alarm level.
Maintaining independent thresholds for caution and alarm states preserves the integrity of the system, while each alarm triggers specific, actionable responses, ensuring a proactive approach to asset management.
Example: Accumulation of CO & CO2 over time
In the below example, from an online DGA monitor, we see a typical scenario experienced in recent years whereby power transformers fitted with a bag in the conservator which have typically high (but still within standards) temperatures will have a steady long-term production of CO and CO2, whilst the other key fault gases remain stable. In this case it is an inhibited mineral oil. One could appreciate that whatever the caution level and alarm level set for CO say between 300-500 ppm it is expected it will be reached. The same situation applied for CO2. This is why using the CO2:CO ratio would be a better choice in this particular scenario or alternatively setting a rate-of-change alarm which is higher than the current gradient.
If using legacy IEEE C57.104 condition 1 levels for CO (350 ppm) and CO2 (2500 ppm) this unit would already be on alarm for CO2 and would be predicted to reach CO alarm within months.
Final Thoughts
When employing online monitors, it is often not effective to allow alarms to persist – this is also why the concept of caution (high) and alarm (high-high) is important as it implies cautions are perhaps tools for a constant reminder, but when it jumps to alarm action must occur. The purpose of a persisting alarm is to catch the attention of the responsible person to peruse the data, do a diagnosis and plan next steps. The alarm settings should be revised thereafter. Quite often the most common cause of nuisance alarms in online DGA is CO, CO2, and TDCG (Total Dissolved Combustible Gas) which includes CO in its calculation. This does not mean these metrics are not useful in diagnostics, on contrary they are invaluable for understanding paper involvement in faults in the case of the CO2:CO ratio, however leaving a monitor on perpetual alarm due to a general up-tick of CO and CO2 can unfortunately lead to too many alarms and a resulting deafness to alarms.
An exciting prospect is that Machine Learning (ML) models can be trained to take over the responsibility of alarm threshold management – this is an area of current research and the authors look forward to sharing progress in future articles.

Carl Wolmarans is the Analytics Expert for GE VERNOVA’s Monitoring & Diagnostics Product Line, focussing on transformer monitoring, and is an active member of IEC TC 10, TC14 and CIGRE and he is also a recipient of the IEC 1906 award.

Richard Luke is the Senior Product Manager for the KELMAN family of monitoring products in GE VERNOVA’s Monitoring & Diagnostics Product Line with 15 years’ experience within the product family.