Testing Emergency Stop Systems

This entry is part of 11 in the series Emergency Stop

Emergency Stop on machine consoleI’ve had a number of questions from readers regarding testing of emergency stop systems, and particularly with the frequency of testing. I addressed the types of tests that might be needed in another article covering Checking Emergency Stop Systems. This article will focus on the frequency of testing rather than the types of tests.

The Problem

Emergency stop systems are considered to be “complementary protective measures” in key machinery safety standards like ISO 12100 [1], and CSA Z432 [2]; this makes emergency stop systems the backup to the primary safeguards. Complementary protective measures are intended to permit “avoiding or limiting the harm” that may result from an emergent situation. By definition, this is a situation that has not been foreseen by the machine builder, or is the result of another failure. This could be a failure of another safeguarding system, or a failure in the machine that is not controlled by other means., e.g., a workpiece shatters due to a material flaw, and the broken pieces damage the machine, creating new, uncontrolled, failure conditions in the machine.

Emergency stop systems are manually triggered, and usually infrequently used. The lack of use means that functional testing of the system doesn’t happen in the normal course of operation of the machinery. Some types of faults may occur and remain undetected until the system is actually used, i.e., contact blocks falling off the back of the operator device. Failure at that point may be catastrophic, since by implication the primary safeguards have already failed, and thus the failure of the backup eliminates the possibility of avoiding or limiting harm.

To understand the testing requirements, it’s important to understand the risk and reliability requirements that drive the design of emergency stop systems, and then get into the test frequency question.

Requirements

In the past, there were no explicit requirements for emergency stop system reliability. Details like the colour of the operator device, or the way the stop function worked were defined in ISO 13850 [3], NFPA 79 [4], and IEC 60204-1 [5]. In the soon-to-be published 3rd edition of ISO 13850, a new provision requiring emergency stop systems to meet at least PLc will be added [6], but until publication, it is up to the designer to determine the safety integrity level, either PL or SIL, required. To determine the requirements for any safety function, the key is to start at the risk assessment. The risk assessment process requires that the designer understand the stage in the life cycle of the machine, the task(s) that will be done, and the specific hazards that a worker may be exposed to while conducting the task. This can become quite complex when considering maintenance and service tasks, and also applies to foreseeable failure modes of the machinery or the process. The scoring or ranking of risk can be accomplished using any suitable risk scoring tool that meets the minimum requirements in [1]. There are some good examples given in ISO/TR 14121-2 [7] if you are looking for some guidance. There are many good engineering textbooks available as well. Have a look at our Book List for some suggestions if you want a deeper dive.

Reliability

Once the initial unmitigated risk is understood, risk control measures can be specified. Wherever the control system is used as part of the risk control measure, a safety function must be specified. Specification of the safety function includes the Performance Level (PL), architectural category (B, 1-4), Mean Time to Dangerous Failure (MTTFd), and Diagnostic Coverage (DC) [6], or Safety Integrity Level (SIL), and Hardware Fault Tolerance (HFT), as described in IEC 62061 [8], as a minimum. If you are unfamiliar with these terms, see the definitions at the end of the article.

Referring to Figure 1, the “Risk Graph” [6, Annex A], we can reasonably state that for most machinery, a failure mode or emergent condition is likely to create conditions where the severity of injury is likely to require more than basic first aid, so selecting “S2” is the first step. In these situations, and particularly where the failure modes are not well understood, the highest level of severity of injury, S2, is selected because we don’t have enough information to expect that the injuries would only be minor. As soon as we make this selection, it is no longer possible to select any combination of Frequency or Probability parameters that will result in anything lower than PLc.

It’s important to understand that Figure 1 is not a risk assessment tool, but rather a decision tree used to select an appropriate PL based on the relevant risk parameters. Those parameters are:

Table 1 – Risk Parameters
Severity of Injury frequency and/or exposure to hazard possibility of avoiding hazard or limiting harm
S1 – slight (normally reversible injury) F1 – seldom-to-less-often and/or exposure time is short P1 – possible under specific conditions
S2 – serious (normally irreversible injury or death) F2 – frequent-to-continuous and/or exposure time is long P2 – scarcely possible
Decision tree used to determine PL based on risk parameters.
Figure 1 – “Risk Graph” for determining PL

PLc can be accomplished using any of three architectures: Category 1, 2, or 3. If you are unsure about what these architectures represent, have a look at my series covering this topic.

Category 1 is single channel, and does not include any diagnostics. A single fault can cause the loss of the safety function (i.e., the machine still runs even though the e-stop button is pressed). Using Category 1, the reliability of the design is based on the use of highly reliable components and well-tried safety principles. This approach can fail to danger.

Category 2 adds some diagnostic capability to the basic single channel configuration, and does not require the use of “well-tried” components. This approach can also fail to danger.

Category 3 architecture adds a redundant channel, and includes diagnostic coverage. Category 3 is not subject to failure due to single faults and is called “single-fault tolerant”. This approach is less likely to fail to danger, but still can in the presence of multiple, undetected, faults.

A key concept in reliability is the “fault”. This can be any kind of defect in hardware or software that results in unwanted behaviour or a failure. Faults are further broken down into dangerous and safe faults, meaning those that result in a dangerous outcome, and those that do not. Finally, each of these classes is broken down into detectable and undetectable faults. I’m not going to get into the mathematical treatment of these classes, but my point is this: there are undetectable dangerous faults. These are faults that cannot be detected by built-in diagnostics. As designers, we try to design the control system so that the undetectable dangerous faults are extremely rare, ideally the probability should be much less than once in the lifetime of the machine.

What is the lifetime of the machine? The standards writers have settled on a default lifetime of 20 years, thus the answer is that undetectable dangerous failures should happen much less than once in twenty years of 24/7/365 operation. So why does this matter? Each architectural category has different requirements for testing. The test rates are driven by the “Demand Rate”. The Demand Rate is defined in [6]. “SRP/CS” stands for “Safety Related Part of the Control System” in the definition:

3.1.30
demand rate (rd) – frequency of demands for a safety-related action of the SRP/CS

Each time the emergency stop button is pressed, a “demand” is put on the system. Looking at the “Simplified Procedure for estimating PL”, [6, 4.5.4], we find that the standard makes the following assumptions:

  • mission time, 20 years (see Clause 10);
  • constant failure rates within the mission time;
  • for category 2, demand rate <= 1/100 test rate;
  • for category 2, MTTFd,TE larger than half of MTTFd,L.

NOTE When blocks of each channel cannot be separated, the following can be applied: MTTFd of the summarized test channel (TE, OTE) larger than half MTTFd of the summarized functional channel (I, L, O).

So what does all that mean? The 20-year mission time is the assumed lifetime of the machinery. This number underpins the rest of the calculations in the standard, and is based on the idea that few modern control systems last longer than 20 years without being replaced or rebuilt. The constant failure rate points at the idea that systems used in the field will have components and controls that are not subject to infant mortality, nor are they old enough to start to fail due to age, but rather that the system is operating in the flat portion of the standardized failure rate “bathtub curve”, [9]. See Figure 2. Components that are subject to infant mortality failed at the factory and were removed from the supply chain. Those failing from “wear-out” are expected to reach that point after 20 years. If this is not the case, then the maintenance instructions for the system should include preventative maintenance tasks that require replacing critical components before they reach the predicted MTTFd.

Diagram of a standardized bathtub-shaped failure rate curve.
Figure 2 – Weibull Bathtub Curve [9]
For systems using Category 2 architecture, the automatic diagnostic test rate must be at least 100x the demand rate. Keep in mind that this test rate is normally accomplished automatically in the design of the controls, and is only related to the detectable safe or dangerous faults. Undetectable faults must have a probability of less than once in 20 years, and should be detected by the “proof test”. More on that a bit later.

Finally, the MTTFd of the functional channel must be at least twice that of the diagnostic system.

Category 1 has no diagnostics, so there is no guidance in [6] to help us out with these systems. Category 3 is single fault tolerant, so as long as we don’t have multiple undetected faults we can count on the system to function and to alert us when a single fault occurs; remember that the automatic tests may not be able to detect every fault. This is where the “proof test” comes in. What is a proof test? To find a definition for proof test, we have to look at IEC 61508-4 [10]:

3.8.5
proof test
periodic test performed to detect failures in a safety-related system so that, if necessary, the system can be restored to an “as new” condition or as close as practical to this condition

NOTE – The effectiveness of the proof test will be dependent upon how close to the “as new” condition the system is restored. For the proof test to be fully effective, it will be necessary to detect 100 % of all dangerous failures. Although in practice 100 % is not easily achieved for other than low-complexity E/E/PE safety-related systems, this should be the target. As a minimum, all the safety functions which are executed are checked according to the E/E/PES safety requirements specification. If separate channels are used, these tests are done for each channel separately.

The 20-year life cycle assumption used in the standards also applies to proof testing. Machine controls are assumed to get at least one proof test in their life time. The proof test should be designed to detect faults that the automatic diagnostics cannot detect. Proof tests are also conducted after major rebuilds and repairs to ensure that the system operates correctly.

If you know the architecture of the emergency stop control system, you can determine the test rate based on the demand rate. It would be considerably easier if the standards just gave us some minimum test rates for the various architectures. One standard, ISO 14119 [11] on interlocks does just that. Admittedly, this standard does not include emergency stop functions within its scope, as its focus is on interlocks, but since interlocking systems are more critical than the complementary protective measures that back them up, it would be reasonable to apply these same rules. Looking at the clause on Assessment of Faults, [9, 8.2], we find this guidance:

For applications using interlocking devices with automatic monitoring to achieve the necessary diagnostic coverage for the required safety performance, a functional test (see IEC 60204-1:2005, 9.4.2.4) can be carried out every time the device changes its state, e.g. at every access. If, in such a case, there is only infrequent access, the interlocking device shall be used with additional measures, because between consecutive functional tests the probability of occurrence of an undetected fault is increased.

When a manual functional test is necessary to detect a possible accumulation of faults, it shall be made within the following test intervals:

  • at least every month for PL e with Category 3 or Category 4 (according to ISO 13849-1) or SIL 3 with HFT (hardware fault tolerance) = 1 (according to IEC 62061);
  • at least every 12 months for PL d with Category 3 (according to ISO 13849-1) or SIL 2 with HFT (hardware fault tolerance) = 1 (according to IEC 62061).

NOTE It is recommended that the control system of a machine demands these tests at the required intervals e.g. by visual display unit or signal lamp. The control system should monitor the tests and stop the machine if the test is omitted or fails.

In the preceding, HFT=1 is equivalent to saying that the system is single-fault tolerant.

This leaves us then with recommended test frequencies for Category 2 and 3 architectures in PLc, PLd, and PLe, or for SIL 2 and 3 with HFT=1. We still don’t have a test frequency for PLc, Category 1 systems. There is no explicit guidance for these systems in the standards. How can we determine a test rate for these systems?

My approach would be to start by examining the MTTFd values for all of the subsystems and components. [6] requires that the system have HIGH MTTFd value, meaning 30 years <= MTTFd <= 100 years [6, Table 5]. If this is the case, then the once-in-20-years proof test is theoretically enough. If the system is constructed, for example, as shown Figure 2 below, then each component would have to have an MTTFd > 120 years. See [6, Annex C] for this calculation.

Basic Stop/Start Circuit
Figure 2 – Basic Stop/Start Circuit

PB1 – Emergency Stop Button

PB2 – Power “ON” Button

MCR – Master Control Relay

MOV – Surge Suppressor on MCR Coil

M1 – Machine prime-mover (motor)

Note that the fuses are not included, since they can only fail to safety, and assuming that they were specified correctly in the original design, are not subject to the same cyclical aging effects as the other components.

M1 is not included, since it is the controlled portion of the machine and is not part of the control system.

If a review of the components in the system shows that any single component falls below the target MTTFd, then I would consider replacing the system with a higher category design. Since most of these components will be unlikely to have MTTFd values on the spec sheet, you will likely have to convert from total life values (B10). This is outside the scope of this article, but you can find guidance in [6, Annex C]. More frequent testing, i.e., more than once in 20 years, is always acceptable.

Where manual testing is required as part of the design for any category of system, and particularly in Category 1 or 2 systems, the control system should alert the user to the requirement and not permit the machine to operate until the test is completed. This will help to ensure that the requisite tests are properly completed.

Need more information? Leave a comment below, or send me an email with the details of your application!

Definitions

3.1.9 [8]
functional safety

part of the overall safety relating to the EUC and the EUC control system which depends on the correct functioning of the E/E/PE safety-related systems, other technology safety-related systems and external risk reduction facilities

3.2.6 [8]
electrical/electronic/programmable electronic (E/E/PE)
based on electrical (E) and/or electronic (E) and/or programmable electronic (PE) technology

NOTE – The term is intended to cover any and all devices or systems operating on electrical principles.
EXAMPLE Electrical/electronic/programmable electronic devices include

  • electromechanical devices (electrical);
  • solid-state non-programmable electronic devices (electronic);
  • electronic devices based on computer technology (programmable electronic); see 3.2.5

3.5.1 [8]
safety function
function to be implemented by an E/E/PE safety-related system, other technology safety related system or external risk reduction facilities, which is intended to achieve or maintain a safe state for the EUC, in respect of a specific hazardous event (see 3.4.1)

3.5.2 [8]
safety integrity
probability of a safety-related system satisfactorily performing the required safety functions under all the stated conditions within a stated period of time

NOTE 1 – The higher the level of safety integrity of the safety-related systems, the lower the probability that the safety-related systems will fail to carry out the required safety functions.
NOTE 2 – There are four levels of safety integrity for systems (see 3.5.6).

3.5.6 [8]
safety integrity level (SIL)
discrete level (one out of a possible four) for specifying the safety integrity requirements of the safety functions to be allocated to the E/E/PE safety-related systems, where safety integrity level 4 has the highest level of safety integrity and safety integrity level 1 has the lowest

NOTE – The target failure measures (see 3.5.13) for the four safety integrity levels are specified in tables 2 and 3 of IEC 61508-1.

3.6.3 [8]
fault tolerance
ability of a functional unit to continue to perform a required function in the presence of faults or errors

NOTE – The definition in IEV 191-15-05 refers only to sub-item faults. See the note for the term fault in 3.6.1.
[ISO/IEC 2382-14-04-061]

3.1.1 [6]
safety–related part of a control system (SRP/CS)
part of a control system that responds to safety-related input signals and generates safety-related output signals

NOTE 1 The combined safety-related parts of a control system start at the point where the safety-related input signals are initiated (including, for example, the actuating cam and the roller of the position switch) and end at the output of the power control elements (including, for example, the main contacts of a contactor).
NOTE 2 If monitoring systems are used for diagnostics, they are also considered as SRP/CS.

3.1.2 [6]
category
classification of the safety-related parts of a control system in respect of their resistance to faults and their subsequent behaviour in the fault condition, and which is achieved by the structural arrangement of the parts, fault detection and/or by their reliability

3.1.3 [6]
fault
state of an item characterized by the inability to perform a required function, excluding the inability during preventive maintenance or other planned actions, or due to lack of external resources

NOTE 1 A fault is often the result of a failure of the item itself, but may exist without prior failure.
[IEC 60050-191:1990, 05-01]
NOTE 2 In this part of ISO 13849, “fault” means random fault.

3.1.4 [6]
failure
termination of the ability of an item to perform a required function

NOTE 1 After a failure, the item has a fault.
NOTE 2 “Failure” is an event, as distinguished from “fault”, which is a state.
NOTE 3 The concept as defined does not apply to items consisting of software only.
[IEC 60050–191:1990, 04-01]
NOTE 4 Failures which only affect the availability of the process under control are outside of the scope of this part of ISO 13849.

3.1.5 [6]
dangerous failure
failure which has the potential to put the SRP/CS in a hazardous or fail-to-function state

NOTE 1 Whether or not the potential is realized can depend on the channel architecture of the system; in redundant systems a dangerous hardware failure is less likely to lead to the overall dangerous or fail-to-function state.
NOTE 2 Adapted from IEC 61508-4:1998, definition 3.6.7.

3.1.20 [6]
safety function
function of the machine whose failure can result in an immediate increase of the risk(s)
[ISO 12100-1:2003, 3.28]

3.1.21 [6]
monitoring
safety function which ensures that a protective measure is initiated if the ability of a component or an element to perform its function is diminished or if the process conditions are changed in such a way that a decrease of the amount of risk reduction is generated

3.1.22 [6]
programmable electronic system (PES)
system for control, protection or monitoring dependent for its operation on one or more programmable electronic devices, including all elements of the system such as power supplies, sensors and other input devices, contactors and other output devices

NOTE Adapted from IEC 61508-4:1998, definition 3.3.2.

3.1.23 [6]
performance level (PL)
discrete level used to specify the ability of safety-related parts of control systems to perform a safety function under foreseeable conditions

NOTE See 4.5.1.

3.1.25 [6]
mean time to dangerous failure (MTTFd)
expectation of the mean time to dangerous failure

NOTE Adapted from IEC 62061:2005, definition 3.2.34.

3.1.26 [6]
diagnostic coverage (DC)
measure of the effectiveness of diagnostics, which may be determined as the ratio between the failure rate of detected dangerous failures and the failure rate of total dangerous failures

NOTE 1 Diagnostic coverage can exist for the whole or parts of a safety-related system. For example, diagnostic coverage could exist for sensors and/or logic system and/or final elements.
NOTE 2 Adapted from IEC 61508-4:1998, definition 3.8.6.

3.1.33 [6]
safety integrity level (SIL)
discrete level (one out of a possible four) for specifying the safety integrity requirements of the safety functions to be allocated to the E/E/PE safety-related systems, where safety integrity level 4 has the highest level of safety integrity and safety integrity level 1 has the lowest

[IEC 61508-4:1998, 3.5.6]

Acknowledgements

Thanks to my colleagues Derek Jones and Jonathan Johnson, both from Rockwell Automation, and members of ISO TC199. Their suggestion to reference ISO 14119 clause 8.2 was the seed for this article.

I’d also like to acknowledge Ronald Sykes, Howard Touski, Mirela Moga, Michael Roland, and Grant Rider for asking the questions that lead to this article.

References

[1]     Safety of machinery — General principles for design — Risk assessment and risk reduction. ISO 12100. International Organization for Standardization (ISO). Geneva 2010.

[2]    Safeguarding of Machinery. CSA Z432. Canadian Standards Association. Toronto. 2004.

[3]    Safety of machinery – Emergency stop – Principles for design. ISO 13850. International Organization for Standardization (ISO). Geneva 2006.

[4]    Electrical Standard for Industrial Machinery. NFPA 79. National Fire Protection Association (NFPA). Batterymarch Park. 2015

[5]    Safety of machinery – Electrical equipment of machines – Part 1: General requirements. IEC 60204-1. International Electrotechnical Commission (IEC). Geneva. 2009.

[6]    Safety of machinery — Safety-related parts of control systems — Part 1: General principles for design.  ISO 13849-1. International Organization for Standardization (ISO). Geneva. 2006.

[7]    Safety of machinery — Risk assessment — Part 2: Practical guidance and examples of methods. ISO/TR 14121-2. International Organization for Standardization (ISO). Geneva. 2012.

[8]   Safety of machinery – Functional safety of safety-related electrical, electronic and programmable electronic control systems. IEC 62061. International Electrotechnical Commission (IEC). Geneva. 2005.

[9]    D. J. Wilkins (2002, November). “The Bathtub Curve and Product Failure Behavior. Part One – The Bathtub Curve, Infant Mortality and Burn-in”. Reliability Hotline [Online]. Available: http://www.weibull.com/hotwire/issue21/hottopics21.htm. [Accessed: 26-Apr-2015].

[10] Functional safety of electrical/electronic/programmable electronic safety-related systems – Part 4: Definitions and abbreviations. IEC 61508-4. International Electrotechnical Commission (IEC). Geneva. 1998.

[11] Safety of machinery — Interlocking devices associated with guards — Principles for design and selection. ISO 14119. International Organization for Standardization (ISO). Geneva. 2013.

Sources for Standards

CANADA

Canadian Standards Association sells CSA, ISO and IEC standards to the Canadian Market.

USA

NSSN: National Standards Search Engine powered by ANSI offers standards from most US Standards Development Organizations. They also sell ISO and IEC standards into the US market.

International

International Organization for Standardization (ISO).

International Electrotechnical Commission (IEC).

Emergency Stop – What’s so confusing about that?

This entry is part 1 of 11 in the series Emergency Stop

I get a lot of calls and emails asking about emergency stops. This is one of those deceptively simple concepts that has managed to get very complicated over time. Not every machine needs or can benefit from an emergency stop. In some cases, it may lead to an unreasonable expectation of safety from the user, which can lead to injury if they don’t understand the hazards involved. Some product-specific standards

This entry is part 1 of 11 in the series Emergency Stop

I get a lot of calls and emails asking about emergency stops. This is one of those deceptively simple concepts that has managed to get very complicated over time. Not every machine needs or can benefit from an emergency stop. In some cases, it may lead to an unreasonable expectation of safety from the user, which can lead to injury if they don’t understand the hazards involved. Some product-specific standards mandate the requirement for emergency stop, such as CSA Z434-03, where robot controllers are required to provide emergency stop functionality and work cells integrating robots are also required to have emergency stop capability.

Defining Emergency Stop

Old, non-compliant, E-Stop Button
This OLD button is definitely non-compliant.

So what is an Emergency Stop, or e-stop, and when do you need to have one? Let’s look at a few definitions taken from CSA Z432-04:

Emergency situation — an immediately hazardous situation that needs to be ended or averted quickly in order to prevent injury or damage.

Emergency stop — a function that is intended to avert harm or to reduce existing hazards to persons, machinery, or work in progress.

Emergency stop button — a red mushroom-headed button that, when activated, will immediately start the emergency stop sequence.

and one more:

6.2.3.5.3 Complementary protective measures
Following the risk assessment, the measures in this clause either shall be applied to the machine or shall be dealt with in the information for use.

Protective measures that are neither inherently safe design measures, nor safeguarding (implementation of guards and/or protective devices), nor information for use may have to be implemented as required by the intended use and the reasonably foreseeable misuse of the machine. Such measures shall include, but not be limited to,

a) emergency stop;

b) means of rescue of trapped persons; and

c) means of energy isolation and dissipation.

Modern, non-compliant e-stop button.
This more modern button is non-compliant due to the RED background and spring-return button.

So, an e-stop is a system that is intended for use in Emergency conditions to try to limit or avert harm to someone or something. It isn’t a safeguard, but is considered to be a Complementary Protective Measure. In terms of the Hierarchy of Controls, emergency stop systems fall into the same level as Personal Protective Equipment like safety glasses, safety boots and hearing protection. So far so good.

Is an Emergency Stop Required?

Depending on the regulations and the standards you choose to read, machinery is may not be required to have an Emergency Stop. Quoting from CSA Z432-04:

6.2.5.2.1 Components and elements to achieve the emergency stop function
If, following a risk assessment, it is determined that in order to achieve adequate risk reduction under emergency circumstances a machine must be fitted with components and elements necessary to achieve an emergency stop function so that actual or impending emergency situations can be controlled, the following requirements shall apply:

a) The actuators shall be clearly identifiable, clearly visible, and readily accessible.

b) The hazardous process shall be stopped as quickly as possible without creating additional hazards.
If this is not possible or the risk cannot be adequately reduced, this may indicate that an emergency stop function may not be the best solution (i.e., other solutions should be sought). (Bolding added for emphasis – DN)

c) The emergency stop control shall trigger or permit the triggering of certain safeguard movements where necessary.

Later in CSA Z432-04 we find clause 7.17.1.2:

Each operator control station, including pendants, capable of initiating machine motion shall have a manually initiated emergency stop device.

To my knowledge, this is the only general level machinery standard that makes this requirement. Product family standards often make specific requirements, based on the opinion of the Technical Committee responsible for the standard and their knowledge of the specific type of machinery covered by their document.

Note: For more detailed provisions on the electrical design requirements, see NFPA 79 or IEC 60204-1.

Download NFPA standards through ANSI

This more modern button is still wrong due to the RED background.
This more modern button is non-compliant due to the RED background.

If you read Ontario’s Industrial Establishments regulation (Regulation 851), you will find that the only requirement for an emergency stop is that it is properly identified and located “within easy reach” of the operator. What does “properly identified” mean? In Canada, the USA and Internationally, a RED operator device on a YELLOW background, with or without any text behind it, is recognized as EMERGENCY STOP or EMERGENCY OFF, in the case of disconnecting switches or control switches. I’ve scattered some examples of different compliant and non-compliant e-stop devices through this article.

The EU Machinery Directive, 2006/42/EC, and Emergency Stop

Interestingly, the European Union has taken what looks like an opposing view of the need for emergency stop systems. Quoting from Annex I of the Machinery Directive:

1.2.4.3. Emergency stop
Machinery must be fitted with one or more emergency stop devices to enable actual or impending danger to be averted.

Notice the words “…actual or impending danger…” This harmonizes with the definition of Complementary Protective Measures, in that they are intended to allow a user to “avert or limit harm” from a hazard. Clearly, the direction from the European perspective is that ALL machines need to have an emergency stop. Or do they? The same clause goes on to say:

The following exceptions apply:

  • machinery in which an emergency stop device would not lessen the risk, either because it would not reduce the stopping time or because it would not enable the special measures required to deal with the risk to be taken,
  • portable hand-held and/or hand-guided machinery.

From these two bullets it becomes clear that, just as in the Canadian and US regulations, machines only need emergency stops WHEN THEY CAN REDUCE THE RISK. This is hugely important, and often overlooked. If the risks cannot be controlled effectively with an emergency stop, or if the risk would be increased or new risks would be introduced by the action of an e-stop system, then it should not be included in the design.

Carrying on with the same clause:

The device must:

  • have clearly identifiable, clearly visible and quickly accessible control devices,
  • stop the hazardous process as quickly as possible, without creating additional risks,
  • where necessary, trigger or permit the triggering of certain safeguard movements.

Once again, this is consistent with the general requirements found in the Canadian and US regulations. The directive goes on to define the functionality of the system in more detail:

Once active operation of the emergency stop device has ceased following a stop command, that command must be sustained by engagement of the emergency stop device until that engagement is specifically overridden; it must not be possible to engage the device without triggering a stop command; it must be possible to disengage the device only by an appropriate operation, and disengaging the device must not restart the machinery but only permit restarting.

The emergency stop function must be available and operational at all times, regardless of the operating mode.

Emergency stop devices must be a back-up to other safeguarding measures and not a substitute for them.

The first sentence of the first paragraph above is the one that requires e-stop devices to latch in the activated position. The last part of that sentence is even more important: “…disengaging the device must not restart the machinery but only permit restarting.” That phrase requires that every emergency stop system have a second discrete action to reset the emergency stop system. Pulling out the e-stop button and having power come back immediately is not OK. Once that button has been reset, a second action, such as pushing a “POWER ON” or “RESET” button to restore control power is needed. Point of Clarification: I had a question come from a reader asking if combining the e-stop function and the reset function was acceptable. It can be, but only if:

  • The risk assessment for the machinery does not indicate any hazards that might preclude this approach; and
  • The device is designed with the following characteristics:
  • The device must latch in the activated position;
  • The device must have a “neutral” position where the machine’s emergency stop system can be reset, or where the machine can be enabled to run;
  • The reset position must be distinct from the previous two positions, and the device must spring-return to the neutral position.

The second sentence harmonizes with the requirements of the Canadian and US standards.

Finally, the last sentence harmonizes with the idea of “Complementary Protective Measures” as described in CSA Z432.

How Many and Where?

Where? “Within easy reach”. Consider the locations where you EXPECT an operator to be. Besides the main control console, these could include feed hoppers, consumables feeders, finished goods exit points… you get the idea. Anywhere you can reasonably expect an operator to be under normal circumstances is a reasonable place to put an e-stop device. “Easy Reach” I interpret as within the arm-span of an adult (presuming the equipment is not intended for use by children). This translates to 500-600 mm either side of the center line of most work stations.

How do you know if you need an emergency stop? Start with a stop/start analysis. Identify all the normal starting and stopping modes that you anticipate on the equipment. Consider all of the different operating modes that you are providing, such as Automatic, Manual, Teach, Setting, etc. Identify all of the matching stop conditions in the same modes, and ensure that all start functions have a matching stop function.

Do a risk assessment. This is a basic requirement in most jurisdictions today.

As you determine your risk control measures (following the hierarchy of controls), look at what risks you might control with an Emergency Stop. Remember that e-stops fall below safeguards in the hierarchy, so you must use a safeguarding technique if possible, you can’t just default down to an emergency stop. IF the e-stop can provide you with the additional risk reduction then use it, but first reduce the risks in other ways.

The Stop Function and Control Reliability Requirements

Finally, once you determine the need for an emergency stop system, you need to consider the system’s functionality and controls architecture. NFPA 79 is the reference standard for Canada and the USA, and you can find very similar requirements in IEC 60204-1 if you are working in an international market. EN 60204-1 applies in the EU market for industrial machines.

Download NFPA standards through ANSI
Download IEC standards, International Electrotechnical Commission standards.

Functional Stop Categories

NFPA 79 calls out three basic categories of stop. Note that these are NOT reliability categories, but are functional categories. Reliability is not addressed in these sections. Quoting from the standard:

9.2.2 Stop Functions. The three categories of stop functions shall be as follows:

(1) Category 0 is an uncontrolled stop by immediately removing power to the machine actuators.

(2) Category 1 is a controlled stop with power to the machine actuators available to achieve the stop then remove power when the stop is achieved.

(3) Category 2 is a controlled stop with power left available to the machine actuators.

This E-Stop Button is correct.
This E-Stop button is CORRECT. Note the Push-Pull-Twist operator and the YELLOW background.

A bit later, the standards says:

9.2.5.3 Stop.
9.2.5.3.1 Each machine shall be equipped with a Category 0 stop.

9.2.5.3.2 Category 0, Category 1, and/or Category 2 stops shall be provided where indicated by an analysis of the risk assessment and the functional requirements of the machine. Category 0 and Category 1 stops shall be operational regardless of operating modes, and Category 0 shall take priority. Stop function shall operate by de-energizing that relevant circuit and shall override related start functions.

Note that 9.2.5.3.1 does NOT mean that every machine must have an e-stop. It simply says that every machine must have a way to stop the machine that is equivalent to “pulling the plug”. The main disconnect on the control panel can be used for this function if sized and rated appropriately. For cord connected equipment, the plug and socket used to provide power to the equipment can also serve this function. The question of HOW to effect the Category 0 stop depends on WHEN it will be used – i.e. is it being used for a safety related function? What risks must be reduced, or what hazards must be controlled by the stop function?

You’ll also note that that pesky “risk assessment” pops up again in 9.2.5.3.2. You just can’t get away from it…

Control Reliability

Disconnect with E-Stop Colours indicates that this device is intended to be used for EMERGENCY SWITCHING OFF.
Disconnect with E-Stop Colours indicates that this device is intended to be used for EMERGENCY SWITCHING OFF.

Once you know what functional category of stop you need, and what degree of risk reduction you are expecting from the emergency stop system, you can determine the degree of reliability required. In Canada, CSA Z432 gives us these categories: SIMPLE, SINGLE CHANNEL, SINGLE CHANNEL MONITORED and CONTROL RELIABLE. These categories are being replaced slowly by Performance Levels (PL) as defined in ISO 13849-1 2007.

The short answer is that the greater the risk reduction required, the higher the degree of reliability required. In many cases, a SINGLE CHANNEL or SINGLE CHANNEL MONITORED solution may be acceptable, particularly when there are more reliable safeguards in place. On the other hand, you may require CONTROL RELIABLE designs if the e-stop is the primary risk reduction for some risks or specific tasks.

To add to the confusion, ISO 13849-1 appears to exclude complementary protective measures from its scope in Table 8 — Some International Standards applicable to typical machine safety functions and certain of their characteristics. At the very bottom of this table, Complementary Protective Measures are listed, but they appear to be excluded from the standard. I can say that there is nothing wrong with applying the techniques in ISO 13849-1 to the reliability analysis of a complementary protective measure that uses the control system, so do this if it makes sense in your application.

ISO 13849-1:2006 Table 8
ISO 13849-1:2006 Table 8

Extra points go to any reader who noticed that the ‘electrical hazard’ warning label immediately above the disconnect handle in the above photo is a) upside down, and b) using a non-standard lighting flash. Cheap hazard warning labels, like this one, are often as good as none at all. I’ll be writing more on hazard warnings in future posts.

Use of Emergency Stop as part of a Lockout Procedure or HECP.

One last note: Emergency stop systems (with the exception of emergency switching off devices, such as disconnect switches used for e-stop) CANNOT be used for energy isolation in a Hazardous Energy Control Procedure (a.k.a. Lockout). Devices for this purpose must physically separate the energy source from the down-stream components. See CSA Z460 for more on that subject.

Read our Article on Using E-Stops in HECP.

Pneumatic E-Stop Device
Pneumatic E-Stop/Isolation device.

Standards Referenced in this post:

CSA Z432-04, Safeguarding of Machinery

NFPA 79-07, Electrical Standard for Industrial Machinery
Download NFPA standards at ANSI

IEC 60204-1:09,  SAFETY OF MACHINERY – ELECTRICAL EQUIPMENT OF MACHINES – PART 1: GENERAL REQUIREMENTS

Download IEC standards, International Electrotechnical Commission standards.

ISO 13849-1-2006, Safety of machinery — Safety-related parts of control systems — Part 1: General principles for design

See also

ISO 13850:06, SAFETY OF MACHINERY – EMERGENCY STOP – PRINCIPLES FOR DESIGN

Download IEC standards, International Electrotechnical Commission standards.
Download ISO Standards

Checking Emergency Stop Systems

This entry is part 2 of 11 in the series Emergency Stop

This short article discusses ways to test emergency stop systems on machines.

This entry is part 2 of 11 in the series Emergency Stop

A while back I wrote about the basic design requirements for Emergency Stop systems. I’ve had several people contact me wanting to know about checking and testing emergency stops, so here are my thoughts on this process.

Figure 1 below, excerpted from the 1996 edition of ISO 13850, Safety of machinery — Emergency stop — Principles for design, shows the emergency stop function graphically. As you can see, the initiating factor in this function is a person becoming aware of the need for an emergency stop. This is NOT an automatic function and is NOT a safety or safeguarding function.

Download ISO Standards

ISO 13850 1996 Figure 1 - Emergency Stop Function
ISO 13850 1996 Figure 1 – Emergency Stop Function

Download ISO Standards

I mention this because many people are confused about this point. Emergency stop systems are considered to be ‘complimentary protective measures’, meaning that their functions complement the safeguarding systems, but cannot be considered to be safeguards in and of themselves. This is significant. Safeguarding systems are required to act automatically to protect an exposed person. Think about how an interlocked gate or a light curtain acts to stop hazardous motion BEFORE the person can reach it. Emergency stop is normally used AFTER the person is already involved with the hazard, and the next step is normally to call 911.

All of that is important from the perspective of control reliability. The control reliability requirements for emergency stop systems are often different from those for the safeguarding systems because they are a backup system. Determination of the reliability requirements is based on the risk assessment and on an analysis of the circumstances where you, as the designer, anticipate that emergency stop may be helpful in reducing or avoiding injury or machinery damage. Frequently, these systems have lower control reliability requirements than do safeguarding systems.

Before you begin any testing, understand what effects the testing will have on the machinery. Emergency stops can be partially tested with the machinery at rest. Depending on the function of the machinery and the difficulty in recovering from an emergency stop condition, you may need to adjust your approach to these tests. Start by reviewing the emergency stop functional description in the manual. Here’s an example taken from a real machine manual:

Emergency Stop (E-Stop) Button

Emergency Stop Button
Figure 2.1 Emergency Stop (E-Stop) Button

A red emergency stop (E-Stop) button is a safety device which allows the operator to stop the machine in an emergency. At any time during operation, press the E-Stop button to disconnect actuator power and stop all connected machines in the production line. Figure 2.1 shows the emergency stop button.

There is one E-Stop button on the pneumatic panel.

NOTE: After pressing the E-Stop button, the entire production line from spreader-feeder to stacker shuts down. When the E-Stop button is reset, all machines in the production line will need to be restarted.

DANGER: These devices do not disconnect main electrical power from the machine. See “Electrical Disconnect” on page 21.

As you can see, the general function of the button is described, and some warnings are given about what does and doesn’t happen when the button is pressed.

Now, if the emergency stop system has been designed properly and the machine is operating normally, pressing the emergency stop button while the machine is in mid-cycle should result in the machinery coming to a fast and graceful stop. Here is what ISO 13850 has to say about this condition:

4.1.3 The emergency stop function shall be so designed that, after actuation of the emergency stop actuator, hazardous movements and operations of the machine are stopped in an appropriate manner, without creating additional hazards and without any further intervention by any person, according to the risk assessment.
An “appropriate manner” can include

  • choice of an optimal deceleration rate,
  • selection of the stop category (see 4.1.4), and
  • employment of a predetermined shutdown sequence.

The emergency stop function shall be so designed that a decision to use the emergency stop device does not require the machine operator to consider the resultant effects.

The intention of this function is to bring the machinery to a halt as quickly as possible without damaging the machine. However, if the braking systems fail, e.g. the servo drive fails to decelerate the tooling as it should, then dropping power and potentially damaging the machinery is acceptable.

In many systems, pressing the e-stop button or otherwise activating the emergency stop system will result in a fault or an error being displayed on the machine’s operator display. This can be used as an indication that the control system ‘knows’ that the system has been activated.

ISO 13850 requires that emergency stop systems exhibit the following key behaviours:

  • It must override all other control functions, and no start functions are permitted (intended, unintended or unexpected) until the emergency stop has been reset;
  • Use of the emergency stop cannot impair the operation of any functions of the machine intended for the release of trapped persons;
  • It is not permitted to affect the function of any other safety critical systems or devices.

Tests

Once the emergency stop device has been activated, control power is normally lost. Pressing any START function on the control panel, except POWER ON or RESET should have no effect. If any aspect of the machine starts, count this as a FAILED test.

If resetting the emergency stop device results in control power being re-applied, count this as a FAILED test.

Pressing POWER ON or RESET before the activated emergency stop device has been reset (i.e. the e-stop button has been pulled out to the ‘operate’ position), should have no effect. If you can turn the power back on before you reset the emergency stop device, count this as a FAILED test.

Once the emergency stop device has been reset, pressing POWER ON or RESET should result in the control power being restored. This is acceptable. The machine should not restart. If the machine restarts normal operation, count this as a FAILED test.

Once control power is back on, you may have a number of faults to clear. When all the faults have been cleared, pressing the START button should result in the machine restarting. This is acceptable behaviour.

If you break the machine while testing the emergency stop system, count this as a FAILED test.

Test all emergency stop devices. A wiring error or other problems may not be apparent until the emergency stop device is tested. Push all buttons, pull all pull cords, activate all emergency stop devices. If any fail to create the emergency stop condition, count this as a FAILED test.

If, having conducted all of these tests, no failures have been detected, consider the system to have passed basic functional testing. Depending on the complexity of the system and the criticality of the emergency stop function, additional testing may be required. It may be necessary to develop some functional tests that are conducted while various EMI signals are present, for example.

If you have any questions regarding testing of emergency stop devices, please email me!

Download ISO Standards