Busting Emergency Stop Myths

This entry is part 3 of 13 in the series Emer­gency Stop

There are a num­ber of myths that have grown up around emer­gency stops over the years. These myths can lead to injury or death, so it’s time for a lit­tle Myth Bust­ing here on the MS101 blog!

There are a num­ber of myths that have grown up around emer­gency stops over the years. These myths can lead to injury or death, so it’s time for a lit­tle Myth Bust­ing here on the MS101 blog!

What does ‘emergency’ mean?

Con­sid­er for a moment the roots of the word ‘emer­gency’. This word comes from the word ‘emer­gent’, mean­ing a sit­u­a­tion that is devel­op­ing or emerg­ing in the moment. Emer­gency stop sys­tems are intend­ed to help the user deal with poten­tial­ly haz­ardous con­di­tions that are emerg­ing in the moment. These con­di­tions have prob­a­bly arisen because the design­ers of the machin­ery failed to con­sid­er all the fore­see­able uses of the equip­ment, or because some­one has cho­sen to mis­use the equip­ment in a way that was not intend­ed by the design­ers. The key func­tion of an Emer­gency Stop sys­tem is to pro­vide the user with a back­up to the pri­ma­ry safe­guards. These sys­tems are referred to as “Com­ple­men­tary Pro­tec­tive Mea­sures” and are intend­ed to give the user a chance to “avert or lim­it harm” in a haz­ardous sit­u­a­tion. With that in mind, let’s look at three myths I hear about reg­u­lar­ly.


Myth #1 – The Emergency Stop Is A Safety Device

Waterwheel and belt. Credit: Harry Matthews & http://www.old-engine.com
A Fitz Water Wheel and Belt Dri­ve, Cred­it: Har­ry Matthews & http://www.old-engine.com

Ear­ly in the Indus­tri­al Rev­o­lu­tion machine builders real­ized that users of their machin­ery need­ed a way to quick­ly stop a machine when some­thing went wrong. At that time, over­head line-shafts were dri­ven by large cen­tral pow­er sources like water­wheels, steam engines or large elec­tric motors. Machin­ery was cou­pled to the cen­tral shafts with pul­leys, clutch­es and belts which trans­mit­ted the pow­er to the machin­ery.

See pic­tures of a line-shaft pow­ered machine shop or click the image below.

Line Shaft in the Mt. Wilson Observatory Machine Shop
Pho­to: Lar­ry Evans & www.oldengine.org

These cen­tral engines pow­ered an entire fac­to­ry, so they were much larg­er than an indi­vid­ual motor sized for a mod­ern machine. In addi­tion, they could not be eas­i­ly stopped, since stop­ping the cen­tral pow­er source would mean stop­ping the entire fac­to­ry – not a wel­come choice. Emer­gency stop devices were born in this envi­ron­ment.

Learn more about Line Shafts at Harry’s Old Engines.

See pho­tos and video of a work­ing line shaft machine shop. 

Due to their ear­ly use as a safe­ty device, some have incor­rect­ly con­sid­ered emer­gency stop sys­tems safe­guard­ing devices. Mod­ern stan­dards make the dif­fer­ence very clear. The eas­i­est way to under­stand the cur­rent mean­ing of the term “EMERGENCY STOP” is to begin by look­ing at the inter­na­tion­al stan­dards pub­lished by IEC1 and ISO2.

emer­gency stop3
emer­gency stop func­tion

func­tion that is intend­ed to

—   avert aris­ing, or reduce exist­ing, haz­ards to per­sons, dam­age to machin­ery or to work in progress,

—   be ini­ti­at­ed by a sin­gle human action


Haz­ards, for the pur­pos­es of this Inter­na­tion­al Stan­dard, are those which can arise from

—   func­tion­al irreg­u­lar­i­ties (e.g. machin­ery mal­func­tion, unac­cept­able prop­er­ties of the mate­r­i­al processed, human error),

—   nor­mal oper­a­tion.

It is impor­tant to under­stand that an emer­gency stop func­tion is “ini­ti­at­ed by a sin­gle human action”. This means that it is not auto­mat­ic, and there­fore can­not be con­sid­ered to be a risk con­trol mea­sure for oper­a­tors or bystanders. Emer­gency stop may pro­vide the abil­i­ty to avoid or reduce harm, by pro­vid­ing a means to stop the equip­ment once some­thing has already gone wrong. Your next actions will usu­al­ly be to call 911 and admin­is­ter first aid.

Safe­guard­ing sys­tems act auto­mat­i­cal­ly to pre­vent a per­son from becom­ing involved with the haz­ard in the first place. This is a reduc­tion in the prob­a­bil­i­ty of a haz­ardous sit­u­a­tion aris­ing, and may also involve a reduc­tion in the sever­i­ty of injury by con­trol­ling the haz­ard (i.e., slow­ing or stop­ping rotat­ing machin­ery before it can be reached.) This con­sti­tutes a risk con­trol mea­sure and can be shown to reduce the risk of injury to an exposed per­son.

Emer­gency stop is reac­tive; safe­guard­ing sys­tems are proac­tive.

In Cana­da, CSA defines emer­gency stop as a ‘Com­ple­men­tary Pro­tec­tive Mea­sure’ in CSA Z432-046:
Safe­guards (guards, pro­tec­tive devices) shall be used to pro­tect per­sons from the haz­ards that can­not rea­son­ably be avoid­ed or suf­fi­cient­ly lim­it­ed by inher­ent­ly safe design. Com­ple­men­tary pro­tec­tive mea­sures involv­ing addi­tion­al equip­ment (e.g., emer­gency stop equip­ment) may have to be tak­en. Com­ple­men­tary pro­tec­tive mea­sures
Fol­low­ing the risk assess­ment, the mea­sures in this clause either shall be applied to the machine or shall be dealt with in the infor­ma­tion for use.
Pro­tec­tive mea­sures that are nei­ther inher­ent­ly safe design mea­sures, nor safe­guard­ing (imple­men­ta­tion of guards and/or pro­tec­tive devices), nor infor­ma­tion for use may have to be imple­ment­ed as required by the intend­ed use and the rea­son­ably fore­see­able mis­use of the machine. Such mea­sures shall include, but not be lim­it­ed to,

(a) emer­gency stop;
(b) means of res­cue of trapped per­sons; and
© means of ener­gy iso­la­tion and dis­si­pa­tion.

In the USA, three stan­dards apply: ANSI B11ANSI B11.19–2003, and NFPA 79:

ANSI B11-2008

3.80 stop: Imme­di­ate or con­trolled ces­sa­tion of machine motion or oth­er haz­ardous sit­u­a­tions. There are many terms used to describe the dif­fer­ent kinds of stops, includ­ing user- or sup­pli­er-spe­cif­ic terms, the oper­a­tion and func­tion of which is deter­mined by the indi­vid­ual design. Def­i­n­i­tions of some of the more com­mon­ly used “stop” ter­mi­nol­o­gy include:

3.80.2 emer­gency stop: The stop­ping of a machine tool, man­u­al­ly ini­ti­at­ed, for emer­gency pur­pos­es;

7.6 Emergency stop

Elec­tri­cal, pneu­mat­ic and hydraulic emer­gency stops shall con­form to require­ments in the ANSI B11 machine-spe­cif­ic stan­dard or NFPA 79.
Infor­ma­tive Note 1: An emer­gency stop is not a safe­guard­ing device. See also, B11.19.
Infor­ma­tive Note 2: For addi­tion­al infor­ma­tion, see ISO 13850 and IEC 60204–1.

ANSI B11.19–2003

12.9 Stop and emergency stop devices

Stop and emer­gency stop devices are not safe­guard­ing devices. They are com­ple­men­tary to the guards, safe­guard­ing device, aware­ness bar­ri­ers, sig­nals and signs, safe­guard­ing meth­ods and safe­guard­ing pro­ce­dures in claus­es 7 through 11.

Stop and emer­gency stop devices shall meet the require­ments of ANSI / NFPA 79.


Emer­gency stop devices include but are not lim­it­ed to, but­tons, rope-pulls, and cable-pulls.

A safe­guard­ing device detects or pre­vents inad­ver­tent access to a haz­ard, typ­i­cal­ly with­out overt action by the indi­vid­ual or oth­ers. Since an indi­vid­ual must actu­ate an emer­gency stop device to issue the stop com­mand, usu­al­ly in reac­tion to an event or haz­ardous sit­u­a­tion, it nei­ther detects nor pre­vents expo­sure to the haz­ard.

If an emer­gency stop device is to be inter­faced into the con­trol sys­tem, it should not reduce the lev­el of per­for­mance of the safe­ty func­tion (see sec­tion 6.1 and Annex C).

NFPA 79 deals with the elec­tri­cal func­tions of the emer­gency stop func­tion which is not direct­ly rel­e­vant to this arti­cle, so that is why I haven’t quot­ed direct­ly from that doc­u­ment here.

As you can clear­ly see, the essen­tial def­i­n­i­tions of these devices in the US and Cana­da match very close­ly, although the US does not specif­i­cal­ly use the term ‘com­ple­men­tary pro­tec­tive mea­sures’.

Myth #2 – Cycle Stop And Emergency Stop Are Equivalent

Emer­gency stop sys­tems act pri­mar­i­ly by remov­ing pow­er from the prime movers in a machine, ensur­ing that pow­er is removed and the equip­ment brought to a stand­still as quick­ly as pos­si­ble, regard­less of the por­tion of the oper­at­ing cycle that the machine is in. After an emer­gency stop, the machine is inop­er­a­ble until the emer­gency stop sys­tem is reset. In some cas­es, emer­gency stop­ping the machine may dam­age the equip­ment due to the forces involved in halt­ing the process quick­ly.

Cycle stop is a con­trol sys­tem com­mand func­tion that is used to bring the machine cycle to a grace­ful stop at the end of the cur­rent cycle. The machine is still ful­ly oper­a­ble and may still be in auto­mat­ic mode at the com­ple­tion of this stop.

Again, refer­ring to ANSI B11-2008:

3.80.1 con­trolled stop: The stop­ping of machine motion while retain­ing pow­er to the machine actu­a­tors dur­ing the stop­ping process. Also referred to as Cat­e­go­ry 1 or 2 stop (see also NFPA 79: 2007, 9.2.2);

3.80.2 emer­gency stop: The stop­ping of a machine tool, man­u­al­ly ini­ti­at­ed, for emer­gency pur­pos­es;

Myth #3 – Emergency Stop Systems Can Be Used For Energy Isolation

Disconnect Switch with Lock and TagFif­teen to twen­ty years ago it was not uncom­mon to see emer­gency stop but­tons fit­ted with lock­ing devices.  The lock­ing device allowed a per­son to pre­vent the reset­ting of the emer­gency stop device. This was done as part of a “lock­out pro­ce­dure”. Lock­out is one aspect of haz­ardous ener­gy con­trol pro­ce­dures (HECP).  HECPs rec­og­nize that live work needs to be done from time to time, and that nor­mal safe­guards may be bypassed or dis­con­nect­ed tem­porar­i­ly, to allow diag­nos­tics and test­ing to be car­ried out. This process is detailed in two cur­rent stan­dards, CSA Z460 and ANSI Z244.1. Note that these lock­ing devices are still avail­able for sale, and can be used as part of an HECP to pre­vent the emer­gency stop sys­tem or oth­er con­trols from being reset until the machine is ready for test­ing. They can­not be used to iso­late an ener­gy source.

No cur­rent stan­dard allows for the use of con­trol devices such as push but­tons or selec­tor switch­es to be used as ener­gy iso­la­tion devices.

CSA Z460-05 specif­i­cal­ly pro­hibits this use in their def­i­n­i­tion of ‘ener­gy iso­la­tion devices’:

Ener­gy-iso­lat­ing device — a mechan­i­cal device that phys­i­cal­ly pre­vents the trans­mis­sion or release of ener­gy, includ­ing but not lim­it­ed to the fol­low­ing: a man­u­al­ly oper­at­ed elec­tri­cal cir­cuit break­er; a dis­con­nect switch; a man­u­al­ly oper­at­ed switch by which the con­duc­tors of a cir­cuit can be dis­con­nect­ed from all unground­ed sup­ply con­duc­tors; a line valve; a block; and oth­er devices used to block or iso­late ener­gy (push-but­ton selec­tor switch­es and oth­er con­trol-type devices are not ener­gy-iso­lat­ing devices).4

Sim­i­lar require­ments are found in ANSI Z244.15 and in ISO 138503.

Myth #4 — All Machines are Required to have an Emergency Stop

Some machine design­ers believe that all machines are required to have an emer­gency stop. This is sim­ply not true. A read­er point­ed out to me that CSA Z432-04, clause, does make this require­ment. To my knowl­edge this is the only gen­er­al lev­el (i.e., not machine spe­cif­ic) stan­dard that makes this require­ment. I stand cor­rect­ed! Hav­ing said that, the rest of my com­ments on this top­ic still stand. Clause lim­its the appli­ca­tion of this require­ment:

Each oper­a­tor con­trol sta­tion, includ­ing pen­dants, capa­ble of ini­ti­at­ing machine motion shall have a man­u­al­ly ini­ti­at­ed emer­gency stop device.

Emer­gency stop sys­tems may be use­ful where they can pro­vide a back-up to oth­er safe­guard­ing sys­tems. To under­stand where to use an emer­gency stop, a start-stop analy­sis must be car­ried out as part of the design process. This analy­sis will help the design­er devel­op a clear under­stand­ing of the nor­mal start and stop con­di­tions for the machine. The analy­sis also needs to include fail­ure modes for all of the stop func­tions. It is here that the emer­gency stop can be help­ful. If remov­ing pow­er will cause the haz­ard to cease in a short time, or if the haz­ard can be quick­ly con­tained in some way, then emer­gency stop is a valid choice. If the haz­ard will remain for a con­sid­er­able time fol­low­ing removal of pow­er, then emer­gency stop will have no effect and is use­less for avoid­ing or lim­it­ing harm.

For exam­ple, con­sid­er an oven. If the burn­er stop con­trol failed, and assum­ing that the only haz­ard we are con­cerned with is the hot sur­faces inside the oven, then using an emer­gency stop to turn the burn­ers off only results in the start of the nat­ur­al cool­ing cycle of the oven. In some cas­es that could take hours or days, so the emer­gency stop has no val­ue. It might be use­ful for con­trol­ling oth­er haz­ards, such as fire, that might be relat­ed to the same fail­ure. With­out a full analy­sis of the fail­ure modes of the con­trol sys­tem, a sound deci­sion can­not be made.

Sim­ple machines like drill press­es and table saws are sel­dom fit­ted with emer­gency stop sys­tems. These machines, which can be very dan­ger­ous, could def­i­nite­ly ben­e­fit from hav­ing an emer­gency stop. They are some­times fit­ted with a dis­con­nect­ing device with a red and yel­low han­dle that can be used for ‘emer­gency switch­ing off’. This dif­fers from emer­gency stop because the machine, and the haz­ard, will typ­i­cal­ly re-start imme­di­ate­ly when the emer­gency switch­ing off device is turned back on. This is not per­mit­ted with emer­gency stop, where reset­ting the emer­gency stop device only per­mits the restart­ing of the machine through oth­er con­trols. Reset of the emer­gency stop device is not per­mit­ted to reap­ply pow­er to the machine on its own.

These require­ments are detailed in ISO 138503, CSA Z4326 and oth­er stan­dards.

Design Considerations

Emer­gency Stop is a con­trol that is often designed in with lit­tle thought and used for a vari­ety of things that it was nev­er intend­ed to be used to accom­plish. The three myths dis­cussed in this arti­cle are the tip of the ice­berg.

Con­sid­er these ques­tions when think­ing about the design and use of emer­gency stop sys­tems:

  1. Have all the intend­ed uses and fore­see­able mis­us­es of the equip­ment been con­sid­ered?
  2. What do I expect the emer­gency stop sys­tem to do for the user of the machine? (The answer to this should be in the risk assess­ment.)
  3. How much risk reduc­tion am I expect­ing to achieve with the emer­gency stop?
  4. How reli­able does the emer­gency stop sys­tem need to be?
  5. Am I expect­ing the emer­gency stop to be used for oth­er pur­pos­es, like ‘Pow­er Off’, ener­gy iso­la­tion, or reg­u­lar stop­ping of the machine? (The answer to this should be ‘NO’.)

Tak­ing the time to assess the design require­ments before design­ing the sys­tem can help ensure that the machine con­trols are designed to pro­vide the func­tion­al­i­ty that the user needs, and the risk reduc­tion that is required. The answers lie in the five ques­tions above.

Have any of these myths affect­ed you?

Got any more myths about e-stops you’d like to share?

I real­ly appre­ci­ate hear­ing from my read­ers! Leave a com­ment or email it to us and we’ll con­sid­er adding it to this arti­cle, with cred­it of course!


5% Dis­count on All Stan­dards with code: CC2011

  1. IEC – Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion. Down­load IEC stan­dards, Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion stan­dards.
  2. ISO – Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion Down­load ISO Stan­dards
  3. Safe­ty of machin­ery — Emer­gency stop — Prin­ci­ples for design, ISO 13850, 2006, ISO, Gene­va, Switzer­land.
  4. Con­trol of Haz­ardous Ener­gy ­– Lock­out and Oth­er Meth­ods, CSA Z460, 2005, Cana­di­an Stan­dards Asso­ci­a­tion, Toron­to, Cana­da.
    Buy CSA Stan­dards online at CSA.ca
  5. Safe­guard­ing of Machin­ery, CSA Z432-04, Cana­di­an Stan­dards Asso­ci­a­tion, Toron­to, Cana­da.
  6. Con­trol of Haz­ardous Ener­gy – Lockout/Tagout and Alter­na­tive Meth­ods, ANSI/ASSE Z244.1, 2003, Amer­i­can Nation­al Stan­dards Insti­tute / Amer­i­can Soci­ety of Safe­ty Engi­neers, Des Plaines, IL, USA.
    Down­load ANSI stan­dards
  7. Amer­i­can Nation­al Stan­dard for Machine Tools – Per­for­mance Cri­te­ria for Safe­guard­ing, ANSI B11.19–2003, Amer­i­can Nation­al Stan­dards Insti­tute, Des Plaines, IL, USA.
  8. Gen­er­al Safe­ty Require­ments Com­mon to ANSI B11 Machines, ANSI B11-2008, Amer­i­can Nation­al Stan­dards Insti­tute, Des Plaines, IL, USA.
  9. Elec­tri­cal Stan­dard for Indus­tri­al Machin­ery, NFPA 79–2007, NFPA, 1 Bat­tery­march Park, Quin­cy, MA 02169–7471, USA.
    Buy NFPA Stan­dards online.

5% Dis­count on All Stan­dards with code: CC2011

Digiprove sealCopy­right secured by Digiprove © 2011–2013
Acknowl­edge­ments: See cita­tions in the arti­cle.
Some Rights Reserved

Interlock Architectures – Pt. 3: Category 2

This entry is part 3 of 8 in the series Cir­cuit Archi­tec­tures Explored

This arti­cle explores the require­ments for safe­ty relat­ed con­trol sys­tems meet­ing ISO 13849–1 Cat­e­go­ry 2 require­ments. “Gotcha!” points in the def­i­n­i­tion are high­light­ed to help design­ers avoid this com­mon pit­falls.

In the first two posts in this series, we looked at Cat­e­go­ry B, the Basic cat­e­go­ry of sys­tem archi­tec­ture, and then moved on to look at Cat­e­go­ry 1. Cat­e­go­ry B under­pins Cat­e­gories 2, 3 and 4. In this post we’ll look more deeply into Cat­e­go­ry 2.

Let’s start by look­ing at the def­i­n­i­tion for Cat­e­go­ry 2, tak­en from ISO 13849–1:2007. Remem­ber that in these excerpts, SRP/CS stands for Safe­ty Relat­ed Parts of Con­trol Sys­tems.


6.2.5 Category 2

For cat­e­go­ry 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­go­ry B shall apply. “Well–tried safe­ty prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

SRP/CS of cat­e­go­ry 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safe­ty function(s) shall be per­formed

  • at the machine start-up, and
  • pri­or to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of oth­er move­ments, and/or
  • peri­od­i­cal­ly dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is nec­es­sary.

The ini­ti­a­tion of this check may be auto­mat­ic. Any check of the safe­ty function(s) shall either

  • allow oper­a­tion if no faults have been detect­ed, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detect­ed.

When­ev­er pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the haz­ard.

For the des­ig­nat­ed archi­tec­ture of cat­e­go­ry 2, as shown in Fig­ure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tion­al chan­nel (i.e. I, L and O in Fig­ure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Fig­ure 10).

The diag­nos­tic cov­er­age (DCavg) of the total SRP/CS includ­ing fault-detec­tion shall be low. The MTTFd of each chan­nel shall be low-to-high, depend­ing on the required per­for­mance lev­el (PLr). Mea­sures against CCF shall be applied (see Annex F).

The check itself shall not lead to a haz­ardous sit­u­a­tion (e.g. due to an increase in response time). The check­ing equip­ment may be inte­gral with, or sep­a­rate from, the safe­ty-relat­ed part(s) pro­vid­ing the safe­ty func­tion.

The max­i­mum PL achiev­able with cat­e­go­ry 2 is PL = d.

NOTE 1 In some cas­es cat­e­go­ry 2 is not applic­a­ble because the check­ing of the safe­ty func­tion can­not be applied to all com­po­nents.

NOTE 2 Cat­e­go­ry 2 sys­tem behav­iour allows that

  • the occur­rence of a fault can lead to the loss of the safe­ty func­tion between checks,
  • the loss of safe­ty func­tion is detect­ed by the check.

NOTE 3 The prin­ci­ple that sup­ports the valid­i­ty of a cat­e­go­ry 2 func­tion is that the adopt­ed tech­ni­cal pro­vi­sions, and, for exam­ple, the choice of check­ing fre­quen­cy can decrease the prob­a­bil­i­ty of occur­rence of a dan­ger­ous sit­u­a­tion.

ISO 13849-1 Figure 10
Fig­ure 1 — Cat­e­go­ry 2 Block dia­gram [1, Fig.10]

Breaking it down

Let start by tak­ing apart the def­i­n­i­tion a piece at a time and look­ing at what each part means. I’ll also show a sim­ple cir­cuit that can meet the require­ments.

Category B & Well-tried Safety Principles

The first para­graph speaks to the build­ing block approach tak­en in the stan­dard:

For cat­e­go­ry 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­go­ry B shall apply. “Well–tried safe­ty prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

Sys­tems meet­ing Cat­e­go­ry 2 are required to meet all of the same require­ments as Cat­e­go­ry B, as far as the com­po­nents are con­cerned. Oth­er require­ments for the cir­cuits are dif­fer­ent, and we will look at those in a bit.

Self-Testing required

Cat­e­go­ry 2 brings in the idea of diag­nos­tics. If cor­rect­ly spec­i­fied com­po­nents have been select­ed (Cat­e­go­ry B), and are applied fol­low­ing ‘well-tried safe­ty prin­ci­ples’, then adding a diag­nos­tic com­po­nent to the sys­tem should allow the sys­tem to detect some faults and there­fore achieve a cer­tain degree of ‘fault-tol­er­ance’ or the abil­i­ty to func­tion cor­rect­ly even when some aspect of the sys­tem has failed.

Let’s look at the text:

SRP/CS of Cat­e­go­ry 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safe­ty function(s) shall be per­formed

  • at the machine start-up, and
  • pri­or to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of oth­er move­ments, and/or
  • peri­od­i­cal­ly dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is nec­es­sary.

The ini­ti­a­tion of this check may be auto­mat­ic. Any check of the safe­ty function(s) shall either

  • allow oper­a­tion if no faults have been detect­ed, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detect­ed.

When­ev­er pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the haz­ard.

Peri­od­ic check­ing is required. The checks must hap­pen at least each time there is a demand placed on the sys­tem, i.e. a guard door is opened and closed, or an emer­gency stop but­ton is pressed and reset. In addi­tion the integri­ty of the SRP/CS must be test­ed at the start of a cycle or haz­ardous peri­od, and poten­tial­ly peri­od­i­cal­ly dur­ing oper­a­tion if the risk assess­ment indi­cates that this is nec­es­sary. The test­ing fre­quen­cy must be at least 100x the demand rate [1, 4.5.4], e.g., a light cur­tain on a part load­ing work sta­tion that is inter­rupt­ed every 30 s dur­ing nor­mal oper­a­tion requires a min­i­mum test rate of once every 0.3 s, or 200x per minute or more.

The test­ing does not have to be auto­mat­ic, although in prac­tice it usu­al­ly is. As long as the sys­tem integri­ty is good, then the out­put is allowed to remain on, and the machin­ery or process can run.

Watch Out!

Notice that the words ‘when­ev­er pos­si­ble’ are used in the last para­graph in this part of the def­i­n­i­tion where the stan­dard speaks about ini­ti­a­tion of a safe state. This word­ing alludes to the fact that these sys­tems are still prone to faults that can lead to the loss of the safe­ty func­tion, and so can­not be called tru­ly ‘fault-tol­er­ant’. Loss of the safe­ty func­tion must be detect­ed by the mon­i­tor­ing sys­tem and a safe state ini­ti­at­ed. This requires care­ful thought, since the safe­ty sys­tem com­po­nents may have to inter­act with the process con­trol sys­tem to ini­ti­ate and main­tain the safe state in the event that the safe­ty sys­tem itself has failed. Also note that it is not pos­si­ble to use fault exclu­sions in Cat­e­go­ry 2 archi­tec­ture, because the sys­tem is not fault tol­er­ant.

All of this leads to an inter­est­ing ques­tion: If the sys­tem is hard­wired through the oper­at­ing chan­nel, and all the com­po­nents used in that chan­nel meet Cat­e­go­ry B require­ments, can the diag­nos­tic com­po­nent be pro­vid­ed by a mon­i­tor­ing the sys­tem with a stan­dard PLC? The answer to this is YES. Test equip­ment (called TE in Fig. 1) is specif­i­cal­ly exclud­ed, and Cat­e­go­ry 2 DOES NOT require the use of well-tried com­po­nents, only well-tried safe­ty prin­ci­ples.

Final­ly, for the faults that can be detect­ed by the mon­i­tor­ing sys­tem, detec­tion of a fault must ini­ti­ate a safe state. This means that on the next demand on the sys­tem, i.e. the next time the guard is opened or the emer­gency stop is pressed, the machine must go into a safe con­di­tion. Gen­er­al­ly, detec­tion of a fault should pre­vent the sub­se­quent reset of the sys­tem until the fault is cleared or repaired.

Test­ing is not per­mit­ted to intro­duce any new haz­ards or to slow the sys­tem down. The tests must occur ‘on-the-fly’ and with­out intro­duc­ing any delay in the sys­tem com­pared to how it would have oper­at­ed with­out the test­ing incor­po­rat­ed. Test equip­ment can be inte­grat­ed into the safe­ty sys­tem or be exter­nal to it.

One more ‘gotcha’

Note 1 in the def­i­n­i­tion high­lights a sig­nif­i­cant pit­fall for many design­ers: if all of the com­po­nents in the func­tion­al chan­nel of the sys­tem can­not be checked, you can­not claim con­for­mi­ty to Cat­e­go­ry 2. If you look back at Fig. 1, you will see that the dashed “m” lines con­nect all three func­tion­al blocks to the TE, indi­cat­ing that all three must be includ­ed in the mon­i­tor­ing chan­nel. A sys­tem that oth­er­wise would meet the archi­tec­tur­al require­ments for Cat­e­go­ry 2 must be down­grad­ed to Cat­e­go­ry 1 in cas­es where all the com­po­nents in the func­tion­al chan­nel can­not be test­ed. This is a major point and one which many design­ers miss when devel­op­ing their sys­tems.

Calculation of MTTFd

The next para­graph deals with the cal­cu­la­tion of the fail­ure rate of the sys­tem, or MTTFd.

For the des­ig­nat­ed archi­tec­ture of cat­e­go­ry 2, as shown in Fig­ure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tion­al chan­nel (i.e. I, L and O in Fig­ure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Fig­ure 10).

Cal­cu­la­tion of the fail­ure rate focus­es on the func­tion­al chan­nel, not on the mon­i­tor­ing sys­tem, mean­ing that the fail­ure rate of the mon­i­tor­ing sys­tem is ignored when ana­lyz­ing sys­tems using this archi­tec­ture. The MTTFd of each com­po­nent in the func­tion­al chan­nel is cal­cu­lat­ed and then the MTTFd of the total chan­nel is cal­cu­lat­ed.

The Diag­nos­tic Cov­er­age (DCavg) is also cal­cu­lat­ed based exclu­sive­ly on the com­po­nents in the func­tion­al chan­nel, so when deter­min­ing what per­cent­age of the faults can be detect­ed by the mon­i­tor­ing equip­ment, only faults in the func­tion­al chan­nel are con­sid­ered.

This high­lights the fact that a fail­ure of the mon­i­tor­ing sys­tem can­not be detect­ed, so a sin­gle fail­ure in the mon­i­tor­ing sys­tem that results in the sys­tem fail­ing to detect a sub­se­quent nor­mal­ly detectable fail­ure in the func­tion­al chan­nel will result in the loss of the safe­ty func­tion.

Summing Up

The next para­graph sums up the lim­its of this par­tic­u­lar archi­tec­ture:

The diag­nos­tic cov­er­age (DCavg) of the total SRP/CS includ­ing fault-detec­tion shall be low. The MTTFd of each chan­nel shall be low-to-high, depend­ing on the required per­for­mance lev­el (PLr). Mea­sures against CCF shall be applied (see Annex F).

The first sen­tence reflects back to the pre­vi­ous para­graph on diag­nos­tic cov­er­age, telling you, as the design­er, that you can­not make a claim to any­thing more than LOW DC cov­er­age when using this archi­tec­ture.

This rais­es an inter­est­ing ques­tion, since Fig­ure 5 in the stan­dard shows columns for both DCavg = LOW and DCavg=MED. My best advice to you as a user of the stan­dard is to abide by the text, mean­ing that you can­not claim high­er than LOW for DCavg in this archi­tec­ture. This con­flict will be addressed by future revi­sions of the stan­dard.

Anoth­er prob­lem raised by this sen­tence is the inclu­sion of the phrase “the total SRP/CS includ­ing fault-detec­tion”, since the pre­vi­ous para­graph explic­it­ly tells you that the assess­ment of DCavg ‘should’ only include the func­tion­al chan­nel, while this sen­tence appears to include it. In stan­dards writ­ing, sen­tences includ­ing the word ‘shall’ are clear­ly manda­to­ry, while those includ­ing the word ‘should’ indi­cate a con­di­tion which is advised but not required. Hope­ful­ly this con­fu­sion will be clar­i­fied in the next edi­tion of the stan­dard.

MTTFd in the func­tion­al chan­nel can be any­where in the range from LOW to HIGH depend­ing on the com­po­nents select­ed and the way they are applied in the design. The require­ment will be dri­ven by the desired PL of the sys­tem, so a PLd sys­tem will require HIGH MTTFd com­po­nents in the func­tion­al chan­nel, while the same archi­tec­ture used for a PLb sys­tem would require only LOW MTTFd com­po­nents.
Final­ly, applic­a­ble mea­sures against Com­mon Cause Fail­ures (CCF) must be used. Some of the mea­sures giv­en in Table F.1 in Annex F of the stan­dard can­not be applied, such as Chan­nel Sep­a­ra­tion, since you can­not sep­a­rate a sin­gle chan­nel. Oth­er CCF mea­sures can and must be applied, and so there­fore you must score at least the min­i­mum 65 on the CCF table in Annex F to claim com­pli­ance with Cat­e­go­ry 2 require­ments.

Example Circuit

Here’s an exam­ple of what a sim­ple Cat­e­go­ry 2 cir­cuit con­struct­ed from dis­crete com­po­nents might look like. Note that PB1 and PB2 could just as eas­i­ly be inter­lock switch­es on guard doors as push but­tons on a con­trol pan­el. For the sake of sim­plic­i­ty, I did not illus­trate surge sup­pres­sion on the relays, but you should include MOV’s or RC sup­pres­sors across all relay coils. All relays are con­sid­ered to be con­struct­ed with  ‘force-guid­ed’ designs and meet the require­ments for well-tried com­po­nents.

Example Category 2 circuit from discrete components
Fig­ure 2 — Exam­ple Cat­e­go­ry 2 cir­cuit from dis­crete com­po­nents

How the cir­cuit works:

  1. The machine is stopped with pow­er off. CR1, CR2, and M are off. CR3 is off until the reset but­ton is pressed, since the NC mon­i­tor­ing con­tacts on CR1, CR2 and M are all closed, but the NO reset push but­ton con­tact is open.
  2. The reset push but­ton, PB3,  is pressed. If both CR1, CR2 and M are off, their nor­mal­ly closed con­tacts will be closed, so press­ing PB3 will result in CR3 turn­ing on.
  3. CR3 clos­es its con­tacts, ener­giz­ing CR1 and CR2 which seal their con­tact cir­cuits in and de-ener­gize CR3. The time delays inher­ent in relays per­mit this to work.
  4. With CR1 and CR2 closed and CR3 held off because its coil cir­cuit opened when CR1 and CR2 turned on, M ener­gizes and motion can start.

In this cir­cuit the mon­i­tor­ing func­tion is pro­vid­ed by CR3. If any of CR1, CR2 or M were to weld closed, CR3 could not ener­gize, and so a sin­gle fault is detect­ed and the machine is pre­vent­ed from re-start­ing. If the machine is stopped by press­ing either PB1 or PB2, the machine will stop since CR1 and CR2 are redun­dant. If CR3 fails with weld­ed con­tacts, then the M rung is held open because CR3 has not de-ener­gized, and if it fails with an open coil, the reset func­tion will not work, there­fore both fail­ure modes will pre­vent the machine from start­ing with a failed mon­i­tor­ing sys­tem, if a “force-guid­ed” type of relay is used for CR3. If CR1 or CR2 fail with an open coil, then M can­not ener­gize because of the redun­dant con­tacts on the M rung.

This cir­cuit can­not detect a fail­ure in PB1, PB2, or PB3. Test­ing is con­duct­ed each time the cir­cuit is reset. This cir­cuit does not meet the 100x test rate require­ment, and so can­not be said to meet Cat­e­go­ry 2 require­ments.

If M is a motor starter rather than the motor itself, it will need to be dupli­cat­ed for redun­dan­cy and a mon­i­tor­ing con­tact added to the CR3 rung .

In cal­cu­lat­ing MTTFd, PB1, PB2, CR1, CR2, CR3 and M must be includ­ed. CR3 is includ­ed because it has a func­tion­al con­tact in the M rung and is there­fore part of the func­tion­al chan­nel of the cir­cuit as well as being part of the OT and OTE chan­nels.

Down­load IEC stan­dards, Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion stan­dards.
Down­load ISO Stan­dards

Watch for the next install­ment in this series where we’ll explore Cat­e­go­ry 3, the first of the ‘fault tol­er­ant’ archi­tec­tures!

Five things most machine builders do incorrectly

Five things that most machine builders fail to do. With a Sixth Bonus fail­ure!

The Top Five errors I see machine builders make on a depress­ing­ly reg­u­lar basis:

1) Poor or Absent Risk Assessment

Risk assess­ments are fun­da­men­tal to safe machine design and lia­bil­i­ty lim­i­ta­tion, and are required by law in the EU. They are a includ­ed in all of the mod­ern North Amer­i­can machin­ery safe­ty stan­dards as well.

Machine builders fre­quent­ly have trou­ble with the risk assess­ment process, usu­al­ly because they fail to under­stand the process or because they fail to devote enough resources to get­ting it done.

If risk assess­ment is built into your design process, it becomes the norm for how you do busi­ness. Time and resources will auto­mat­i­cal­ly be devot­ed to the process, and since it’s part of how you do things it will become rel­a­tive­ly pain­less. Where peo­ple go wrong is in mak­ing it a ‘big deal’ one-time event. Also get­ting it done ear­ly in the design process and iter­at­ed as the design pro­gress­es means that you have time to react to the find­ings, and you can com­plete any nec­es­sary changes at more cost-effec­tive points in the design and build process. The worst time to do risk assess­ment is at the point where the machine is on the shop floor ready to start pro­duc­tion. Costs for mod­i­fi­ca­tion are then expo­nen­tial­ly high­er than dur­ing design and con­struc­tion.

Poor­ly done, risk assess­ments become a lia­bil­i­ty defense lawyer’s worst night­mare and a plaintiff’s lawyer’s dream. Short­chang­ing the risk assess­ment process ensures that you will lose, either now or lat­er.

Fight this prob­lem by: learn­ing how to con­duct a risk assess­ment, using qual­i­ty risk assess­ment soft­ware tools, and build­ing risk assess­ment into your stan­dard design process/practice in your orga­ni­za­tion.

2) Failure to be Aware of Regulations & Use Design Standards

This one is a mys­tery to me.

Every mar­ket has prod­uct safe­ty leg­is­la­tion, sup­port­ed by reg­u­la­tions. Grant­ed, the scope and qual­i­ty of these reg­u­la­tions varies wide­ly, but if you want to sell a prod­uct in a mar­ket, it doesn’t take a lot of effort to find out what reg­u­la­tions may apply.

Design stan­dards have been in exis­tence for a long time. Most pur­chase orders, at least for cus­tom machin­ery, con­tain lists of stan­dards that the equip­ment is required to meet at Fac­to­ry Accep­tance Test­ing (FAT).

Why machine builders fail to grasp that using these stan­dards can actu­al­ly give them a com­pet­i­tive edge, as well as help­ing them to meet reg­u­la­to­ry require­ments, I don’t know. If you do, please either com­ment on this sto­ry or send me an email. I’d love to hear your thoughts on this!

Fight this prob­lem by: Doing some research. Under­stand the mar­ket envi­ron­ment in which you sell your prod­ucts. If you aren’t sure how to do this, use a con­sul­tant to assist you. Buy the stan­dards, espe­cial­ly if your client calls them out in their spec­i­fi­ca­tions. Read and apply them to your designs.

One great resource for infor­ma­tion on reg­u­la­to­ry envi­ron­ments and stan­dards appli­ca­tions is the IEEE Prod­uct Safe­ty Engi­neer­ing Soci­ety and the EMC-PSTC List­serv that they main­tain.

3) Fixed Guard Design

Fixed guard­ing design is dri­ven by at least two fac­tors, a) pre­vent­ing peo­ple from access­ing haz­ards, and b) allow­ing raw mate­ri­als and prod­ucts into and out of the machin­ery.

Design­ers fre­quent­ly go wrong by select­ing a fixed guard where a mov­able guard is nec­es­sary to per­mit fre­quent access (say more than once per shift). This is some­times done in an effort to avoid hav­ing to add inter­locks to the con­trol sys­tems. Fre­quent­ly the guard will be removed and replaced a cou­ple of times, and then the screws will be left off, and even­tu­al­ly the guard itself will be left off, leav­ing the user with an unguard­ed haz­ard.

The oth­er com­mon fault with fixed guards relates to the sec­ond fac­tor I men­tioned — get­ting raw mate­ri­als and prod­ucts in an out of the machine. There are lim­its on the size of open­ings that can be left in guards, depen­dent on the dis­tance from the open­ing to the haz­ards behind the guard and the size of the open­ing itself. Often the only fac­tor con­sid­ered is the size of the item that needs to enter or exit the machin­ery.

Both of these faults often occur because the guard­ing is not designed, but is allowed to hap­pen dur­ing machine build. The size and shape of the guards is then often dri­ven by con­ve­nience in fab­ri­ca­tion rather than by thought­ful design and appli­ca­tion of the min­i­mum code require­ments.

Fight this prob­lem by: Design­ing the guards on your prod­uct rather than allow­ing them to hap­pen, based on the out­come of the risk assess­ment and the lim­its defined in the stan­dards. Tables for guard open­ings and safe­ty dis­tances are avail­able in North Amer­i­can, EU and Inter­na­tion­al stan­dards.

4) Movable Guard Interlocking

Mov­able guards them­selves are usu­al­ly rea­son­ably well done. Note that I am not talk­ing about self adjust­ing guards like those found on a table saw for instance. I am talk­ing about guard doors, gates, and cov­ers.

The prob­lem usu­al­ly comes with the design of the inter­lock that is required to go with the mov­able guard. The first part of the prob­lem goes back to my #1 mis­take: Risk Assess­ment. No risk assess­ment means that you can­not rea­son­ably hope to get the reli­a­bil­i­ty require­ments right for the inter­lock­ing sys­tem. Next, there are small but sig­nif­i­cant dif­fer­ences in how the Cana­di­an, US, EU and Inter­na­tion­al stan­dards han­dle con­trol reli­a­bil­i­ty, and the biggest dif­fer­ences occur in the high­er reli­a­bil­i­ty clas­si­fi­ca­tions.

In the USA, the stan­dards speak of con­trol reli­able cir­cuits (see ANSI RIA R15.06–1999, 4.5.5). This require­ment is writ­ten in such a way that a sin­gle inter­lock­ing device, installed with dual chan­nel elec­tri­cal cir­cuits and suit­ably select­ed com­po­nents will meet the require­ments. No sin­gle ELECTRICAL com­po­nent fail­ure will lead to the loss of the safe­ty func­tion, but a sin­gle mechan­i­cal fault could.

In Cana­da, the machin­ery and robot­ics stan­dards speak of con­trol reli­able sys­tems (see CSA Z432, 8.2.5), not cir­cuits as in the US stan­dards. This require­ment is writ­ten in such a way that TWO electro­mechan­i­cal inter­lock­ing devices are required, one in each elec­tri­cal chan­nel of the inter­lock­ing sys­tem. This per­mits the sys­tem to detect mechan­i­cal fail­ures such as bro­ken or miss­ing keys, and if dif­fer­ent types of inter­lock­ing devices are cho­sen, may also per­mit detec­tion of efforts to bypass the inter­lock. Most sin­gle mechan­i­cal faults and elec­tri­cal faults will be detect­ed.

In the EU and Inter­na­tion­al­ly, con­trol reli­a­bil­i­ty is much more high­ly devel­oped. Here, the appli­ca­tion of ISO 13849, IEC 62061 or IEC 61508 have tak­en con­trol reli­a­bil­i­ty to high­er lev­els than any­thing seen to date in North Amer­i­ca. Under these stan­dards, the required Per­for­mance Lev­el (PLr) or Safe­ty Integri­ty Lev­el (SIL) must be known. This is based on the out­come of, you guessed it, the Risk Assess­ment. No risk assess­ment, or a poor risk assess­ment, dooms the design­er to like­ly fail­ure. Sig­nif­i­cant skill is required to han­dle the analy­sis and design of safe­ty relat­ed parts of con­trol sys­tems under these stan­dards.

Fight this prob­lem by: Get­ting the train­ing you need to prop­er­ly apply these stan­dards and then using them in your designs.

5) Safety Distances

Safe­ty dis­tances crop up any­where you don’t have a phys­i­cal bar­ri­er keep­ing the user away from the haz­ard. Whether its an open­ing in a fixed guard, a mov­able guard like a guard door or gate, or a pres­ence-sens­ing safe­guard­ing device like a light cur­tain, safe­ty dis­tances have to be con­sid­ered in the machine design. The eas­i­er it is for the user to come in con­tact with the haz­ard, the more safe­ty dis­tance mat­ters.

Stop­ping per­for­mance of the machin­ery must be test­ed to val­i­date the safe­ty dis­tances used. Fail­ure to get the safe­ty dis­tance right means that your guards will give your users a false sense of secu­ri­ty, and will expose them to injury. This will also expose your com­pa­ny to sig­nif­i­cant lia­bil­i­ty when some­one gets hurt, because they will. Its only a mat­ter of time.

Fight this prob­lem by: Test­ing safe­guard­ing devices.

6) Validation

OK, so this list should real­ly be SIX things. Just con­sid­er this to be a bonus for read­ing this far!

Designs, and par­tic­u­lar­ly safe­ty crit­i­cal designs, must be test­ed. Let me say it again:

Safe­ty Crit­i­cal Designs MUST Be Test­ed.

What­ev­er the­o­ry you are work­ing under, whether it’s North Amer­i­can, Euro­pean, Inter­na­tion­al or some­thing else, you can­not afford miss­ing the val­i­da­tion step. With­out val­i­da­tion you have no evi­dence that your sys­tem worked at all, let alone if it worked cor­rect­ly.

Fight this prob­lem by: TESTING YOUR DESIGNS.

A wise man once said: “If you think safe­ty is expen­sive, try hav­ing an acci­dent.” The gen­tle­man was involved in inves­ti­gat­ing the crash of a Siko­rsky S-92 heli­copter off the coast of New­found­land. 17 peo­ple died as a result of the fail­ure of two tita­ni­um studs that held an oil fil­ter onto the main gear­box, and the fact that the heli­copter failed the ‘1/2-hour gear­box run-dry test’ that is required for all new heli­copter designs. This was a clear case of fail­ure in the risk assess­ment process com­pli­cat­ed by fail­ure in the test process.