Testing Emergency Stop Systems

This entry is part 11 of 14 in the series Emer­gency Stop

Emergency Stop on machine consoleI’ve had a num­ber of ques­tions from read­ers regard­ing test­ing of emer­gency stop sys­tems, and par­tic­u­larly with the fre­quency of test­ing. I addressed the types of tests that might be needed in anoth­er art­icle cov­er­ing Check­ing Emer­gency Stop Sys­tems. This art­icle will focus on the fre­quency of test­ing rather than the types of tests.

The Problem

Emer­gency stop sys­tems are con­sidered to be “com­ple­ment­ary pro­tect­ive meas­ures” in key machinery safety stand­ards like ISO 12100 [1], and CSA Z432 [2]; this makes emer­gency stop sys­tems the backup to the primary safe­guards. Com­ple­ment­ary pro­tect­ive meas­ures are inten­ded to per­mit “avoid­ing or lim­it­ing the harm” that may res­ult from an emer­gent situ­ation. By defin­i­tion, this is a situ­ation that has not been fore­seen by the machine build­er, or is the res­ult of anoth­er fail­ure. This could be a fail­ure of anoth­er safe­guard­ing sys­tem, or a fail­ure in the machine that is not con­trolled by oth­er means., e.g., a work­piece shat­ters due to a mater­i­al flaw, and the broken pieces dam­age the machine, cre­at­ing new, uncon­trolled, fail­ure con­di­tions in the machine.

Emer­gency stop sys­tems are manu­ally triggered, and usu­ally infre­quently used. The lack of use means that func­tion­al test­ing of the sys­tem doesn’t hap­pen in the nor­mal course of oper­a­tion of the machinery. Some types of faults may occur and remain undetec­ted until the sys­tem is actu­ally used, i.e., con­tact blocks fall­ing off the back of the oper­at­or device. Fail­ure at that point may be cata­stroph­ic, since by implic­a­tion the primary safe­guards have already failed, and thus the fail­ure of the backup elim­in­ates the pos­sib­il­ity of avoid­ing or lim­it­ing harm.

To under­stand the test­ing require­ments, it’s import­ant to under­stand the risk and reli­ab­il­ity require­ments that drive the design of emer­gency stop sys­tems, and then get into the test fre­quency ques­tion.

Requirements

In the past, there were no expli­cit require­ments for emer­gency stop sys­tem reli­ab­il­ity. Details like the col­our of the oper­at­or device, or the way the stop func­tion worked were defined in ISO 13850 [3], NFPA 79 [4], and IEC 60204 – 1 [5]. In the soon-to-be pub­lished 3rd edi­tion of ISO 13850, a new pro­vi­sion requir­ing emer­gency stop sys­tems to meet at least PLc will be added [6], but until pub­lic­a­tion, it is up to the design­er to determ­ine the safety integ­rity level, either PL or SIL, required. To determ­ine the require­ments for any safety func­tion, the key is to start at the risk assess­ment. The risk assess­ment pro­cess requires that the design­er under­stand the stage in the life cycle of the machine, the task(s) that will be done, and the spe­cif­ic haz­ards that a work­er may be exposed to while con­duct­ing the task. This can become quite com­plex when con­sid­er­ing main­ten­ance and ser­vice tasks, and also applies to fore­see­able fail­ure modes of the machinery or the pro­cess. The scor­ing or rank­ing of risk can be accom­plished using any suit­able risk scor­ing tool that meets the min­im­um require­ments in [1]. There are some good examples giv­en in ISO/TR 14121 – 2 [7] if you are look­ing for some guid­ance. There are many good engin­eer­ing text­books avail­able as well. Have a look at our Book List for some sug­ges­tions if you want a deep­er dive.

Reliability

Once the ini­tial unmit­ig­ated risk is under­stood, risk con­trol meas­ures can be spe­cified. Wherever the con­trol sys­tem is used as part of the risk con­trol meas­ure, a safety func­tion must be spe­cified. Spe­cific­a­tion of the safety func­tion includes the Per­form­ance Level (PL), archi­tec­tur­al cat­egory (B, 1 – 4), Mean Time to Dan­ger­ous Fail­ure (MTTFd), and Dia­gnost­ic Cov­er­age (DC) [6], or Safety Integ­rity Level (SIL), and Hard­ware Fault Tol­er­ance (HFT), as described in IEC 62061 [8], as a min­im­um. If you are unfa­mil­i­ar with these terms, see the defin­i­tions at the end of the art­icle.

Refer­ring to Fig­ure 1, the “Risk Graph” [6, Annex A], we can reas­on­ably state that for most machinery, a fail­ure mode or emer­gent con­di­tion is likely to cre­ate con­di­tions where the sever­ity of injury is likely to require more than basic first aid, so select­ing “S2″ is the first step. In these situ­ations, and par­tic­u­larly where the fail­ure modes are not well under­stood, the highest level of sever­ity of injury, S2, is selec­ted because we don’t have enough inform­a­tion to expect that the injur­ies would only be minor. As soon as we make this selec­tion, it is no longer pos­sible to select any com­bin­a­tion of Fre­quency or Prob­ab­il­ity para­met­ers that will res­ult in any­thing lower than PLc.

It’s import­ant to under­stand that Fig­ure 1 is not a risk assess­ment tool, but rather a decision tree used to select an appro­pri­ate PL based on the rel­ev­ant risk para­met­ers. Those para­met­ers are:

Table 1 – Risk Para­met­ers
Sever­ity of Injury fre­quency and/or expos­ure to haz­ard pos­sib­il­ity of avoid­ing haz­ard or lim­it­ing harm
S1 – slight (nor­mally revers­ible injury) F1 – sel­dom-to-less-often and/or expos­ure time is short P1 – pos­sible under spe­cif­ic con­di­tions
S2 – ser­i­ous (nor­mally irre­vers­ible injury or death) F2 – fre­quent-to-con­tinu­ous and/or expos­ure time is long P2 – scarcely pos­sible
Decision tree used to determine PL based on risk parameters.
Fig­ure 1 – “Risk Graph” for determ­in­ing PL

PLc can be accom­plished using any of three archi­tec­tures: Cat­egory 1, 2, or 3. If you are unsure about what these archi­tec­tures rep­res­ent, have a look at my series cov­er­ing this top­ic.

Cat­egory 1 is single chan­nel, and does not include any dia­gnostics. A single fault can cause the loss of the safety func­tion (i.e., the machine still runs even though the e-stop but­ton is pressed). Using Cat­egory 1, the reli­ab­il­ity of the design is based on the use of highly reli­able com­pon­ents and well-tried safety prin­ciples. This approach can fail to danger.

Cat­egory 2 adds some dia­gnost­ic cap­ab­il­ity to the basic single chan­nel con­fig­ur­a­tion and does not require the use of “well-tried” com­pon­ents. This approach can also fail to danger.

Cat­egory 3 archi­tec­ture adds a redund­ant chan­nel, and includes dia­gnost­ic cov­er­age. Cat­egory 3 is not sub­ject to fail­ure due to single faults and is called “single-fault tol­er­ant”. This approach is less likely to fail to danger, but still can in the pres­ence of mul­tiple, undetec­ted, faults.

A key concept in reli­ab­il­ity is the “fault”. This can be any kind of defect in hard­ware or soft­ware that res­ults in unwanted beha­viour or a fail­ure. Faults are fur­ther broken down into dan­ger­ous and safe faults, mean­ing those that res­ult in a dan­ger­ous out­come, and those that do not. Finally, each of these classes is broken down into detect­able and undetect­able faults. I’m not going to get into the math­em­at­ic­al treat­ment of these classes, but my point is this: there are undetect­able dan­ger­ous faults. These are faults that can­not be detec­ted by built-in dia­gnostics. As design­ers, we try to design the con­trol sys­tem so that the undetect­able dan­ger­ous faults are extremely rare, ideally the prob­ab­il­ity should be much less than once in the life­time of the machine.

What is the life­time of the machine? The stand­ards writers have settled on a default life­time of 20 years, thus the answer is that undetect­able dan­ger­ous fail­ures should hap­pen much less than once in twenty years of 24/7/365 oper­a­tion. So why does this mat­ter? Each archi­tec­tur­al cat­egory has dif­fer­ent require­ments for test­ing. The test rates are driv­en by the “Demand Rate”. The Demand Rate is defined in [6]. “SRP/CS” stands for “Safety Related Part of the Con­trol Sys­tem” in the defin­i­tion:

3.1.30
demand rate (rd) – fre­quency of demands for a safety-related action of the SRP/CS

Each time the emer­gency stop but­ton is pressed, a “demand” is put on the sys­tem. Look­ing at the “Sim­pli­fied Pro­ced­ure for estim­at­ing PL”, [6, 4.5.4], we find that the stand­ard makes the fol­low­ing assump­tions:

  • mis­sion time, 20 years (see Clause 10);
  • con­stant fail­ure rates with­in the mis­sion time;
  • for cat­egory 2, demand rate <= 1/100 test rate;
  • for cat­egory 2, MTTFDTE lar­ger than half of MTTFDL.

NOTE When blocks of each chan­nel can­not be sep­ar­ated, the fol­low­ing can be applied: MTTFD of the sum­mar­ized test chan­nel (TE, OTE) lar­ger than half MTTFD of the sum­mar­ized func­tion­al chan­nel (I, L, O).

So what does all that mean? The 20-year mis­sion time is the assumed life­time of the machinery. This num­ber under­pins the rest of the cal­cu­la­tions in the stand­ard and is based on the idea that few mod­ern con­trol sys­tems last longer than 20 years without being replaced or rebuilt. The con­stant fail­ure rate points at the idea that sys­tems used in the field will have com­pon­ents and con­trols that are not sub­ject to infant mor­tal­ity, nor are they old enough to start to fail due to age, but rather that the sys­tem is oper­at­ing in the flat por­tion of the stand­ard­ized fail­ure rate “bathtub curve”, [9]. See Fig­ure 2. Com­pon­ents that are sub­ject to infant mor­tal­ity failed at the fact­ory and were removed from the sup­ply chain. Those fail­ing from “wear-out” are expec­ted to reach that point after 20 years. If this is not the case, then the main­ten­ance instruc­tions for the sys­tem should include pre­vent­at­ive main­ten­ance tasks that require repla­cing crit­ic­al com­pon­ents before they reach the pre­dicted MTTFd.

Diagram of a standardized bathtub-shaped failure rate curve.
Fig­ure 2 – Weibull Bathtub Curve [9]
For sys­tems using Cat­egory 2 archi­tec­ture, the auto­mat­ic dia­gnost­ic test rate must be at least 100x the demand rate. Keep in mind that this test rate is nor­mally accom­plished auto­mat­ic­ally in the design of the con­trols, and is only related to the detect­able safe or dan­ger­ous faults. Undetect­able faults must have a prob­ab­il­ity of less than once in 20 years, and should be detec­ted by the “proof test”. More on that a bit later.

Finally, the MTTFD of the func­tion­al chan­nel must be at least twice that of the dia­gnost­ic sys­tem.

Cat­egory 1 has no dia­gnostics, so there is no guid­ance in [6] to help us out with these sys­tems. Cat­egory 3 is single fault tol­er­ant, so as long as we don’t have mul­tiple undetec­ted faults we can count on the sys­tem to func­tion and to alert us when a single fault occurs; remem­ber that the auto­mat­ic tests may not be able to detect every fault. This is where the “proof test” comes in. What is a proof test? To find a defin­i­tion for the proof test, we have to look at IEC 61508 – 4 [10]:

3.8.5
proof test
peri­od­ic test per­formed to detect fail­ures in a safety-related sys­tem so that, if neces­sary, the sys­tem can be restored to an “as new” con­di­tion or as close as prac­tic­al to this con­di­tion

NOTE – The effect­ive­ness of the proof test will be depend­ent upon how close to the “as new” con­di­tion the sys­tem is restored. For the proof test to be fully effect­ive, it will be neces­sary to detect 100% of all dan­ger­ous fail­ures. Although in prac­tice 100% is not eas­ily achieved for oth­er than low-com­plex­ity E/E/PE safety-related sys­tems, this should be the tar­get. As a min­im­um, all the safety func­tions which are executed are checked accord­ing to the E/E/PES safety require­ments spe­cific­a­tion. If sep­ar­ate chan­nels are used, these tests are done for each chan­nel sep­ar­ately.

The 20-year life cycle assump­tion used in the stand­ards also applies to proof test­ing. Machine con­trols are assumed to get at least one proof test in their life­time. The proof test should be designed to detect faults that the auto­mat­ic dia­gnostics can­not detect. Proof tests are also con­duc­ted after major rebuilds and repairs to ensure that the sys­tem oper­ates cor­rectly.

If you know the archi­tec­ture of the emer­gency stop con­trol sys­tem, you can determ­ine the test rate based on the demand rate. It would be con­sid­er­ably easi­er if the stand­ards just gave us some min­im­um test rates for the vari­ous archi­tec­tures. One stand­ard, ISO 14119 [11] on inter­locks does just that. Admit­tedly, this stand­ard does not include emer­gency stop func­tions with­in its scope, as its focus is on inter­locks, but since inter­lock­ing sys­tems are more crit­ic­al than the com­ple­ment­ary pro­tect­ive meas­ures that back them up, it would be reas­on­able to apply these same rules. Look­ing at the clause on Assess­ment of Faults, [9, 8.2], we find this guid­ance:

For applic­a­tions using inter­lock­ing devices with auto­mat­ic mon­it­or­ing to achieve the neces­sary dia­gnost­ic cov­er­age for the required safety per­form­ance, a func­tion­al test (see IEC 60204 – 1:2005, 9.4.2.4) can be car­ried out every time the device changes its state, e.g. at every access. If, in such a case, there is only infre­quent access, the inter­lock­ing device shall be used with addi­tion­al meas­ures, because between con­sec­ut­ive func­tion­al tests the prob­ab­il­ity of occur­rence of an undetec­ted fault is increased.

When a manu­al func­tion­al test is neces­sary to detect a pos­sible accu­mu­la­tion of faults, it shall be made with­in the fol­low­ing test inter­vals:

  • at least every month for PLe with Cat­egory 3 or Cat­egory 4 (accord­ing to ISO 13849 – 1) or SIL 3 with HFT (hard­ware fault tol­er­ance) = 1 (accord­ing to IEC 62061);
  • at least every 12 months for PLd with Cat­egory 3 (accord­ing to ISO 13849 – 1) or SIL 2 with HFT (hard­ware fault tol­er­ance) = 1 (accord­ing to IEC 62061).

NOTE It is recom­men­ded that the con­trol sys­tem of a machine demands these tests at the required inter­vals e.g. by visu­al dis­play unit or sig­nal lamp. The con­trol sys­tem should mon­it­or the tests and stop the machine if the test is omit­ted or fails.

In the pre­ced­ing, HFT=1 is equi­val­ent to say­ing that the sys­tem is single-fault tol­er­ant.

This leaves us then with recom­men­ded test fre­quen­cies for Cat­egory 2 and 3 archi­tec­tures in PLc, PLd, and PLe, or for SIL 2 and 3 with HFT=1. We still don’t have a test fre­quency for PLc, Cat­egory 1 sys­tems. There is no expli­cit guid­ance for these sys­tems in the stand­ards. How can we determ­ine a test rate for these sys­tems?

My approach would be to start by examin­ing the MTTFd val­ues for all of the sub­sys­tems and com­pon­ents. [6] requires that the sys­tem has HIGH MTTFd value, mean­ing 30 years <= MTTFd <= 100 years [6, Table 5]. If this is the case, then the once-in-20-years proof test is the­or­et­ic­ally enough. If the sys­tem is con­struc­ted, for example, as shown Fig­ure 2 below, then each com­pon­ent would have to have an MTTFd > 120 years. See [6, Annex C] for this cal­cu­la­tion.

Basic Stop/Start Circuit
Fig­ure 2 – Basic Stop/Start Cir­cuit

PB1 – Emer­gency Stop But­ton

PB2 – Power “ON” But­ton

MCR – Mas­ter Con­trol Relay

MOV – Surge Sup­press­or on MCR Coil

M1 – Machine prime mover (motor)

Note that the fuses are not included, since they can only fail to safety, and assum­ing that they were spe­cified cor­rectly in the ori­gin­al design, are not sub­ject to the same cyc­lic­al aging effects as the oth­er com­pon­ents.

M1 is not included since it is the con­trolled por­tion of the machine and is not part of the con­trol sys­tem.

If a review of the com­pon­ents of the sys­tem shows that any single com­pon­ent falls below the tar­get MTTFD, then I would con­sider repla­cing the sys­tem with a high­er cat­egory design. Since most of these com­pon­ents will be unlikely to have MTTFD val­ues on the spec sheet, you will likely have to con­vert from total life val­ues (B10). This is out­side the scope of this art­icle, but you can find guid­ance in [6, Annex C]. More fre­quent test­ing, i.e., more than once in 20 years, is always accept­able.

Where manu­al test­ing is required as part of the design for any cat­egory of sys­tem, and par­tic­u­larly in Cat­egory 1 or 2 sys­tems, the con­trol sys­tem should alert the user to the require­ment and not per­mit the machine to oper­ate until the test is com­pleted. This will help to ensure that the requis­ite tests are prop­erly com­pleted.

Need more inform­a­tion? Leave a com­ment below, or send me an email with the details of your applic­a­tion!

Definitions

3.1.9 [8]
func­tion­al safety
part of the over­all safety relat­ing to the EUC and the EUC con­trol sys­tem which depends on the cor­rect func­tion­ing of the E/E/PE safety-related sys­tems, oth­er tech­no­logy safety-related sys­tems and extern­al risk reduc­tion facil­it­ies
3.2.6 [8]
electrical/electronic/programmable elec­tron­ic (E/E/PE)
based on elec­tric­al (E) and/or elec­tron­ic (E) and/or pro­gram­mable elec­tron­ic (PE) tech­no­logy

NOTE – The term is inten­ded to cov­er any and all devices or sys­tems oper­at­ing on elec­tric­al prin­ciples.

EXAMPLE Electrical/electronic/programmable elec­tron­ic devices include

  • elec­tromech­an­ic­al devices (elec­tric­al);
  • sol­id-state non-pro­gram­mable elec­tron­ic devices (elec­tron­ic);
  • elec­tron­ic devices based on com­puter tech­no­logy (pro­gram­mable elec­tron­ic); see 3.2.5
3.5.1 [8]
safety func­tion
func­tion to be imple­men­ted by an E/E/PE safety-related sys­tem, oth­er tech­no­logy safety-related sys­tem or extern­al risk reduc­tion facil­it­ies, which is inten­ded to achieve or main­tain a safe state for the EUC, in respect of a spe­cif­ic haz­ard­ous event (see 3.4.1)
3.5.2 [8]
safety integ­rity
prob­ab­il­ity of a safety-related sys­tem sat­is­fact­or­ily per­form­ing the required safety func­tions under all the stated con­di­tions with­in a stated peri­od of time
NOTE 1 – The high­er the level of safety integ­rity of the safety-related sys­tems, the lower the prob­ab­il­ity that the safety-related sys­tems will fail to carry out the required safety func­tions.
NOTE 2 – There are four levels of safety integ­rity for sys­tems (see 3.5.6).
3.5.6 [8]
safety integ­rity level (SIL)
dis­crete level (one out of a pos­sible four) for spe­cify­ing the safety integ­rity require­ments of the safety func­tions to be alloc­ated to the E/E/PE safety-related sys­tems, where safety integ­rity level 4 has the highest level of safety integ­rity and safety integ­rity level 1 has the low­est
NOTE – The tar­get fail­ure meas­ures (see 3.5.13) for the four safety integ­rity levels are spe­cified in tables 2 and 3 of IEC 61508 – 1.
3.6.3 [8]
fault tol­er­ance
abil­ity of a func­tion­al unit to con­tin­ue to per­form a required func­tion in the pres­ence of faults or errors
NOTE – The defin­i­tion in IEV 191 – 15-05 refers only to sub-item faults. See the note for the term fault in 3.6.1.
[ISO/IEC 2382 – 14-04 – 061]
3.1.1 [6]
safety – related part of a con­trol sys­tem (SRP/CS)
part of a con­trol sys­tem that responds to safety-related input sig­nals and gen­er­ates safety-related out­put sig­nals
NOTE 1 The com­bined safety-related parts of a con­trol sys­tem start at the point where the safety-related input sig­nals are ini­ti­ated (includ­ing, for example, the actu­at­ing cam and the roller of the pos­i­tion switch) and end at the out­put of the power con­trol ele­ments (includ­ing, for example, the main con­tacts of a con­tact­or).
NOTE 2 If mon­it­or­ing sys­tems are used for dia­gnostics, they are also con­sidered as SRP/CS.
3.1.2 [6]
cat­egory
clas­si­fic­a­tion of the safety-related parts of a con­trol sys­tem in respect of their res­ist­ance to faults and their sub­sequent beha­viour in the fault con­di­tion, and which is achieved by the struc­tur­al arrange­ment of the parts, fault detec­tion and/or by their reli­ab­il­ity
3.1.3 [6]
fault
state of an item char­ac­ter­ized by the inab­il­ity to per­form a required func­tion, exclud­ing the inab­il­ity dur­ing pre­vent­ive main­ten­ance or oth­er planned actions, or due to lack of extern­al resources

NOTE 1 A fault is often the res­ult of a fail­ure of the item itself, but may exist without pri­or fail­ure.
[IEC 60050 – 191:1990, 05 – 01]

NOTE 2 In this part of ISO 13849, “fault” means ran­dom fault.

3.1.4 [6]
fail­ure
ter­min­a­tion of the abil­ity of an item to per­form a required func­tion

NOTE 1 After a fail­ure, the item has a fault.

NOTE 2 “Fail­ure” is an event, as dis­tin­guished from “fault”, which is a state.

NOTE 3 The concept as defined does not apply to items con­sist­ing of soft­ware only.
[IEC 60050 – 191:1990, 04 – 01]

NOTE 4 Fail­ures which only affect the avail­ab­il­ity of the pro­cess under con­trol are out­side of the scope of this part of ISO 13849.

3.1.5 [6]
dan­ger­ous fail­ure
fail­ure which has the poten­tial to put the SRP/CS in a haz­ard­ous or fail-to-func­tion state

NOTE 1 Wheth­er or not the poten­tial is real­ized can depend on the chan­nel archi­tec­ture of the sys­tem; in redund­ant sys­tems, a dan­ger­ous hard­ware fail­ure is less likely to lead to the over­all dan­ger­ous or fail-to-func­tion state.

NOTE 2 Adap­ted from IEC 61508 – 4:1998, defin­i­tion 3.6.7.

3.1.20 [6]
safety func­tion
func­tion of the machine whose fail­ure can res­ult in an imme­di­ate increase of the risk(s)
[ISO 12100 – 1:2003, 3.28]
3.1.21 [6]
mon­it­or­ing
safety func­tion which ensures that a pro­tect­ive meas­ure is ini­ti­ated if the abil­ity of a com­pon­ent or an ele­ment to per­form its func­tion is dimin­ished or if the pro­cess con­di­tions are changed in such a way that a decrease of the amount of risk reduc­tion is gen­er­ated
3.1.22 [6]
pro­gram­mable elec­tron­ic sys­tem (PES)
sys­tem for con­trol, pro­tec­tion or mon­it­or­ing depend­ent for its oper­a­tion on one or more pro­gram­mable elec­tron­ic devices, includ­ing all ele­ments of the sys­tem such as power sup­plies, sensors and oth­er input devices, con­tact­ors and oth­er out­put devices

NOTE Adap­ted from IEC 61508 – 4:1998, defin­i­tion 3.3.2.

3.1.23 [6]
per­form­ance level (PL)
dis­crete level used to spe­cify the abil­ity of safety-related parts of con­trol sys­tems to per­form a safety func­tion under fore­see­able con­di­tions

NOTE See 4.5.1.

3.1.25 [6]
mean time to dan­ger­ous fail­ure (MTTFd)
expect­a­tion of the mean time to dan­ger­ous fail­ure

NOTE Adap­ted from IEC 62061:2005, defin­i­tion 3.2.34.

3.1.26 [6]
dia­gnost­ic cov­er­age (DC)
meas­ure of the effect­ive­ness of dia­gnostics, which may be determ­ined as the ratio between the fail­ure rate of detec­ted dan­ger­ous fail­ures and the fail­ure rate of total dan­ger­ous fail­ures

NOTE 1 Dia­gnost­ic cov­er­age can exist for the whole or parts of a safety-related sys­tem. For example, dia­gnost­ic cov­er­age could exist for sensors and/or logic sys­tem and/or final ele­ments.

NOTE 2 Adap­ted from IEC 61508 – 4:1998, defin­i­tion 3.8.6.

3.1.33 [6]
safety integ­rity level (SIL)
dis­crete level (one out of a pos­sible four) for spe­cify­ing the safety integ­rity require­ments of the safety func­tions to be alloc­ated to the E/E/PE safety-related sys­tems, where safety integ­rity level 4 has the highest level of safety integ­rity and safety integ­rity level 1 has the low­est
[IEC 61508 – 4:1998, 3.5.6]

Acknowledgements

Thanks to my col­leagues Derek Jones and Jonath­an John­son, both from Rock­well Auto­ma­tion, and mem­bers of ISO TC199. Their sug­ges­tion to ref­er­ence ISO 14119 clause 8.2 was the seed for this art­icle.

I’d also like to acknow­ledge Ron­ald Sykes, Howard Touski, Mirela Moga, Michael Roland, and Grant Rider for ask­ing the ques­tions that lead to this art­icle.

References

[1]     Safety of machinery — Gen­er­al prin­ciples for design — Risk assess­ment and risk reduc­tion. ISO 12100. Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO). Geneva 2010.

[2]    Safe­guard­ing of Machinery. CSA Z432. Cana­dian Stand­ards Asso­ci­ation. Toronto. 2004.

[3]    Safety of machinery – Emer­gency stop – Prin­ciples for design. ISO 13850. Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO). Geneva 2006.

[4]    Elec­tric­al Stand­ard for Indus­tri­al Machinery. NFPA 79. Nation­al Fire Pro­tec­tion Asso­ci­ation (NFPA). Bat­tery­march Park. 2015

[5]    Safety of machinery – Elec­tric­al equip­ment of machines – Part 1: Gen­er­al require­ments. IEC 60204 – 1. Inter­na­tion­al Elec­tro­tech­nic­al Com­mis­sion (IEC). Geneva. 2009.

[6]    Safety of machinery — Safety-related parts of con­trol sys­tems — Part 1: Gen­er­al prin­ciples for design.  ISO 13849 – 1. Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO). Geneva. 2006.

[7]    Safety of machinery — Risk assess­ment — Part 2: Prac­tic­al guid­ance and examples of meth­ods. ISO/TR 14121 – 2. Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO). Geneva. 2012.

[8]   Safety of machinery – Func­tion­al safety of safety-related elec­tric­al, elec­tron­ic and pro­gram­mable elec­tron­ic con­trol sys­tems. IEC 62061. Inter­na­tion­al Elec­tro­tech­nic­al Com­mis­sion (IEC). Geneva. 2005.

[9]    D. J. Wilkins (2002, Novem­ber). “The Bathtub Curve and Product Fail­ure Beha­vi­or. Part One – The Bathtub Curve, Infant Mor­tal­ity and Burn-in”. Reli­ab­il­ity Hot­line [Online]. Avail­able: http://www.weibull.com/hotwire/issue21/hottopics21.htm. [Accessed: 26-Apr-2015].

[10] Func­tion­al safety of electrical/electronic/programmable elec­tron­ic safety-related sys­tems – Part 4: Defin­i­tions and abbre­vi­ations. IEC 61508 – 4. Inter­na­tion­al Elec­tro­tech­nic­al Com­mis­sion (IEC). Geneva. 1998.

[11] Safety of machinery — Inter­lock­ing devices asso­ci­ated with guards — Prin­ciples for design and selec­tion. ISO 14119. Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO). Geneva. 2013.

Sources for Standards

CANADA

Cana­dian Stand­ards Asso­ci­ation sells CSA, ISO and IEC stand­ards to the Cana­dian Mar­ket.

USA

ANSI offers stand­ards from most US Stand­ards Devel­op­ment Organ­iz­a­tions. They also sell ISO and IEC stand­ards into the US mar­ket.


International

Inter­na­tion­al Organ­iz­a­tion for Stand­ard­iz­a­tion (ISO).

Inter­na­tion­al Elec­tro­tech­nic­al Com­mis­sion (IEC).

Europe

Each EU mem­ber state has their own stand­ards body. For reas­ons unknown to me, each stand­ards body can set their own pri­cing for the doc­u­ments they sell. All offer Eng­lish lan­guage cop­ies, in addi­tion to cop­ies in the offi­cial language(s) of the mem­ber state. My best advice is to shop around a bit. Prices can vary by as much as 10:1.

Brit­ish Stand­ards Insti­tute (BSi) $$$

Dan­ish Stand­ards (DS) $

Esto­ni­an Stand­ards (EVS) $

Ger­man stand­ards (DIN) – Beuth Ver­lag GmbH

Series Nav­ig­a­tionEmer­gency stop devices: the risks of installer liab­il­itySTO)”>Safe Drive Con­trol includ­ing Safe Torque Off (STO)

Author: Doug Nix

Doug Nix is Managing Director and Principal Consultant at Compliance InSight Consulting, Inc. (http://www.complianceinsight.ca) in Kitchener, Ontario, and is Lead Author and Senior Editor of the Machinery Safety 101 blog. Doug's work includes teaching machinery risk assessment techniques privately and through Conestoga College Institute of Technology and Advanced Learning in Kitchener, Ontario, as well as providing technical services and training programs to clients related to risk assessment, industrial machinery safety, safety-related control system integration and reliability, laser safety and regulatory conformity. For more see Doug's LinkedIn profile.