Testing Emergency Stop Systems

This entry is part 11 of 13 in the series Emer­gency Stop

Emergency Stop on machine consoleI’ve had a num­ber of ques­tions from read­ers regard­ing test­ing of emer­gency stop sys­tems, and par­tic­u­lar­ly with the fre­quen­cy of test­ing. I addressed the types of tests that might be need­ed in anoth­er arti­cle cov­er­ing Check­ing Emer­gency Stop Sys­tems. This arti­cle will focus on the fre­quen­cy of test­ing rather than the types of tests.

The Problem

Emer­gency stop sys­tems are con­sid­ered to be “com­ple­men­tary pro­tec­tive mea­sures” in key machin­ery safe­ty stan­dards like ISO 12100 [1], and CSA Z432 [2]; this makes emer­gency stop sys­tems the back­up to the pri­ma­ry safe­guards. Com­ple­men­tary pro­tec­tive mea­sures are intend­ed to per­mit “avoid­ing or lim­it­ing the harm” that may result from an emer­gent sit­u­a­tion. By def­i­n­i­tion, this is a sit­u­a­tion that has not been fore­seen by the machine builder, or is the result of anoth­er fail­ure. This could be a fail­ure of anoth­er safe­guard­ing sys­tem, or a fail­ure in the machine that is not con­trolled by oth­er means., e.g., a work­piece shat­ters due to a mate­r­i­al flaw, and the bro­ken pieces dam­age the machine, cre­at­ing new, uncon­trolled, fail­ure con­di­tions in the machine.

Emer­gency stop sys­tems are man­u­al­ly trig­gered, and usu­al­ly infre­quent­ly used. The lack of use means that func­tion­al test­ing of the sys­tem doesn’t hap­pen in the nor­mal course of oper­a­tion of the machin­ery. Some types of faults may occur and remain unde­tect­ed until the sys­tem is actu­al­ly used, i.e., con­tact blocks falling off the back of the oper­a­tor device. Fail­ure at that point may be cat­a­stroph­ic, since by impli­ca­tion the pri­ma­ry safe­guards have already failed, and thus the fail­ure of the back­up elim­i­nates the pos­si­bil­i­ty of avoid­ing or lim­it­ing harm.

To under­stand the test­ing require­ments, it’s impor­tant to under­stand the risk and reli­a­bil­i­ty require­ments that dri­ve the design of emer­gency stop sys­tems, and then get into the test fre­quen­cy ques­tion.

Requirements

In the past, there were no explic­it require­ments for emer­gency stop sys­tem reli­a­bil­i­ty. Details like the colour of the oper­a­tor device, or the way the stop func­tion worked were defined in ISO 13850 [3], NFPA 79 [4], and IEC 60204–1 [5]. In the soon-to-be pub­lished 3rd edi­tion of ISO 13850, a new pro­vi­sion requir­ing emer­gency stop sys­tems to meet at least PLc will be added [6], but until pub­li­ca­tion, it is up to the design­er to deter­mine the safe­ty integri­ty lev­el, either PL or SIL, required. To deter­mine the require­ments for any safe­ty func­tion, the key is to start at the risk assess­ment. The risk assess­ment process requires that the design­er under­stand the stage in the life cycle of the machine, the task(s) that will be done, and the spe­cif­ic haz­ards that a work­er may be exposed to while con­duct­ing the task. This can become quite com­plex when con­sid­er­ing main­te­nance and ser­vice tasks, and also applies to fore­see­able fail­ure modes of the machin­ery or the process. The scor­ing or rank­ing of risk can be accom­plished using any suit­able risk scor­ing tool that meets the min­i­mum require­ments in [1]. There are some good exam­ples giv­en in ISO/TR 14121–2 [7] if you are look­ing for some guid­ance. There are many good engi­neer­ing text­books avail­able as well. Have a look at our Book List for some sug­ges­tions if you want a deep­er dive.

Reliability

Once the ini­tial unmit­i­gat­ed risk is under­stood, risk con­trol mea­sures can be spec­i­fied. Wher­ev­er the con­trol sys­tem is used as part of the risk con­trol mea­sure, a safe­ty func­tion must be spec­i­fied. Spec­i­fi­ca­tion of the safe­ty func­tion includes the Per­for­mance Lev­el (PL), archi­tec­tur­al cat­e­go­ry (B, 1–4), Mean Time to Dan­ger­ous Fail­ure (MTTFd), and Diag­nos­tic Cov­er­age (DC) [6], or Safe­ty Integri­ty Lev­el (SIL), and Hard­ware Fault Tol­er­ance (HFT), as described in IEC 62061 [8], as a min­i­mum. If you are unfa­mil­iar with these terms, see the def­i­n­i­tions at the end of the arti­cle.

Refer­ring to Fig­ure 1, the “Risk Graph” [6, Annex A], we can rea­son­ably state that for most machin­ery, a fail­ure mode or emer­gent con­di­tion is like­ly to cre­ate con­di­tions where the sever­i­ty of injury is like­ly to require more than basic first aid, so select­ing “S2″ is the first step. In these sit­u­a­tions, and par­tic­u­lar­ly where the fail­ure modes are not well under­stood, the high­est lev­el of sever­i­ty of injury, S2, is select­ed because we don’t have enough infor­ma­tion to expect that the injuries would only be minor. As soon as we make this selec­tion, it is no longer pos­si­ble to select any com­bi­na­tion of Fre­quen­cy or Prob­a­bil­i­ty para­me­ters that will result in any­thing low­er than PLc.

It’s impor­tant to under­stand that Fig­ure 1 is not a risk assess­ment tool, but rather a deci­sion tree used to select an appro­pri­ate PL based on the rel­e­vant risk para­me­ters. Those para­me­ters are:

Table 1 — Risk Para­me­ters
Sever­i­ty of Injury fre­quen­cy and/or expo­sure to haz­ard pos­si­bil­i­ty of avoid­ing haz­ard or lim­it­ing harm
S1 — slight (nor­mal­ly reversible injury) F1 — sel­dom-to-less-often and/or expo­sure time is short P1 — pos­si­ble under spe­cif­ic con­di­tions
S2 — seri­ous (nor­mal­ly irre­versible injury or death) F2 — fre­quent-to-con­tin­u­ous and/or expo­sure time is long P2 — scarce­ly pos­si­ble
Decision tree used to determine PL based on risk parameters.
Fig­ure 1 — “Risk Graph” for deter­min­ing PL

PLc can be accom­plished using any of three archi­tec­tures: Cat­e­go­ry 1, 2, or 3. If you are unsure about what these archi­tec­tures rep­re­sent, have a look at my series cov­er­ing this top­ic.

Cat­e­go­ry 1 is sin­gle chan­nel, and does not include any diag­nos­tics. A sin­gle fault can cause the loss of the safe­ty func­tion (i.e., the machine still runs even though the e-stop but­ton is pressed). Using Cat­e­go­ry 1, the reli­a­bil­i­ty of the design is based on the use of high­ly reli­able com­po­nents and well-tried safe­ty prin­ci­ples. This approach can fail to dan­ger.

Cat­e­go­ry 2 adds some diag­nos­tic capa­bil­i­ty to the basic sin­gle chan­nel con­fig­u­ra­tion and does not require the use of “well-tried” com­po­nents. This approach can also fail to dan­ger.

Cat­e­go­ry 3 archi­tec­ture adds a redun­dant chan­nel, and includes diag­nos­tic cov­er­age. Cat­e­go­ry 3 is not sub­ject to fail­ure due to sin­gle faults and is called “sin­gle-fault tol­er­ant”. This approach is less like­ly to fail to dan­ger, but still can in the pres­ence of mul­ti­ple, unde­tect­ed, faults.

A key con­cept in reli­a­bil­i­ty is the “fault”. This can be any kind of defect in hard­ware or soft­ware that results in unwant­ed behav­iour or a fail­ure. Faults are fur­ther bro­ken down into dan­ger­ous and safe faults, mean­ing those that result in a dan­ger­ous out­come, and those that do not. Final­ly, each of these class­es is bro­ken down into detectable and unde­tectable faults. I’m not going to get into the math­e­mat­i­cal treat­ment of these class­es, but my point is this: there are unde­tectable dan­ger­ous faults. These are faults that can­not be detect­ed by built-in diag­nos­tics. As design­ers, we try to design the con­trol sys­tem so that the unde­tectable dan­ger­ous faults are extreme­ly rare, ide­al­ly the prob­a­bil­i­ty should be much less than once in the life­time of the machine.

What is the life­time of the machine? The stan­dards writ­ers have set­tled on a default life­time of 20 years, thus the answer is that unde­tectable dan­ger­ous fail­ures should hap­pen much less than once in twen­ty years of 24/7/365 oper­a­tion. So why does this mat­ter? Each archi­tec­tur­al cat­e­go­ry has dif­fer­ent require­ments for test­ing. The test rates are dri­ven by the “Demand Rate”. The Demand Rate is defined in [6]. “SRP/CS” stands for “Safe­ty Relat­ed Part of the Con­trol Sys­tem” in the def­i­n­i­tion:

3.1.30
demand rate (rd) — fre­quen­cy of demands for a safe­ty-relat­ed action of the SRP/CS

Each time the emer­gency stop but­ton is pressed, a “demand” is put on the sys­tem. Look­ing at the “Sim­pli­fied Pro­ce­dure for esti­mat­ing PL”, [6, 4.5.4], we find that the stan­dard makes the fol­low­ing assump­tions:

  • mis­sion time, 20 years (see Clause 10);
  • con­stant fail­ure rates with­in the mis­sion time;
  • for cat­e­go­ry 2, demand rate <= 1/100 test rate;
  • for cat­e­go­ry 2, MTTFDTE larg­er than half of MTTFDL.

NOTE When blocks of each chan­nel can­not be sep­a­rat­ed, the fol­low­ing can be applied: MTTFD of the sum­ma­rized test chan­nel (TE, OTE) larg­er than half MTTFD of the sum­ma­rized func­tion­al chan­nel (I, L, O).

So what does all that mean? The 20-year mis­sion time is the assumed life­time of the machin­ery. This num­ber under­pins the rest of the cal­cu­la­tions in the stan­dard and is based on the idea that few mod­ern con­trol sys­tems last longer than 20 years with­out being replaced or rebuilt. The con­stant fail­ure rate points at the idea that sys­tems used in the field will have com­po­nents and con­trols that are not sub­ject to infant mor­tal­i­ty, nor are they old enough to start to fail due to age, but rather that the sys­tem is oper­at­ing in the flat por­tion of the stan­dard­ized fail­ure rate “bath­tub curve”, [9]. See Fig­ure 2. Com­po­nents that are sub­ject to infant mor­tal­i­ty failed at the fac­to­ry and were removed from the sup­ply chain. Those fail­ing from “wear-out” are expect­ed to reach that point after 20 years. If this is not the case, then the main­te­nance instruc­tions for the sys­tem should include pre­ven­ta­tive main­te­nance tasks that require replac­ing crit­i­cal com­po­nents before they reach the pre­dict­ed MTTFd.

Diagram of a standardized bathtub-shaped failure rate curve.
Fig­ure 2 — Weibull Bath­tub Curve [9]
For sys­tems using Cat­e­go­ry 2 archi­tec­ture, the auto­mat­ic diag­nos­tic test rate must be at least 100x the demand rate. Keep in mind that this test rate is nor­mal­ly accom­plished auto­mat­i­cal­ly in the design of the con­trols, and is only relat­ed to the detectable safe or dan­ger­ous faults. Unde­tectable faults must have a prob­a­bil­i­ty of less than once in 20 years, and should be detect­ed by the “proof test”. More on that a bit lat­er.

Final­ly, the MTTFD of the func­tion­al chan­nel must be at least twice that of the diag­nos­tic sys­tem.

Cat­e­go­ry 1 has no diag­nos­tics, so there is no guid­ance in [6] to help us out with these sys­tems. Cat­e­go­ry 3 is sin­gle fault tol­er­ant, so as long as we don’t have mul­ti­ple unde­tect­ed faults we can count on the sys­tem to func­tion and to alert us when a sin­gle fault occurs; remem­ber that the auto­mat­ic tests may not be able to detect every fault. This is where the “proof test” comes in. What is a proof test? To find a def­i­n­i­tion for the proof test, we have to look at IEC 61508–4 [10]:

3.8.5
proof test
peri­od­ic test per­formed to detect fail­ures in a safe­ty-relat­ed sys­tem so that, if nec­es­sary, the sys­tem can be restored to an “as new” con­di­tion or as close as prac­ti­cal to this con­di­tion

NOTE — The effec­tive­ness of the proof test will be depen­dent upon how close to the “as new” con­di­tion the sys­tem is restored. For the proof test to be ful­ly effec­tive, it will be nec­es­sary to detect 100% of all dan­ger­ous fail­ures. Although in prac­tice 100% is not eas­i­ly achieved for oth­er than low-com­plex­i­ty E/E/PE safe­ty-relat­ed sys­tems, this should be the tar­get. As a min­i­mum, all the safe­ty func­tions which are exe­cut­ed are checked accord­ing to the E/E/PES safe­ty require­ments spec­i­fi­ca­tion. If sep­a­rate chan­nels are used, these tests are done for each chan­nel sep­a­rate­ly.

The 20-year life cycle assump­tion used in the stan­dards also applies to proof test­ing. Machine con­trols are assumed to get at least one proof test in their life­time. The proof test should be designed to detect faults that the auto­mat­ic diag­nos­tics can­not detect. Proof tests are also con­duct­ed after major rebuilds and repairs to ensure that the sys­tem oper­ates cor­rect­ly.

If you know the archi­tec­ture of the emer­gency stop con­trol sys­tem, you can deter­mine the test rate based on the demand rate. It would be con­sid­er­ably eas­i­er if the stan­dards just gave us some min­i­mum test rates for the var­i­ous archi­tec­tures. One stan­dard, ISO 14119 [11] on inter­locks does just that. Admit­ted­ly, this stan­dard does not include emer­gency stop func­tions with­in its scope, as its focus is on inter­locks, but since inter­lock­ing sys­tems are more crit­i­cal than the com­ple­men­tary pro­tec­tive mea­sures that back them up, it would be rea­son­able to apply these same rules. Look­ing at the clause on Assess­ment of Faults, [9, 8.2], we find this guid­ance:

For appli­ca­tions using inter­lock­ing devices with auto­mat­ic mon­i­tor­ing to achieve the nec­es­sary diag­nos­tic cov­er­age for the required safe­ty per­for­mance, a func­tion­al test (see IEC 60204–1:2005, 9.4.2.4) can be car­ried out every time the device changes its state, e.g. at every access. If, in such a case, there is only infre­quent access, the inter­lock­ing device shall be used with addi­tion­al mea­sures, because between con­sec­u­tive func­tion­al tests the prob­a­bil­i­ty of occur­rence of an unde­tect­ed fault is increased.

When a man­u­al func­tion­al test is nec­es­sary to detect a pos­si­ble accu­mu­la­tion of faults, it shall be made with­in the fol­low­ing test inter­vals:

  • at least every month for PLe with Cat­e­go­ry 3 or Cat­e­go­ry 4 (accord­ing to ISO 13849–1) or SIL 3 with HFT (hard­ware fault tol­er­ance) = 1 (accord­ing to IEC 62061);
  • at least every 12 months for PLd with Cat­e­go­ry 3 (accord­ing to ISO 13849–1) or SIL 2 with HFT (hard­ware fault tol­er­ance) = 1 (accord­ing to IEC 62061).

NOTE It is rec­om­mend­ed that the con­trol sys­tem of a machine demands these tests at the required inter­vals e.g. by visu­al dis­play unit or sig­nal lamp. The con­trol sys­tem should mon­i­tor the tests and stop the machine if the test is omit­ted or fails.

In the pre­ced­ing, HFT=1 is equiv­a­lent to say­ing that the sys­tem is sin­gle-fault tol­er­ant.

This leaves us then with rec­om­mend­ed test fre­quen­cies for Cat­e­go­ry 2 and 3 archi­tec­tures in PLc, PLd, and PLe, or for SIL 2 and 3 with HFT=1. We still don’t have a test fre­quen­cy for PLc, Cat­e­go­ry 1 sys­tems. There is no explic­it guid­ance for these sys­tems in the stan­dards. How can we deter­mine a test rate for these sys­tems?

My approach would be to start by exam­in­ing the MTTFd val­ues for all of the sub­sys­tems and com­po­nents. [6] requires that the sys­tem has HIGH MTTFd val­ue, mean­ing 30 years <= MTTFd <= 100 years [6, Table 5]. If this is the case, then the once-in-20-years proof test is the­o­ret­i­cal­ly enough. If the sys­tem is con­struct­ed, for exam­ple, as shown Fig­ure 2 below, then each com­po­nent would have to have an MTTFd > 120 years. See [6, Annex C] for this cal­cu­la­tion.

Basic Stop/Start Circuit
Fig­ure 2 — Basic Stop/Start Cir­cuit

PB1 — Emer­gency Stop But­ton

PB2 — Pow­er “ON” But­ton

MCR — Mas­ter Con­trol Relay

MOV — Surge Sup­pres­sor on MCR Coil

M1 — Machine prime mover (motor)

Note that the fus­es are not includ­ed, since they can only fail to safe­ty, and assum­ing that they were spec­i­fied cor­rect­ly in the orig­i­nal design, are not sub­ject to the same cycli­cal aging effects as the oth­er com­po­nents.

M1 is not includ­ed since it is the con­trolled por­tion of the machine and is not part of the con­trol sys­tem.

If a review of the com­po­nents of the sys­tem shows that any sin­gle com­po­nent falls below the tar­get MTTFD, then I would con­sid­er replac­ing the sys­tem with a high­er cat­e­go­ry design. Since most of these com­po­nents will be unlike­ly to have MTTFD val­ues on the spec sheet, you will like­ly have to con­vert from total life val­ues (B10). This is out­side the scope of this arti­cle, but you can find guid­ance in [6, Annex C]. More fre­quent test­ing, i.e., more than once in 20 years, is always accept­able.

Where man­u­al test­ing is required as part of the design for any cat­e­go­ry of sys­tem, and par­tic­u­lar­ly in Cat­e­go­ry 1 or 2 sys­tems, the con­trol sys­tem should alert the user to the require­ment and not per­mit the machine to oper­ate until the test is com­plet­ed. This will help to ensure that the req­ui­site tests are prop­er­ly com­plet­ed.

Need more infor­ma­tion? Leave a com­ment below, or send me an email with the details of your appli­ca­tion!

Definitions

3.1.9 [8]
func­tion­al safe­ty
part of the over­all safe­ty relat­ing to the EUC and the EUC con­trol sys­tem which depends on the cor­rect func­tion­ing of the E/E/PE safe­ty-relat­ed sys­tems, oth­er tech­nol­o­gy safe­ty-relat­ed sys­tems and exter­nal risk reduc­tion facil­i­ties
3.2.6 [8]
electrical/electronic/programmable elec­tron­ic (E/E/PE)
based on elec­tri­cal (E) and/or elec­tron­ic (E) and/or pro­gram­ma­ble elec­tron­ic (PE) tech­nol­o­gy

NOTE — The term is intend­ed to cov­er any and all devices or sys­tems oper­at­ing on elec­tri­cal prin­ci­ples.

EXAMPLE Electrical/electronic/programmable elec­tron­ic devices include

  • electro­mechan­i­cal devices (elec­tri­cal);
  • sol­id-state non-pro­gram­ma­ble elec­tron­ic devices (elec­tron­ic);
  • elec­tron­ic devices based on com­put­er tech­nol­o­gy (pro­gram­ma­ble elec­tron­ic); see 3.2.5
3.5.1 [8]
safe­ty func­tion
func­tion to be imple­ment­ed by an E/E/PE safe­ty-relat­ed sys­tem, oth­er tech­nol­o­gy safe­ty-relat­ed sys­tem or exter­nal risk reduc­tion facil­i­ties, which is intend­ed to achieve or main­tain a safe state for the EUC, in respect of a spe­cif­ic haz­ardous event (see 3.4.1)
3.5.2 [8]
safe­ty integri­ty
prob­a­bil­i­ty of a safe­ty-relat­ed sys­tem sat­is­fac­to­ri­ly per­form­ing the required safe­ty func­tions under all the stat­ed con­di­tions with­in a stat­ed peri­od of time
NOTE 1 — The high­er the lev­el of safe­ty integri­ty of the safe­ty-relat­ed sys­tems, the low­er the prob­a­bil­i­ty that the safe­ty-relat­ed sys­tems will fail to car­ry out the required safe­ty func­tions.
NOTE 2 — There are four lev­els of safe­ty integri­ty for sys­tems (see 3.5.6).
3.5.6 [8]
safe­ty integri­ty lev­el (SIL)
dis­crete lev­el (one out of a pos­si­ble four) for spec­i­fy­ing the safe­ty integri­ty require­ments of the safe­ty func­tions to be allo­cat­ed to the E/E/PE safe­ty-relat­ed sys­tems, where safe­ty integri­ty lev­el 4 has the high­est lev­el of safe­ty integri­ty and safe­ty integri­ty lev­el 1 has the low­est
NOTE — The tar­get fail­ure mea­sures (see 3.5.13) for the four safe­ty integri­ty lev­els are spec­i­fied in tables 2 and 3 of IEC 61508–1.
3.6.3 [8]
fault tol­er­ance
abil­i­ty of a func­tion­al unit to con­tin­ue to per­form a required func­tion in the pres­ence of faults or errors
NOTE — The def­i­n­i­tion in IEV 191–15-05 refers only to sub-item faults. See the note for the term fault in 3.6.1.
[ISO/IEC 2382–14-04–061]
3.1.1 [6]
safety–related part of a con­trol sys­tem (SRP/CS)
part of a con­trol sys­tem that responds to safe­ty-relat­ed input sig­nals and gen­er­ates safe­ty-relat­ed out­put sig­nals
NOTE 1 The com­bined safe­ty-relat­ed parts of a con­trol sys­tem start at the point where the safe­ty-relat­ed input sig­nals are ini­ti­at­ed (includ­ing, for exam­ple, the actu­at­ing cam and the roller of the posi­tion switch) and end at the out­put of the pow­er con­trol ele­ments (includ­ing, for exam­ple, the main con­tacts of a con­tac­tor).
NOTE 2 If mon­i­tor­ing sys­tems are used for diag­nos­tics, they are also con­sid­ered as SRP/CS.
3.1.2 [6]
cat­e­go­ry
clas­si­fi­ca­tion of the safe­ty-relat­ed parts of a con­trol sys­tem in respect of their resis­tance to faults and their sub­se­quent behav­iour in the fault con­di­tion, and which is achieved by the struc­tur­al arrange­ment of the parts, fault detec­tion and/or by their reli­a­bil­i­ty
3.1.3 [6]
fault
state of an item char­ac­ter­ized by the inabil­i­ty to per­form a required func­tion, exclud­ing the inabil­i­ty dur­ing pre­ven­tive main­te­nance or oth­er planned actions, or due to lack of exter­nal resources

NOTE 1 A fault is often the result of a fail­ure of the item itself, but may exist with­out pri­or fail­ure.
[IEC 60050–191:1990, 05–01]

NOTE 2 In this part of ISO 13849, “fault” means ran­dom fault.

3.1.4 [6]
fail­ure
ter­mi­na­tion of the abil­i­ty of an item to per­form a required func­tion

NOTE 1 After a fail­ure, the item has a fault.

NOTE 2 “Fail­ure” is an event, as dis­tin­guished from “fault”, which is a state.

NOTE 3 The con­cept as defined does not apply to items con­sist­ing of soft­ware only.
[IEC 60050–191:1990, 04–01]

NOTE 4 Fail­ures which only affect the avail­abil­i­ty of the process under con­trol are out­side of the scope of this part of ISO 13849.

3.1.5 [6]
dan­ger­ous fail­ure
fail­ure which has the poten­tial to put the SRP/CS in a haz­ardous or fail-to-func­tion state

NOTE 1 Whether or not the poten­tial is real­ized can depend on the chan­nel archi­tec­ture of the sys­tem; in redun­dant sys­tems, a dan­ger­ous hard­ware fail­ure is less like­ly to lead to the over­all dan­ger­ous or fail-to-func­tion state.

NOTE 2 Adapt­ed from IEC 61508–4:1998, def­i­n­i­tion 3.6.7.

3.1.20 [6]
safe­ty func­tion
func­tion of the machine whose fail­ure can result in an imme­di­ate increase of the risk(s)
[ISO 12100–1:2003, 3.28]
3.1.21 [6]
mon­i­tor­ing
safe­ty func­tion which ensures that a pro­tec­tive mea­sure is ini­ti­at­ed if the abil­i­ty of a com­po­nent or an ele­ment to per­form its func­tion is dimin­ished or if the process con­di­tions are changed in such a way that a decrease of the amount of risk reduc­tion is gen­er­at­ed
3.1.22 [6]
pro­gram­ma­ble elec­tron­ic sys­tem (PES)
sys­tem for con­trol, pro­tec­tion or mon­i­tor­ing depen­dent for its oper­a­tion on one or more pro­gram­ma­ble elec­tron­ic devices, includ­ing all ele­ments of the sys­tem such as pow­er sup­plies, sen­sors and oth­er input devices, con­tac­tors and oth­er out­put devices

NOTE Adapt­ed from IEC 61508–4:1998, def­i­n­i­tion 3.3.2.

3.1.23 [6]
per­for­mance lev­el (PL)
dis­crete lev­el used to spec­i­fy the abil­i­ty of safe­ty-relat­ed parts of con­trol sys­tems to per­form a safe­ty func­tion under fore­see­able con­di­tions

NOTE See 4.5.1.

3.1.25 [6]
mean time to dan­ger­ous fail­ure (MTTFd)
expec­ta­tion of the mean time to dan­ger­ous fail­ure

NOTE Adapt­ed from IEC 62061:2005, def­i­n­i­tion 3.2.34.

3.1.26 [6]
diag­nos­tic cov­er­age (DC)
mea­sure of the effec­tive­ness of diag­nos­tics, which may be deter­mined as the ratio between the fail­ure rate of detect­ed dan­ger­ous fail­ures and the fail­ure rate of total dan­ger­ous fail­ures

NOTE 1 Diag­nos­tic cov­er­age can exist for the whole or parts of a safe­ty-relat­ed sys­tem. For exam­ple, diag­nos­tic cov­er­age could exist for sen­sors and/or log­ic sys­tem and/or final ele­ments.

NOTE 2 Adapt­ed from IEC 61508–4:1998, def­i­n­i­tion 3.8.6.

3.1.33 [6]
safe­ty integri­ty lev­el (SIL)
dis­crete lev­el (one out of a pos­si­ble four) for spec­i­fy­ing the safe­ty integri­ty require­ments of the safe­ty func­tions to be allo­cat­ed to the E/E/PE safe­ty-relat­ed sys­tems, where safe­ty integri­ty lev­el 4 has the high­est lev­el of safe­ty integri­ty and safe­ty integri­ty lev­el 1 has the low­est
[IEC 61508–4:1998, 3.5.6]

Acknowledgements

Thanks to my col­leagues Derek Jones and Jonathan John­son, both from Rock­well Automa­tion, and mem­bers of ISO TC199. Their sug­ges­tion to ref­er­ence ISO 14119 clause 8.2 was the seed for this arti­cle.

I’d also like to acknowl­edge Ronald Sykes, Howard Tou­s­ki, Mirela Moga, Michael Roland, and Grant Rid­er for ask­ing the ques­tions that lead to this arti­cle.

References

[1]     Safe­ty of machin­ery — Gen­er­al prin­ci­ples for design — Risk assess­ment and risk reduc­tion. ISO 12100. Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO). Gene­va 2010.

[2]    Safe­guard­ing of Machin­ery. CSA Z432. Cana­di­an Stan­dards Asso­ci­a­tion. Toron­to. 2004.

[3]    Safe­ty of machin­ery – Emer­gency stop – Prin­ci­ples for design. ISO 13850. Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO). Gene­va 2006.

[4]    Elec­tri­cal Stan­dard for Indus­tri­al Machin­ery. NFPA 79. Nation­al Fire Pro­tec­tion Asso­ci­a­tion (NFPA). Bat­tery­march Park. 2015

[5]    Safe­ty of machin­ery – Elec­tri­cal equip­ment of machines – Part 1: Gen­er­al require­ments. IEC 60204–1. Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion (IEC). Gene­va. 2009.

[6]    Safe­ty of machin­ery — Safe­ty-relat­ed parts of con­trol sys­tems — Part 1: Gen­er­al prin­ci­ples for design.  ISO 13849–1. Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO). Gene­va. 2006.

[7]    Safe­ty of machin­ery — Risk assess­ment — Part 2: Prac­ti­cal guid­ance and exam­ples of meth­ods. ISO/TR 14121–2. Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO). Gene­va. 2012.

[8]   Safe­ty of machin­ery – Func­tion­al safe­ty of safe­ty-relat­ed elec­tri­cal, elec­tron­ic and pro­gram­ma­ble elec­tron­ic con­trol sys­tems. IEC 62061. Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion (IEC). Gene­va. 2005.

[9]    D. J. Wilkins (2002, Novem­ber). “The Bath­tub Curve and Prod­uct Fail­ure Behav­ior. Part One — The Bath­tub Curve, Infant Mor­tal­i­ty and Burn-in”. Reli­a­bil­i­ty Hot­line [Online]. Avail­able: http://www.weibull.com/hotwire/issue21/hottopics21.htm. [Accessed: 26-Apr-2015].

[10] Func­tion­al safe­ty of electrical/electronic/programmable elec­tron­ic safe­ty-relat­ed sys­tems — Part 4: Def­i­n­i­tions and abbre­vi­a­tions. IEC 61508–4. Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion (IEC). Gene­va. 1998.

[11] Safe­ty of machin­ery — Inter­lock­ing devices asso­ci­at­ed with guards — Prin­ci­ples for design and selec­tion. ISO 14119. Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO). Gene­va. 2013.

Sources for Standards

CANADA

Cana­di­an Stan­dards Asso­ci­a­tion sells CSA, ISO and IEC stan­dards to the Cana­di­an Mar­ket.

USA

ANSI offers stan­dards from most US Stan­dards Devel­op­ment Orga­ni­za­tions. They also sell ISO and IEC stan­dards into the US mar­ket.


International

Inter­na­tion­al Orga­ni­za­tion for Stan­dard­iza­tion (ISO).

Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion (IEC).

Europe

Each EU mem­ber state has their own stan­dards body. For rea­sons unknown to me, each stan­dards body can set their own pric­ing for the doc­u­ments they sell. All offer Eng­lish lan­guage copies, in addi­tion to copies in the offi­cial language(s) of the mem­ber state. My best advice is to shop around a bit. Prices can vary by as much as 10:1.

British Stan­dards Insti­tute (BSi) $$$

Dan­ish Stan­dards (DS) $

Eston­ian Stan­dards (EVS) $

Ger­man stan­dards (DIN) — Beuth Ver­lag GmbH

Series Nav­i­ga­tionEmer­gency stop devices: the risks of installer lia­bil­i­tySTO)”>Safe Dri­ve Con­trol includ­ing Safe Torque Off (STO)

Author: Doug Nix

Doug Nix is Managing Director and Principal Consultant at Compliance InSight Consulting, Inc. (http://www.complianceinsight.ca) in Kitchener, Ontario, and is Lead Author and Senior Editor of the Machinery Safety 101 blog. Doug's work includes teaching machinery risk assessment techniques privately and through Conestoga College Institute of Technology and Advanced Learning in Kitchener, Ontario, as well as providing technical services and training programs to clients related to risk assessment, industrial machinery safety, safety-related control system integration and reliability, laser safety and regulatory conformity. For more see Doug's LinkedIn profile.