Control FunctionsControl ReliabilityEmergency StopFunctional SafetyGuards and GuardingInterlocks

Interlock Architectures – Pt. 5: Category 4 — Control Reliable

This entry is part 5 of 8 in the series Cir­cuit Archi­tec­tures Explored

Ed. note: I’ve made a few updates to this art­icle since it was first pub­lished in 2011, with the most recent today, 6-Dec-18. – DN

The most reli­able of the five sys­tem archi­tec­tures, Cat­egory 4 is the only archi­tec­ture that uses mul­tiple-fault tol­er­ant tech­niques to help ensure that com­pon­ent fail­ures do not res­ult in an unac­cept­able expos­ure to risk. This post will delve into the depths of this archi­tec­ture in this install­ment on sys­tem archi­tec­tures. The defin­i­tions and require­ments dis­cussed in this art­icle come from ISO 13849 – 1, Edi­tion 3 (2015) [1] and ISO 13849 – 2, Edi­tion 2 (2012) [2].

As with pre­ced­ing art­icles in this series, I’ll be build­ing on con­cepts dis­cussed in those art­icles. If you need more inform­a­tion, you should have a look at the pre­vi­ous art­icles to see if I’ve answered your ques­tions there.

The Definition

The Cat­egory 4 defin­i­tion builds on both Cat­egory B and Cat­egory 3. As you read, recall that “SRP/CS” stands for “Safety-Related Parts of the Con­trol Sys­tem”. Here is the com­plete defin­i­tion:

6.2.7 Cat­egory 4
For cat­egory 4, the same require­ments as those accord­ing to 6.2.3 for cat­egory B shall apply. “Well-tried safety prin­ciples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

SRP/CS of cat­egory 4 shall be designed such that

  • a single fault in any of these safety-related parts does not lead to a loss of the safety func­tion, and
  • the single fault is detec­ted at or before the next demand upon the safety func­tions, e.g. imme­di­ately, at switch on, or at end of a machine oper­at­ing cycle,

but if this detec­tion is not pos­sible, then an accu­mu­la­tion of undetec­ted faults shall not lead to the loss of the safety func­tion.

The dia­gnost­ic cov­er­age (DCavg) of the total SRP/CS shall be high, includ­ing the accu­mu­la­tion of faults. The MTTFD of each of the redund­ant chan­nels shall be high. Meas­ures against CCF shall be applied (see Annex F).

NOTE 1

Cat­egory 4 sys­tem beha­viour is char­ac­ter­ized by

  • con­tin­ued per­form­ance of the safety func­tion in the pres­ence of a single fault,
  • detec­tion of faults in time to pre­vent the loss of the safety func­tion,
  • the accu­mu­la­tion of undetec­ted faults is taken into account.

NOTE 2

The dif­fer­ence between cat­egory 3 and cat­egory 4 is a high­er DCavg in cat­egory 4 and a required MTTFD of each chan­nel of “high” only. In prac­tice, the con­sid­er­a­tion of a fault com­bin­a­tion of two faults may be suf­fi­cient. [1, 6.2.7]

5% Dis­count on ISO and IEC Stand­ards with code: CC2011

Breaking it down

For cat­egory 4, the same require­ments as those accord­ing to 6.2.3 for cat­egory B shall apply. “Well-tried safety prin­ciples” accord­ing to 6.2.4 shall also be fol­lowed.

The first two sen­tences give the basic require­ment for all the cat­egor­ies from 2 through 4. Sound com­pon­ent selec­tion based on the applic­a­tion require­ments for voltage, cur­rent, switch­ing cap­ab­il­ity and life­time must be con­sidered. In addi­tion, using well-tried safety prin­ciples, such as switch­ing the +V rail side of the coil cir­cuit for con­trol com­pon­ents is required. If you aren’t sure about what con­sti­tutes a “well-tried safety prin­ciple”, see the art­icle on Cat­egory 2 where this is dis­cussed. Don’t con­fuse “well-tried safety prin­ciples” with “well-tried com­pon­ents”. There is no require­ment in Cat­egory 4 for the use of well-tried com­pon­ents, although you can use them for addi­tion­al reli­ab­il­ity if the design require­ments war­rant.

In addi­tion, the fol­low­ing applies.
SRP/CS of cat­egory 4 shall be designed such that

  • a single fault in any of these safety-related parts does not lead to a loss of the safety func­tion, and
  • the single fault is detec­ted at or before the next demand upon the safety func­tions, e.g. imme­di­ately, at switch on, or at end of a machine oper­at­ing cycle,

but if this detec­tion is not pos­sible, then an accu­mu­la­tion of undetec­ted faults shall not lead to the loss of the safety func­tion.

This is the big one. This para­graph, and the two bul­lets that fol­low it define the fun­da­ment­al per­form­ance require­ments for this cat­egory. No single fault can lead to the loss of the safety func­tion in Cat­egory 4, and test­ing is required that can detect fail­ures and an accu­mu­la­tion of undetec­ted faults can­not lead even­tu­ally to the loss of the safety func­tion. The require­ment regard­ing undetec­ted faults lead­ing to the loss of the safety func­tion means that faults that would fall into what IEC 61508 calls “?du” (i.e., dan­ger­ous undetect­able faults) must be elim­in­ated by design if they can­not be detec­ted by the dia­gnostics, or the dia­gnostics need to be improved so that all dan­ger­ous undetect­able faults become dan­ger­ous detect­able faults or “?dd” faults. This increase in the dia­gnost­ic cap­ab­il­ity of the sys­tem is the fun­da­ment­al dif­fer­ence between Cat­egory 3 and Cat­egory 4. Note that the next para­graph sup­ports this.

The dia­gnost­ic cov­er­age (DCavg) of the total SRP/CS shall be high, includ­ing the accu­mu­la­tion of faults. The MTTFD of each of the redund­ant chan­nels shall be high. Meas­ures against CCF shall be applied (see Annex F).

In Cat­egory 3, DCavg is required to be “at least low,” [1, 6.2.6], mean­ing 60 – 90% [1, Table 5]. So we go from 60 – 90% in Cat­egory 3 to 99% or more in Cat­egory 4.

These three sen­tences give the design­er the cri­ter­ia for dia­gnost­ic cov­er­age, chan­nel fail­ure rates and com­mon cause fail­ure pro­tec­tion. As you can see, the abil­ity to dia­gnose fail­ures auto­mat­ic­ally is a crit­ic­al part of the design, as is the use of highly reli­able com­pon­ents, lead­ing to highly reli­able chan­nels. The strongest CCF pro­tec­tion you can include in the design is also needed, although the “passing score” of 65 remains unchanged (see Annex F in ISO 13849 – 1 for more details on scor­ing your design).

NOTE 1

Cat­egory 4 sys­tem beha­viour is char­ac­ter­ized by

  • con­tin­ued per­form­ance of the safety func­tion in the pres­ence of a single fault,
  • detec­tion of faults in time to pre­vent the loss of the safety func­tion,
  • the accu­mu­la­tion of undetec­ted faults is taken into account.

NOTE 2

The dif­fer­ence between cat­egory 3 and cat­egory 4 is a high­er DCavg in cat­egory 4 and a required MTTFD of each chan­nel of “high” only. In prac­tice, the con­sid­er­a­tion of a fault com­bin­a­tion of two faults may be suf­fi­cient.

Note 1 expands on the first para­graph in the defin­i­tion, fur­ther cla­ri­fy­ing the per­form­ance require­ments by expli­cit state­ments. Notice that nowhere is there a require­ment that single faults or accu­mu­la­tion of single faults be pre­ven­ted, only detec­ted by the dia­gnost­ic sys­tem. Pre­ven­tion of single faults is nearly impossible since com­pon­ents do fail. It is import­ant to first under­stand which com­pon­ents are crit­ic­al to the safety func­tion, and second, what kinds of faults each com­pon­ent is likely to have, is fun­da­ment­al to being able to design a dia­gnost­ic sys­tem that can detect the faults.

The cat­egory relies on redund­ancy to ensure that the com­plete loss of one chan­nel will not cause the loss of the safety func­tion, but this is only use­ful if the com­mon cause fail­ures have been prop­erly dealt with. Oth­er­wise, a single event could wipe out both chan­nels sim­ul­tan­eously, caus­ing the loss of the safety func­tion and pos­sibly res­ult in an injury or fatal­ity.

Also notice that mul­tiple single faults are per­mit­ted, as long as the accu­mu­la­tion does not res­ult in the loss of the safety func­tion. ISO 13849 allows for “fault exclu­sion”, a concept that is not used in the North Amer­ic­an stand­ards.

The final sen­tence from Note 2 sug­gests that con­sid­er­a­tion of two con­cur­rent faults may be enough, but be care­ful. You need to look closely at the fault lists to see if there are any groups of high prob­ab­il­ity faults that are likely to occur con­cur­rently. IF there are, you need to assess these com­bin­a­tions of faults, wheth­er there are 5 or 50 to be eval­u­ated.

Fault Exclusion

Fault exclu­sion involves assess­ing the types of faults that can occur in each com­pon­ent in the crit­ic­al path of the sys­tem. The decision to exclude cer­tain kinds of faults is always a tech­nic­al com­prom­ise between the the­or­et­ic­al improb­ab­il­ity of the fault, the expert­ise of the designer(s) and engin­eers involved and the spe­cif­ic tech­nic­al require­ments of the applic­a­tion. Whenev­er the decision is made to exclude a par­tic­u­lar type of fault, the decision and the pro­cess used to make it must be doc­u­mented in the Reli­ab­il­ity Report included in the design file. Sec­tion 7.3 of ISO 13849 – 1 provides guid­ance on fault exclu­sion.

In the sec­tion dis­cuss­ing Cat­egory 1, the stand­ard has this to say about fault exclu­sion, and the dif­fer­ence between “well-tried com­pon­ents” and “fault exclu­sion”:

It is import­ant that a clear dis­tinc­tion between “well-tried com­pon­ent” and “fault exclu­sion” (see Clause 7) be made. The qual­i­fic­a­tion of a com­pon­ent as being well-tried depends on its applic­a­tion. For example, a pos­i­tion switch with pos­it­ive open­ing con­tacts could be con­sidered as being well-tried for a machine tool, while at the same time as being inap­pro­pri­ate for applic­a­tion in a food industry — in the milk industry, for instance, this switch would be des­troyed by the milk acid after a few months. A fault exclu­sion can lead to a very high PL, but the appro­pri­ate meas­ures to allow this fault exclu­sion should be applied dur­ing the whole life­time of the device. In order to ensure this, addi­tion­al meas­ures out­side the con­trol sys­tem may be neces­sary. In the case of a pos­i­tion switch, some examples of these kinds of meas­ures are

  • means to secure the fix­ing of the switch after its adjust­ment,
  • means to secure the fix­ing of the cam,
  • means to ensure the trans­verse sta­bil­ity of the cam,
  • means to avoid over-travel of the pos­i­tion switch, e.g. adequate mount­ing strength of the shock absorber and any align­ment devices, and
  • means to pro­tect it against dam­age from out­side.

To assist the design­er, ISO 13849 – 2 provides lists of typ­ic­al faults and the allow­able exclu­sions in Annex D.5. As an example, let’s con­sider the typ­ic­al situ­ation where a robust guard inter­lock­ing device has been selec­ted. The decision has been made to use redund­ant elec­tric­al cir­cuits to the switch­ing com­pon­ents in the inter­lock, so elec­tric­al faults can be detec­ted. But what about mech­an­ic­al fail­ures? A fault list is needed:

 Inter­lock Mech­an­ic­al Fault List
# Fault Descrip­tion Res­ult Like­li­hood
1 Key breaks off Con­trol sys­tem can­not determ­ine guard pos­i­tion. Com­plete fail­ure of sys­tem through a single fault. Unlikely
2 Screws mount­ing key to guard fail Con­trol sys­tem can­not determ­ine guard pos­i­tion. Com­plete fail­ure of sys­tem through a single fault. Unlikely
3 Screws mount­ing inter­lock device to guard fail Con­trol sys­tem can­not determ­ine guard pos­i­tion. Com­plete fail­ure of sys­tem through a single fault. Unlikely
4 Key and inter­lock device mis­aligned. Guard can­not close, pre­vent­ing machine from oper­at­ing. Very likely
5 Key and inter­lock device mis­aligned. Key and / or inter­lock device dam­aged. Guard may not close, or the key may jam in the inter­lock device once closed. Machine is inop­er­able if the inter­lock can­not be com­pleted, or the guard can­not be opened if the key jams in the device. Likely
6 Screws mount­ing key to guard removed by user. Inter­lock can now be bypassed by fix­ing the key into the inter­lock­ing device. Con­trol sys­tem can no longer sense the pos­i­tion of the guard. Likely
7 Screws mount­ing inter­lock device to guard removed by user Prob­ably com­bined with the pre­ced­ing con­di­tion. Con­trol sys­tem can no longer sense the pos­i­tion of the guard. Unlikely, but could hap­pen.

There may be more fail­ure modes, but for the pur­pose of this dis­cus­sion, let’s lim­it them to this list.

Look­ing at Fault 1, there are a num­ber of things that could res­ult in a broken key. They include mis­align­ment of the key and the inter­lock device, lack of main­ten­ance on the guard and the inter­lock­ing hard­ware, or inten­tion­al dam­age by a user. Unless the hard­ware is excep­tion­ally robust, includ­ing the design of the guard and any align­ment fea­tures incor­por­ated in the guard­ing, devel­op­ing a sound rationale for exclud­ing this fault will be very dif­fi­cult.

Fault 2 con­siders the mech­an­ic­al fail­ure of the mount­ing screws for the inter­lock key. Screws are con­sidered to be well-tried com­pon­ents (see Annex A.5), so you can con­sider them for fault exclu­sion. You can improve their reli­ab­il­ity by using thread lock­ing adhes­ives when installing the screws to pre­vent them from vibrat­ing loose, and “tamper-proof” style screw heads to deter unau­thor­ized remov­al. The inclu­sion of these meth­ods will sup­port any decision to exclude these faults. This goes to address­ing faults 3, 6 and 7 as well.

Faults 4 & 5 occur fre­quently and are often caused by poor device selec­tion (i.e. an inter­lock device inten­ded for straight-line slid­ing-gate applic­a­tions is chosen for a hinged gate), or by poor guard design (i.e. the guard is poorly guided by the reten­tion mech­an­ism and can be closed in a mis­aligned con­di­tion). The rationale for pre­ven­tion of these faults will need to include dis­cus­sion of design fea­tures that will pre­vent these con­di­tions.

Exclud­ing any oth­er kind of fault fol­lows the same pro­cess: Devel­op the fault list, assess each fault against the rel­ev­ant Annex from ISO 13849 – 2, determ­ine if there are pre­vent­at­ive meas­ures that can be designed into the product and wheth­er these provide suf­fi­cient risk reduc­tion to allow the exclu­sion of the fault from con­sid­er­a­tion.

DCavg and MTTFD requirements

NOTE 2 The dif­fer­ence between cat­egory 3 and cat­egory 4 is a high­er DCavg in cat­egory 4 and a required MTTFD of each chan­nel of “high” only.

The first sen­tence in Note 2 cla­ri­fies the two main dif­fer­ences from a design stand­point, aside from the addi­tion­al fault tol­er­ance require­ments: Bet­ter dia­gnostics are required and much high­er require­ments for an indi­vidu­al com­pon­ent, and there­fore chan­nel, MTTFD.

The Block Diagram

The block dia­gram for Cat­egory 4 is almost identic­al to Cat­egory 3 and was updated by Cor­ri­gendum 1 to the dia­gram shown below. The text from the cor­ri­gendum that accom­pan­ies the dia­gram has this to say about the change:

Replace the draw­ing show­ing the des­ig­nated archi­tec­ture for cat­egory 4 with the fol­low­ing draw­ing. This
cor­rects the arrowed lines labeled “m” between L1 and O1, and L2 and O2, by chan­ging them from dashed to sol­id lines, rep­res­ent­ing high­er dia­gnost­ic cov­er­age.

I’ve high­lighted this area using red ovals on Fig­ure 12 to make it easi­er to see.

ISO 13849-1 Figure 12 - Category 4 Block Diagram
ISO 13849 – 1 Fig­ure 12 – Cat­egory 4 Block Dia­gram

Here is Fig­ure 11 for com­par­is­on. Notice that the “m” lines are sol­id in Fig­ure 12 and dashed in Fig­ure 11? Subtle, but sig­ni­fic­ant! There are no oth­er dif­fer­ences between the dia­grams.

ISO 13849-1 Figure 11I went look­ing for a cir­cuit dia­gram to sup­port the block dia­gram but wasn’t able to find one from a com­mer­cial source that I could share with you. Con­sid­er­ing that the primary dif­fer­ences are in the reli­ab­il­ity of the com­pon­ents chosen and in the way the test­ing is done, this isn’t too sur­pris­ing. The basic phys­ic­al con­struc­tion of the two cat­egor­ies can be vir­tu­ally identic­al.

Applications

The fol­low­ing is not from the stand­ards – this is my per­son­al opin­ion, based on more than 20 years of prac­tice.

In the past, many man­u­fac­tur­ers decided that they were going to apply Cat­egory 4 archi­tec­ture without really under­stand­ing the design implic­a­tions because they believed that it was “the best”. With the change in the har­mon­iz­a­tion of EN 954 – 1 [3] and ISO 13849 – 1 under the EU machinery dir­ect­ive that came into force on 29-Dec-2011, and con­sid­er­ing the great dif­fi­culty that many man­u­fac­tur­ers have had in prop­erly imple­ment­ing EN 954 – 1, I can eas­ily ima­gine man­u­fac­tur­ers who have taken the approach that they already have Cat­egory 4 SRP/CS on their sys­tems and mak­ing the state­ment that they now have PLe SRP/CS sys­tem per­form­ance. This is a bad decision for a lot of reas­ons:

  1. ISO 13849 – 1 PLe, Cat­egory 4 sys­tems should be reserved for very dan­ger­ous machinery where the tech­nic­al effort and expense involved is war­ran­ted by the risk assess­ment. Attempt­ing to apply this level of design to machinery where a PLb per­form­ance level is more suit­able based on a risk assess­ment, is a waste of design time and effort and a need­less expense. The product fam­ily stand­ards for these types of machines, such as EN 201 [4] for plastic injec­tion mould­ing machines or EN 692 [5], [6] for Mech­an­ic­al Power Presses or EN 693 [7], [8] for Hydraul­ic Power Presses will expli­citly spe­cify the PL level required for these machines.
  2. Man­u­fac­tur­ers have fre­quently claimed EN 954 – 1 Cat­egory 4 per­form­ance based on the rat­ing of the safety relay alone, without under­stand­ing that the rest of the SRP/CS must be con­sidered, and clearly, this is wrong. The SRP/CS must be eval­u­ated as a com­plete sys­tem.

This lack of under­stand­ing endangers the users, the main­ten­ance per­son­nel, the own­ers and the man­u­fac­tur­ers. If they con­tin­ue this approach and an injury occurs, it is my opin­ion that the courts will have more than enough evid­ence in the defendant’s pub­lished doc­u­ments to cause some ser­i­ous leg­al grief.

As design­ers involved with the safety of our company’s products or with our co-worker’s safety, I believe that we owe it to every­one who uses our products to be edu­cated and to cor­rectly apply these con­cepts. The fact that you have read all of the posts lead­ing up to this one is evid­ence that you are work­ing on get­ting edu­cated.

Always con­duct a risk assess­ment and use the out­come from that work to guide your selec­tion of safe­guard­ing meas­ures, com­ple­ment­ary pro­tect­ive meas­ures and the per­form­ance of the SRP/CS that ties those sys­tems togeth­er. Choose per­form­ance levels that make sense based on the required risk reduc­tion and ensure that the design cri­ter­ia are met by val­id­at­ing the sys­tem once built.

As always, I wel­come your com­ments and ques­tions! Please feel free to com­ment below. I will respond to all your com­ments.

References

[1]     Safety of machinery — Safety-related parts of con­trol sys­tems — Part 1: Gen­er­al prin­ciples for design. ISO 13849 – 1, 2015

[2]     Safety of machinery — Safety-related parts of con­trol sys­tems — Part 2: Val­id­a­tion. ISO 13849 – 2, 2012.

[3]     Safety of Machinery – Safety Related Parts of Con­trol Sys­tems – Part 1: Gen­er­al Prin­ciples for Design. CEN European Com­mit­tee for Stand­ard­iz­a­tion. EN 954 – 1, 1996.

[4]     Plastics and rub­ber machines – Injec­tion mould­ing machines – Safety require­ments. CEN European Com­mit­tee for Stand­ard­iz­a­tion. EN 201, 2009.

[5]     Machine tools. Mech­an­ic­al presses. Safety. CEN European Com­mit­tee for Stand­ard­iz­a­tion. EN 692:2005+A1:2009. (with­drawn)

[6]      Machine tools safety. Presses. Gen­er­al safety require­ments. CEN European Com­mit­tee for Stand­ard­iz­a­tion. EN ISO 16092 – 1, 2018.

[7]     Machine tools. Safety. Hydraul­ic presses. CEN European Com­mit­tee for Stand­ard­iz­a­tion. EN 693:2001+A2:2011. (with­drawn)

[8]     Machine tools safety. Presses. Safety require­ments for hydraul­ic presses. CEN European Com­mit­tee for Stand­ard­iz­a­tionEN ISO 16092 – 3:2018.

Digiprove sealCopy­right secured by Digi­prove © 2011 – 2018
Acknow­ledge­ments: ISO for excerpts from ISO 13849 – 1 and more…
Some Rights Reserved
Series Nav­ig­a­tionInter­lock Archi­tec­tures – Pt. 4: Cat­egory 3 – Con­trol Reli­ableInter­lock Archi­tec­tures Pt. 6 – Com­par­ing North Amer­ic­an and Inter­na­tion­al Sys­tems