Interlock Architectures – Pt. 4: Category 3 – Control Reliable

This entry is part 4 of 8 in the series Circuit Architectures Explored

Category 3 sys­tem archi­tec­ture is the first cat­egory that could be con­sidered to have sim­il­ar­ity to “Control Reliable” cir­cuits or sys­tems as defined in the North American stand­ards. It is not the same as Control Reliable, but we’ll get to in a sub­sequent post. If you haven’t read the first three posts in this series, you may want to go back and review them as the con­cepts in those art­icles are the basis for the dis­cus­sion in this post.

So what is “Control Reliable” any­way? This term was coined by the ANSI RIA R15.06 tech­nic­al com­mit­tee when they were devel­op­ing their defin­i­tions for con­trol sys­tem reli­ab­il­ity, first pub­lished in the 1999 edi­tion of the stand­ard. No men­tion of the concept of con­trol reli­ab­il­ity appears in the 1994 edi­tion of CSA Z434 or the pre­ced­ing edi­tion of RIA R15.06.

Essentially, the term “Control Reliable” means that the con­trol sys­tem is designed with some degree of fault tol­er­ance. Depending on the defin­i­tions that you read, this could be single- or multiple-​fault-​tolerance.

There are a num­ber of design tech­niques that can be used to increase the fault tol­er­ance of a con­trol sys­tem. The older approaches, such as those giv­en in ANSI RIA R15.06 – 1999, CSA Z434-​03 or EN 954 – 1:95, rely primar­ily on the struc­ture or archi­tec­ture of the cir­cuit, and the char­ac­ter­ist­ics of the com­pon­ents selec­ted for use. ISO 13849 – 1 uses the same basic archi­tec­tures defined by EN 954 – 1:95, and extends them to include dia­gnost­ic cov­er­age, com­mon cause fail­ure res­ist­ance and an under­stand­ing of the fail­ure rate of the com­pon­ents to determ­ine the degree of fault tol­er­ance and reli­ab­il­ity provided by the design.

OK, enough back­ground for now! Let’s look at the defin­i­tion for Category 3 sys­tems. Remember that “SRP/​CS” means “Safety Related Parts of the Control System”.

Definition

6.2.6 Category 3

For cat­egory 3, the same require­ments as those accord­ing to 6.2.3 for cat­egory B shall apply. “Well-​tried safety prin­ciples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies. SRP/​CS of cat­egory 3 shall be designed so that a single fault in any of these parts does not lead to the loss of the safety func­tion. Whenever reas­on­ably prac­tic­able, the single fault shall be detec­ted at or before the next demand upon the safety func­tion.

The dia­gnost­ic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​detection shall be low. The MTTFd of each of the redund­ant chan­nels shall be low-​to-​high, depend­ing on the PLr. Measures against CCF shall be applied (see Annex F).

NOTE 1 The require­ment of single-​fault detec­tion does not mean that all faults will be detec­ted. Consequently, the accu­mu­la­tion of undetec­ted faults can lead to an unin­ten­ded out­put and a haz­ard­ous situ­ation at the machine. Typical examples of prac­tic­able meas­ures for fault detec­tion are use of the feed­back of mech­an­ic­ally guided relay con­tacts and mon­it­or­ing of redund­ant elec­tric­al out­puts.

NOTE 2 If neces­sary because of tech­no­logy and applic­a­tion, type-​C stand­ard makers need to give fur­ther details on the detec­tion of faults.

NOTE 3 Category 3 sys­tem beha­viour allows that

  • when the single fault occurs the safety func­tion is always per­formed,
  • some but not all faults will be detec­ted,
  • accu­mu­la­tion of undetec­ted faults can lead to the loss of the safety func­tion.

NOTE 4 The tech­no­logy used will influ­ence the pos­sib­il­it­ies for the imple­ment­a­tion of fault detec­tion.

5% Discount on ISO and IEC Standards with code: CC2011 

Breaking it down

Let’s take the defin­i­tion apart and look at the com­pon­ents that make it up.

For cat­egory 3, the same require­ments as those accord­ing to 6.2.3 for cat­egory B shall apply. “Well-​tried safety prin­ciples” accord­ing to 6.2.4 shall also be fol­lowed.

The first couple of lines remind the design­er of two key points:

  • The com­pon­ents selec­ted must be suit­able for the applic­a­tion, i.e. cor­rectly spe­cified for voltage, cur­rent, envir­on­ment­al con­di­tions, etc.; and
  • well-​tried safety prin­ciples” must be used in the design.

It’s import­ant to note here that we are talk­ing about “well tried safety prin­ciples” and NOT “well-​tried com­pon­ents”. The require­ment to use com­pon­ents designed for safety applic­a­tions comes from oth­er stand­ards, like EN 1088 and ISO 13850. The require­ments from these stand­ards, such as the use of “direct-​drive” con­tacts improves the fault tol­er­ance of the com­pon­ent, and so bene­fits the design in the end. These improve­ments are gen­er­ally reflec­ted in the B10d or MTTFd of the com­pon­ent, and are points that inspect­ors will com­monly look for, since they are easy to spot in the field, since “safety-​rated com­pon­ents” often use red or yel­low caps to identi­fy them clearly in the con­trol pan­el.

In addi­tion, the fol­low­ing applies. SRP/​CS of cat­egory 3 shall be designed so that a single fault in any of these parts does not lead to the loss of the safety func­tion.

This sen­tence makes the require­ment for single-​fault tol­er­ance. This means that the fail­ure of any single com­pon­ent in the func­tion­al chan­nel can­not res­ult in the loss of the safety func­tion. To meet this require­ment, redund­ancy is needed. With redund­ant sys­tems, one com­plete chan­nel can fail without los­ing the abil­ity to stop the machinery. It is pos­sible to lose the func­tion of the mon­it­or­ing sys­tem from a single com­pon­ent fail­ure, but as long as the sys­tem con­tin­ues to provide the safety func­tion this may be accept­able. The sys­tem should not per­mit itself to be reset if the mon­it­or­ing sys­tem is not work­ing.

One more “gotcha” from this sen­tence: In order to meet the require­ment that any single com­pon­ent fail­ure can be detec­ted, the design will require two sep­ar­ate sensors to detect the pos­i­tion of a gate, for example. This per­mits the sys­tem to detect a fail­ure in either sensor, includ­ing mech­an­ic­al fail­ures like broken keys or attempts to defeat the safety sys­tem. You can clearly see this in both the block dia­gram, which does not show any mon­it­or­ing con­nec­tion to the input devices, and in the cir­cuit dia­gram. Both of these dia­grams are shown later in this post. The only way out of the require­ment to have redund­ant sensors is to select a gate switch that is robust enough that mech­an­ic­al faults can reas­on­ably be excep­ted. I’ll get into fault excep­tions later in this art­icle.

Whenever reas­on­ably prac­tic­able, the single fault shall be detec­ted at or before the next demand upon the safety func­tion.

This sen­tence can be a bit sticky. The phrase “Whenever reas­on­ably prac­tic­able” means that your design needs to be able to detect single faults unless it would be “unreas­on­able” to do so. What con­sti­tutes an unreas­on­able degree of effort? This is for you to decide. I will say that if there is a com­mon, off the shelf com­pon­ent (COTS) avail­able that will do the job, and you choose not to use it, you will have a dif­fi­cult time con­vin­cing a court that you took every reas­on­ably prac­tic­able means to detect the fault.

Following the comma, the rest of the sen­tence provides the design­er with the basic require­ment for the test sys­tem: it must be able to detect a single com­pon­ent fail­ure at the moment of demand (this is usu­ally how it’s done, since this is typ­ic­ally the simplest way) or before it occurs, which can hap­pen if your test equip­ment has a means to detect a change in some crit­ic­al char­ac­ter­ist­ic of the mon­itored component(s).

 The dia­gnost­ic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​detection shall be low.

This sen­tence tells you that your design must meet the require­ments for LOW Diagnostic Coverage. To get to LOW DCavg, we need to look first at Table 6:

ISO 13849 – 1:06 Table 6

Diagnostic Coverage (DC)

Denotation  Range
 None  DC < 60%
 Low  60% <= DC < 90%
 Medium  90% <= DC < 99%
 High  99% <= DC
NOTE 1 For SRP/​CS con­sist­ing of sev­er­al parts an aver­age value DCavg for DC is used in Figure 5, Clause 6 and E.2.

NOTE 2 The choice of the DC ranges is based on the key val­ues 60 %, 90 % and 99 % also estab­lished in oth­er stand­ards (e.g. IEC 61508) deal­ing with dia­gnost­ic cov­er­age of tests. Investigations show that (1 – DC) rather than DC itself is a char­ac­ter­ist­ic meas­ure for the effect­ive­ness of the test. (1 – DC) for the key val­ues 60 %, 90 % and 99 % forms a kind of log­ar­ithmic scale fit­ting to the log­ar­ithmic PL-​scale. A DC-​value less than 60 % has only slight effect on the reli­ab­il­ity of the tested sys­tem and is there­fore called “none”. A DC-​value great­er than 99 % for com­plex sys­tems is very hard to achieve. To be prac­tic­able, the num­ber of ranges was restric­ted to four. The indic­ated bor­ders of this table are assumed with­in an accur­acy of 5 %.

Based on Table 6, the DCavg must be between 60% and 90%, all com­pon­ents con­sidered. To score this, we must go to Annex E and look at Table E1. Using the factors in Table E1, score the design. If you end up in the desired range between 60% and 90% DC cov­er­age, you can move on. If not, the design will require modi­fic­a­tion to bring it into this range.

The MTTFd of each of the redund­ant chan­nels shall be low-​to-​high, depend­ing on the PLr.

This sen­tence reminds you that your com­pon­ent selec­tions mat­ter. Depending on the PLr you are try­ing to achieve, you will need to choose com­pon­ents with suit­able MTTFd rat­ings. Remember that just because you are using a Category 3 archi­tec­ture, you have not auto­mat­ic­ally achieved the highest levels of reli­ab­il­ity. If you refer to Figure 5 in the stand­ard, you can see that a Category 3 archi­tec­ture can meet a range of PL’s, all the way from PLa through PLe!

ISO 13849-1 Figure 5
ISO 13849 – 1 Figure 5

If you want, or need, to know the numer­ic bound­ar­ies of each of the bands in the dia­gram above, look at Annex K of the stand­ard. The full numer­ic rep­res­ent­a­tion of Figure 5 is provided in that Annex.

Measures against CCF shall be applied (see Annex F).

In order for the archi­tec­ture of your design to meet Category 3 archi­tec­ture, CCF meas­ures are required. I’ve dis­cussed Common Cause Failures else­where on the blog, but as a remind­er, a Common Cause Failure is one where a single event, like a light­ning strike on the power line, or a cable being cut, res­ults in the fail­ure of the sys­tem. This is not the same as a Common Mode Failure, where sim­il­ar or dif­fer­ent com­pon­ents fail in the same way. For instance, if both out­put con­tact­ors were to weld closed either sim­ul­tan­eously or at dif­fer­ent time due to over­load­ing because they were under­sized, this could be con­sidered to be a Common Mode Failure. If they both weld closed due to a light­ning strike, that is a Common Cause Failure.

Annex F provides a check­list that is used to score the CCF of the design. The design must meet at least 65 points to be con­sidered to meet the min­im­um level of CCF pro­tec­tion, and more is bet­ter of course! Score your design and see where you come out. Less than 65 and you need to do more. 65 or more and you are good to go.

The Notes

The notes giv­en in the defin­i­tion are also import­ant. Note 1 reminds the design­er that not all faults will be detec­ted, and an accu­mu­la­tion of undetec­ted faults can lead to the loss of the safety func­tion. Be aware that it is up to you as the design­er to min­im­ize the kinds of fail­ures that can accu­mu­late undetec­ted.

Note 2 speaks to the pos­sib­il­ity that a Type-​C product stand­ard, like EN 201 for injec­tion mould­ing machines for example, may impose a min­im­um PLr on the design. Make sure that you get a copy of any Type-​C stand­ard that is rel­ev­ant for your product and mar­ket. Note that the des­ig­na­tion “Type-​C” comes from ISO. If you go look­ing for this ter­min­o­logy in ANSI or CSA stand­ards, you won’t find it used because the concept doesn’t exist in the same way in these National stand­ards.

Note 3 gives you the basic per­form­ance para­met­ers for the design. If your design can do these things, then you’re halfway there.

Finally, Note 4 is a remind­er that dif­fer­ent kinds of tech­no­logy have great­er or less­er cap­ab­il­ity to detect fail­ures. More soph­ist­ic­ated tech­no­logy may be required to achieve the PL level you need.

The Block Diagram

Let’s have a look at the func­tion­al block dia­gram for this Category.

ISO 13849-1 Figure 11By look­ing at the dia­gram you can see clearly the two inde­pend­ent chan­nels and the cross-​monitoring con­nec­tion between the chan­nels. Input devices are not mon­itored, but out­put devices are mon­itored. This is anoth­er sig­ni­fic­ant reas­on requir­ing the use of two phys­ic­ally sep­ar­ate input devices to sense the guard pos­i­tion or whatever oth­er safe­guard­ing device is integ­rated into the sys­tem. The only way that a fail­ure in the input devices can be detec­ted is if one chan­nel changes state and one does not.

If you want to learn more about apply­ing the block dia­gram­ming meth­od to you design, there is a good explan­a­tion of the meth­od in the SISTEMA Cookbook 1, pub­lished by the IFA in Germany. You can down­load the English ver­sion from the link above, or get the doc­u­ment dir­ectly from the IFA web site.

Circuit Diagram

By now you prob­ably get the idea that there are as many ways to con­fig­ure a Category 3 cir­cuit as there are applic­a­tions. Below is a typ­ic­al cir­cuit dia­gram bor­rowed from Rockwell Allen-​Bradley, show­ing the applic­a­tion of typ­ic­al safety relays in a com­plete sys­tem that includes the emer­gency stop sys­tem, a gate inter­lock and a safety mat. You can meet the require­ments for Category 3 archi­tec­ture in oth­er ways, so don’t feel that you must use a COTS safety relay. It just may be the most straight­for­ward way in many cases.

This is not a plug for A-​B products. Neither Machinery Safety 101, nor I, have any rela­tion­ship with Rockwell Allen-​Bradley.

From Rockwell Automation pub­lic­a­tion SAFETY-​WD001A-​EN-​P – June 2011, p.6.

If you’re inter­ested in obtain­ing the source doc­u­ment con­tain­ing this dia­gram, you can down­load it dir­ectly from the Rockwell Automation web site.

Emergency Stop Subsystem

The emer­gency stop cir­cuit uses the 440R-​512R2 relay on the left side of the dia­gram. This par­tic­u­lar sys­tem uses Category 3 archi­tec­ture in the e-​stop sys­tem, which may be more than is required. A risk assess­ment and a start-​stop ana­lys­is is required to determ­ine what per­form­ance level is needed for this sub­sys­tem. Get more inform­a­tion on emer­gency stop.

 Gate Interlock Subsystem

The gate inter­lock cir­cuit is loc­ated in the cen­ter of the dia­gram, and uses the 440R-​D22R2 relay. As you can see, there are two phys­ic­ally sep­ar­ate gate inter­lock switches. Only one con­tact from each switch is used, so one switch is con­nec­ted to Channel 1, and the oth­er to Channel 2. Notice that there is no oth­er mon­it­or­ing of these devices (i.e. no second con­nec­tion to either switch). The sec­ond­ary con­tacts on these switches could be con­nec­ted to the PLC for annun­ci­ation pur­poses. This would allow the PLC to dis­play the open/​closed status of the gate on the machine HMI.

The out­put con­tact­ors, K3 and K4, are mon­itored by the reset loop con­nec­ted to S34 and the +V rail.

One more inter­est­ing point – did you notice that there is a “zone e-​stop” included in the gate inter­lock? If you look imme­di­ately below the cent­ral safety relay and a little to the left you will find an emer­gency stop device. This device is wired in series with the gate inter­lock, so activ­at­ing it will drop out K3 and K4 but not dis­turb the oper­a­tion of the rest of the machine. The safety relay can’t dis­tin­guish between the e-​stop but­ton and the gate inter­locks, so if annun­ci­ation is needed, you may want to use a third con­tact on the e-​stop device to con­nect to a PLC input for this pur­pose.

Safety Mat Subsystem

The safety mat sub­sys­tem is loc­ated on the right side of the dia­gram and uses a second 440R-​D22R2 relay. Safety mats can be either single or dual chan­nel in design. The mat show in this draw­ing is a dual-​channel type. Stepping on the mat causes the con­duct­ive lay­ers in the mat to touch, short­ing Channel 1 to Channel 2. This cre­ates an input fault that will be detec­ted by the 440R relay. The fault con­di­tion will cause the out­put of the relay to open, stop­ping the machine.

Safety mats can be dam­aged reas­on­ably eas­ily, and the cir­cuit design shown will detect shorts or opens with­in the mat and will pre­vent the haz­ard­ous motion from start­ing or con­tinu­ing.

The out­put con­tact­ors, K5 and K6 are mon­itored by the relay reset loop con­nec­ted to S34 and the +V rail.

This cir­cuit also includes a con­ven­tion­al start-​stop cir­cuit that doesn’t rely on the safety relay.

One more thing – just like the gate inter­lock cir­cuit, this cir­cuit also includes a “zone e-​stop”. Look below and to the left of the safety mat relay. As with the gate inter­lock, press­ing this but­ton will drop out K5 and K6, stop­ping the same motions pro­tec­ted by the safety mat. Since the relay can’t tell the dif­fer­ence between the e-​stop but­ton and the mat being activ­ated, you may want to use the same approach and add a third con­tact to the e-​stop but­ton, con­nect­ing it to the PLC for annun­ci­ation.

Component Selection

The com­pon­ents used in the cir­cuit are crit­ic­al to the final PL rat­ing of the design. The final PL of the design depends on the MTTFd of the com­pon­ents used in each chan­nel. No know­ledge of the intern­al con­struc­tion of the safety relays is needed, because the relays come with a PL rat­ing from the man­u­fac­turer. They can be treated as a sub­sys­tem unto them­selves. The selec­tion of the input and out­put devices is then the sig­ni­fic­ant factor. Component data sheets can be down­loaded from the Rockwell site if you want to dig a bit deep­er.

What did you think about this art­icle? What ques­tions came to mind that weren’t answered for you? I look for­ward to hear­ing your thoughts and ques­tions!

Digiprove sealCopyright secured by Digiprove © 2011 – 2014
Acknowledgements: ISO for excerpts from ISO 13849 – 1 and more…
Some Rights Reserved

Reader Question: Multiple E-​Stops and Resets

This entry is part 7 of 12 in the series Emergency Stop

Control Panel with Emergency Stop Button.I had an inter­est­ing ques­tion come in from a read­er today that is rel­ev­ant to many situ­ations:

When you have mul­tiple E-​Stop but­tons I have often got­ten into an argu­ment that says you can have a reset beside each one. I was taught that you were required to have a single point of reset. Who is cor­rect?”

— Michael Barb, Sr. Electrical Engineer

The Short Answer

There is noth­ing in the EU, US or Canadian reg­u­la­tions that would for­bid hav­ing mul­tiple reset but­tons. However, you must under­stand the over­lap­ping require­ments for emer­gency stop and pre­ven­tion of unex­pec­ted start-​up.

The Long Answer:

First I need to define two dif­fer­ent types of reset for clar­ity:

  1. Emergency Stop Device Reset: Each e-​stop device, i.e. but­ton, pull cord, foot switch, etc., is required to latch in the activ­ated state and must be indi­vidu­ally reset. Resetting the e-​stop device is NOT per­mit­ted to re-​start the machinery, only to per­mit restart­ing. (NFPA 79, CSA Z432, ISO 14118).
  2. Restarting the machine is a sep­ar­ate delib­er­ate action from reset­ting the emer­gency stop device(s).

ANSI B11-​2008 provides some dir­ect guid­ance on this top­ic:

7.2.2 Zones

A machine or an assembly of machines may be divided into sev­er­al con­trol zones (e.g., for emer­gency stop­ping, stop­ping as a res­ult of safe­guard­ing devices, start-​up, isol­a­tion or energy dis­sip­a­tion). The machine and con­trols in dif­fer­ent zones shall be defined and iden­ti­fied. Controls for machines in zones can be loc­al for each machine, across sev­er­al machines in a zone, or glob­ally for machines across zones. The con­trol require­ments shall be based on the oper­a­tion­al require­ments and on the risk assessment.The inter­faces between zones, includ­ing syn­chron­iz­a­tion and inde­pend­ent oper­a­tion, shall be designed such that no func­tion in one zone cre­ates a hazard(s) /​ haz­ard­ous situ­ation in anoth­er zone.

CSA Z432-​04 has sim­il­ar word­ing:

6.2.1.8.4

When zones can be determ­ined, their delim­it­a­tions shall be evid­ent (includ­ing the effect of the asso­ci­ated emer­gency stop device). This shall also apply to the effect of isol­a­tion and energy dis­sip­a­tion.

Let’s take a case with a single e-​stop but­ton first. The same require­ments apply for all e-​stop devices. The require­ments include:

  1. Button must be in ‘easy-​reach’ of the nor­mal oper­at­or pos­i­tion. I con­sider ‘easy-​reach’ to be the range I can touch while sit­ting or stand­ing at the nor­mal oper­at­or pos­i­tion. This pos­i­tion is not neces­sar­ily in front of the con­trol pan­el. This is the pos­i­tion where the oper­at­or is expec­ted to be while car­ry­ing out the tasks expec­ted of them when the machine is oper­at­ing. This is the require­ment that drives hav­ing mul­tiple but­tons in most cases.
  2. E-​stop devices can­not be loc­ated so that the oper­at­or must reach over or past a haz­ard to activ­ate them.
  3. The but­ton must latch in the oper­ated pos­i­tion.
  4. The but­ton must be robust enough to handle the mech­an­ic­al and elec­tric­al stresses that will be placed on it when used. i.e. rugged but­tons are required.
  5. When the e-​stop device is reset – i.e returned to the ‘RUN’ pos­i­tion – the machine is NOT per­mit­ted to restart. It is only PERMITTED to restart. It must be restar­ted through anoth­er delib­er­ate action, like press­ing a ‘Power On’ but­ton.

So what do you do with the ‘POWER ON’ or safety cir­cuit reset but­ton? The first ques­tion to ask is: ‘What hap­pens when I reset this cir­cuit, apply­ing power to the con­trol cir­cuits?”

Case A: If it is impossible to see the entire machine from the loc­a­tion of the reset but­ton, then I would recom­mend a single reset but­ton loc­ated at the HMI or main con­sole. The oper­at­or must check to make sure the machine is clear before re-​applying power. Where the machine is too big to be com­pletely vis­ible from the main oper­at­or con­sole, then I would also recom­mend:

  • warn­ing horn, 
  • warn­ing lights, and 
  • a start-​up delay that is long enough to allow a per­son to get clear of the machine before it starts mov­ing.

Case B: If the machine is simply ‘enabled’ at this point, but no motion occurs, then mul­tiple ‘reset’ or ‘power on’ but­tons may be accept­able, depend­ing on the out­come of the risk assess­ment and start/​stop ana­lys­is. Having said that, the oper­at­or will likely have to return to a main con­sole to reset the machine and restart oper­a­tion, and chances are there is only one HMI screen on the machine, so there may not be any advant­age to hav­ing mul­tiple reset but­tons.

I would recom­mend doing two things to get a good handle on this: Conduct a detailed risk assess­ment and include all nor­mal oper­a­tions and all main­ten­ance oper­a­tions. Then con­duct a start/​stop ana­lys­is to look at all of the start­ing and stop­ping con­di­tions that you can reas­on­ably fore­see. Combine the res­ults of these two ana­lyses to find the start­ing and stop­ping con­di­tions with the highest risk, and then determ­ine if hav­ing mul­tiple reset but­tons will con­trib­ute to the risk or not. You may also want to look at the con­trol reli­ab­il­ity require­ments for the emer­gency stop sys­tem based on the out­come of the risk assess­ment and the start/​stop ana­lys­is.

In a case where there are mul­tiple emer­gency stop devices, loc­a­tions are import­ant. There must be one at each nor­mal work­sta­tion to meet the reg­u­lat­ory require­ments in most jur­is­dic­tions, and with­in ‘easy reach’. You may also want some inside the machine if it is pos­sible to gain full body access inside the machinery. i.e. inside a robot work cell. Make sure that the but­tons or oth­er devices are loc­ated so that a per­son exposed to the hazard(s) inside the machine is not required to reach over or past the haz­ard to get to the but­ton.

Michael, I hope that settles the argu­ment!