Interlock Architectures — Pt. 1: What do those categories really mean?

Basic Stop/Start Circuit
This entry is part 1 of 8 in the series Circuit Architectures Explored

It all started with EN 954–1

In 1995 CEN pub­lished an impor­tant stan­dard for machine builders — EN 954–1, “Safety of Machinery — Safety Related Parts of Control Systems — Part 1: General Principles for Design”. This stan­dard set the stage for defin­ing con­trol reli­a­bil­ity in machin­ery safe­guard­ing sys­tems, intro­duc­ing the Reliability cat­e­gories that have become ubiq­ui­tous. So what do these cat­e­gories mean, and how are they applied under the lat­est machin­ery stan­dard, ISO 13849–1?

Download ISO Standards

Circuit Categories

The cat­e­gories are used to describe sys­tem archi­tec­tures for safety related con­trol sys­tems. Each archi­tec­ture car­ries with it a range of reli­a­bil­ity per­for­mance that can be related to the degree of risk reduc­tion you are expect­ing to achieve with the sys­tem. These archi­tec­tures can be applied equally to elec­tri­cal, elec­tronic, pneu­matic, hydraulic or mechan­i­cal con­trol systems.

Historical Circuits

Early elec­tri­cal ‘master-​​control-​​relay’ cir­cuits used a sim­ple archi­tec­ture with a sin­gle con­tac­tor, or some­times two, and a sin­gle chan­nel style of archi­tec­ture to main­tain the con­tac­tor coil cir­cuit once the START or POWER ON but­ton (PB2 in Fig. 1) had been pressed. Power to the out­put ele­ments of the machine con­trols was sup­plied via con­tacts on the con­tac­tor, which is why it was called the Master Control Relay or ‘MCR’. The POWER OFF but­ton (PB1 in Fig. 1) could be labeled that way, or you could make the same cir­cuit into an Emergency Stop by sim­ply replac­ing the oper­a­tor with a red mushroom-​​head push but­ton. These devices were usu­ally spring-​​return, so to restore power, all that was needed was to push the POWER ON but­ton again (Fig.1).

Basic Stop/Start Circuit

Figure 1 — Basic Stop/​Start Circuit

Typically, the com­po­nents used in these cir­cuits were spec­i­fied to meet the cir­cuit con­di­tions, but not more. Controls man­u­fac­tur­ers brought out over-​​dimensioned ver­sions, such as Allen-Bradley’s Bulletin 700-​​PK con­tac­tor which had 20 A rated con­tacts instead of the stan­dard Bulletin 700’s 10 A contacts.

When inter­locked guards began to show up, they were inte­grated into the orig­i­nal MCR cir­cuit by adding a basic con­trol relay (CR1 in Fig. 2) whose coil was con­trolled by the inter­lock switch(es) (LS1 in Fig. 2), and whose out­put con­tacts were in series with the coil cir­cuit of the MCR con­tac­tor. Opening the guard inter­lock would open the MCR coil cir­cuit and drop power to the machine con­trols. Very simple.

Start/Stop Circuit with Guard Relay

Figure 2 — Old-​​School Start/​Stop Circuit with Guard Relay

Ice-​​cube’ style plug-​​in relays were often cho­sen for CR1. These devices did not have ‘force-​​guided’ con­tacts in them, so it was pos­si­ble to have one con­tact in the relay fail while the other con­tin­ued to oper­ate properly.

LS1 could be any kind of switch. Frequently a ‘micro-​​switch’ style of limit switch was cho­sen. These snap-​​action switches could fail shorted inter­nally, or weld closed and the actu­a­tor would con­tinue to work nor­mally even though the switch itself had failed. These switches are also ridicu­lously easy to bypass. All that is required is a piece of tape or an elas­tic band and the switch is no longer doing it’s job.

Micro-Switch style limit switch used as an interlock switch

Photo 1 — Micro-​​Switch style limit switch used as a cover inter­lock switch in a piece of indus­trial laun­dry equipment

The prob­lem with these cir­cuits is that they can fail in a num­ber of ways that aren’t obvi­ous to the user, with the result being that the inter­lock might not work as expected, or the Emergency Stop might fail just when you need it most.

Modern Circuits

Category B

These orig­i­nal cir­cuits are the basis for what became known as ‘Category B’ (‘B’ for ‘Basic’) cir­cuits. Here’s the def­i­n­i­tion from the stan­dard. Note that I am tak­ing this excerpt from ISO 13849–1: 2007 (Edition 2). “SRP/​CS” stands for “Safety Related Parts of Control Systems”:

6.2.3 Category B
The SRP/​CS shall, as a min­i­mum, be designed, con­structed, selected, assem­bled and com­bined in accor­dance with the rel­e­vant stan­dards and using basic safety prin­ci­ples for the spe­cific appli­ca­tion to withstand

  • the expected oper­at­ing stresses, e.g. the reli­a­bil­ity with respect to break­ing capac­ity and frequency,
  • the influ­ence of the processed mate­r­ial, e.g. deter­gents in a wash­ing machine, and
  • other rel­e­vant exter­nal influ­ences, e.g. mechan­i­cal vibra­tion, elec­tro­mag­netic inter­fer­ence, power sup­ply inter­rup­tions or disturbances.

There is no diag­nos­tic cov­er­age (DCavg = none) within cat­e­gory B sys­tems and the MTTFd of each chan­nel can be low to medium. In such struc­tures (nor­mally single-​​channel sys­tems), the con­sid­er­a­tion of CCF is not relevant.

The max­i­mum PL achiev­able with cat­e­gory B is PL = b.

NOTE When a fault occurs it can lead to the loss of the safety function.

Specific require­ments for elec­tro­mag­netic com­pat­i­bil­ity are found in the rel­e­vant prod­uct stan­dards, e.g. IEC 61800–3 for power drive sys­tems. For func­tional safety of SRP/​CS in par­tic­u­lar, the immu­nity require­ments are rel­e­vant. If no prod­uct stan­dard exists, at least the immu­nity require­ments of IEC 61000−6−2 should be followed.

The stan­dard also pro­vides us with a nice block dia­gram of what a single-​​channel sys­tem might look like:

Category B Designated Architecture

ISO 13849–1 Category B Designated Architecture

If you look at this block dia­gram and the Start/​Stop Circuit with Guard Relay above, you can see how this basic cir­cuit trans­lates into a sin­gle chan­nel archi­tec­ture, since from the con­trol inputs to the con­trolled load you have a sin­gle chan­nel. Even the guard loop is a sin­gle chan­nel. A fail­ure in any com­po­nent in the chan­nel can result in loss of con­trol of the load.

Lets look at each part of this require­ment in more detail, since each of the sub­se­quent Categories builds upon these BASIC requirements.

The SRP/​CS shall, as a min­i­mum, be designed, con­structed, selected, assem­bled and com­bined in accor­dance with the rel­e­vant stan­dards and using basic safety prin­ci­ples for the spe­cific application…

Basic Safety Principles

We have to go to ISO 13849–2 to get a def­i­n­i­tion of what Basic Safety Principles might include. Looking at Annex A.2 of the stan­dard we find:

Table A.1 — Basic Safety Principles

Basic Safety PrinciplesRemarks
Use of suit­able mate­ri­als and ade­quate manufacturingSelection of mate­r­ial, man­u­fac­tur­ing meth­ods and treat­ment in rela­tion to, e. g. stress, dura­bil­ity, elas­tic­ity, fric­tion, wear,
cor­ro­sion, temperature.
Correct dimen­sion­ing and shapingConsider e. g. stress, strain, fatigue, sur­face rough­ness, tol­er­ances, stick­ing, manufacturing.
Proper selec­tion, com­bi­na­tion, arrange­ments, assem­bly and instal­la­tion of components/​systems.Apply manufacturer’s appli­ca­tion notes, e. g. cat­a­logue sheets, instal­la­tion instruc­tions, spec­i­fi­ca­tions, and use of good engi­neer­ing prac­tice in sim­i­lar components/​systems.
Use of de–energisation principleThe safe state is obtained by release of energy. See pri­mary action for stop­ping in EN 292–2:1991 (ISO/​TR 12100–2:1992), 3.7.1. Energy is sup­plied for start­ing the move­ment of a mech­a­nism. See pri­mary action for start­ing in EN 292–2:1991 (ISO/​TR 12100–2:1992), 3.7.1. 

Consider dif­fer­ent modes, e. g. oper­a­tion mode, main­te­nance mode.

This prin­ci­ple shall not be used in spe­cial appli­ca­tions, e. g. to keep energy for clamp­ing devices.

Proper fas­ten­ingFor the appli­ca­tion of screw lock­ing con­sider manufacturer’s appli­ca­tion notes.Overloading can be avoided by apply­ing ade­quate torque load­ing technology.
Limitation of the gen­er­a­tion and/​or trans­mis­sion of force and sim­i­lar parametersExamples are break pin, break plate, torque lim­it­ing clutch.
Limitation of range of envi­ron­men­tal parametersExamples of para­me­ters are tem­per­a­ture, humid­ity, pol­lu­tion at the instal­la­tion place. See clause 8 and con­sider
manufacturer’s appli­ca­tion notes.
Limitation of speed and sim­i­lar parametersConsider e. g. the speed, accel­er­a­tion, decel­er­a­tion required by the application
Proper reac­tion timeConsider e. g. spring tired­ness, fric­tion, lubri­ca­tion, tem­per­a­ture, iner­tia dur­ing accel­er­a­tion and decel­er­a­tion,
com­bi­na­tion of tolerances.
Protection against unex­pected start–upConsider unex­pected start-​​up caused by stored energy and after power “sup­ply” restora­tion for dif­fer­ent modes as
oper­a­tion mode, main­te­nance mode etc.
Special equip­ment for release of stored energy may be nec­es­sary.
Special appli­ca­tions, e. g. to keep energy for clamp­ing devices or ensure a posi­tion, need to be con­sid­ered
separately.
SimplificationReduce the num­ber of com­po­nents in the safety-​​related system.
SeparationSeparation of safety-​​related func­tions from other functions.
Proper lubri­ca­tion
Proper pre­ven­tion of the ingress of flu­ids and dustConsider IP rat­ing [see EN 60529 (IEC 60529)]

Download ISO Standards
As you can see, the basic safety prin­ci­ples are pretty basic — select com­po­nents appro­pri­ately for the appli­ca­tion, con­sider the oper­at­ing con­di­tions for the com­po­nents, fol­low manufacturer’s data, and use de-​​energization to cre­ate the stop func­tion. That way, a loss of power results in the sys­tem fail­ing into a safe state, as does an open relay coil or set of burnt contacts.

…the expected oper­at­ing stresses, e.g. the reli­a­bil­ity with respect to break­ing capac­ity and frequency,”

Specify your com­po­nents cor­rectly with regard to volt­age, cur­rent, break­ing capac­ity, tem­per­a­ture, humid­ity, dust,…

…other rel­e­vant exter­nal influ­ences, e.g. mechan­i­cal vibra­tion, elec­tro­mag­netic inter­fer­ence, power sup­ply inter­rup­tions or disturbances.”

Specific require­ments for elec­tro­mag­netic com­pat­i­bil­ity are found in the rel­e­vant prod­uct stan­dards, e.g. IEC 61800–3 for power drive sys­tems. For func­tional safety of SRP/​CS in par­tic­u­lar, the immu­nity require­ments are rel­e­vant. If no prod­uct stan­dard exists, at least the immu­nity require­ments of IEC 61000−6−2 should be followed.”

Probably the biggest ‘gotcha’ in this point is “elec­tro­mag­netic inter­fer­ence”. This is impor­tant enough that the stan­dard devotes a para­graph to it specif­i­cally. I added the bold text to high­light the idea of ‘func­tional safety’. You can find other infor­ma­tion in other posts on this blog on that topic. If your prod­uct is des­tined for the European Union (EU), then you will almost cer­tainly be doing some EMC test­ing, unless your prod­uct is a ‘fixed instal­la­tion’. If it’s going to almost any other mar­ket, you prob­a­bly are not under­tak­ing this test­ing. So how do you know if your design meets this cri­te­ria? Unless you test, you don’t. You can make some edu­cated guesses based on using sound engi­neer­ing prac­tices , but after that you can only hope.

Diagnostic Coverage

…There is no diag­nos­tic cov­er­age (DCavg = none) within cat­e­gory B systems…”

Category B sys­tems are fun­da­men­tally sin­gle chan­nel. A sin­gle fault in the sys­tem will lead to the loss of the safety func­tion. This sen­tence refers to the con­cept of “diag­nos­tic cov­er­age” that was intro­duced in ISO 13849–1:2007, but what this means in prac­tice is that there is no mon­i­tor­ing or feed­back from any crit­i­cal ele­ments. Remember our basic MCR cir­cuit? If the MCR con­tac­tor welded closed, the only diag­nos­tic was the fail­ure of the machine to stop when the emer­gency stop but­ton was pressed.

Component Failure Rates

…the MTTFd of each chan­nel can be low to medium.”

This part of the state­ment is refer­ring to another new con­cept from ISO 13849–1:2007, “MTTFd”. Standing for “Mean Time to Failure Dangerous”, this con­cept looks at the expected fail­ure rates of the com­po­nent in hours. Calculating MTTFd is a sig­nif­i­cant part of imple­ment­ing the new stan­dard. From the per­spec­tive of under­stand­ing Category B, what this means is that you do not need to use high-​​reliability com­po­nents in these systems.

Common Cause Failures

In such struc­tures (nor­mally single-​​channel sys­tems), the con­sid­er­a­tion of CCF is not relevant.”

CCF is another new con­cept from ISO 13849–1:2007, and stands for “Common Cause Failure”. I’m not going to get into this in any detail here, but suf­fice to say that design tech­niques, as well as chan­nel sep­a­ra­tion (impos­si­ble in a sin­gle chan­nel archi­tec­ture) and other tech­niques are used to reduce the like­li­hood of CCF in higher reli­a­bil­ity systems.

Performance Levels

The max­i­mum PL achiev­able with cat­e­gory B is PL = b.”

PL stands for “Performance Level”, divided into five degrees from ‘a’ to ‘e’. PLa is equal to an aver­age prob­a­bil­ity of dan­ger­ous fail­ure per hour of >= 10–5 to < 10–4 fail­ures per hour. PLb is equal to >= 3 × 10–6 to < 10–5 fail­ures per hour or once in 10,000 to 100,000 hours, to once in 3,000,000 hours of oper­a­tion. This sounds like a lot, but when deal­ing with prob­a­bil­i­ties, these num­bers are actu­ally pretty low.

If you con­sider an oper­a­tion run­ning a sin­gle shift in Canada where the nor­mal work­ing year is 50 weeks and the nor­mal work­day is 7.5 hours, a work­ing year is

7.5 h/​d x 5 d/​w x 50 w/​a = 1875 hours/​a

Taking the fail­ure rates per hour above, yields:

PLa = one fail­ure in 5.3 years of oper­a­tion to one fail­ure in 53.3 years

PLb = one fail­ure in 1600 years of operation

If we go to an oper­a­tion run­ning three shifts in Canada, a work­ing year is:

7.5 h/​shift x 3 shifts x 5 d/​w x 50 w/​a = 5625 hours/​a

Taking the fail­ure rates per hour above, yields:

PLa = one fail­ure in 1.8 years of oper­a­tion to one fail­ure in 17 years

PLb = one fail­ure in 533 years of operation

Now you should be start­ing to get an idea about where this is going. It’s impor­tant to remem­ber that prob­a­bil­i­ties are just that — the fail­ure could hap­pen in the first hour of oper­a­tion or at any time after that, or never. These fig­ures give you some way to gauge the rel­a­tive reli­a­bil­ity of the design, and ARE NOT any sort of guarantee.

Add to your Library

If you are work­ing on imple­ment­ing these design stan­dards in your prod­ucts, you need to buy copies of the stan­dards for your library.

  • ISO 13849–1:2006 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 1: General prin­ci­ples for design
  • ISO 13849–2:2003 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 2: Validation
  • ISO TR 13849–100:2000 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 100: Guidelines for the use and appli­ca­tion of ISO 13849–1

Download ISO Standards

If you are work­ing in the EU, or are work­ing on CE Marking your prod­uct, you should hold the har­mo­nized ver­sion of this stan­dard avail­able from one the the CEN resellers:

EN ISO 13849–1:2008 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 1: General prin­ci­ples for design

Watch for the next post in this series where I will look at Category 1 requirements!

Interlock Architectures – Pt. 2: Category 1

This entry is part 2 of 8 in the series Circuit Architectures Explored

In Part 1 of this series we explored Category B, the Basic Category that under­pins all of the other Categories.

This post builds on Part 1 by tak­ing a look at Category 1. Let’s start by explor­ing the dif­fer­ence as defined in ISO 13849–1. Remember that “SRP/​CS” stands for “Safety Related Parts of Control Systems”.

SRP/​CS of cat­e­gory 1 shall be designed and con­structed using well-​​tried com­po­nents and well-​​tried safety prin­ci­ples (see ISO 13849–2).

Well-​​Tried Components

So what, exactly, is a “Well-​​Tried Component”?? Let’s go back to the stan­dard for that:

A “well-​​tried com­po­nent” for a safety-​​related appli­ca­tion is a com­po­nent which has been either

a) widely used in the past with suc­cess­ful results in sim­i­lar appli­ca­tions, or
b) made and ver­i­fied using prin­ci­ples which demon­strate its suit­abil­ity and reli­a­bil­ity for safety-​​related applications.

Newly devel­oped com­po­nents and safety prin­ci­ples may be con­sid­ered as equiv­a­lent to “well-​​tried” if they ful­fil the con­di­tions of b).

The deci­sion to accept a par­tic­u­lar com­po­nent as being “well-​​tried” depends on the application.

NOTE 1 Complex elec­tronic com­po­nents (e.g. PLC, micro­proces­sor, application-​​specific inte­grated cir­cuit) can­not be con­sid­ered as equiv­a­lent to “well tried”.

Lets look at what this all means by ref­er­enc­ing ISO 13849–2:

Table A.3 — Well-​​Tried Components

Well-​​Tried ComponentsConditions for “well–tried”Standard or specification
ScrewAll fac­tors influ­enc­ing the screw con­nec­tion and the appli­ca­tion are to be con­sid­ered. See Table A.2 “List of well–tried safety principles”.Mechanical joint­ing such as screws, nuts, wash­ers, riv­ets, pins, bolts etc. are standardised.
SpringSee Table A.2 “Use of a well–tried spring”.Technical spec­i­fi­ca­tions for spring steels and other spe­cial appli­ca­tions are given in ISO 4960.
CamAll fac­tors influ­enc­ing the cam arrange­ment (e. g. part of an inter­lock­ing device) are to be con­sid­ered. See Table A.2 “List of well–tried safety principles”.See EN 1088 (ISO 14119) (Interlocking devices).
Break–pinAll fac­tors influ­enc­ing the appli­ca­tion are to be con­sid­ered. See Table A.2 “List of well-​​tried safety principles”.

OK, so now we have a few ideas about what might con­sti­tute a ‘well-​​tried com­po­nent’. Unfortunately, you will notice that ‘con­tac­tor’ or ‘relay’ or ‘limit switch’ appear nowhere on the list. This is a chal­lenge, but one that can be over­come. The key to deal­ing with this is to look at how the com­po­nents that you are choos­ing to use are con­structed. If they use these com­po­nents and tech­niques, you are on your way to con­sid­er­ing them to be well-​​tried.

Another approach is to let the com­po­nent man­u­fac­turer worry about the details of the con­struc­tion of the device, and sim­ply ensure that com­po­nents selected for use in the SRP/​CS are ‘safety rated’ by the man­u­fac­turer. This can work in 80–90% of cases, with a small per­cent­age of com­po­nents, such as large motor starters, some servo and step­per dri­ves and other sim­i­lar com­po­nents unavail­able with a safety rat­ing. It’s worth not­ing that many drive man­u­fac­tur­ers are start­ing to pro­duce dri­ves with built-​​in safety com­po­nents that are intended to be inte­grated into your SRP/​CS.

Exclusion of Complex Electronics

Note 1 from the first part of the def­i­n­i­tion is very impor­tant. So impor­tant that I’m going to repeat it here:

NOTE 1 Complex elec­tronic com­po­nents (e.g. PLC, micro­proces­sor, application-​​specific inte­grated cir­cuit) can­not be con­sid­ered as equiv­a­lent to “well tried”.

This lit­tle note is what pre­vents any safety sys­tem that incor­po­rates a stan­dard PLC from being con­sid­ered any­thing more than Category B, regard­less of redun­dancy and com­po­nent selec­tions for all other com­po­nents. Its also impor­tant to real­ize that this def­i­n­i­tion is only con­sid­er­ing the hard­ware — no men­tion of soft­ware is made here, and soft­ware is not dealt with until later in the standard.

Well-​​Tried Safety Principles

Let’s have a look at what ‘Well-​​Tried Safety Principles’ might be.

Table A.2 — Well-​​Tried Safety Principles

Well-​​tried Safety PrinciplesRemarks
Use of care­fully selected mate­ri­als and manufacturingSelection of suit­able mate­r­ial, ade­quate man­u­fac­tur­ing meth­ods and treat­ments related to the application.
Use of com­po­nents with ori­ented fail­ure modeThe pre­dom­i­nant fail­ure mode of a com­po­nent is known in advance and always the same, see EN 292–2:1991, (ISO/​TR 12100–2:1992), 3.7.4.
Over–dimensioning/safety fac­torThe safety fac­tors are given in stan­dards or by good expe­ri­ence in safety-​​related applications.
Safe posi­tionThe mov­ing part of the com­po­nent is held in one of the pos­si­ble posi­tions by mechan­i­cal means (fric­tion only is not enough). Force is needed for chang­ing the position.
Increased OFF forceA safe position/​state is obtained by an increased OFF force in rela­tion to ON force.
Careful selec­tion, com­bi­na­tion, arrange­ment, assem­bly and instal­la­tion of components/​system related to the application
Careful selec­tion of fas­ten­ing related to the applicationAvoid rely­ing only on friction.
Positive mechan­i­cal actionDependent oper­a­tion (e. g. par­al­lel oper­a­tion) between parts is obtained by pos­i­tive mechan­i­cal link(s). Springs and sim­i­lar “flex­i­ble” ele­ments should not be part of the link(s) [see EN 292–2:1991 (ISO/​TR 12100–2:1992), 3.5].
Multiple partsReducing the effect of faults by mul­ti­ply­ing parts, e. g. where a fault of one spring (of many springs) does not lead to a dan­ger­ous condition.
Use of well–tried spring (see also Table A.3)A well–tried spring requires:

  • use of care­fully selected mate­ri­als, man­u­fac­tur­ing meth­ods (e. g. pre­set­ting and cycling before use) and treat­ments (e. g. rolling and shot–peening),
  • suf­fi­cient guid­ance of the spring, and
  • suf­fi­cient safety fac­tor for fatigue stress (i. e. with high prob­a­bil­ity a frac­ture will not occur).

Well–tried pres­sure coil springs may also be designed by:

  • use of care­fully selected mate­ri­als, man­u­fac­tur­ing meth­ods (e. g. pre­set­ting and cycling before use) and treat­ments (e. g. rolling and shot-​​peening),
  • suf­fi­cient guid­ance of the spring, and
  • clear­ance between the turns less than the wire diam­e­ter when unloaded, and
  • suf­fi­cient force after a fracture(s) is main­tained (i. e. a fracture(s) will not lead to a dan­ger­ous condition).
Limited range of force and sim­i­lar parametersDecide the nec­es­sary lim­i­ta­tion in rela­tion to the expe­ri­ence and appli­ca­tion. Examples for lim­i­ta­tions are break pin, break plate, torque lim­it­ing clutch.
Limited range of speed and sim­i­lar parametersDecide the nec­es­sary lim­i­ta­tion in rela­tion to the expe­ri­ence and appli­ca­tion. Examples for lim­i­ta­tions are cen­trifu­gal gov­er­nor; safe mon­i­tor­ing of speed or lim­ited displacement.
Limited range of envi­ron­men­tal parametersDecide the nec­es­sary lim­i­ta­tions. Examples on para­me­ters are tem­per­a­ture, humid­ity, pol­lu­tion at the instal­la­tion. See clause 8 and con­sider manufacturer’s appli­ca­tion notes.
Limited range of reac­tion time, lim­ited hysteresisDecide the nec­es­sary lim­i­ta­tions.
Consider e. g. spring tired­ness, fric­tion, lubri­ca­tion, tem­per­a­ture, iner­tia dur­ing accel­er­a­tion and decel­er­a­tion,
com­bi­na­tion of tolerances.

Use of Positive-​​Mode Operation

The use of these prin­ci­ples in the com­po­nents, as well as in the over­all design of the safe­guards is impor­tant. In devel­op­ing a sys­tem that uses ‘pos­i­tive mode oper­a­tion’, the mechan­i­cal link­age that oper­ates the elec­tri­cal con­tacts or the fluid-​​power valve that con­trols the prime-mover(s) (i.e. motors, cylin­ders, etc.), must act to directly drive the con­trol ele­ment (con­tacts or valve spool) to the safe state. Springs can be used to return the sys­tem to the run state or dan­ger­ous state, since a fail­ure of the spring will result in the inter­lock device stay­ing in the safe state (fail-​​safe or fail-​​to-​​safety).

CSA Z432 pro­vides us with a nice dia­gram that illus­trates the idea of “positive-​​action” or “positive-​​mode” operation:

CSA Z432 Fig B.10 - Positive Mode Operation

CSA Z432-​​04 Fig B.10 — Positive Mode Operation

In Figure B.10, open­ing the guard door forces the roller to fol­low the cam attached to the door, dri­ving the switch con­tacts apart and open­ing the inter­lock. Even if the con­tacts were to weld, they would still be dri­ven apart since the mechan­i­cal advan­tage pro­vided by the width of the door and the cam are more than enough to force the con­tacts apart.

Here’s an exam­ple of a ‘neg­a­tive mode’ operation:

CSA Z432-04 Fig B.11 - Negative Mode operation

CSA Z432-​​04 Fig B.11 — Negative Mode operation

In Figure B.11, the inter­lock switch relies on a spring to enter the safe state when the door is opened. If the spring in the inter­lock device fails, the sys­tem fails-​​to-​​danger. Also note that this design is very easy to defeat. A ‘zip-​​tie’ or some tape is all that would be required to keep the inter­lock in the ‘RUN’ condition.

You should have a bet­ter idea of what is meant when you read about pos­i­tive and negative-​​modes of oper­a­tion now. We’ll talk about defeat resis­tance in another article.

Reliability

Combining what you’ve learned so far, you can see that cor­rectly spec­i­fied com­po­nents, com­bined with over-​​dimensioning and imple­men­ta­tion of design lim­its along with the use of well-​​tried safety prin­ci­ples will go a long way to improv­ing the reli­a­bil­ity of the con­trol sys­tem. The next part of the def­i­n­i­tion of Category 1 speaks to some addi­tional requirements:

The MTTFd of each chan­nel shall be high.

The max­i­mum PL achiev­able with cat­e­gory 1 is PL = c.

NOTE 2 There is no diag­nos­tic cov­er­age (DCavg = none) within cat­e­gory 1 sys­tems. In such struc­tures (single-​​channel sys­tems) the con­sid­er­a­tion of CCF is not relevant.

NOTE 3 When a fault occurs it can lead to the loss of the safety func­tion. However, the MTTFd of each chan­nel in cat­e­gory 1 is higher than in cat­e­gory B. Consequently, the loss of the safety func­tion is less likely.

We now know that the con­trol reli­a­bil­ity is bet­ter with a Category 1 sys­tem than with a B, since the MTTFd of the sys­tem has gone from a max­i­mum of ‘b’ to ‘c’. PLc >= 10–6 to < 3 x 10–6 fail­ures per hour. This is a pretty good result for sim­ply improv­ing the com­po­nents used in the system!

To get a han­dle on what PLc means, let’s look at our sin­gle and three shift exam­ples again. If we take a Canadian oper­a­tion with a sin­gle shift per day, and a 50 week work­ing year we get:

7.5 h/​shift x 5 d/​w x 50 w/​a = 1875 h/​a

In this case, PLc is equiv­a­lent to one fail­ure in 533.3 years of oper­a­tion to 1600 years of operation.

Looking at three shifts per day in the same oper­a­tion gives us:

7.5 h/​shift x 3 shifts/​d x 5 d/​w x 50 w/​a = 5625 h/​a

In this case, PLc is equiv­a­lent to one fail­ure in 177.8 years of oper­a­tion to 533.3 years of operation.

Remember that these are prob­a­bil­i­ties, not guar­an­tees. A fail­ure could hap­pen in the first hour of oper­a­tion, the last hour of oper­a­tion or never. These fig­ures sim­ply pro­vide a way for you as the designer to gauge the rel­a­tive reli­a­bil­ity of the system.

Well-​​Tried Components ver­sus Fault Exclusions

The stan­dard goes on to out­line some key dis­tinc­tions between ‘well-​​tried com­po­nent’ and ‘fault exclu­sion’. We’ll talk more about fault exclu­sions later in the series.

It is impor­tant that a clear dis­tinc­tion between “well-​​tried com­po­nent” and “fault exclu­sion” (see Clause 7) be made. The qual­i­fi­ca­tion of a com­po­nent as being well-​​tried depends on its appli­ca­tion. For exam­ple, a posi­tion switch with pos­i­tive open­ing con­tacts could be con­sid­ered as being well-​​tried for a machine tool, while at the same time as being inap­pro­pri­ate for appli­ca­tion in a food indus­try — in the milk indus­try, for instance, this switch would be destroyed by the milk acid after a few months. A fault exclu­sion can lead to a very high PL, but the appro­pri­ate mea­sures to allow this fault exclu­sion should be applied dur­ing the whole life­time of the device. In order to ensure this, addi­tional mea­sures out­side the con­trol sys­tem may be nec­es­sary. In the case of a posi­tion switch, some exam­ples of these kinds of mea­sures are

  • means to secure the fix­ing of the switch after its adjustment,
  • means to secure the fix­ing of the cam,
  • means to ensure the trans­verse sta­bil­ity of the cam,
  • means to avoid over travel of the posi­tion switch, e.g. ade­quate mount­ing strength of the shock absorber and any align­ment devices, and
  • means to pro­tect it against dam­age from outside.

System Block Diagram

Finally, Here is the block dia­gram for Category 1, which looks the same as that for Category B, since only the com­po­nents used in the sys­tem have changed, and not the architecture.

ISO 13849-1 Figure 9

ISO 13849–1 Figure 9 — Category 1 Block Diagram

 

Add to your Library

If you are work­ing on imple­ment­ing these design stan­dards in your prod­ucts, you need to buy copies of the stan­dards for your library.

  • ISO 13849–1:2006 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 1: General prin­ci­ples for design
  • ISO 13849–2:2003 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 2: Validation
  • ISO TR 13849–100:2000 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 100: Guidelines for the use and appli­ca­tion of ISO 13849–1

Download IEC stan­dards, International Electrotechnical Commission standards.

If you are work­ing in the EU, or are work­ing on CE Marking your prod­uct, you should hold the har­mo­nized ver­sion of this stan­dard, avail­able through the CEN resellers:

EN ISO 13849–1:2008 Safety of machin­ery — Safety-​​related parts of con­trol sys­tems — Part 1: General prin­ci­ples for design

Next Installment

Watch for the next part of this series, “Interlock Architectures – Pt. 3: Category 2″ where we expand on the first two cat­e­gories by adding some diag­nos­tic cov­er­age to improve reliability.

Have ques­tions? Email me!

Interlock Architectures – Pt. 3: Category 2

ISO 13849-1 Figure 10
This entry is part 3 of 8 in the series Circuit Architectures Explored

In the first two posts in this series, we looked at Category B, the Basic cat­e­gory of sys­tem archi­tec­ture, and then moved on to look at Category 1. Category B under­pins Categories 2, 3 and 4. In this post we’ll look more deeply into Category 2.

Let’s start by look­ing at the def­i­n­i­tion for Category 2, taken from ISO 13849–1:2007. Remember that in these excerpts, SRP/​CS stands for Safety Related Parts of Control Systems.

Definition

6.2.5 Category 2

For cat­e­gory 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­gory B shall apply. “Well–tried safety prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

SRP/​CS of cat­e­gory 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safety function(s) shall be performed

  • at the machine start-​​up, and
  • prior to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of other move­ments, and/​or
  • peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is necessary.

The ini­ti­a­tion of this check may be auto­matic. Any check of the safety function(s) shall either

  • allow oper­a­tion if no faults have been detected, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detected.

Whenever pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the haz­ard.

For the des­ig­nated archi­tec­ture of cat­e­gory 2, as shown in Figure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tional chan­nel (i.e. I, L and O in Figure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Figure 10).

The diag­nos­tic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​​detection shall be low. The MTTFd of each chan­nel shall be low-​​to-​​high, depend­ing on the required per­for­mance level (PLr). Measures against CCF shall be applied (see Annex F).

The check itself shall not lead to a haz­ardous sit­u­a­tion (e.g. due to an increase in response time). The check­ing equip­ment may be inte­gral with, or sep­a­rate from, the safety-​​related part(s) pro­vid­ing the safety function.

The max­i­mum PL achiev­able with cat­e­gory 2 is PL = d.

NOTE 1 In some cases cat­e­gory 2 is not applic­a­ble because the check­ing of the safety func­tion can­not be applied to all components.

NOTE 2 Category 2 sys­tem behav­iour allows that

  • the occur­rence of a fault can lead to the loss of the safety func­tion between checks,
  • the loss of safety func­tion is detected by the check.

NOTE 3 The prin­ci­ple that sup­ports the valid­ity of a cat­e­gory 2 func­tion is that the adopted tech­ni­cal pro­vi­sions, and, for exam­ple, the choice of check­ing fre­quency can decrease the prob­a­bil­ity of occur­rence of a dan­ger­ous situation.

ISO 13849-1 Figure 10

ISO 13849–1 Figure 10 — Category 2 Block diagram

 

Breaking it down

Let start by tak­ing apart the def­i­n­i­tion a piece at a time and look­ing at what each part means. I’ll also show a sim­ple cir­cuit that can meet the requirements.

Category B & Well-​​tried Components

The first para­graph speaks to the build­ing block approach taken in the standard:

For cat­e­gory 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­gory B shall apply. “Well–tried safety prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

Systems meet­ing Category 2 are required to meet all of the same require­ments as Category B, as far as the com­po­nents are con­cerned. Other require­ments for the cir­cuits are dif­fer­ent, and we will look at those in a bit.

Self-​​Testing required

Category 2 brings in the idea of diag­nos­tics. If cor­rectly spec­i­fied com­po­nents have been selected (Category B), and those com­po­nents can be con­sid­ered ‘well-​​tried’ and are applied fol­low­ing ‘well-​​tried safety prin­ci­ples’, then adding a diag­nos­tic com­po­nent to the sys­tem should allow the sys­tem to detect some faults and there­fore achieve a cer­tain degree of ‘fault-​​tolerance’ or the abil­ity to func­tion cor­rectly even when some aspect of the sys­tem has failed.

Let’s look at the text:

SRP/​CS of Category 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safety function(s) shall be performed

  • at the machine start-​​up, and
  • prior to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of other move­ments, and/​or
  • peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is necessary.

The ini­ti­a­tion of this check may be auto­matic. Any check of the safety function(s) shall either

  • allow oper­a­tion if no faults have been detected, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detected.

Whenever pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the hazard.

Periodic check­ing is required. The checks must hap­pen at least each time there is a demand placed on the sys­tem, i.e. a guard door is opened and closed, or an emer­gency stop but­ton is pressed and reset. In addi­tion the integrity of the SRP/​CS must be tested at the start of a cycle or haz­ardous period, and poten­tially peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment indi­cates that this is necessary.

The test­ing does not have to be auto­matic, although in prac­tice it usu­ally is. As long as the sys­tem integrity is good, then the out­put is allowed to remain on, and the machin­ery or process can run.

Watch Out!

Notice that the words ‘when­ever pos­si­ble’ are used in the last para­graph in this part of the def­i­n­i­tion where the stan­dard speaks about ini­ti­a­tion of a safe state. This word­ing alludes to the fact that these sys­tems are still prone to faults that can lead to the loss of the safety func­tion, and so can­not be called truly ‘fault-​​tolerant’. Loss of the safety func­tion must be detected by the mon­i­tor­ing sys­tem and a safe state ini­ti­ated. This requires care­ful thought, since the safety sys­tem com­po­nents may have to inter­act with the process con­trol sys­tem to ini­ti­ate and main­tain the safe state in the event that the safety sys­tem itself has failed.

All of this leads to an inter­est­ing ques­tion: If the sys­tem is hard­wired through the oper­at­ing chan­nel, and all the com­po­nents used in that chan­nel meet Category B require­ments, can the diag­nos­tic com­po­nent be pro­vided by a mon­i­tor­ing the sys­tem with a stan­dard PLC?

Unfortunately, the answer to this is NO. This is true because ALL of the com­po­nents must meet the well-​​tried require­ment, and since pro­gram­ma­ble elec­tron­ics are specif­i­cally excluded from being con­sid­ered well-​​tried, this approach can­not be used. Some North American stan­dards are writ­ten so that this approach could be applied, but under the International and EU require­ments it is not acceptable.

Finally, for the faults that can be detected by the mon­i­tor­ing sys­tem, detec­tion of a fault must ini­ti­ate a safe state. This means that on the next demand on the sys­tem, i.e. the next time the guard is opened or the emer­gency stop is pressed, the machine must go into a safe con­di­tion. Generally, detec­tion of a fault should pre­vent the sub­se­quent reset of the sys­tem until the fault is cleared or repaired.

Testing is not per­mit­ted to intro­duce any new haz­ards or to slow the sys­tem down. The tests must occur ‘on-​​the-​​fly’ and with­out intro­duc­ing any delay in the sys­tem com­pared to how it would have oper­ated with­out the test­ing incor­po­rated. Test equip­ment can be inte­grated into the safety sys­tem or be exter­nal to it.

One more ‘gotcha’

Note 1 in the def­i­n­i­tion high­lights a sig­nif­i­cant pit­fall for many design­ers: if all of the com­po­nents in the func­tional chan­nel of the sys­tem can­not be checked, you can­not claim con­for­mity to Category 2. A sys­tem that oth­er­wise would meet the archi­tec­tural require­ments for Category 2 must be down­graded to Category 1 in cases where all the com­po­nents in the func­tional chan­nel can­not be tested. This is a major point and one which many design­ers miss when devel­op­ing their systems.

Calculation of MTTFd

The next para­graph deals with the cal­cu­la­tion of the fail­ure rate of the sys­tem, or MTTFd.

For the des­ig­nated archi­tec­ture of cat­e­gory 2, as shown in Figure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tional chan­nel (i.e. I, L and O in Figure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Figure 10).

Calculation of the fail­ure rate focuses on the func­tional chan­nel, not on the mon­i­tor­ing sys­tem, mean­ing that the fail­ure rate of the mon­i­tor­ing sys­tem is ignored when ana­lyz­ing sys­tems using this archi­tec­ture. The MTTFd of each com­po­nent in the func­tional chan­nel is cal­cu­lated and then the MTTFd of the total chan­nel is calculated.

The Diagnostic Coverage (DCavg) is also cal­cu­lated based exclu­sively on the com­po­nents in the func­tional chan­nel, so when deter­min­ing what per­cent­age of the faults can be detected by the mon­i­tor­ing equip­ment, only faults in the func­tional chan­nel are considered.

This high­lights the fact that a fail­ure of the mon­i­tor­ing sys­tem can­not be detected, so a sin­gle fail­ure in the mon­i­tor­ing sys­tem that results in the sys­tem fail­ing to detect a sub­se­quent nor­mally detectable fail­ure in the func­tional chan­nel will result in the loss of the safety function.

Summing Up

The next para­graph sums up the lim­its of this par­tic­u­lar architecture:

The diag­nos­tic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​​detection shall be low. The MTTFd of each chan­nel shall be low-​​to-​​high, depend­ing on the required per­for­mance level (PLr). Measures against CCF shall be applied (see Annex F).

The first sen­tence reflects back to the pre­vi­ous para­graph on diag­nos­tic cov­er­age, telling you, as the designer, that you can­not make a claim to any­thing more than LOW DC cov­er­age when using this architecture.

This raises an inter­est­ing ques­tion, since Figure 5 in the stan­dard shows columns for both DCavg = LOW and DCavg=MED. My best advice to you as a user of the stan­dard is to abide by the text, mean­ing that you can­not claim higher than LOW for DCavg in this architecture.

Another prob­lem raised by this sen­tence is the inclu­sion of the phrase “the total SRP/​CS includ­ing fault-​​detection”, since the pre­vi­ous para­graph explic­itly tells you that the assess­ment of DCavg ‘should’ only include the func­tional chan­nel, while this sen­tence appears to include it. In stan­dards writ­ing, sen­tences includ­ing the word ‘shall’ are clearly manda­tory, while those includ­ing the word ‘should’ indi­cate a con­di­tion which is advised but not required. Hopefully this con­fu­sion will be clar­i­fied in the next edi­tion of the standard.

Failure rates in the func­tional chan­nel can be any­where in the range from LOW to HIGH depend­ing on the com­po­nents selected and the way they are applied in the design. The require­ment will be dri­ven by the desired PL of the sys­tem, so a PLd sys­tem will require HIGH MTTFd com­po­nents in the func­tional chan­nel, while the same archi­tec­ture used for a PLb sys­tem would require only LOW MTTFd com­po­nents.
Finally, applic­a­ble mea­sures against Common Cause Failures (CCF) must be used. Some of the mea­sures given in Table F.1 in Annex F of the stan­dard can­not be applied, such as Channel Separation, since you can­not sep­a­rate a sin­gle chan­nel. Other CCF mea­sures can and must be applied, and so there­fore you must score at least the min­i­mum 65 on the CCF table in Annex F to claim com­pli­ance with Category 2 requirements.

Example Circuit

Here’s an exam­ple of what a sim­ple Category 2 cir­cuit con­structed from dis­crete com­po­nents might look like. Note that PB1 and PB2 could just as eas­ily be inter­lock switches on guard doors as push but­tons on a con­trol panel. For the sake of sim­plic­ity, I did not illus­trate surge sup­pres­sion on the relays, but you should include MOV’s or RC sup­pres­sors across all relay coils. All relays are con­sid­ered to be con­structed with  ‘force-​​guided’ designs and meet the require­ments for well-​​tried components.

Example Category 2 circuit from discrete components

Example Example Category 2 cir­cuit from dis­crete components

Here is how the cir­cuit works:

  1. The machine is stopped with power off. CR1, CR2, and M are off. CR3 is off until the reset but­ton is pressed, since the NC mon­i­tor­ing con­tacts on CR1, CR2 and M are all closed, but the NO reset push but­ton con­tact is open.
  2. The reset push but­ton, PB3,  is pressed. If both CR1, CR2 and M are off, their nor­mally closed con­tacts will be closed, so press­ing PB3 will result in CR3 turn­ing on.
  3. CR3 closes its con­tacts, ener­giz­ing CR1 and CR2 which seal their con­tact cir­cuits in and de-​​energize CR3. The time delays inher­ent in relays per­mit this to work.
  4. With CR1 and CR2 closed and CR3 held off because its coil cir­cuit opened when CR1 and CR2 turned on, M ener­gizes and motion can start.

In this cir­cuit the mon­i­tor­ing func­tion is pro­vided by CR3. If any of CR1, CR2 or M were to weld closed, CR3 could not ener­gize, and so a sin­gle fault is detected and the machine is pre­vented from re-​​starting. If the machine is stopped by press­ing either PB1 or PB2, the machine will stop since CR1 and CR2 are redun­dant. If CR3 fails, then the M rung is all held open because CR3 has not de-​​energized, pre­vent­ing the machine from start­ing with a failed mon­i­tor­ing sys­tem. If CR1 or CR2 fail with an open coil, then M can­not ener­gize because of the redun­dant con­tacts on the M rung.

This cir­cuit can­not detect a fail­ure in PB1, PB2, or PB3. Testing is con­ducted each time the cir­cuit is reset.

If M is a motor starter rather than the motor itself, it will need to be dupli­cated for redun­dancy and a mon­i­tor­ing con­tact added to the CR3 rung unless a rea­son­able case for fault exclu­sion can be made.

In cal­cu­lat­ing MTTFd, PB1, PB2, CR1, CR2, CR3 and M must be included. CR3 is included because it has a func­tional con­tact in the M rung and is there­fore part of the func­tional chan­nel of the cir­cuit as well as being part of the OT and OTE channels.

Download IEC stan­dards, International Electrotechnical Commission stan­dards.
Download ISO Standards

Watch for the next install­ment in this series where we’ll explore Category 3, the first of the ‘fault tol­er­ant’ architectures!

All original content on these pages is fingerprinted and certified by Digiprove
Performance Optimization WordPress Plugins by W3 EDGE