Category Archives: Functional Safety - Page 3

Missing MTTFd data

What the heck is MTTFd???

When you first start to work through ISO 13849–1, the first thing that will smack you in the head is the plethora of new acronyms. The first one you’ll run into is ‘PL’, of course, since the entire pur­pose of the stan­dard is to aid the designer in deter­min­ing the reli­a­bil­ity Performance Level of the con­trol sys­tem. Shortly after that you’ll find your­self face to face with MTTFd.

MTTFd, or the Mean Time To Failure (dan­ger­ous), is the name given to the expected fail­ure rate per year for a com­po­nent used in a sys­tem that is being ana­lyzed. This rate dif­fers from the straight fail­ure rate for the com­po­nent because it’s lim­ited to the fail­ures that result in a dan­ger­ous fail­ure mode, or that may lead to a haz­ard.

So how do you get this data?

Obtaining MTTFd data for a com­po­nent should be easy for a designer. Component man­u­fac­tur­ers who mar­ket com­po­nents intended for safety appli­ca­tions should pro­vide this data in the com­po­nent spec­i­fi­ca­tions, but there are thou­sands, per­haps mil­lions, of dif­fer­ent com­po­nents being mar­keted today for use in safety sys­tems. Most of the major man­u­fac­tur­ers are already pro­vid­ing this fig­ure, or a fig­ure that can be used to derive MTTFd, B10d, but for many com­po­nents, this data is sim­ply not available.

Here are some ran­domly cho­sen exam­ples of manufacturer’s spec­i­fi­ca­tion sheets that give this data:

Allen-​​Bradley Trojan™ T15 Interlock Switch

Pilz PNOZ X2 (pdf data sheet)

Telemecanique XPS MP Safety Controller (pdf data sheet)

B10d is the num­ber of cycles until 10% of the com­po­nents being tested fail in a dan­ger­ous way. Using fail­ure rate data from the component’s data sheet, it is pos­si­ble to esti­mate B10d from either B10 or T (the appli­ca­tion depen­dent life­time of the com­po­nent). Check out Annex C of the stan­dard if you want to see how this can be done.

But what do you do if the man­u­fac­turer of your favourite con­tac­tor doesn’t pro­vide ANY fail­ure data? Some major man­u­fac­tur­ers still don’t pro­vide any fail­ure rate data at all, some pro­vide expected life­times under spe­cific oper­a­tion con­di­tions. Some pro­vide only EN 954–1:95 data. In the last case, I think this is one of the rea­sons for the EC Machinery Working Group’s deci­sion late last year to extend the tran­si­tion period to ISO 13849–1:07. Need to know more about that decision?

Now what?

Unless you work for a large orga­ni­za­tion, insti­tut­ing a life test­ing pro­gram is not likely to be an option, since you either need a pro­tracted period of time with a few com­po­nents in test, or thou­sands of sam­ples for a short time.

The stan­dard pro­vides the option to use 10 years as a default where no other data is avail­able. 10 years sounds like a long time at first blush, par­tic­u­larly if the planned life­time of the sys­tem involved is 20 years. Typical MTTFd val­ues for high-​​reliability com­po­nents are in the hun­dreds of years, so by com­par­i­son, 10 years is almost noth­ing. Tables are also pro­vided for some kinds of com­po­nents, but the tables are nec­es­sar­ily lim­ited in size, so not every com­po­nent will be listed.

Your only option is to use the data in the stan­dard, or pick up some of the other pub­li­ca­tions that include com­po­nent fail­ure data, like MIL-​​HDBK-​​217F, IEC/​TR 62380 (based on UTE 80810 & RDF 2000), NPRD 95 or IEC 61709 (based on Siemens SN 29500 doc­u­ments). Some of these doc­u­ments may be dif­fi­cult or impos­si­ble to obtain.

The result of this lack of objec­tive data from the com­po­nent man­u­fac­tur­ers is:

  • Conservative results based on the min­i­mum default MTTFd;
  • Potential over-​​design of safety related controls;
  • Increased man­u­fac­tur­ing costs for machine builders;

The rea­sons for this sit­u­a­tion vary by man­u­fac­turer, but ulti­mately it comes down to the cost of life test­ing com­po­nents mul­ti­plied by num­ber of com­po­nents built by each man­u­fac­turer. Typical life tests require load sim­u­la­tors and switch­ing for thou­sands of com­po­nents, as well as data log­ging to trap fail­ures and record rel­e­vant data. In the case of fluid power com­po­nents (pneu­mat­ics and hydraulics), this becomes increas­ingly com­plex. For many com­po­nent man­u­fac­tur­ers, the cost of the life test­ing is pro­hib­i­tive, even though this data is badly needed by their users.

Will we see an improve­ment in the future? The largest con­trols com­po­nent man­u­fac­tur­ers are very likely to pro­vide this data as they have it avail­able, mean­ing as they com­plete test­ing. New designs are much more likely to come with this data ini­tially, while it may be a long time before some of the old stan­dard com­po­nents get time in the life test cell. Until then, lots of com­po­nents will be assigned ’10 years’.

A big thank you to Wouter Leusden for the idea for this post!

Have a thought to share on this topic? Correct an error in the arti­cle? Sound off? Leave a comment!

IEC/​TR 62061–1 Reviewed

This entry is part 2 of 2 in the series IEC/​TR 62061–1

Why You Need to Spend More Cash on Yet Another Document

Standards orga­ni­za­tions pub­lish doc­u­ments in a fairly con­tin­u­ous stream, so for those of us tasked with stay­ing cur­rent with a large num­ber of stan­dards (say, more than 10), the pub­li­ca­tion of another new stan­dard or Technical Report isn’t news — it’s busi­ness as usual. The ques­tion is always: Do we really need to add this to the library?

For those who are new to this busi­ness, hav­ing to pay for crit­i­cal design infor­ma­tion is a new expe­ri­ence. Finding out that it can cost hun­dreds, if not thou­sands, to build the library you need can be overwhelming.

This review aims to help you decide if you need IEC/​TR 62061–1 in your library.

The Problem

As a machine builder or a man­u­fac­turer build­ing a prod­uct designed to be inte­grated into machin­ery, how do you choose between ISO 13849–1 and IEC 62061?

IEC 62061–1 attempts to pro­vide guid­ance on how to make this choice.

History

When CENELEC pub­lished EN 954–1 in 1995, machine builders were intro­duced to a whole new world of con­trol reli­a­bil­ity require­ments. Prior to its pub­li­ca­tion, most machines were built with very sim­ple inter­locks, and no spe­cific stan­dards for inter­lock­ing devices existed. In the years since then, the EN 954–1 Categories have become well known and are applied inside and out­side the EU.

In the inter­ven­ing years, IEC pub­lished IEC 61508. This seven-​​part stan­dard intro­duced the idea of ‘Safety Integrity  Levels’ or SILs. This stan­dard is aimed at process con­trol sys­tems and could be used for com­plex machin­ery as well.

Why the Confusion?

In 2006, IEC pub­lished a machin­ery sec­tor spe­cific stan­dard based on IEC 61508, called IEC 62061. This stan­dard offered a sim­pli­fied appli­ca­tion of the IEC 61508 method­ol­ogy intended for machine builders. The key prob­lem with this stan­dard is that it did not pro­vide a means to deal with pneu­matic or hydraulic con­trol ele­ments, which are cov­ered by ISO 13849–1.

ISO adopted EN 954–1 and reis­sued it as ISO 13849–1 in 1999. This edi­tion of the stan­dard was vir­tu­ally iden­ti­cal to the stan­dard it replaced from a tech­ni­cal require­ments per­spec­tive. EN 954–1/ISO 13849–1 did not pro­vide any means to esti­mate the integrity of the safety related con­trols, but did define cir­cuit archi­tec­tures (Categories B, 1–4) and spoke to the selec­tion of com­po­nents, intro­duc­ing the con­cepts of ‘well-​​tried safety prin­ci­ples’ and ‘well-​​tried com­po­nents’. A sec­ond prob­lem had long existed in addi­tion to this — EN 954–2, Validation, was never pub­lished by CENELEC except as a com­mit­tee draft, so a key ele­ment in the appli­ca­tion of the stan­dard had been miss­ing for five years at the point where ISO 13849–1 Edition 1 was published.

The first cut at guid­ing users in choos­ing an appro­pri­ate stan­dard came with the pub­li­ca­tion of IEC 62061 Edition 1.  Published in 2005, Edition 1 included a table that attempted to pro­vide users with some guid­ance on how to choose between ISO 13849–1 or IEC 62061.

…and then came 2007…

In 2007, ISO pub­lished the Second Edition of ISO 13849–1, and brought a whole new twist to the dis­cus­sion by intro­duc­ing ‘Performance Levels’ or PLs. PLs can be loosely equated to SILs, even though PLs are stated in fail­ures per year and SILs in fail­ures per hour. The same table included in IEC 62061 was included in this edi­tion of ISO 13849–1.

Table 1
Recommended appli­ca­tion of
IEC 62061 and ISO 13849–1(under revision)

(from the Second Edition, 2007)

Technology imple­ment­ing the
safety related con­trol function(s)
ISO
13849–1 (under revision)
IEC 62061
ANon elec­tri­cal, e.g. hydraulicsXNot cov­ered
BElectromechanical, e.g. relays, or
non-​​complex electronics
Restricted to des­ig­nated
archi­tec­tures (see Note 1) and up to PL=e

All archi­tec­tures and up to
SIL 3

CComplex elec­tron­ics, e.g. programmableRestricted to des­ig­nated
archi­tec­tures (see Note 1) and up
to PL=d
All archi­tec­tures and up to
SIL 3
DA com­bined with BRestricted to des­ig­nated
archi­tec­tures (see Note 1) and up
to PL=e
X
see Note 3
EC com­bined with BRestricted to des­ig­nated
archi­tec­tures (see Note 1) and up
to PL=d
All archi­tec­tures and up to
SIL 3
FC com­bined with A, or C com­bined with
A and B
X
see Note 2
X
see Note 3

X” indi­cates that this item is dealt with by the stan­dard shown in the col­umn heading.

NOTE 1 Designated archi­tec­tures are defined in Annex B of EN ISO 13849–1(rev.) to give a sim­pli­fied approach for quan­tifi­ca­tion of per­for­mance level.

NOTE 2 For com­plex elec­tron­ics: Use of des­ig­nated archi­tec­tures accord­ing to EN ISO 13849–1(rev.) up to PL=d or any archi­tec­ture accord­ing to IEC 62061.

NOTE 3 For non-​​electrical tech­nol­ogy use parts accord­ing to EN ISO 13849–1(rev.) as subsystems.

So how is a machine builder to choose the ‘cor­rect’ stan­dard, if both stan­dards are applic­a­ble and both are cor­rect? Furthermore, how do you assess the reli­a­bil­ity of the safety-​​related con­trols when inte­grat­ing equip­ment from var­i­ous sup­pli­ers, some of whom rate their equip­ment in PLs and some in SILs? Why are two stan­dards address­ing the same topic required? Will ISO 13849–1 and IEC 62061 ever be merged?

The Technical Report

In July this year the IEC pub­lished a Technical Report that dis­cusses the selec­tion and appli­ca­tion of these two key con­trol reli­a­bil­ity stan­dards for machine builders. This guide has long been needed, and pre­cedes a face to face event planned by IEC to bring machine builders and stan­dards writ­ers face-​​to-​​face to dis­cuss these same issues.

The guide, titled IEC/​TR 62061–1 — Technical Report — Guidance on the appli­ca­tion of ISO 13849–1 and IEC 62061 in the design of safety-​​related con­trol sys­tems for machin­ery pro­vides direct guid­ance on how to select between these two standards.

Download IEC stan­dards, International Electrotechnical Commission standards.

Merger

In the intro­duc­tion to the report the TC makes it clear that the stan­dards will be merged, although they don’t pro­vide any kind of a time line for the merger. Quoting from the introduction:

It is intended that this Technical Report be incor­po­rated into both IEC 62061 and ISO 13849–1 by means of cor­ri­genda that ref­er­ence the pub­lished ver­sion of this doc­u­ment. These cor­ri­genda will also remove the infor­ma­tion given in Table 1, Recommended appli­ca­tion of IEC 62061 and ISO 13849–1, pro­vided in the com­mon intro­duc­tion to both stan­dards, which is now rec­og­nized as being out of date. Subsequently, it is intended to merge ISO 13849–1 and IEC 62061 by means of a JWG of ISO/​TC 199 and IEC/​TC 44.

I added the bold face to the para­graph above to high­light the key state­ment regard­ing the even­tual merger of the two doc­u­ments.  If you’re not famil­iar with the stan­dards acronyms, a ‘JWG’ is a Joint Working Group, and a TC is a Technical Committee. TC’s are formed from vol­un­teer experts from indus­try and acad­e­mia sup­ported by their orga­ni­za­tions. So a JWG formed from two TC’s just means that a joint com­mit­tee has been formed to work out the details of the merger. Eventually.

The other key point in this para­graph relates to the replace­ment of Table 1. In the interim, IEC/​TR 62061–1 will be incor­po­rated into both stan­dards, replac­ing Table 1.

Eventually the con­fu­sion will be cleared up because only one stan­dard will exist in the machin­ery sec­tor, but until then, machine builders will need to fig­ure out which stan­dard best fits their products.

Comparing PL’s and SIL’s

The Technical Report does a good job of dis­cussing the dif­fer­ences between PL and SIL, includ­ing pro­vid­ing an expla­na­tion of how to covert one to the other, very use­ful if you are try­ing to inte­grate an SIL rated device into a PL analy­sis or vice-​​versa.

Selecting a Standard

Clause 2.5 gives some solid advice on select­ing between the two stan­dards based on the tech­nolo­gies employed in the design and your own com­fort level in using the ana­lyt­i­cal tech­niques in the two standards.

Another key point is that EITHER stan­dard can be used to ana­lyze com­plex OR sim­ple con­trol sys­tems. Some fans of IEC 62061 have been known to put ISO 13849–1 down as use­ful exclu­sively for sim­ple hard­wired con­trol sys­tems. Clause 3.3 makes it clear that this is not the case. Pick the one you like or know the best and go with that. As an addi­tional thought, con­sider which stan­dard your com­peti­tors are using, and also which your cus­tomers are using. For exam­ple, if your cus­tomers use ISO 13849–1 pri­mar­ily, qual­i­fy­ing your prod­uct under IEC 62061 might seem like a good idea, but may drive your cus­tomers to a com­peti­tor who makes their life eas­ier by using ISO 13849–1. If your com­peti­tors are using a dif­fer­ent stan­dard, try to under­stand the choice before climb­ing on the band­wagon. There may be a com­pet­i­tive advan­tage lurk­ing in being different.

Risk Assessment

Clause 4 speaks directly to the indis­pens­able need to con­duct a method­i­cal risk assess­ment, and to use that to guide the design of the controls.

In my prac­tice, many clients decide that they would pre­fer to choose a con­trol reli­a­bil­ity level that they feel will be more than good enough for any of their designs, and then to ‘stan­dard­ize’ on that design for all their prod­ucts, thereby elim­i­nat­ing the need to thought­fully decide on the appro­pri­ate design for the appli­ca­tion. In other cases, end-​​users may choose to use a ‘stan­dard’ design through­out their facil­ity to assist main­te­nance per­son­nel by lim­it­ing their need to become tech­ni­cally famil­iar with a vari­ety of designs. This is done to speed trou­bleshoot­ing and reduce down time and spares stocks.

The prob­lem with this approach can be that some man­agers believe this approach can elim­i­nate the need to con­duct risk assess­ments, see­ing this as a fruit­less, expen­sive and often futile exer­cise. This is emphat­i­cally NOT the case. Risk assess­ments address much more than the selec­tion of con­trol reli­a­bil­ity require­ments and need to be done to ensure that all haz­ards that can­not be elim­i­nated or sub­sti­tuted are safe­guarded. A miss­ing or badly done risk assess­ment may inval­i­date your claim to a CE mark, or be the land­mine that ends a lia­bil­ity case — with you on the los­ing end.

Safety Requirement Specification (SRS)

Each safety func­tion needs to be defined in detail in a Safety Requirement Specification (SRS). A reli­a­bil­ity assess­ment needs to be com­pleted for each safety func­tion defined in the SRS. This point is dis­cussed in detail in IEC 62061, but is not dealt with in any detail in ISO 13849–1, so IEC/​TR 62061–1 once again bridges the gap by pro­vid­ing an impor­tant detail that is miss­ing in one of the two standards.

If you are unfa­mil­iar with the con­cept of an SRS, each safety func­tion needs to be described with a cer­tain min­i­mum amount of infor­ma­tion, including:

  • The name of safety function;
  • A descrip­tion of the function;
  • The required level of per­for­mance based on the risk assess­ment and accord­ing to either ISO 13849–1 (PLr a to e) or the required safety integrity accord­ing to IEC 62061 (SIL 1 to 3)

Once the safety func­tions are defined and ana­lyzed, each safety func­tion must be imple­mented by a con­trol cir­cuit. The selected PL will drive the design to one or two of the defined ISO 13849–1 archi­tec­tures, and then the com­po­nent selec­tions and other design details will drive the final fail­ure rate and PL. Alternatively, the SRS will drive the selec­tion of IEC 62061 archi­tec­ture (1oo1, 1oo2, 2oo2, etc.) and the rest of the design details will lead to the final fail­ure rate and SIL.

Table 1 in the Technical Report com­pares the levels.

Table 1 – Relationship between PLs and SILs based on the aver­age prob­a­bil­ity
of dan­ger­ous fail­ure per hour

Performance Level (PL)Average prob­a­bil­ity of a dan­ger­ous
fail­ure per hour (1/​h)
Safety integrity level (SIL)
a>= 10–5 to < 10–4No spe­cial safety requirements
b>= 3 x 10–6 to < 10–51
c>= 10–6 to < 3 x 10–61
d>= 10–7 to < 10–62
e>= 10–8 to < 10–73

This table com­bines ISO 13849–1 2007, Tables 3 & 4. No sim­i­lar tables exist in IEC 62061 2005.

Combining Equipment with PLs and SILs

Section 7 of the report speaks to the chal­lenge of inte­grat­ing equip­ment with rat­ings in a mix of PLs and SILs. Until the stan­dards merge and a sin­gle sys­tem for describ­ing reli­a­bil­ity cat­e­gories is agreed on, this prob­lem will be with us.

When design­ing sys­tems using either sys­tem the designer has to deter­mine the approx­i­mate rate of dan­ger­ous fail­ures. In ISO 13849–1, MTTFd is the com­po­nent fail­ure rate para­me­ter, while in IEC 62061, PFHd is the sub­sys­tem fail­ure rate para­me­ter. MTTFd does not con­sider diag­nos­tics or archi­tec­ture, only the com­po­nent fail­ure rate per year, while PFHd does include diag­nos­tics and archti­tec­ture, and it speaks to the sys­tem fail­ure rate per hour. To com­pare these rates, ISO 13849–1 Annex K describes the rela­tion­ship between MTTFd and PFHd for dif­fer­ent architectures.

In the design process only one method can be used, so where equip­ment with dif­fer­ent rat­ings must be com­bined the fail­ure rates must be con­verted to either MTTFd or to PFHd, depend­ing on the sys­tem being used to com­plete the analy­sis. Mixing require­ments within the design of a sub­sys­tem is not per­mit­ted (See Clause 7.3.3).

Fault Exclusions

Fault exclu­sions are per­mit­ted under both stan­dards with some lim­i­ta­tions: up to IEC 62061 SIL 2. No fault exclu­sions are per­mit­ted in SIL 3. Properly jus­ti­fied fault exclu­sions can be used up to PLe. “Properly jus­ti­fied” fault exclu­sions are those that can be shown to be valid through the life­time of the SRP/​CS.

In gen­eral, fault exclu­sions for mechan­i­cal fail­ures of electro­mechan­i­cal devices such as inter­lock devices or emer­gency stop devices are not per­mit­ted, with a few excep­tions given in ISO 13849–2, (See Clauses 7.2.2.4 and 7.2.2.5).

This approach is con­sis­tent with the cur­rent approach taken in Canada, as described in CSA Z432 & Z434. Fault exclu­sions are gen­er­ally not per­mit­ted under ANSI standards.

Worked Examples

Section 8 of the Technical Report gives a cou­ple of worked exam­ples, one done under ISO 13849–1, and one under IEC 62061. For some­one look­ing for a good exam­ple of what a prop­erly com­pleted analy­sis should look like, this sec­tion is the gold at the end of the rain­bow. Section 8.2 pro­vides a good, clear exam­ple of the appli­ca­tion of the stan­dards along with a nice, sim­ple exam­ple of what a safety require­ment spec­i­fi­ca­tion might look like.

Understanding the Differences

One area where pro­po­nents of the two stan­dards often dis­agree is on the ‘accu­racy’ of the ana­lyt­i­cal pro­ce­dures given in the two stan­dards. The Technical Report pro­vides a detailed expla­na­tion of why the two tech­niques pro­vide slightly dif­fer­ent results and pro­vides the ratio­nale explain­ing why this vari­a­tion should be con­sid­ered acceptable.

To Buy or Not to Buy…

At the end of the day, the ques­tion that needs to be answered is whether to buy this doc­u­ment or not. If you use either of these stan­dards, I strongly rec­om­mend that you spend the money to get this Technical Report, if for noth­ing more than the worked exam­ples. Until the two stan­dards are merged, and that could be a few years, you will need to be able to effec­tively apply these approaches to PL and SIL rated equip­ment. This Technical Report will be an invalu­able aid.

It also pro­vides some guid­ance on the direc­tion that the new merged stan­dard will take. Some old argu­ments can be set­tled, or at least re-​​directed, by this document.

Finally, since the TR is to be incor­po­rated in both stan­dards and con­tains mate­r­ial replac­ing that in the cur­rent edi­tions of the stan­dard, you must buy a copy to remain current.

For all of these rea­sons, I would spend the money to acquire this doc­u­ment, read and apply it.

Download IEC stan­dards, International Electrotechnical Commission standards.

Download ISO Standards

If you’ve bought the report and would like to add your thoughts, please add a com­ment below. Got ques­tions? Contact me!

Interlock Architectures – Pt. 3: Category 2

ISO 13849-1 Figure 10
This entry is part 3 of 8 in the series Circuit Architectures Explored

In the first two posts in this series, we looked at Category B, the Basic cat­e­gory of sys­tem archi­tec­ture, and then moved on to look at Category 1. Category B under­pins Categories 2, 3 and 4. In this post we’ll look more deeply into Category 2.

Let’s start by look­ing at the def­i­n­i­tion for Category 2, taken from ISO 13849–1:2007. Remember that in these excerpts, SRP/​CS stands for Safety Related Parts of Control Systems.

Definition

6.2.5 Category 2

For cat­e­gory 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­gory B shall apply. “Well–tried safety prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

SRP/​CS of cat­e­gory 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safety function(s) shall be performed

  • at the machine start-​​up, and
  • prior to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of other move­ments, and/​or
  • peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is necessary.

The ini­ti­a­tion of this check may be auto­matic. Any check of the safety function(s) shall either

  • allow oper­a­tion if no faults have been detected, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detected.

Whenever pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the haz­ard.

For the des­ig­nated archi­tec­ture of cat­e­gory 2, as shown in Figure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tional chan­nel (i.e. I, L and O in Figure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Figure 10).

The diag­nos­tic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​​detection shall be low. The MTTFd of each chan­nel shall be low-​​to-​​high, depend­ing on the required per­for­mance level (PLr). Measures against CCF shall be applied (see Annex F).

The check itself shall not lead to a haz­ardous sit­u­a­tion (e.g. due to an increase in response time). The check­ing equip­ment may be inte­gral with, or sep­a­rate from, the safety-​​related part(s) pro­vid­ing the safety function.

The max­i­mum PL achiev­able with cat­e­gory 2 is PL = d.

NOTE 1 In some cases cat­e­gory 2 is not applic­a­ble because the check­ing of the safety func­tion can­not be applied to all components.

NOTE 2 Category 2 sys­tem behav­iour allows that

  • the occur­rence of a fault can lead to the loss of the safety func­tion between checks,
  • the loss of safety func­tion is detected by the check.

NOTE 3 The prin­ci­ple that sup­ports the valid­ity of a cat­e­gory 2 func­tion is that the adopted tech­ni­cal pro­vi­sions, and, for exam­ple, the choice of check­ing fre­quency can decrease the prob­a­bil­ity of occur­rence of a dan­ger­ous situation.

ISO 13849-1 Figure 10

ISO 13849–1 Figure 10 — Category 2 Block diagram

 

Breaking it down

Let start by tak­ing apart the def­i­n­i­tion a piece at a time and look­ing at what each part means. I’ll also show a sim­ple cir­cuit that can meet the requirements.

Category B & Well-​​tried Components

The first para­graph speaks to the build­ing block approach taken in the standard:

For cat­e­gory 2, the same require­ments as those accord­ing to 6.2.3 for cat­e­gory B shall apply. “Well–tried safety prin­ci­ples” accord­ing to 6.2.4 shall also be fol­lowed. In addi­tion, the fol­low­ing applies.

Systems meet­ing Category 2 are required to meet all of the same require­ments as Category B, as far as the com­po­nents are con­cerned. Other require­ments for the cir­cuits are dif­fer­ent, and we will look at those in a bit.

Self-​​Testing required

Category 2 brings in the idea of diag­nos­tics. If cor­rectly spec­i­fied com­po­nents have been selected (Category B), and those com­po­nents can be con­sid­ered ‘well-​​tried’ and are applied fol­low­ing ‘well-​​tried safety prin­ci­ples’, then adding a diag­nos­tic com­po­nent to the sys­tem should allow the sys­tem to detect some faults and there­fore achieve a cer­tain degree of ‘fault-​​tolerance’ or the abil­ity to func­tion cor­rectly even when some aspect of the sys­tem has failed.

Let’s look at the text:

SRP/​CS of Category 2 shall be designed so that their function(s) are checked at suit­able inter­vals by the machine con­trol sys­tem. The check of the safety function(s) shall be performed

  • at the machine start-​​up, and
  • prior to the ini­ti­a­tion of any haz­ardous sit­u­a­tion, e.g. start of a new cycle, start of other move­ments, and/​or
  • peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment and the kind of oper­a­tion shows that it is necessary.

The ini­ti­a­tion of this check may be auto­matic. Any check of the safety function(s) shall either

  • allow oper­a­tion if no faults have been detected, or
  • gen­er­ate an out­put which ini­ti­ates appro­pri­ate con­trol action, if a fault is detected.

Whenever pos­si­ble this out­put shall ini­ti­ate a safe state. This safe state shall be main­tained until the fault is cleared. When it is not pos­si­ble to ini­ti­ate a safe state (e.g. weld­ing of the con­tact in the final switch­ing device) the out­put shall pro­vide a warn­ing of the hazard.

Periodic check­ing is required. The checks must hap­pen at least each time there is a demand placed on the sys­tem, i.e. a guard door is opened and closed, or an emer­gency stop but­ton is pressed and reset. In addi­tion the integrity of the SRP/​CS must be tested at the start of a cycle or haz­ardous period, and poten­tially peri­od­i­cally dur­ing oper­a­tion if the risk assess­ment indi­cates that this is necessary.

The test­ing does not have to be auto­matic, although in prac­tice it usu­ally is. As long as the sys­tem integrity is good, then the out­put is allowed to remain on, and the machin­ery or process can run.

Watch Out!

Notice that the words ‘when­ever pos­si­ble’ are used in the last para­graph in this part of the def­i­n­i­tion where the stan­dard speaks about ini­ti­a­tion of a safe state. This word­ing alludes to the fact that these sys­tems are still prone to faults that can lead to the loss of the safety func­tion, and so can­not be called truly ‘fault-​​tolerant’. Loss of the safety func­tion must be detected by the mon­i­tor­ing sys­tem and a safe state ini­ti­ated. This requires care­ful thought, since the safety sys­tem com­po­nents may have to inter­act with the process con­trol sys­tem to ini­ti­ate and main­tain the safe state in the event that the safety sys­tem itself has failed.

All of this leads to an inter­est­ing ques­tion: If the sys­tem is hard­wired through the oper­at­ing chan­nel, and all the com­po­nents used in that chan­nel meet Category B require­ments, can the diag­nos­tic com­po­nent be pro­vided by a mon­i­tor­ing the sys­tem with a stan­dard PLC?

Unfortunately, the answer to this is NO. This is true because ALL of the com­po­nents must meet the well-​​tried require­ment, and since pro­gram­ma­ble elec­tron­ics are specif­i­cally excluded from being con­sid­ered well-​​tried, this approach can­not be used. Some North American stan­dards are writ­ten so that this approach could be applied, but under the International and EU require­ments it is not acceptable.

Finally, for the faults that can be detected by the mon­i­tor­ing sys­tem, detec­tion of a fault must ini­ti­ate a safe state. This means that on the next demand on the sys­tem, i.e. the next time the guard is opened or the emer­gency stop is pressed, the machine must go into a safe con­di­tion. Generally, detec­tion of a fault should pre­vent the sub­se­quent reset of the sys­tem until the fault is cleared or repaired.

Testing is not per­mit­ted to intro­duce any new haz­ards or to slow the sys­tem down. The tests must occur ‘on-​​the-​​fly’ and with­out intro­duc­ing any delay in the sys­tem com­pared to how it would have oper­ated with­out the test­ing incor­po­rated. Test equip­ment can be inte­grated into the safety sys­tem or be exter­nal to it.

One more ‘gotcha’

Note 1 in the def­i­n­i­tion high­lights a sig­nif­i­cant pit­fall for many design­ers: if all of the com­po­nents in the func­tional chan­nel of the sys­tem can­not be checked, you can­not claim con­for­mity to Category 2. A sys­tem that oth­er­wise would meet the archi­tec­tural require­ments for Category 2 must be down­graded to Category 1 in cases where all the com­po­nents in the func­tional chan­nel can­not be tested. This is a major point and one which many design­ers miss when devel­op­ing their systems.

Calculation of MTTFd

The next para­graph deals with the cal­cu­la­tion of the fail­ure rate of the sys­tem, or MTTFd.

For the des­ig­nated archi­tec­ture of cat­e­gory 2, as shown in Figure 10, the cal­cu­la­tion of MTTFd and DCavg should take into account only the blocks of the func­tional chan­nel (i.e. I, L and O in Figure 10) and not the blocks of the test­ing chan­nel (i.e. TE and OTE in Figure 10).

Calculation of the fail­ure rate focuses on the func­tional chan­nel, not on the mon­i­tor­ing sys­tem, mean­ing that the fail­ure rate of the mon­i­tor­ing sys­tem is ignored when ana­lyz­ing sys­tems using this archi­tec­ture. The MTTFd of each com­po­nent in the func­tional chan­nel is cal­cu­lated and then the MTTFd of the total chan­nel is calculated.

The Diagnostic Coverage (DCavg) is also cal­cu­lated based exclu­sively on the com­po­nents in the func­tional chan­nel, so when deter­min­ing what per­cent­age of the faults can be detected by the mon­i­tor­ing equip­ment, only faults in the func­tional chan­nel are considered.

This high­lights the fact that a fail­ure of the mon­i­tor­ing sys­tem can­not be detected, so a sin­gle fail­ure in the mon­i­tor­ing sys­tem that results in the sys­tem fail­ing to detect a sub­se­quent nor­mally detectable fail­ure in the func­tional chan­nel will result in the loss of the safety function.

Summing Up

The next para­graph sums up the lim­its of this par­tic­u­lar architecture:

The diag­nos­tic cov­er­age (DCavg) of the total SRP/​CS includ­ing fault-​​detection shall be low. The MTTFd of each chan­nel shall be low-​​to-​​high, depend­ing on the required per­for­mance level (PLr). Measures against CCF shall be applied (see Annex F).

The first sen­tence reflects back to the pre­vi­ous para­graph on diag­nos­tic cov­er­age, telling you, as the designer, that you can­not make a claim to any­thing more than LOW DC cov­er­age when using this architecture.

This raises an inter­est­ing ques­tion, since Figure 5 in the stan­dard shows columns for both DCavg = LOW and DCavg=MED. My best advice to you as a user of the stan­dard is to abide by the text, mean­ing that you can­not claim higher than LOW for DCavg in this architecture.

Another prob­lem raised by this sen­tence is the inclu­sion of the phrase “the total SRP/​CS includ­ing fault-​​detection”, since the pre­vi­ous para­graph explic­itly tells you that the assess­ment of DCavg ‘should’ only include the func­tional chan­nel, while this sen­tence appears to include it. In stan­dards writ­ing, sen­tences includ­ing the word ‘shall’ are clearly manda­tory, while those includ­ing the word ‘should’ indi­cate a con­di­tion which is advised but not required. Hopefully this con­fu­sion will be clar­i­fied in the next edi­tion of the standard.

Failure rates in the func­tional chan­nel can be any­where in the range from LOW to HIGH depend­ing on the com­po­nents selected and the way they are applied in the design. The require­ment will be dri­ven by the desired PL of the sys­tem, so a PLd sys­tem will require HIGH MTTFd com­po­nents in the func­tional chan­nel, while the same archi­tec­ture used for a PLb sys­tem would require only LOW MTTFd com­po­nents.
Finally, applic­a­ble mea­sures against Common Cause Failures (CCF) must be used. Some of the mea­sures given in Table F.1 in Annex F of the stan­dard can­not be applied, such as Channel Separation, since you can­not sep­a­rate a sin­gle chan­nel. Other CCF mea­sures can and must be applied, and so there­fore you must score at least the min­i­mum 65 on the CCF table in Annex F to claim com­pli­ance with Category 2 requirements.

Example Circuit

Here’s an exam­ple of what a sim­ple Category 2 cir­cuit con­structed from dis­crete com­po­nents might look like. Note that PB1 and PB2 could just as eas­ily be inter­lock switches on guard doors as push but­tons on a con­trol panel. For the sake of sim­plic­ity, I did not illus­trate surge sup­pres­sion on the relays, but you should include MOV’s or RC sup­pres­sors across all relay coils. All relays are con­sid­ered to be con­structed with  ‘force-​​guided’ designs and meet the require­ments for well-​​tried components.

Example Category 2 circuit from discrete components

Example Example Category 2 cir­cuit from dis­crete components

Here is how the cir­cuit works:

  1. The machine is stopped with power off. CR1, CR2, and M are off. CR3 is off until the reset but­ton is pressed, since the NC mon­i­tor­ing con­tacts on CR1, CR2 and M are all closed, but the NO reset push but­ton con­tact is open.
  2. The reset push but­ton, PB3,  is pressed. If both CR1, CR2 and M are off, their nor­mally closed con­tacts will be closed, so press­ing PB3 will result in CR3 turn­ing on.
  3. CR3 closes its con­tacts, ener­giz­ing CR1 and CR2 which seal their con­tact cir­cuits in and de-​​energize CR3. The time delays inher­ent in relays per­mit this to work.
  4. With CR1 and CR2 closed and CR3 held off because its coil cir­cuit opened when CR1 and CR2 turned on, M ener­gizes and motion can start.

In this cir­cuit the mon­i­tor­ing func­tion is pro­vided by CR3. If any of CR1, CR2 or M were to weld closed, CR3 could not ener­gize, and so a sin­gle fault is detected and the machine is pre­vented from re-​​starting. If the machine is stopped by press­ing either PB1 or PB2, the machine will stop since CR1 and CR2 are redun­dant. If CR3 fails, then the M rung is all held open because CR3 has not de-​​energized, pre­vent­ing the machine from start­ing with a failed mon­i­tor­ing sys­tem. If CR1 or CR2 fail with an open coil, then M can­not ener­gize because of the redun­dant con­tacts on the M rung.

This cir­cuit can­not detect a fail­ure in PB1, PB2, or PB3. Testing is con­ducted each time the cir­cuit is reset.

If M is a motor starter rather than the motor itself, it will need to be dupli­cated for redun­dancy and a mon­i­tor­ing con­tact added to the CR3 rung unless a rea­son­able case for fault exclu­sion can be made.

In cal­cu­lat­ing MTTFd, PB1, PB2, CR1, CR2, CR3 and M must be included. CR3 is included because it has a func­tional con­tact in the M rung and is there­fore part of the func­tional chan­nel of the cir­cuit as well as being part of the OT and OTE channels.

Download IEC stan­dards, International Electrotechnical Commission stan­dards.
Download ISO Standards

Watch for the next install­ment in this series where we’ll explore Category 3, the first of the ‘fault tol­er­ant’ architectures!

All original content on these pages is fingerprinted and certified by Digiprove
Performance Optimization WordPress Plugins by W3 EDGE