ISO 13849–1 Analysis — Part 6: CCF — Common Cause Failures

This entry is part 6 of 9 in the series How to do a 13849–1 analy­sis

What is a Common Cause Failure?

There are two sim­i­lar-sound­ing terms that peo­ple often get con­fused: Com­mon Cause Fail­ure (CCF) and Com­mon Mode Fail­ure. While these two types of fail­ures sound sim­i­lar, they are dif­fer­ent. A Com­mon Cause Fail­ure is a fail­ure in a sys­tem where two or more por­tions of the sys­tem fail at the same time from a sin­gle com­mon cause. An exam­ple could be a light­ning strike that caus­es a con­tac­tor to weld and simul­ta­ne­ous­ly takes out the safe­ty relay proces­sor that con­trols the con­tac­tor. Com­mon cause fail­ures are there­fore two dif­fer­ent man­ners of fail­ure in two dif­fer­ent com­po­nents, but with a sin­gle cause.

Com­mon Mode Fail­ure is where two com­po­nents or por­tions of a sys­tem fail in the same way, at the same time. For exam­ple, two inter­pos­ing relays both fail with weld­ed con­tacts at the same time. The fail­ures could be caused by the same cause or from dif­fer­ent caus­es, but the way the com­po­nents fail is the same.

Com­mon-cause fail­ure includes com­mon mode fail­ure, since a com­mon cause can result in a com­mon man­ner of fail­ure in iden­ti­cal devices used in a sys­tem.

Here are the for­mal def­i­n­i­tions of these terms:

3.1.6 com­mon cause fail­ure CCF

fail­ures of dif­fer­ent items, result­ing from a sin­gle event, where these fail­ures are not con­se­quences of each oth­er

Note 1 to entry: Com­mon cause fail­ures should not be con­fused with com­mon mode fail­ures (see ISO 12100:2010, 3.36). [SOURCE: IEC 60050?191-am1:1999, 04–23.] [1]


3.36 com­mon mode fail­ures

fail­ures of items char­ac­ter­ized by the same fault mode

NOTE Com­mon mode fail­ures should not be con­fused with com­mon cause fail­ures, as the com­mon mode fail­ures can result from dif­fer­ent caus­es. [lEV 191–04-24] [3]

The “com­mon mode” fail­ure def­i­n­i­tion uses the phrase “fault mode”, so let’s look at that as well:

fail­ure mode
DEPRECATED: fault mode
man­ner in which fail­ure occurs

Note 1 to entry: A fail­ure mode may be defined by the func­tion lost or oth­er state tran­si­tion that occurred. [IEV 192–03-17] [17]

As you can see, “fault mode” is no longer used, in favour of the more com­mon “fail­ure mode”, so it is pos­si­ble to re-write the com­mon-mode fail­ure def­i­n­i­tion to read, “fail­ures of items char­ac­terised by the same man­ner of fail­ure.”

Random, Systematic and Common Cause Failures

Why do we need to care about this? There are three man­ners in which fail­ures occur: ran­dom fail­ures, sys­tem­at­ic fail­ures, and com­mon cause fail­ures. When devel­op­ing safe­ty relat­ed con­trols, we need to con­sid­er all three and mit­i­gate them as much as pos­si­ble.

Ran­dom fail­ures do not fol­low any pat­tern, occur­ring ran­dom­ly over time, and are often brought on by over-stress­ing the com­po­nent, or from man­u­fac­tur­ing flaws. Ran­dom fail­ures can increase due to envi­ron­men­tal or process-relat­ed stress­es, like cor­ro­sion, EMI, nor­mal wear-and-tear, or oth­er over-stress­ing of the com­po­nent or sub­sys­tem. Ran­dom fail­ures are often mit­i­gat­ed through selec­tion of high-reli­a­bil­i­ty com­po­nents [18].

Sys­tem­at­ic fail­ures include com­mon-cause fail­ures, and occur because some human behav­iour occurred that was not caught by pro­ce­dur­al means. These fail­ures are due to design, spec­i­fi­ca­tion, oper­at­ing, main­te­nance, and instal­la­tion errors. When we look at sys­tem­at­ic errors, we are look­ing for things like train­ing of the sys­tem design­ers, or qual­i­ty assur­ance pro­ce­dures used to val­i­date the way the sys­tem oper­ates. Sys­tem­at­ic fail­ures are non-ran­dom and com­plex, mak­ing them dif­fi­cult to analyse sta­tis­ti­cal­ly. Sys­tem­at­ic errors are a sig­nif­i­cant source of com­mon-cause fail­ures because they can affect redun­dant devices, and because they are often deter­min­is­tic, occur­ring when­ev­er a set of cir­cum­stances exist.

Sys­tem­at­ic fail­ures include many types of errors, such as:

  • Man­u­fac­tur­ing defects, e.g., soft­ware and hard­ware errors built into the device by the man­u­fac­tur­er.
  • Spec­i­fi­ca­tion mis­takes, e.g. incor­rect design basis and inac­cu­rate soft­ware spec­i­fi­ca­tion.
  • Imple­men­ta­tion errors, e.g., improp­er instal­la­tion, incor­rect pro­gram­ming, inter­face prob­lems, and not fol­low­ing the safe­ty man­u­al for the devices used to realise the safe­ty func­tion.
  • Oper­a­tion and main­te­nance, e.g., poor inspec­tion, incom­plete test­ing and improp­er bypass­ing [18].

Diverse redun­dan­cy is com­mon­ly used to mit­i­gate sys­tem­at­ic fail­ures, since dif­fer­ences in com­po­nent or sub­sys­tem design tend to cre­ate non-over­lap­ping sys­tem­at­ic fail­ures, reduc­ing the like­li­hood of a com­mon error cre­at­ing a com­mon-mode fail­ure. Errors in spec­i­fi­ca­tion, imple­men­ta­tion, oper­a­tion and main­te­nance are not affect­ed by diver­si­ty.

Fig 1 below shows the results of a small study done by the UK’s Health and Safe­ty Exec­u­tive in 1994 [19] that sup­ports the idea that sys­tem­at­ic fail­ures are a sig­nif­i­cant con­trib­u­tor to safe­ty sys­tem fail­ures. The study includ­ed only 34 sys­tems (n=34), so the results can­not be con­sid­ered con­clu­sive. How­ev­er, there were some star­tling results. As you can see, errors in the spec­i­fi­ca­tion of the safe­ty func­tions (Safe­ty Require­ment Spec­i­fi­ca­tion) result­ed in about 44% of the sys­tem fail­ures in the study. Based on this small sam­ple, sys­tem­at­ic fail­ures appear to be a sig­ni­fi­cate source of fail­ures.

Pie chart illustrating the proportion of failures in each phase of the life cycle of a machine, based on data taken from HSE Report HSG238.
Fig­ure 1 — HSG 238 Pri­ma­ry Caus­es of Fail­ure by Life Cycle Stage

Handling CCF in ISO 13849–1

Now that we under­stand WHAT Com­mon-Cause Fail­ure is, and WHY it’s impor­tant, we can talk about HOW it is han­dled in ISO 13849–1. Since ISO 13849–1 is intend­ed to be a sim­pli­fied func­tion­al safe­ty stan­dard, CCF analy­sis is lim­it­ed to a check­list in Annex F, Table F.1. Note that Annex F is infor­ma­tive, mean­ing that it is guid­ance mate­r­i­al to help you apply the stan­dard. Since this is the case, you could use any oth­er means suit­able for assess­ing CCF mit­i­ga­tion, like those in IEC 61508, or in oth­er stan­dards.

Table F.1 is set up with a series of mit­i­ga­tion mea­sures which are grouped togeth­er in relat­ed cat­e­gories. Each group is pro­vid­ed with a score that can be claimed if you have imple­ment­ed the mit­i­ga­tions in that group. ALL OF THE MEASURES in each group must be ful­filled in order to claim the points for that cat­e­go­ry. Here’s an exam­ple:

A portion of ISO 13849-1 Table F.1.
ISO 13849–1:2015, Table F.1 Excerpt

In order to claim the 20 points avail­able for the use of sep­a­ra­tion or seg­re­ga­tion in the sys­tem design, there must be a sep­a­ra­tion between the sig­nal paths. Sev­er­al exam­ples of this are giv­en for clar­i­ty.

Table F.1 lists six groups of mit­i­ga­tion mea­sures. In order to claim ade­quate CCF mit­i­ga­tion, a min­i­mum score of 65 points must be achieved. Only Cat­e­go­ry 2, 3 and 4 archi­tec­tures are required to meet the CCF require­ments in order to claim the PL, but with­out meet­ing the CCF require­ment you can­not claim the PL, regard­less of whether the design meets the oth­er cri­te­ria or not.

One final note on CCF: If you are try­ing to review an exist­ing con­trol sys­tem, say in an exist­ing machine, or in a machine designed by a third par­ty where you have no way to deter­mine the expe­ri­ence and train­ing of the design­ers or the capa­bil­i­ty of the company’s change man­age­ment process, then you can­not ade­quate­ly assess CCF [8]. This fact is recog­nised in CSA Z432-16 [20], chap­ter 8. [20] allows the review­er to sim­ply ver­i­fy that the archi­tec­tur­al require­ments, exclu­sive of any prob­a­bilis­tic require­ments, have been met. This is par­tic­u­lar­ly use­ful for engi­neers review­ing machin­ery under Ontario’s Pre-Start Health and Safe­ty require­ments [21], who are fre­quent­ly work­ing with less-than-com­plete design doc­u­men­ta­tion.

In case you missed the first part of the series, you can read it here. In the next arti­cle in this series, I’m going to review the process flow for sys­tem analy­sis as cur­rent­ly out­lined in ISO 13849–1. Watch for it!

Book List

Here are some books that I think you may find help­ful on this jour­ney:

[0]     B. Main, Risk Assess­ment: Basics and Bench­marks, 1st ed. Ann Arbor, MI USA: DSE, 2004.

[0.1]  D. Smith and K. Simp­son, Safe­ty crit­i­cal sys­tems hand­book. Ams­ter­dam: Else­vier/But­ter­worth-Heine­mann, 2011.

[0.2]  Elec­tro­mag­net­ic Com­pat­i­bil­i­ty for Func­tion­al Safe­ty, 1st ed. Steve­nage, UK: The Insti­tu­tion of Engi­neer­ing and Tech­nol­o­gy, 2008.

[0.3]  Overview of tech­niques and mea­sures relat­ed to EMC for Func­tion­al Safe­ty, 1st ed. Steve­nage, UK: Overview of tech­niques and mea­sures relat­ed to EMC for Func­tion­al Safe­ty, 2013.


Note: This ref­er­ence list starts in Part 1 of the series, so “miss­ing” ref­er­ences may show in oth­er parts of the series. The com­plete ref­er­ence list is includ­ed in the last post of the series.

[1]     Safe­ty of machin­ery — Safe­ty-relat­ed parts of con­trol sys­tems — Part 1: Gen­er­al prin­ci­ples for design. 3rd Edi­tion. ISO Stan­dard 13849–1. 2015.

[2]     Safe­ty of machin­ery — Safe­ty-relat­ed parts of con­trol sys­tems — Part 2: Val­i­da­tion. 2nd Edi­tion. ISO Stan­dard 13849–2. 2012.

[3]      Safe­ty of machin­ery — Gen­er­al prin­ci­ples for design — Risk assess­ment and risk reduc­tion. ISO Stan­dard 12100. 2010.

[8]     S. Joce­lyn, J. Bau­doin, Y. Chin­ni­ah, and P. Char­p­en­tier, “Fea­si­bil­i­ty study and uncer­tain­ties in the val­i­da­tion of an exist­ing safe­ty-relat­ed con­trol cir­cuit with the ISO 13849–1:2006 design stan­dard,” Reliab. Eng. Syst. Saf., vol. 121, pp. 104–112, Jan. 2014.

[17]      “fail­ure mode”, 192–03-17, Inter­na­tion­al Elec­trotech­ni­cal Vocab­u­lary. IEC Inter­na­tion­al Elec­trotech­ni­cal Com­mis­sion, Gene­va, 2015.

[18]      M. Gen­tile and A. E. Sum­mers, “Com­mon Cause Fail­ure: How Do You Man­age Them?,” Process Saf. Prog., vol. 25, no. 4, pp. 331–338, 2006.

[19]     Out of Control—Why con­trol sys­tems go wrong and how to pre­vent fail­ure, 2nd ed. Rich­mond, Sur­rey, UK: HSE Health and Safe­ty Exec­u­tive, 2003.

[20]     Safe­guard­ing of Machin­ery. 3rd Edi­tion. CSA Stan­dard Z432. 2016.

[21]     O. Reg. 851, INDUSTRIAL ESTABLISHMENTS. Ontario, Cana­da, 1990.

31-Dec-2011 — Are YOU ready?

This entry is part 8 of 8 in the series Cir­cuit Archi­tec­tures Explored

31-Decem­ber-2011 marks a key mile­stone for machine builders mar­ket­ing their prod­ucts in the Euro­pean Union, the EEA and many of the Can­di­date States. Func­tion­al Safe­ty takes a pos­i­tive step for­ward with the manda­to­ry appli­ca­tion of EN ISO 13849–1 and -2. As of 1-Jan­u­ary-2012, the safe­ty-relat­ed parts of the con­trol sys­tems on all machin­ery bear­ing a CE Mark will be required to meet these stan­dards.

This change start­ed six years ago, when these stan­dards were first har­mo­nized under the Machin­ery Direc­tive. The EC Machin­ery Com­mit­tee gave machine builders an addi­tion­al three years to make the tran­si­tion to these stan­dards, after much oppo­si­tion to the orig­i­nal manda­to­ry imple­men­ta­tion date of 31-Dec-08 was announced.

If you aren’t aware of these stan­dards, or if you aren’t famil­iar with the con­cept of func­tion­al safe­ty, you need to get up to speed, and fast.

Under EN 954–1:1995 and the 1st Edi­tion of ISO 13849–1, pub­lished in 1999, a design­er need­ed to select a design Cat­e­go­ry or archi­tec­ture, that would pro­vide the degree of fault tol­er­ance and reli­a­bil­i­ty need­ed based on the out­come of the risk assess­ment for the machin­ery. The Cat­e­gories, B, 1–4, remain unchanged in the 2nd Edi­tion. I’ve talked about the Cat­e­gories in detail in oth­er posts, so I won’t spend any time on them here.

The 2nd Edi­tion brings Mean Time to Fail­ure into the pic­ture, along with Diag­nos­tic Cov­er­age and Com­mon Cause Fail­ures. These new con­cepts require design­ers to use more ana­lyt­i­cal tech­niques in devel­op­ing their designs, and also require addi­tion­al doc­u­men­ta­tion (as usu­al!).

One of the main fail­ings with EN 954–1 was Val­i­da­tion. This top­ic was sup­posed to have been cov­ered by EN 954–2, but this stan­dard was nev­er pub­lished. This has led machine builders to make design deci­sions with­out keep­ing the nec­es­sary design doc­u­men­ta­tion trail, and fur­ther­more, to skip the Val­i­da­tion step entire­ly in many cas­es.

The miss­ing Val­i­da­tion stan­dard was final­ly pub­lished in 2003 as ISO 13849–2:2003, and sub­se­quent­ly adopt­ed and har­mo­nized in 2009 as EN ISO 13849–2:2003. While no manda­to­ry imple­men­ta­tion date for this stan­dard is giv­en in the cur­rent list of stan­dards har­mo­nized under 2006/42/EC-Machin­ery, use of Part 1 of the stan­dard man­dates use of Part 2, so this stan­dard is effec­tive­ly manda­to­ry at the same time.

Part 2 brings a num­ber of key annex­es that are nec­es­sary for the imple­men­ta­tion of Part 1, and also out­lines the com­plete doc­u­men­ta­tion trail need­ed for val­i­da­tion, and coin­ci­den­tal­ly, audit. Noti­fied bpdies will be look­ing for this infor­ma­tion when eval­u­at­ing the con­tent of Tech­ni­cal Files used in CE Mark­ing.

From a North Amer­i­can per­spec­tive, these two stan­dards gain access through ANSI’s adop­tion of ISO 10218 for Indus­tri­al Robots. Part 1 of this stan­dard, cov­er­ing the robot itself, was adopt­ed last year. Part 2 of the stan­dard will be adopt­ed in 2012, and RIA R15.06 will be with­drawn. At the same time, CSA will be adopt­ing the ISO stan­dards and with­draw­ing CSA Z434.

These changes will final­ly bring North Amer­i­ca, the Inter­na­tion­al Com­mu­ni­ty and the EU onto the same foot­ing when it comes to Func­tion­al Safe­ty in indus­tri­al machin­ery appli­ca­tions. The days of “SIMPLE, SINGLE CHANNEL, SINGLE CHANNEL-MONITORED and CONTROL RELIABLE” are num­bered.

Are you ready?

Com­pli­ance InSight Con­sult­ing will be offer­ing a series of train­ing events in 2012 on this top­ic. For more infor­ma­tion, con­tact Doug Nix.

Inconsistencies in ISO 13849–1:2006

This entry is part 7 of 8 in the series Cir­cuit Archi­tec­tures Explored

I’ve writ­ten quite a bit recent­ly on the top­ic of cir­cuit archi­tec­tures under ISO 13849–1, and one of my read­ers noticed an incon­sis­ten­cy between the text of the stan­dard and Fig­ure 5, the dia­gram that shows how the cat­e­gories can span one or more Per­for­mance Lev­els.

ISO 13849-1 Figure 5
ISO 13849–1, Fig­ure 5: Rela­tion­ship between Cat­e­gories, DC, MTTFd and PL

If you look at Cat­e­go­ry 2 in Fig­ure 5, you will notice that there are TWO bands, one for DCavg LOW and one for DCavg MED. How­ev­er, read­ing the text of the def­i­n­i­tion for Cat­e­go­ry 2 gives (§6.2.5):

The diag­nos­tic cov­er­age (DCavg) of the total SRP/CS includ­ing fault-detec­tion shall be low.

This leaves some con­fu­sion, because it appears from the dia­gram that there are two options for this archi­tec­ture. This is backed up by the data in Annex K that under­lies the dia­gram.

The same con­fu­sion exists in the text describ­ing Cat­e­go­ry 3, with Fig­ure 5 show­ing two bands, one for DCavg LOW and one for DCavg MED.

I con­tact­ed the ISO TC199 Sec­re­tari­at, the peo­ple respon­si­ble for the con­tent of ISO 13849–1, and point­ed out this appar­ent con­flict. They respond­ed that they would pass the com­ment on to the TC for res­o­lu­tion, and would con­tact me if they need­ed addi­tion­al infor­ma­tion. As of this writ­ing, I have not heard more.

So what should you do if you are try­ing to design to this stan­dard? My advice is to fol­low Fig­ure 5. If you can achieve a DCavg MED in your design, it is com­plete­ly rea­son­able to claim a high­er PL. Refer to the data in Annex K to see where your design falls once you have com­plet­ed the MTTFd cal­cu­la­tions.

Thanks to Richard Har­ris and Dou­glas Flo­rence, both mem­bers of the ISO 13849 and IEC 62061 Group on LinkedIn for bring­ing this to my atten­tion!

If you are inter­est­ed in con­tact­ing the TC199 Sec­re­tari­at, you can email the Sec­re­tary, Mr. Stephen Kennedy. More details on ISO TC199 can be found on the Tech­ni­cal Com­mit­tee page on the ISO web Site.