CanadaCE MarkControl FunctionsEU European UnionFunctional SafetyGuards and GuardingHierarchy of ControlsInherently Safe DesignInternationalISO 13849Robotics

ISO 13849 – 1 Analysis — Part 5: Diagnostic Coverage (DC)

A graph showing the theoretical "bathtub curve" for product failure rate over the lifetime of the product.
This entry is part 5 of 9 in the series How to do a 13849 – 1 ana­lys­is

What is Diagnostic Coverage?

Under­stand­ing Dia­gnost­ic Cov­er­age (DC) as it is used in ISO 13849 – 1 [1] is crit­ic­al to ana­lys­ing the design of any safety func­tion assessed using this stand­ard. In case you missed a pre­vi­ous part of the series, you can read it here.

In the last instal­ment of this series dis­cuss­ing MTTFD, I brought up the fact that everything fails even­tu­ally, and so everything has a nat­ur­al fail­ure rate. The bathtub curve shown at the top of this post shows a typ­ic­al fail­ure rate curve for most products. Fail­ure rates tell you the aver­age time (or some­times the mean time) it takes for com­pon­ents or sys­tems to fail. Fail­ure rates are expressed in many ways, MTTFD and PFHd being the ways rel­ev­ant to this dis­cus­sion of ISO 13849 ana­lys­is. MTTFis giv­en in years, and PFHd is giv­en in frac­tion­al hours (1/h). As a remind­er, PFHd stands for “Prob­ab­il­ity of dan­ger­ous Fail­ure per Hour”.

Three of the stand­ard archi­tec­tures include auto­mat­ic dia­gnost­ic func­tions, Cat­egor­ies 2, 3 and 4. As soon as we add dia­gnostics to the sys­tem, we need to know what faults the dia­gnostics can detect and how many of the dan­ger­ous fail­ures rel­at­ive to the total num­ber of fail­ures that rep­res­ents. Dia­gnost­ic Cov­er­age (DC) rep­res­ents the ratio of dan­ger­ous fail­ures that can be detec­ted to the total dan­ger­ous fail­ures that could occur, expressed as a per­cent­age. There will be some fail­ures that do not res­ult in a dan­ger­ous fail­ure, and those fail­ures are excluded from DC because we don’t need to worry about them – if they occur, the sys­tem will not fail into a dan­ger­ous state.

Here’s the form­al defin­i­tion from [1]:

3.1.26 dia­gnost­ic cov­er­age (DC)

meas­ure of the effect­ive­ness of dia­gnostics, which may be determ­ined as the ratio between the fail­ure rate of detec­ted dan­ger­ous fail­ures and the fail­ure rate of total dan­ger­ous fail­ures

Note 1 to entry: Dia­gnost­ic cov­er­age can exist for the whole or parts of a safety-related sys­tem. For example, dia­gnost­ic cov­er­age could exist for sensors and/or logic sys­tem and/or final ele­ments. [SOURCE: IEC 61508 – 4:1998, 3.8.6, mod­i­fied.]

That brings up two oth­er related defin­i­tions that need to be kept in mind [1]:

3.1.4 fail­ure

ter­min­a­tion of the abil­ity of an item to per­form a required func­tion

Note 1 to entry: After a fail­ure, the item has a fault.

Note 2 to entry: “Fail­ure” is an event, as dis­tin­guished from “fault”, which is a state.

Note 3 to entry: The concept as defined does not apply to items con­sist­ing of soft­ware only.

Note 4 to entry: Fail­ures which only affect the avail­ab­il­ity of the pro­cess under con­trol are out­side of the scope of this part of ISO 13849. [SOURCE: IEC 60050 – 191:1990, 04 – 01.]

and the most import­ant one [1]:

3.1.5 dan­ger­ous fail­ure

fail­ure which has the poten­tial to put the SRP/CS in a haz­ard­ous or fail-to-func­tion state

Note 1 to entry: Wheth­er or not the poten­tial is real­ized can depend on the chan­nel archi­tec­ture of the sys­tem; in redund­ant sys­tems a dan­ger­ous hard­ware fail­ure is less likely to lead to the over­all dan­ger­ous or fail-to- func­tion state.

Note 2 to entry: [SOURCE: IEC 61508 – 4, 3.6.7, mod­i­fied.]

Just as a remind­er, SRP/CS stands for “safety-related parts of con­trol sys­tems”.

Failure Math

Failure Rate Data Sources

To do any cal­cu­la­tions, we need data, and this is true for fail­ure rates as well. ISO 13849 – 1 provides some tables in the annexes that list some com­mon types of com­pon­ents and their asso­ci­ated fail­ure rates, and there are more fail­ure rate tables in ISO 13849 – 2. A word of cau­tion here: Do not mix sources of fail­ure rate data, as the con­di­tions under which that data is true won’t match the data in ISO 13849. There are a few good sources of fail­ure rate data out there, for example, MIL-HDBK-217, Reli­ab­il­ity Pre­dic­tion of Elec­tron­ic Equip­ment [15], as well as the data­base main­tained by Exida. In any case, use a single source for your fail­ure rate data.

Failure Rate Variables

IEC 61508 [7] defines a num­ber of vari­ables related to fail­ure rates. The lower­case Greek let­ter lambda, \lambda, is used to denote fail­ures.

The com­mon vari­able des­ig­na­tions used are:

\lambda = fail­ures
\lambda_{(t)} = fail­ure rate
\lambda_s = “safe” fail­ures
\lambda_d = “dan­ger­ous” fail­ures
\lambda_{dd} = detect­able “dan­ger­ous” fail­ures
\lambda_{du} = undetect­able “dan­ger­ous” fail­ures

Calculating DC

Of these vari­ables, we only need to con­cern ourselves with \lambda_d, \lambda_{dd} and \lambda_{du}. To under­stand how these vari­ables are used, we can express their rela­tion­ship as

\lambda_d=\lambda_{dd}+\lambda_{du}

Fol­low­ing on that idea, the Dia­gnost­ic Cov­er­age can be expressed as a per­cent­age like this:

DC\%=\frac{\lambda_{dd}}{\lambda_d}\times 100

Determining DC%

If you want to actu­ally cal­cu­late DC%, you have some work ahead of you. Rather than going into the details here, I am going to refer you hard­core types to IEC 61508 – 2, Func­tion­al safety of electrical/electronic/programmable elec­tron­ic safety-related sys­tems – Part 2: Require­ments for electrical/electronic/programmable elec­tron­ic safety-related sys­tems. This stand­ard goes into some depth on how to determ­ine fail­ure rates and how to cal­cu­late the “Safe Fail­ure Frac­tion,” a num­ber which is related to DC but is not the same.

For every­one else, the good news is that you can use the table in Annex E to estim­ate the DC%. It’s worth not­ing here that Annex E is “Inform­at­ive.” In stand­ards-speak, this means that the inform­a­tion in the annex is not part of the “norm­at­ive” text, which means that it is simply inform­a­tion to help you use the norm­at­ive part of the stand­ard. The design must con­form to the require­ments in the norm­at­ive text if you want to claim con­form­ity to the stand­ard. The fact that [1, Annex E] is inform­at­ive gives you the option to cal­cu­late the DC% value rather than select­ing it from Table E.1. Using the cal­cu­lated value would not viol­ate the require­ments in the norm­at­ive text.

If you are using IFA SISTEMA [16] to do the cal­cu­la­tions for you, you will find that the soft­ware lim­its you to select­ing a single DC meas­ure from Table E.1, and this prin­ciple applies if you are doing the cal­cu­la­tions by hand too. Only one item from Table E.1 can be selec­ted for a giv­en safety func­tion.

Ranking DC

Once you have determ­ined the DC for a safety func­tion, you need to com­pare the DC value against [1, Table 5] to see if the DC is suf­fi­cient for the PLr you are try­ing to achieve. Table 5 bins the DC res­ults into four ranges. Just like bin­ning the PFHd val­ues into five ranges helps to pre­vent pre­ci­sion bias in estim­at­ing the prob­ab­il­ity of fail­ure of the com­plete sys­tem or safety func­tion, the ranges in Table 5 helps to pre­vent pre­ci­sion bias in the cal­cu­lated or selec­ted DC val­ues.

ISO 13849-1, Table 5 Diagnostic coverage (DC)
ISO 13849 – 1, Table 5 Dia­gnost­ic cov­er­age (DC)

If the DC value was high enough for the PLr, then you are done with this part of the work. If not, you will need to go back to your design and add addi­tion­al dia­gnost­ic fea­tures so that you can either select a high­er cov­er­age from [1, Table E.1] or cal­cu­late a high­er value using [14].

Multiple safety functions

When you have mul­tiple safety func­tions that make up a com­plete safety sys­tem, for example, an emer­gency stop func­tion and a guard inter­lock­ing func­tion, the DC val­ues need to be aver­aged to determ­ine the over­all DC for the com­plete sys­tem. [1, Annex E] provides you with a meth­od to do this in Equa­tion E.1.

Equation for averaging the DC values of multiple safety functions
ISO 13849 – 1-2015 Equa­tion E.1

Plug in the val­ues for MTTFD and DC for each safety func­tion, and cal­cu­late the res­ult­ing DCavg value for the com­plete sys­tem.

That’s it for this art­icle. The next part will cov­er Com­mon Cause Fail­ures (CCF). Look for it on 20-Mar-17!

In case you missed the first part of the series, you can read it here.

Book List

Here are some books that I think you may find help­ful on this jour­ney:

[0]     B. Main, Risk Assess­ment: Basics and Bench­marks, 1st ed. Ann Arbor, MI USA: DSE, 2004.

[0.1]  D. Smith and K. Simpson, Safety crit­ic­al sys­tems hand­book, 3rd Ed. Ams­ter­dam: Elsevi­er­/But­ter­worth-Heine­mann, 2011.

[0.2]  Elec­tro­mag­net­ic Com­pat­ib­il­ity for Func­tion­al Safety, 1st ed. Steven­age, UK: The Insti­tu­tion of Engin­eer­ing and Tech­no­logy, 2008.

[0.3]  Over­view of tech­niques and meas­ures related to EMC for Func­tion­al Safety, 1st ed. Steven­age, UK: Over­view of tech­niques and meas­ures related to EMC for Func­tion­al Safety, 2013.

References

Note: This ref­er­ence list starts in Part 1 of the series, so “miss­ing” ref­er­ences may show in oth­er parts of the series. Included in the last post of the series is the com­plete ref­er­ence list.

[1]     Safety of machinery — Safety-related parts of con­trol sys­tems — Part 1: Gen­er­al prin­ciples for design. 3rd Edi­tion. ISO Stand­ard 13849 – 1. 2015.

[7]     Func­tion­al safety of electrical/electronic/programmable elec­tron­ic safety-related sys­tems. 7 parts. IEC Stand­ard 61508. Edi­tion 2. 2010.

[14]   Func­tion­al safety of electrical/electronic/programmable elec­tron­ic safety-related sys­tems – Part 2: Require­ments for electrical/electronic/programmable elec­tron­ic safety-related sys­tems. IEC Stand­ard 61508 – 2. 2010.

[15]     Reli­ab­il­ity Pre­dic­tion of Elec­tron­ic Equip­ment. Mil­it­ary Hand­book MIL-HDBK-217F. 1991.

[16]     “IFA – Prac­tic­al aids: Soft­ware-Assist­ent SISTEMA: Safety Integ­rity – Soft­ware Tool for the Eval­u­ation of Machine Applic­a­tions”, Dguv.de, 2017. [Online]. Avail­able: http://www.dguv.de/ifa/praxishilfen/practical-solutions-machine-safety/software-sistema/index.jsp. [Accessed: 30- Jan- 2017].

Series Nav­ig­a­tionISO 13849 – 1 Ana­lys­is — Part 4: MTTFD – Mean Time to Dan­ger­ous Fail­ure”>ISO 13849 – 1 Ana­lys­is — Part 4: MTTFD – Mean Time to Dan­ger­ous Fail­ureISO 13849 – 1 Ana­lys­is — Part 6: CCF — Com­mon Cause Fail­ures”>ISO 13849 – 1 Ana­lys­is — Part 6: CCF — Com­mon Cause Fail­ures