ISO 13849 – 1 Analysis — Part 5: Diagnostic Coverage (DC)

A graph showing the theoretical "bathtub curve" for product failure rate over the lifetime of the product.
This entry is part 5 of 9 in the series How to do a 13849 – 1 ana­lys­is

What is Diagnostic Coverage?

Understanding Diagnostic Coverage (DC) as it is used in ISO 13849 – 1 [1] is crit­ic­al to ana­lys­ing the design of any safety func­tion assessed using this stand­ard. In case you missed a pre­vi­ous part of the series, you can read it here.

In the last instal­ment of this series dis­cuss­ing MTTFD, I brought up the fact that everything fails even­tu­ally, and so everything has a nat­ur­al fail­ure rate. The bathtub curve shown at the top of this post shows a typ­ic­al fail­ure rate curve for most products. Failure rates tell you the aver­age time (or some­times the mean time) it takes for com­pon­ents or sys­tems to fail. Failure rates are expressed in many ways, MTTFD and PFHd being the ways rel­ev­ant to this dis­cus­sion of ISO 13849 ana­lys­is. MTTFis giv­en in years, and PFHd is giv­en in frac­tion­al hours (1/​h). As a remind­er, PFHd stands for “Probability of dan­ger­ous Failure per Hour”.

Three of the stand­ard archi­tec­tures include auto­mat­ic dia­gnost­ic func­tions, Categories 2, 3 and 4. As soon as we add dia­gnostics to the sys­tem, we need to know what faults the dia­gnostics can detect and how many of the dan­ger­ous fail­ures rel­at­ive to the total num­ber of fail­ures that rep­res­ents. Diagnostic Coverage (DC) rep­res­ents the ratio of dan­ger­ous fail­ures that can be detec­ted to the total dan­ger­ous fail­ures that could occur, expressed as a per­cent­age. There will be some fail­ures that do not res­ult in a dan­ger­ous fail­ure, and those fail­ures are excluded from DC because we don’t need to worry about them – if they occur, the sys­tem will not fail into a dan­ger­ous state.

Here’s the form­al defin­i­tion from [1]:

3.1.26 dia­gnost­ic cov­er­age (DC)

meas­ure of the effect­ive­ness of dia­gnostics, which may be determ­ined as the ratio between the fail­ure rate of detec­ted dan­ger­ous fail­ures and the fail­ure rate of total dan­ger­ous fail­ures

Note 1 to entry: Diagnostic cov­er­age can exist for the whole or parts of a safety-​related sys­tem. For example, dia­gnost­ic cov­er­age could exist for sensors and/​or logic sys­tem and/​or final ele­ments. [SOURCE: IEC 61508 – 4:1998, 3.8.6, mod­i­fied.]

That brings up two oth­er related defin­i­tions that need to be kept in mind [1]:

3.1.4 fail­ure

ter­min­a­tion of the abil­ity of an item to per­form a required func­tion

Note 1 to entry: After a fail­ure, the item has a fault.

Note 2 to entry: “Failure” is an event, as dis­tin­guished from “fault”, which is a state.

Note 3 to entry: The concept as defined does not apply to items con­sist­ing of soft­ware only.

Note 4 to entry: Failures which only affect the avail­ab­il­ity of the pro­cess under con­trol are out­side of the scope of this part of ISO 13849. [SOURCE: IEC 60050 – 191:1990, 04 – 01.]

and the most import­ant one [1]:

3.1.5 dan­ger­ous fail­ure

fail­ure which has the poten­tial to put the SRP/​CS in a haz­ard­ous or fail-​to-​function state

Note 1 to entry: Whether or not the poten­tial is real­ized can depend on the chan­nel archi­tec­ture of the sys­tem; in redund­ant sys­tems a dan­ger­ous hard­ware fail­ure is less likely to lead to the over­all dan­ger­ous or fail-​to- func­tion state.

Note 2 to entry: [SOURCE: IEC 61508 – 4, 3.6.7, mod­i­fied.]

Just as a remind­er, SRP/​CS stands for “safety-​related parts of con­trol sys­tems”.

Failure Math

Failure Rate Data Sources

To do any cal­cu­la­tions, we need data, and this is true for fail­ure rates as well. ISO 13849 – 1 provides some tables in the annexes that list some com­mon types of com­pon­ents and their asso­ci­ated fail­ure rates, and there are more fail­ure rate tables in ISO 13849 – 2. A word of cau­tion here: Do not mix sources of fail­ure rate data, as the con­di­tions under which that data is true won’t match the data in ISO 13849. There are a few good sources of fail­ure rate data out there, for example, MIL-​HDBK-​217, Reliability Prediction of Electronic Equipment [15], as well as the data­base main­tained by Exida. In any case, use a single source for your fail­ure rate data.

Failure Rate Variables

IEC 61508 [7] defines a num­ber of vari­ables related to fail­ure rates. The lower­case Greek let­ter lambda, \lambda, is used to denote fail­ures.

The com­mon vari­able des­ig­na­tions used are:

\lambda = fail­ures
\lambda_{(t)} = fail­ure rate
\lambda_s = “safe” fail­ures
\lambda_d = “dan­ger­ous” fail­ures
\lambda_{dd} = detect­able “dan­ger­ous” fail­ures
\lambda_{du} = undetect­able “dan­ger­ous” fail­ures

Calculating DC

Of these vari­ables, we only need to con­cern ourselves with \lambda_d, \lambda_{dd} and \lambda_{du}. To under­stand how these vari­ables are used, we can express their rela­tion­ship as

\lambda_d=\lambda_{dd}+\lambda_{du}

Following on that idea, the Diagnostic Coverage can be expressed as a per­cent­age like this:

DC\%=\frac{\lambda_{dd}}{\lambda_d}\times 100

Determining DC%

If you want to actu­ally cal­cu­late DC%, you have some work ahead of you. Rather than going into the details here, I am going to refer you hard­core types to IEC 61508 – 2, Functional safety of electrical/​electronic/​programmable elec­tron­ic safety-​related sys­tems – Part 2: Requirements for electrical/​electronic/​programmable elec­tron­ic safety-​related sys­tems. This stand­ard goes into some depth on how to determ­ine fail­ure rates and how to cal­cu­late the “Safe Failure Fraction,” a num­ber which is related to DC but is not the same.

For every­one else, the good news is that you can use the table in Annex E to estim­ate the DC%. It’s worth not­ing here that Annex E is “Informative.” In standards-​speak, this means that the inform­a­tion in the annex is not part of the “norm­at­ive” text, which means that it is simply inform­a­tion to help you use the norm­at­ive part of the stand­ard. The design must con­form to the require­ments in the norm­at­ive text if you want to claim con­form­ity to the stand­ard. The fact that [1, Annex E] is inform­at­ive gives you the option to cal­cu­late the DC% value rather than select­ing it from Table E.1. Using the cal­cu­lated value would not viol­ate the require­ments in the norm­at­ive text.

If you are using IFA SISTEMA [16] to do the cal­cu­la­tions for you, you will find that the soft­ware lim­its you to select­ing a single DC meas­ure from Table E.1, and this prin­ciple applies if you are doing the cal­cu­la­tions by hand too. Only one item from Table E.1 can be selec­ted for a giv­en safety func­tion.

Ranking DC

Once you have determ­ined the DC for a safety func­tion, you need to com­pare the DC value against [1, Table 5] to see if the DC is suf­fi­cient for the PLr you are try­ing to achieve. Table 5 bins the DC res­ults into four ranges. Just like bin­ning the PFHd val­ues into five ranges helps to pre­vent pre­ci­sion bias in estim­at­ing the prob­ab­il­ity of fail­ure of the com­plete sys­tem or safety func­tion, the ranges in Table 5 helps to pre­vent pre­ci­sion bias in the cal­cu­lated or selec­ted DC val­ues.

ISO 13849-1, Table 5 Diagnostic coverage (DC)
ISO 13849 – 1, Table 5 Diagnostic cov­er­age (DC)

If the DC value was high enough for the PLr, then you are done with this part of the work. If not, you will need to go back to your design and add addi­tion­al dia­gnost­ic fea­tures so that you can either select a high­er cov­er­age from [1, Table E.1] or cal­cu­late a high­er value using [14].

Multiple safety functions

When you have mul­tiple safety func­tions that make up a com­plete safety sys­tem, for example, an emer­gency stop func­tion and a guard inter­lock­ing func­tion, the DC val­ues need to be aver­aged to determ­ine the over­all DC for the com­plete sys­tem. [1, Annex E] provides you with a meth­od to do this in Equation E.1.

Equation for averaging the DC values of multiple safety functions
ISO 13849 – 1-​2015 Equation E.1

Plug in the val­ues for MTTFD and DC for each safety func­tion, and cal­cu­late the res­ult­ing DCavg value for the com­plete sys­tem.

That’s it for this art­icle. The next part will cov­er Common Cause Failures (CCF). Look for it on 20-​Mar-​17!

In case you missed the first part of the series, you can read it here.

Book List

Here are some books that I think you may find help­ful on this jour­ney:

[0]     B. Main, Risk Assessment: Basics and Benchmarks, 1st ed. Ann Arbor, MI USA: DSE, 2004.

[0.1]  D. Smith and K. Simpson, Safety crit­ic­al sys­tems hand­book, 3rd Ed. Amsterdam: Elsevier/​Butterworth-​Heinemann, 2011.

[0.2]  Electromagnetic Compatibility for Functional Safety, 1st ed. Stevenage, UK: The Institution of Engineering and Technology, 2008.

[0.3]  Overview of tech­niques and meas­ures related to EMC for Functional Safety, 1st ed. Stevenage, UK: Overview of tech­niques and meas­ures related to EMC for Functional Safety, 2013.

References

Note: This ref­er­ence list starts in Part 1 of the series, so “miss­ing” ref­er­ences may show in oth­er parts of the series. Included in the last post of the series is the com­plete ref­er­ence list.

[1]     Safety of machinery — Safety-​related parts of con­trol sys­tems — Part 1: General prin­ciples for design. 3rd Edition. ISO Standard 13849 – 1. 2015.

[7]     Functional safety of electrical/​electronic/​programmable elec­tron­ic safety-​related sys­tems. 7 parts. IEC Standard 61508. Edition 2. 2010.

[14]   Functional safety of electrical/​electronic/​programmable elec­tron­ic safety-​related sys­tems – Part 2: Requirements for electrical/​electronic/​programmable elec­tron­ic safety-​related sys­tems. IEC Standard 61508 – 2. 2010.

[15]     Reliability Prediction of Electronic Equipment. Military Handbook MIL-​HDBK-​217F. 1991.

[16]     “IFA – Practical aids: Software-​Assistent SISTEMA: Safety Integrity – Software Tool for the Evaluation of Machine Applications”, Dguv​.de, 2017. [Online]. Available: http://​www​.dguv​.de/​i​f​a​/​p​r​a​x​i​s​h​i​l​f​e​n​/​p​r​a​c​t​i​c​a​l​-​s​o​l​u​t​i​o​n​s​-​m​a​c​h​i​n​e​-​s​a​f​e​t​y​/​s​o​f​t​w​a​r​e​-​s​i​s​t​e​m​a​/​i​n​d​e​x​.​jsp. [Accessed: 30- Jan- 2017].

Series NavigationISO 13849 – 1 Analysis — Part 4: MTTFD – Mean Time to Dangerous FailureISO 13849 – 1 Analysis — Part 6: CCF — Common Cause Failures

Author: Doug Nix

+DougNix is Managing Director and Principal Consultant at Compliance InSight Consulting, Inc. (http://www.complianceinsight.ca) in Kitchener, Ontario, and is Lead Author and Managing Editor of the Machinery Safety 101 blog.

Doug's work includes teaching machinery risk assessment techniques privately and through Conestoga College Institute of Technology and Advanced Learning in Kitchener, Ontario, as well as providing technical services and training programs to clients related to risk assessment, industrial machinery safety, safety-related control system integration and reliability, laser safety and regulatory conformity.

Follow me on Academia.edu//a.academia-assets.com/javascripts/social.js