ISO 13849-1 Analysis — Part 8: Fault Exclusion

Post updated 2019-07-24. Ed.

Fault Consideration & Fault Exclusion

ISO 13849-1, Clause 7 [1, 7] discusses the need for fault consideration and fault exclusion. Fault consideration is examining the components and sub-systems used in the safety-related part of the control system (SRP/CS) and making a list of all possible faults. This is a non-trivial exercise!

Thinking back to some of the earlier articles in this series where I mentioned the different types of faults, you may recall that there are detectable and undetectable faults, and there are safe and dangerous faults, leading us to four kinds of faults:

  • Safe undetectable faults
  • Dangerous undetectable faults
  • Safe detectable faults
  • Dangerous detectable faults

For systems with no diagnostics, i.e., Category B and 1, faults need to be eliminated using inherently safe design techniques, particularly through the use of well-tried components and over-dimensioning. Care needs to be taken when classifying components as “well-tried” versus using a fault exclusion, as components that might normally be considered “well-tried” might not meet those requirements in every application. [2, Annex A], Validation tools for mechanical systems, discusses the concepts of “Basic Safety Principles,” “Well-Tried Safety Principles,” and “Well-tried components.” [2, Annex A] also provides examples of faults and relevant fault exclusion criteria. There are similar Annexes that cover pneumatic systems [2, Annex B], hydraulic systems [2, Annex C], and electrical systems [2, Annex D].

For systems where diagnostics are part of the design, i.e., Category 2, 3, and 4, the fault lists are used to evaluate the diagnostic coverage (DC) of the test systems. Depending on the architecture, certain levels of DC are required to meet the relevant PL, see [1, Fig. 5]. The fault lists are the starting point for determining DC. The values from the lists are inputs for the hardware and software designs. The diagnostics must cover all dangerous detectable faults, and the DC must be high enough to meet the PLr for the safety function.

The fault lists and exclusions are also used in the Validation portion of this process. At the start of the Validation process flowchart [2, Fig. 1], you can see how the fault lists and the criteria used for fault exclusion are used as inputs to the validation plan.

The diagram shows the first few stages in the ISO 13849-2 Validation process. See ISO 13849-2, Figure 1.
Start of ISO 13849-2 Fig. 1

Faults that can be excluded do not need to be validated, saving time and effort during the system verification and validation (V & V). How is this done?

Fault Consideration

The first step is to develop a list of potential faults that could occur based on the components and subsystems included in SRP/CS. ISO 13849-2 [2] includes lists of typical faults for various technologies. For example, [2, Table A.4] is the fault list for mechanical components.

Mechanical fault list from ISO 13849-2
Table A.4 ? Faults and fault exclusions ? Mechanical devices, components and elements
(e.g. cam, follower, chain, clutch, brake, shaft, screw, pin, guide, bearing)

[2] contains tables similar to Table A.4 for:

  • Pressure-coil springs
  • Directional control valves
  • Stop (shut-off) valves/non-return (check) valves/quick-action venting valves/shuttle valves, etc.
  • Flow valves
  • Pressure valves
  • Pipework
  • Hose assemblies
  • Connectors
  • Pressure transmitters and pressure medium transducers
  • Compressed air treatment — Filters
  • Compressed-air treatment — Oilers
  • Compressed air treatment — Silencers
  • Accumulators and pressure vessels
  • Sensors
  • Fluidic Information processing — Logical elements
  • etc.

As you can see, many different types of faults need to be considered. Remember that I did not give you all the different fault lists – this post would be a mile long if I did that! You need to develop a fault list for your system and then consider each fault’s impact on the system’s operation. If you have components or subsystems not listed in the tables, you must develop your own fault lists for those items. Failure Modes and Effects Analysis (FMEA) can be used to develop fault lists for these components [23], [24].

When considering the faults to be included in the list, there are a few things that should be considered [1, 7.2]:

  • if after the first fault occurs, other faults develop due to the first fault, then you can group those faults together as a single fault
  • two or more single faults with a common cause can be considered as a single fault
  • multiple faults with different causes but occurring simultaneously are considered improbable and do not need to be considered

Examples

#1 – Voltage Regulator

A voltage regulator fails in a system power supply so that the 24 Vdc output rises to an unregulated 36 Vdc (the internal power supply bus voltage), and after some time has passed, two sensors fail. All three failures can be grouped and considered as a single fault because they originate in a single failure in the voltage regulator.

#2 – Lightning Strike

If a lightning strike occurs on the power line and the resulting surge voltage on the 400 V mains causes an interposing contactor and the motor drive it controls to fail to danger, these failures may be grouped and considered as one. Again, a single event causes all of the subsequent failures.

#3 – Pneumatic System Lubrication

3a – A pneumatic lubricator runs out of lubricant and is not refilled, depriving downstream pneumatic components of lubrication.

3b – The spool on the system dump valve sticks open because it is not cycled often enough.

Neither of these failures has the same cause, so there is no need to consider them occurring simultaneously because the probability of both happening concurrently is extremely small. One caution: These two faults MAY have a common cause – poor maintenance. If this is true and you consider them two faults with a common cause, they could be grouped as a single fault.

Fault Exclusion

Once you have your well-considered fault lists together, the next question is, “Can any of the listed faults be excluded?” This is a tricky question! There are a few points to consider:

  • Does the system architecture allow for fault exclusion?
  • Is the fault technically improbable, even if it is possible?
  • Does experience show that the fault is unlikely to occur?*
  • Are technical requirements related to the application and the hazard supporting fault exclusion?

* BE CAREFUL with this one!

Whenever faults are excluded, a detailed justification must be included in the system design documentation. Simply deciding that the fault can be excluded is NOT ENOUGH! Consider the risk a person will be exposed to if the fault occurs. If the severity is very high, i.e., severe permanent injury or death, you may not want to exclude the fault even if you think you could. Careful consideration of the resulting injury scenario is needed.

Basing a fault exclusion on personal experience is seldom considered adequate, so I added the asterisk (*) above. Look for good statistical data to support any fault exclusion decision.

There is much more information available in IEC 61508-2 on the subject of fault exclusion, and there is good information in some of the books mentioned below [0.1], [0.2], and [0.3]. If you know of additional resources you would like to share, please post the information in the comments!

The final part of this series, Part 9, is the complete reference list for the series.


Definitions

3.1.3
fault
state of an item characterized by the inability to perform a required function, excluding the inability during preventive maintenance or other planned actions, or due to lack of external resources
Note 1 to entry: A fault is often the result of a failure of the item itself, but may exist without prior failure.
Note 2 to entry: In this part of ISO 13849, ?fault? means random fault. [SOURCE: IEC 60050-191:1990, 05-01.]

Book List

Here are some books that I think you may find helpful on this journey:

[0]     B. Main, Risk Assessment: Basics and Benchmarks, 1st ed. Ann Arbor, MI USA: DSE, 2004.

[0.1]  D. Smith and K. Simpson, Safety critical systems handbook. Amsterdam: Elsevier/Butterworth-Heinemann, 2011.

[0.2]  Electromagnetic Compatibility for Functional Safety, 1st ed. Stevenage, UK: The Institution of Engineering and Technology, 2008.

[0.3] Overview of techniques and measures related to EMC for Functional Safety, 1st ed. Stevenage, UK: Overview of techniques and measures related to EMC for Functional Safety, 2013.

[0.4] “Code of practice for electromagnetic resilience, 1st ed. Stevenage, UK: IET Standards TC4.3 EMC, 2017.

[0.5] “Code of Practice: Competence for Safety Related Systems Practitioners, 1st ed. Stevenage, UK: The Institution of Engineering and Technology, 2016.


References

Note: This reference list starts in Part 1 of the series, so “missing” references may show in other parts of the series. Included in the last post of the series is the complete reference list.

[1]     Safety of machinery — Safety-related parts of control systems — Part 1: General principles for design, 3rd Ed. ISO 13849-1. 2015.

[2]     Safety of machinery — Safety-related parts of control systems — Part 2: Validation. 2nd Ed. ISO 13849-2. 2012.

[3]     Safety of machinery — General principles for design — Risk assessment and risk reduction, ISO 12100. 2010.

[4]     Safeguarding of Machinery, 2nd Ed. CSA Z432. 2004.

[5]     Risk Assessment and Risk Reduction — A Guideline to Estimate, Evaluate and Reduce Risks Associated with Machine Tools. ANSI Technical Report B11.TR3. 2000.

[6]    Safety of machinery — Emergency stop function — Principles for design, ISO 13850. 2015.

[7]     Functional safety of electrical/electronic/programmable electronic safety-related systems, seven parts. IEC 61508. Ed. 2. 2010.

[8]     S. Jocelyn, J. Baudoin, Y. Chinniah, and P. Charpentier, “Feasibility study and uncertainties in the validation of an existing safety-related control circuit with the ISO 13849-1:2006 design standard,” Reliab. Eng. Syst. Saf., vol. 121, pp. 104-112, Jan. 2014.

[9]     Guidance on the application of ISO 13849-1 and IEC 62061 in the design of safety-related control systems for machinery, IEC/TR 62061-1. 2010.

[10]     Safety of machinery — Functional safety of safety-related electrical, electronic and programmable electronic control systems, IEC 62061. 2005.

[11]     Guidance on the application of ISO 13849-1 and IEC 62061 in the design of safety-related control systems for machinery, IEC/TR 62061-1. 2010.

[12]    D. S. G. Nix, Y. Chinniah, F. Dosio, M. Fessler, F. Eng, and F. Schrever, “Linking Risk and Reliability—Mapping the output of risk assessment tools to functional safety requirements for safety-related control systems,” Kitchener: Compliance inSight Consulting Inc. 2015.

[13]     Safety of machinery—Safety-related parts of control systems. General principles for design, CEN EN 954-1. European Committee for Standardization (CEN). 1996.

[14]    Functional safety of electrical/electronic/programmable electronic safety-related systems — Part 2: Requirements for electrical/electronic/programmable electronic safety-related systems, IEC 61508-2. 2010.

[15]     Reliability Prediction of Electronic Equipment. Military Handbook MIL-HDBK-217F. 1991.

[16]     “IFA – Practical aids: Software-Assistant SISTEMA: Safety Integrity – Software Tool for the Evaluation of Machine Applications”, dguv.de, 2017. [Online]. Available: http://www.dguv.de/ifa/praxishilfen/practical-solutions-machine-safety/software-sistema/index.jsp. [Accessed: 30- Jan- 2017].

[17]      “failure mode”, 192-03-17, International Electrotechnical Vocabulary. IEC International Electrotechnical Commission, Geneva, 2015.

[18]      M. Gentile and A. E. Summers, “Common Cause Failure: How Do You Manage Them?,” Process Saf. Prog., vol. 25, no. 4, pp. 331-338, 2006.

[19]     Out of Control—Why control systems go wrong and how to prevent failure, 2nd ed. Richmond, Surrey, UK: HSE Health and Safety Executive, 2003.

[20]     Safeguarding of Machinery. 3rd Ed. CSA Z432. Canadian Standards Association (CSA). 2016.

[21]     O. Reg. 851, INDUSTRIAL ESTABLISHMENTS. Ontario, Canada, 1990.

[22]     “Field-programmable gate array”, En.wikipedia.org, 2017. [Online]. Available: https://en.wikipedia.org/wiki/Field-programmable_gate_array. [Accessed: 16-Jun-2017].

[23]     Analysis techniques for system reliability — Procedure for failure mode and effects analysis (FMEA). 2nd Ed. IEC 60812. International Electrotechnical Commission (IEC). 2018.

[24]     “Failure mode and effects analysis”, en.wikipedia.org, 2017. [Online]. Available: https://en.wikipedia.org/wiki/Failure_mode_and_effects_analysis. [Accessed: 16-Jun-2017].

© 2017 – 2022, Compliance inSight Consulting Inc. Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

2 thoughts on “ISO 13849-1 Analysis — Part 8: Fault Exclusion

  1. My biggest problem with the FE is the hydraulic hoses. 13849-2 states that they cannot be FE, but getting calculation values ​​for them is really difficult. we cannot use pipes in mobile machines, so that is not a solution either.

    1. Hi Joni,
      You are correct. ISO13849-2:2012, Table C.8 does not permit fault exclusions for the “Bursting, tearing off at the fitting attachment and leakage” of hydraulic hoses. The rationale behind excluding flexible hydraulic hoses from these specific fault exclusions is that hydraulics are often used in harsh conditions. Since hydraulic systems operate at high pressures, hose failures can be immediately catastrophic. Hydraulic fluids are often flammable, so aerosolized oil from cracks and pinhole leaks can pose a significant fire risk. Many replacement hose assembly vendors do not include lifetime data with their assemblies. Although some hydraulic machine manufacturers do include hose inspection and replacement requirements in the preventative maintenance schedules for their products, many do not. Due to the cost, many machine owners choose not to replace the hoses at the required intervals. So the consequence of all of this is that fault exclusions for these assemblies are likely to create significant risk to the equipment user.

      If you have sufficient failure data related to your specific product(s) as used in typical conditions, you might be able to develop a fault exclusion justification. This is a non-trivial task as significant amounts of data and statistical analysis of that data is required to develop a justification like this. For guidance, I’d recommend reading the “proven in use” material in IEC 61508-2. While that standard is scoped for Electrotechnical products, the methodologies can be applied to any components.

  2. I have always been very uncomfortable about fault exclusion.

    Either you are saying that you totally trust another companies Q/A, and the comprehensive knowledge and discpline of all the designers and testers, or you are saying you think you know so much about something, there is nothing at all you don’t know.

    When put in that context I have always felt any gains from fault exclusion struggle to outweigh the certainty of the lack of unknowns.

    I feel it is far easier, even if only on your own sleep at night, to avoid exclusion where ever practical, if not possible.

    1. Hi Gareth,
      I think your position is more black-and-white than I would take, however, it does take some significant effort to justify a fault exclusion. If the component you are considering for fault exclusion is described in the tables in ISO 13849-2, then the justification is reasonably straightforward. If, however, you want to justify fault exclusion in a component not listed in part 2, then you need to do your homework. I’d start with an FMEA, and then back that up with a fault-tree analysis. Once you are done with those two steps, you will have either convinced yourself that the fault exclusion is justifiable, or you will have realized that you can’t adequately justify it. In either case, you will have the basis for your decision documented which is always important. If after all that, you still don’t want to use fault excursion, there’s nothing wrong with that. As a control systems designer that is always your right to make the decision you feel most comfortable about.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.