Post updated 2019-07-24. Ed.
Fault Consideration & Fault Exclusion
ISO 13849-1, Clause 7 [1, 7] discusses the need for fault consideration and fault exclusion. Fault consideration is examining the components and sub-systems used in the safety-related part of the control system (SRP/CS) and making a list of all possible faults. This is a non-trivial exercise!
Thinking back to some of the earlier articles in this series where I mentioned the different types of faults, you may recall that there are detectable and undetectable faults, and there are safe and dangerous faults, leading us to four kinds of faults:
- Safe undetectable faults
- Dangerous undetectable faults
- Safe detectable faults
- Dangerous detectable faults
For systems with no diagnostics, i.e., Category B and 1, faults need to be eliminated using inherently safe design techniques, particularly through the use of well-tried components and over-dimensioning. Care needs to be taken when classifying components as “well-tried” versus using a fault exclusion, as components that might normally be considered “well-tried” might not meet those requirements in every application. [2, Annex A], Validation tools for mechanical systems, discusses the concepts of “Basic Safety Principles,” “Well-Tried Safety Principles,” and “Well-tried components.” [2, Annex A] also provides examples of faults and relevant fault exclusion criteria. There are similar Annexes that cover pneumatic systems [2, Annex B], hydraulic systems [2, Annex C], and electrical systems [2, Annex D].
For systems where diagnostics are part of the design, i.e., Category 2, 3, and 4, the fault lists are used to evaluate the diagnostic coverage (DC) of the test systems. Depending on the architecture, certain levels of DC are required to meet the relevant PL, see [1, Fig. 5]. The fault lists are the starting point for determining DC. The values from the lists are inputs for the hardware and software designs. The diagnostics must cover all dangerous detectable faults, and the DC must be high enough to meet the PLr for the safety function.
The fault lists and exclusions are also used in the Validation portion of this process. At the start of the Validation process flowchart [2, Fig. 1], you can see how the fault lists and the criteria used for fault exclusion are used as inputs to the validation plan.
Faults that can be excluded do not need to be validated, saving time and effort during the system verification and validation (V & V). How is this done?
Fault Consideration
The first step is to develop a list of potential faults that could occur based on the components and subsystems included in SRP/CS. ISO 13849-2 [2] includes lists of typical faults for various technologies. For example, [2, Table A.4] is the fault list for mechanical components.

(e.g. cam, follower, chain, clutch, brake, shaft, screw, pin, guide, bearing)
[2] contains tables similar to Table A.4 for:
- Pressure-coil springs
- Directional control valves
- Stop (shut-off) valves/non-return (check) valves/quick-action venting valves/shuttle valves, etc.
- Flow valves
- Pressure valves
- Pipework
- Hose assemblies
- Connectors
- Pressure transmitters and pressure medium transducers
- Compressed air treatment — Filters
- Compressed-air treatment — Oilers
- Compressed air treatment — Silencers
- Accumulators and pressure vessels
- Sensors
- Fluidic Information processing — Logical elements
- etc.
As you can see, many different types of faults need to be considered. Remember that I did not give you all the different fault lists – this post would be a mile long if I did that! You need to develop a fault list for your system and then consider each fault’s impact on the system’s operation. If you have components or subsystems not listed in the tables, you must develop your own fault lists for those items. Failure Modes and Effects Analysis (FMEA) can be used to develop fault lists for these components [23], [24].
When considering the faults to be included in the list, there are a few things that should be considered [1, 7.2]:
- if after the first fault occurs, other faults develop due to the first fault, then you can group those faults together as a single fault
- two or more single faults with a common cause can be considered as a single fault
- multiple faults with different causes but occurring simultaneously are considered improbable and do not need to be considered
Examples
#1 – Voltage Regulator
A voltage regulator fails in a system power supply so that the 24 Vdc output rises to an unregulated 36 Vdc (the internal power supply bus voltage), and after some time has passed, two sensors fail. All three failures can be grouped and considered as a single fault because they originate in a single failure in the voltage regulator.
#2 – Lightning Strike
If a lightning strike occurs on the power line and the resulting surge voltage on the 400 V mains causes an interposing contactor and the motor drive it controls to fail to danger, these failures may be grouped and considered as one. Again, a single event causes all of the subsequent failures.
#3 – Pneumatic System Lubrication
3a – A pneumatic lubricator runs out of lubricant and is not refilled, depriving downstream pneumatic components of lubrication.
3b – The spool on the system dump valve sticks open because it is not cycled often enough.
Neither of these failures has the same cause, so there is no need to consider them occurring simultaneously because the probability of both happening concurrently is extremely small. One caution: These two faults MAY have a common cause – poor maintenance. If this is true and you consider them two faults with a common cause, they could be grouped as a single fault.
Fault Exclusion
Once you have your well-considered fault lists together, the next question is, “Can any of the listed faults be excluded?” This is a tricky question! There are a few points to consider:
- Does the system architecture allow for fault exclusion?
- Is the fault technically improbable, even if it is possible?
- Does experience show that the fault is unlikely to occur?*
- Are technical requirements related to the application and the hazard supporting fault exclusion?
* BE CAREFUL with this one!
Whenever faults are excluded, a detailed justification must be included in the system design documentation. Simply deciding that the fault can be excluded is NOT ENOUGH! Consider the risk a person will be exposed to if the fault occurs. If the severity is very high, i.e., severe permanent injury or death, you may not want to exclude the fault even if you think you could. Careful consideration of the resulting injury scenario is needed.
Basing a fault exclusion on personal experience is seldom considered adequate, so I added the asterisk (*) above. Look for good statistical data to support any fault exclusion decision.
There is much more information available in IEC 61508-2 on the subject of fault exclusion, and there is good information in some of the books mentioned below [0.1], [0.2], and [0.3]. If you know of additional resources you would like to share, please post the information in the comments!
The final part of this series, Part 9, is the complete reference list for the series.
Definitions
- 3.1.3
- fault
- state of an item characterized by the inability to perform a required function, excluding the inability during preventive maintenance or other planned actions, or due to lack of external resources
- Note 1 to entry: A fault is often the result of a failure of the item itself, but may exist without prior failure.
- Note 2 to entry: In this part of ISO 13849, ?fault? means random fault. [SOURCE: IEC 60050-191:1990, 05-01.]
Book List
Here are some books that I think you may find helpful on this journey:
[0] B. Main, Risk Assessment: Basics and Benchmarks, 1st ed. Ann Arbor, MI USA: DSE, 2004.
[0.2] Electromagnetic Compatibility for Functional Safety, 1st ed. Stevenage, UK: The Institution of Engineering and Technology, 2008.
[0.3] Overview of techniques and measures related to EMC for Functional Safety, 1st ed. Stevenage, UK: Overview of techniques and measures related to EMC for Functional Safety, 2013.
[0.5] “Code of Practice: Competence for Safety Related Systems Practitioners, 1st ed. Stevenage, UK: The Institution of Engineering and Technology, 2016.
References
Note: This reference list starts in Part 1 of the series, so “missing” references may show in other parts of the series. Included in the last post of the series is the complete reference list.
[4] Safeguarding of Machinery, 2nd Ed. CSA Z432. 2004.
[6] Safety of machinery — Emergency stop function — Principles for design, ISO 13850. 2015.
[13] Safety of machinery—Safety-related parts of control systems. General principles for design, CEN EN 954-1. European Committee for Standardization (CEN). 1996.
[15] Reliability Prediction of Electronic Equipment. Military Handbook MIL-HDBK-217F. 1991.
[16] “IFA – Practical aids: Software-Assistant SISTEMA: Safety Integrity – Software Tool for the Evaluation of Machine Applications”, dguv.de, 2017. [Online]. Available: http://www.dguv.de/ifa/praxishilfen/practical-solutions-machine-safety/software-sistema/index.jsp. [Accessed: 30- Jan- 2017].
[17] “failure mode”, 192-03-17, International Electrotechnical Vocabulary. IEC International Electrotechnical Commission, Geneva, 2015.
[18] M. Gentile and A. E. Summers, “Common Cause Failure: How Do You Manage Them?,” Process Saf. Prog., vol. 25, no. 4, pp. 331-338, 2006.
[19] Out of Control—Why control systems go wrong and how to prevent failure, 2nd ed. Richmond, Surrey, UK: HSE Health and Safety Executive, 2003.
[20] Safeguarding of Machinery. 3rd Ed. CSA Z432. Canadian Standards Association (CSA). 2016.
[21] O. Reg. 851, INDUSTRIAL ESTABLISHMENTS. Ontario, Canada, 1990.
[22] “Field-programmable gate array”, En.wikipedia.org, 2017. [Online]. Available: https://en.wikipedia.org/wiki/Field-programmable_gate_array. [Accessed: 16-Jun-2017].
[23] Analysis techniques for system reliability — Procedure for failure mode and effects analysis (FMEA). 2nd Ed. IEC 60812. International Electrotechnical Commission (IEC). 2018.
[24] “Failure mode and effects analysis”, en.wikipedia.org, 2017. [Online]. Available: https://en.wikipedia.org/wiki/Failure_mode_and_effects_analysis. [Accessed: 16-Jun-2017].
© 2017 – 2022, Compliance inSight Consulting Inc.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
My biggest problem with the FE is the hydraulic hoses. 13849-2 states that they cannot be FE, but getting calculation values for them is really difficult. we cannot use pipes in mobile machines, so that is not a solution either.
Hi Joni,
You are correct. ISO13849-2:2012, Table C.8 does not permit fault exclusions for the “Bursting, tearing off at the fitting attachment and leakage” of hydraulic hoses. The rationale behind excluding flexible hydraulic hoses from these specific fault exclusions is that hydraulics are often used in harsh conditions. Since hydraulic systems operate at high pressures, hose failures can be immediately catastrophic. Hydraulic fluids are often flammable, so aerosolized oil from cracks and pinhole leaks can pose a significant fire risk. Many replacement hose assembly vendors do not include lifetime data with their assemblies. Although some hydraulic machine manufacturers do include hose inspection and replacement requirements in the preventative maintenance schedules for their products, many do not. Due to the cost, many machine owners choose not to replace the hoses at the required intervals. So the consequence of all of this is that fault exclusions for these assemblies are likely to create significant risk to the equipment user.
If you have sufficient failure data related to your specific product(s) as used in typical conditions, you might be able to develop a fault exclusion justification. This is a non-trivial task as significant amounts of data and statistical analysis of that data is required to develop a justification like this. For guidance, I’d recommend reading the “proven in use” material in IEC 61508-2. While that standard is scoped for Electrotechnical products, the methodologies can be applied to any components.
I have always been very uncomfortable about fault exclusion.
Either you are saying that you totally trust another companies Q/A, and the comprehensive knowledge and discpline of all the designers and testers, or you are saying you think you know so much about something, there is nothing at all you don’t know.
When put in that context I have always felt any gains from fault exclusion struggle to outweigh the certainty of the lack of unknowns.
I feel it is far easier, even if only on your own sleep at night, to avoid exclusion where ever practical, if not possible.
Hi Gareth,
I think your position is more black-and-white than I would take, however, it does take some significant effort to justify a fault exclusion. If the component you are considering for fault exclusion is described in the tables in ISO 13849-2, then the justification is reasonably straightforward. If, however, you want to justify fault exclusion in a component not listed in part 2, then you need to do your homework. I’d start with an FMEA, and then back that up with a fault-tree analysis. Once you are done with those two steps, you will have either convinced yourself that the fault exclusion is justifiable, or you will have realized that you can’t adequately justify it. In either case, you will have the basis for your decision documented which is always important. If after all that, you still don’t want to use fault excursion, there’s nothing wrong with that. As a control systems designer that is always your right to make the decision you feel most comfortable about.