Missing MTTFd data

Dealing with the huge information void that exists while trying to complete reasonable control reliability assessments is a major challenge for every engineer or technologist tasked with this activity. Here are a few thoughts on what to do now, and where things may be going…

What the heck is MTTFd???

When you first start to work through ISO 13849-1, the first thing that will smack you in the head are all the new acronyms. The first one you’ll run into is ‘PL’, of course, since the entire purpose of the standard is to aid the designer in determining the reliability Performance Level of the control system. Shortly after that you’ll find yourself face to face with MTTFd.

MTTFd, or the Mean Time To Failure (dangerous), is the name given to the expected failure rate per year for a component used in a system that is being analyzed. This rate differs from the straight failure rate for the component because it’s limited to the failures that result in a dangerous failure mode, or that may lead to a hazard.

So how do you get this data?

Obtaining MTTFd data for a component should be easy for a designer. Component manufacturers who market components intended for safety applications should provide this data in the component specifications, but there are thousands, perhaps millions, of different components being marketed today for use in safety systems. Most of the major manufacturers are already providing this figure, or a figure that can be used to derive MTTFd, B10d, but for many components, this data is simply not available.

Here are some randomly chosen examples of manufacturer’s specification sheets that give this data:

Allen-Bradley Trojan™ T15 Interlock Switch

Pilz PNOZ X2 (pdf data sheet)

Preventa XPS MC Catalog Safety Controller (pdf 2015 Catalog)

B10d is the number of cycles until 10% of the components being tested fail in a dangerous way. Using failure rate data from the component’s data sheet, it is possible to estimate B10d from either B10 or T (the application dependent lifetime of the component). Check out Annex C of the standard if you want to see how this can be done.

But what do you do if the manufacturer of your favourite contactor doesn’t provide ANY failure data? Some major manufacturers still don’t provide any failure rate data at all, some provide expected lifetimes under specific operation conditions. Some provide only EN 954-1:95 data. In the last case, I think this is one of the reasons for the EC Machinery Working Group’s decision late last year to extend the transition period to ISO 13849-1:07. Need to know more about that decision?

Now what?

Unless you work for a large organization, instituting a life testing program is not likely to be an option, since you either need a protracted period of time with a few components in test, or thousands of samples for a short time.

The standard provides the option to use 10 years as a default where no other data is available. 10 years sounds like a long time at first blush, particularly if the planned lifetime of the system involved is 20 years. Typical MTTFd values for high-reliability components are in the hundreds of years, so by comparison, 10 years is almost nothing. Tables are also provided for some kinds of components, but the tables are necessarily limited in size, so not every component will be listed.

Your only option is to use the data in the standard, or pick up some of the other publications that include component failure data, like MIL-HDBK-217F, IEC/TR 62380 (based on UTE 80810 & RDF 2000), NPRD 95 or IEC 61709 (based on Siemens SN 29500 documents). Some of these documents may be difficult or impossible to obtain.

The result of this lack of objective data from the component manufacturers is:

  • Conservative results based on the minimum default MTTFd;
  • Potential over-design of safety related controls;
  • Increased manufacturing costs for machine builders;

The reasons for this situation vary by manufacturer, but ultimately it comes down to the cost of life testing components multiplied by number of components built by each manufacturer. Typical life tests require load simulators and switching for thousands of components, as well as data logging to trap failures and record relevant data. In the case of fluid power components (pneumatics and hydraulics), this becomes increasingly complex. For many component manufacturers, the cost of the life testing is prohibitive, even though this data is badly needed by their users.

Will we see an improvement in the future? The largest controls component manufacturers are very likely to provide this data as they have it available, meaning as they complete testing. New designs are much more likely to come with this data initially, while it may be a long time before some of the old standard components get time in the life test cell. Until then, lots of components will be assigned ’10 years’.

A big thank you to Wouter Leusden for the idea for this post!

Have a thought to share on this topic? Correct an error in the article? Sound off? Leave a comment!

IEC/TR 62061-1 Reviewed

This entry is part 2 of 2 in the series IEC/TR 62061-1

Why You Need to Spend More Cash on Yet Another Document

Standards organizations publish documents in a fairly continuous stream, so for those of us tasked with staying current with a large number of standards (say, more than 10), the publication of another new standard or Technical Report isn’t news – it’s business as usual. The question is always: Do we really need to add this to the library?

For those who are new to this business, having to pay for critical design information is a new experience. Finding out that it can cost hundreds, if not thousands, to build the library you need can be overwhelming.

This review aims to help you decide if you need IEC/TR 62061-1 in your library.

The Problem

As a machine builder or a manufacturer building a product designed to be integrated into machinery, how do you choose between ISO 13849-1 and IEC 62061?

IEC 62061-1 attempts to provide guidance on how to make this choice.

History

When CENELEC published EN 954-1 in 1995, machine builders were introduced to a whole new world of control reliability requirements. Prior to its publication, most machines were built with very simple interlocks, and no specific standards for interlocking devices existed. In the years since then, the EN 954-1 Categories have become well known and are applied inside and outside the EU.

In the intervening years, IEC published IEC 61508. This seven-part standard introduced the idea of ‘Safety Integrity  Levels’ or SILs. This standard is aimed at process control systems and could be used for complex machinery as well.

Why the Confusion?

In 2006, IEC published a machinery sector specific standard based on IEC 61508, called IEC 62061. This standard offered a simplified application of the IEC 61508 methodology intended for machine builders. The key problem with this standard is that it did not provide a means to deal with pneumatic or hydraulic control elements, which are covered by ISO 13849-1.

ISO adopted EN 954-1 and reissued it as ISO 13849-1 in 1999. This edition of the standard was virtually identical to the standard it replaced from a technical requirements perspective. EN 954-1/ISO 13849-1 did not provide any means to estimate the integrity of the safety related controls, but did define circuit architectures (Categories B, 1-4) and spoke to the selection of components, introducing the concepts of ‘well-tried safety principles’ and ‘well-tried components’. A second problem had long existed in addition to this – EN 954-2, Validation, was never published by CENELEC except as a committee draft, so a key element in the application of the standard had been missing for five years at the point where ISO 13849-1 Edition 1 was published.

The first cut at guiding users in choosing an appropriate standard came with the publication of IEC 62061 Edition 1.  Published in 2005, Edition 1 included a table that attempted to provide users with some guidance on how to choose between ISO 13849-1 or IEC 62061.

…and then came 2007…

In 2007, ISO published the Second Edition of ISO 13849-1, and brought a whole new twist to the discussion by introducing ‘Performance Levels’ or PLs. PLs can be loosely equated to SILs, even though PLs are stated in failures per year and SILs in failures per hour. The same table included in IEC 62061 was included in this edition of ISO 13849-1.

Table 1
Recommended application of
IEC 62061 and ISO 13849-1(under revision)

(from the Second Edition, 2007)

Technology implementing the
safety related control function(s)
ISO
13849-1 (under revision)
IEC 62061
A Non electrical, e.g. hydraulics X Not covered
B Electromechanical, e.g. relays, or
non-complex electronics
Restricted to designated
architectures (see Note 1) and up to PL=e

All architectures and up to
SIL 3

C Complex electronics, e.g. programmable Restricted to designated
architectures (see Note 1) and up
to PL=d
All architectures and up to
SIL 3
D A combined with B Restricted to designated
architectures (see Note 1) and up
to PL=e
X
see Note 3
E C combined with B Restricted to designated
architectures (see Note 1) and up
to PL=d
All architectures and up to
SIL 3
F C combined with A, or C combined with
A and B
X
see Note 2
X
see Note 3

“X” indicates that this item is dealt with by the standard shown in the column heading.

NOTE 1 Designated architectures are defined in Annex B of EN ISO 13849-1(rev.) to give a simplified approach for quantification of performance level.

NOTE 2 For complex electronics: Use of designated architectures according to EN ISO 13849-1(rev.) up to PL=d or any architecture according to IEC 62061.

NOTE 3 For non-electrical technology use parts according to EN ISO 13849-1(rev.) as subsystems.

So how is a machine builder to choose the ‘correct’ standard, if both standards are applicable and both are correct? Furthermore, how do you assess the reliability of the safety-related controls when integrating equipment from various suppliers, some of whom rate their equipment in PLs and some in SILs? Why are two standards addressing the same topic required? Will ISO 13849-1 and IEC 62061 ever be merged?

The Technical Report

In July this year the IEC published a Technical Report that discusses the selection and application of these two key control reliability standards for machine builders. This guide has long been needed, and precedes a face to face event planned by IEC to bring machine builders and standards writers face-to-face to discuss these same issues.

The guide, titled IEC/TR 62061-1 — Technical Report — Guidance on the application of ISO 13849-1 and IEC 62061 in the design of safety-related control systems for machinery provides direct guidance on how to select between these two standards.

Download IEC standards, International Electrotechnical Commission standards.

Merger

In the introduction to the report the TC makes it clear that the standards will be merged, although they don’t provide any kind of a time line for the merger. Quoting from the introduction:

It is intended that this Technical Report be incorporated into both IEC 62061 and ISO 13849-1 by means of corrigenda that reference the published version of this document. These corrigenda will also remove the information given in Table 1, Recommended application of IEC 62061 and ISO 13849-1, provided in the common introduction to both standards, which is now recognized as being out of date. Subsequently, it is intended to merge ISO 13849-1 and IEC 62061 by means of a JWG of ISO/TC 199 and IEC/TC 44.

I added the bold face to the paragraph above to highlight the key statement regarding the eventual merger of the two documents.  If you’re not familiar with the standards acronyms, a ‘JWG’ is a Joint Working Group, and a TC is a Technical Committee. TC’s are formed from volunteer experts from industry and academia supported by their organizations. So a JWG formed from two TC’s just means that a joint committee has been formed to work out the details of the merger. Eventually.

The other key point in this paragraph relates to the replacement of Table 1. In the interim, IEC/TR 62061-1 will be incorporated into both standards, replacing Table 1.

Eventually the confusion will be cleared up because only one standard will exist in the machinery sector, but until then, machine builders will need to figure out which standard best fits their products.

Comparing PL’s and SIL’s

The Technical Report does a good job of discussing the differences between PL and SIL, including providing an explanation of how to covert one to the other, very useful if you are trying to integrate an SIL rated device into a PL analysis or vice-versa.

Selecting a Standard

Clause 2.5 gives some solid advice on selecting between the two standards based on the technologies employed in the design and your own comfort level in using the analytical techniques in the two standards.

Another key point is that EITHER standard can be used to analyze complex OR simple control systems. Some fans of IEC 62061 have been known to put ISO 13849-1 down as useful exclusively for simple hardwired control systems. Clause 3.3 makes it clear that this is not the case. Pick the one you like or know the best and go with that. As an additional thought, consider which standard your competitors are using, and also which your customers are using. For example, if your customers use ISO 13849-1 primarily, qualifying your product under IEC 62061 might seem like a good idea, but may drive your customers to a competitor who makes their life easier by using ISO 13849-1. If your competitors are using a different standard, try to understand the choice before climbing on the bandwagon. There may be a competitive advantage lurking in being different.

Risk Assessment

Clause 4 speaks directly to the indispensable need to conduct a methodical risk assessment, and to use that to guide the design of the controls.

In my practice, many clients decide that they would prefer to choose a control reliability level that they feel will be more than good enough for any of their designs, and then to ‘standardize’ on that design for all their products, thereby eliminating the need to thoughtfully decide on the appropriate design for the application. In other cases, end-users may choose to use a ‘standard’ design throughout their facility to assist maintenance personnel by limiting their need to become technically familiar with a variety of designs. This is done to speed troubleshooting and reduce down time and spares stocks.

The problem with this approach can be that some managers believe this approach can eliminate the need to conduct risk assessments, seeing this as a fruitless, expensive and often futile exercise. This is emphatically NOT the case. Risk assessments address much more than the selection of control reliability requirements and need to be done to ensure that all hazards that cannot be eliminated or substituted are safeguarded. A missing or badly done risk assessment may invalidate your claim to a CE mark, or be the landmine that ends a liability case – with you on the losing end.

Safety Requirement Specification (SRS)

Each safety function needs to be defined in detail in a Safety Requirement Specification (SRS). A reliability assessment needs to be completed for each safety function defined in the SRS. This point is discussed in detail in IEC 62061, but is not dealt with in any detail in ISO 13849-1, so IEC/TR 62061-1 once again bridges the gap by providing an important detail that is missing in one of the two standards.

If you are unfamiliar with the concept of an SRS, each safety function needs to be described with a certain minimum amount of information, including:

  • The name of safety function;
  • A description of the function;
  • The required level of performance based on the risk assessment and according to either ISO 13849-1 (PLr a to e) or the required safety integrity according to IEC 62061 (SIL 1 to 3)

Once the safety functions are defined and analyzed, each safety function must be implemented by a control circuit. The selected PL will drive the design to one or two of the defined ISO 13849-1 architectures, and then the component selections and other design details will drive the final failure rate and PL. Alternatively, the SRS will drive the selection of IEC 62061 architecture (1oo1, 1oo2, 2oo2, etc.) and the rest of the design details will lead to the final failure rate and SIL.

Table 1 in the Technical Report compares the levels.

Table 1 – Relationship between PLs and SILs based on the average probability
of dangerous failure per hour

Performance Level (PL) Average probability of a dangerous
failure per hour (1/h)
Safety integrity level (SIL)
a >= 10-5 to < 10-4 No special safety requirements
b >= 3 x 10-6 to < 10-5 1
c >= 10-6 to < 3 x 10-6 1
d >= 10-7 to < 10-6 2
e >= 10-8 to < 10-7 3

This table combines ISO 13849-1 2007, Tables 3 & 4. No similar tables exist in IEC 62061 2005.

Combining Equipment with PLs and SILs

Section 7 of the report speaks to the challenge of integrating equipment with ratings in a mix of PLs and SILs. Until the standards merge and a single system for describing reliability categories is agreed on, this problem will be with us.

When designing systems using either system the designer has to determine the approximate rate of dangerous failures. In ISO 13849-1, MTTFd is the component failure rate parameter, while in IEC 62061, PFHd is the subsystem failure rate parameter. MTTFd does not consider diagnostics or architecture, only the component failure rate per year, while PFHd does include diagnostics and archtitecture, and it speaks to the system failure rate per hour. To compare these rates, ISO 13849-1 Annex K describes the relationship between MTTFd and PFHd for different architectures.

In the design process only one method can be used, so where equipment with different ratings must be combined the failure rates must be converted to either MTTFd or to PFHd, depending on the system being used to complete the analysis. Mixing requirements within the design of a subsystem is not permitted (See Clause 7.3.3).

Fault Exclusions

Fault exclusions are permitted under both standards with some limitations: up to IEC 62061 SIL 2. No fault exclusions are permitted in SIL 3. Properly justified fault exclusions can be used up to PLe. “Properly justified” fault exclusions are those that can be shown to be valid through the lifetime of the SRP/CS.

In general, fault exclusions for mechanical failures of electromechanical devices such as interlock devices or emergency stop devices are not permitted, with a few exceptions given in ISO 13849-2, (See Clauses 7.2.2.4 and 7.2.2.5).

This approach is consistent with the current approach taken in Canada, as described in CSA Z432 & Z434. Fault exclusions are generally not permitted under ANSI standards.

Worked Examples

Section 8 of the Technical Report gives a couple of worked examples, one done under ISO 13849-1, and one under IEC 62061. For someone looking for a good example of what a properly completed analysis should look like, this section is the gold at the end of the rainbow. Section 8.2 provides a good, clear example of the application of the standards along with a nice, simple example of what a safety requirement specification might look like.

Understanding the Differences

One area where proponents of the two standards often disagree is on the ‘accuracy’ of the analytical procedures given in the two standards. The Technical Report provides a detailed explanation of why the two techniques provide slightly different results and provides the rationale explaining why this variation should be considered acceptable.

To Buy or Not to Buy…

At the end of the day, the question that needs to be answered is whether to buy this document or not. If you use either of these standards, I strongly recommend that you spend the money to get this Technical Report, if for nothing more than the worked examples. Until the two standards are merged, and that could be a few years, you will need to be able to effectively apply these approaches to PL and SIL rated equipment. This Technical Report will be an invaluable aid.

It also provides some guidance on the direction that the new merged standard will take. Some old arguments can be settled, or at least re-directed, by this document.

Finally, since the TR is to be incorporated in both standards and contains material replacing that in the current editions of the standard, you must buy a copy to remain current.

For all of these reasons, I would spend the money to acquire this document, read and apply it.

Download IEC standards, International Electrotechnical Commission standards.

Download ISO Standards

If you’ve bought the report and would like to add your thoughts, please add a comment below. Got questions? Contact me!