Posts Tagged ‘ISO 13849-1’

IEC/TR 62061-1 Reviewed

Wednesday, September 8th, 2010
This entry is part 2 of 2 in the series IEC/TR 62061-1

Why You Need to Spend More Cash on Yet Another Document

Standards organizations publish documents in a fairly continuous stream, so for those of us tasked with staying current with a large number of standards (say, more than 10), the publication of another new standard or Technical Report isn’t news – it’s business as usual. The question is always: Do we really need to add this to the library?

For those who are new to this business, having to pay for critical design information is a new experience. Finding out that it can cost hundreds, if not thousands, to build the library you need can be overwhelming.

This review aims to help you decide if you need IEC/TR 62061-1 in your library.

The Problem

As a machine builder or a manufacturer building a product designed to be integrated into machinery, how do you choose between ISO 13849-1 and IEC 62061?

IEC 62061-1 attempts to provide guidance on how to make this choice.

History

When CENELEC published EN 954-1 in 1995, machine builders were introduced to a whole new world of control reliability requirements. Prior to its publication, most machines were built with very simple interlocks, and no specific standards for interlocking devices existed. In the years since then, the EN 954-1 Categories have become well known and are applied inside and outside the EU.

In the intervening years, IEC published IEC 61508. This seven-part standard introduced the idea of ‘Safety Integrity  Levels’ or SILs. This standard is aimed at process control systems and could be used for complex machinery as well.

Why the Confusion?

In 2006, IEC published a machinery sector specific standard based on IEC 61508, called IEC 62061. This standard offered a simplified application of the IEC 61508 methodology intended for machine builders. The key problem with this standard is that it did not provide a means to deal with pneumatic or hydraulic control elements, which are covered by ISO 13849-1.

ISO adopted EN 954-1 and reissued it as ISO 13849-1 in 1999. This edition of the standard was virtually identical to the standard it replaced from a technical requirements perspective. EN 954-1/ISO 13849-1 did not provide any means to estimate the integrity of the safety related controls, but did define circuit architectures (Categories B, 1-4) and spoke to the selection of components, introducing the concepts of ‘well-tried safety principles’ and ‘well-tried components’. A second problem had long existed in addition to this – EN 954-2, Validation, was never published by CENELEC except as a committee draft, so a key element in the application of the standard had been missing for five years at the point where ISO 13849-1 Edition 1 was published.

The first cut at guiding users in choosing an appropriate standard came with the publication of IEC 62061 Edition 1. Edition 1 of IEC 62061, published in 2005, included a table that attempted to provide users with some guidance on how to choose between ISO 13849-1 or IEC 62061.

…and then came 2007…

In 2007, ISO published the Second Edition of ISO 13849-1, and brought a whole new twist to the discussion by introducing ‘Performance Levels’ or PLs. PLs can be loosely equated to SILs, even though PLs are stated in failures per year and SILs in failures per hour. The same table included in IEC 62061 was included in this edition of ISO 13849-1.

Table 1
Recommended application of
IEC 62061 and ISO 13849-1(under revision)

(from the Second Edition, 2007)

Technology implementing the
safety related control function(s)
ISO
13849-1 (under revision)
IEC 62061
A Non electrical, e.g. hydraulics X Not covered
B Electromechanical, e.g. relays, or
non-complex electronics
Restricted to designated
architectures (see Note 1) and up to PL=e

All architectures and up to
SIL 3

C Complex electronics, e.g. programmable Restricted to designated
architectures (see Note 1) and up
to PL=d
All architectures and up to
SIL 3
D A combined with B Restricted to designated
architectures (see Note 1) and up
to PL=e
X
see Note 3
E C combined with B Restricted to designated
architectures (see Note 1) and up
to PL=d
All architectures and up to
SIL 3
F C combined with A, or C combined with
A and B
X
see Note 2
X
see Note 3

“X” indicates that this item is dealt with by the standard shown in the column heading.

NOTE 1 Designated architectures are defined in Annex B of EN ISO 13849-1(rev.) to give a simplified approach for quantification of performance level.

NOTE 2 For complex electronics: Use of designated architectures according to EN ISO 13849-1(rev.) up to PL=d or any architecture according to IEC 62061.

NOTE 3 For non-electrical technology use parts according to EN ISO 13849-1(rev.) as subsystems.

So how is a machine builder to choose the ‘correct’ standard, if both standards are applicable and both are correct? Furthermore, how do you assess the reliability of the safety-related controls when integrating equipment from various suppliers, some of whom rate their equipment in PLs and some in SILs? Why are two standards addressing the same topic required? Will ISO 13849-1 and IEC 62061 ever be merged?

The Technical Report

In July this year the IEC published a Technical Report that discusses the selection and application of these two key control reliability standards for machine builders. This guide has long been needed, and precedes a face to face event planned by IEC to bring machine builders and standards writers face-to-face to discuss these same issues.

The guide, titled IEC/TR 62061-1 — Technical Report — Guidance on the application of ISO 13849-1 and IEC 62061 in the design of safety-related control systems for machinery provides direct guidance on how to select between these two standards.

Download IEC standards, International Electrotechnical Commission standards.

Merger

In the introduction to the report the TC makes it clear that the standards will be merged, although they don’t provide any kind of a time line for the merger. Quoting from the introduction:

It is intended that this Technical Report be incorporated into both IEC 62061 and ISO 13849-1 by means of corrigenda that reference the published version of this document. These corrigenda will also remove the information given in Table 1, Recommended application of IEC 62061 and ISO 13849-1, provided in the common introduction to both standards, which is now recognized as being out of date. Subsequently, it is intended to merge ISO 13849-1 and IEC 62061 by means of a JWG of ISO/TC 199 and IEC/TC 44.

I added the bold face to the paragraph above to highlight the key statement regarding the eventual merger of the two documents.  If you’re not familiar with the standards acronyms, a ‘JWG’ is a Joint Working Group, and a TC is a Technical Committee. TC’s are formed from volunteer experts from industry and academia supported by their organizations. So a JWG formed from two TC’s just means that a joint committee has been formed to work out the details of the merger. Eventually.

The other key point in this paragraph relates to the replacement of Table 1. In the interim, IEC/TR 62061-1 will be incorporated into both standards, replacing Table 1.

Eventually the confusion will be cleared up because only one standard will exist in the machinery sector, but until then, machine builders will need to figure out which standard best fits their products.

Comparing PL’s and SIL’s

The Technical Report does a good job of discussing the differences between PL and SIL, including providing an explanation of how to covert one to the other, very useful if you are trying to integrate an SIL rated device into a PL analysis or vice-versa.

Selecting a Standard

Clause 2.5 gives some solid advice on selecting between the two standards based on the technologies employed in the design and your own comfort level in using the analytical techniques in the two standards.

Another key point is that EITHER standard can be used to analyze complex OR simple control systems. Some fans of IEC 62061 have been known to put ISO 13849-1 down as useful exclusively for simple hardwired control systems. Clause 3.3 makes it clear that this is not the case. Pick the one you like or know the best and go with that. As an additional thought, consider which standard your competitors are using, and also which your customers are using. For example, if your customers use ISO 13849-1 primarily, qualifying your product under IEC 62061 might seem like a good idea, but may drive your customers to a competitor who makes their life easier by using ISO 13849-1. If your competitors are using a different standard, try to understand the choice before climbing on the bandwagon. There may be a competitive advantage lurking in being different.

Risk Assessment

Clause 4 speaks directly to the indispensable need to conduct a methodical risk assessment, and to use that to guide the design of the controls.

In my practice, many clients decide that they would prefer to choose a control reliability level that they feel will be more than good enough for any of their designs, and then to ‘standardize’ on that design for all their products, thereby eliminating the need to thoughtfully decide on the appropriate design for the application. In other cases, end-users may choose to use a ‘standard’ design throughout their facility to assist maintenance personnel by limiting their need to become technically familiar with a variety of designs. This is done to speed troubleshooting and reduce down time and spares stocks.

The problem with this approach can be that some managers believe this approach can eliminate the need to conduct risk assessments, seeing this as a fruitless, expensive and often futile exercise. This is emphatically NOT the case. Risk assessments address much more than the selection of control reliability requirements and need to be done to ensure that all hazards that cannot be eliminated or substituted are safeguarded. A missing or badly done risk assessment may invalidate your claim to a CE mark, or be the landmine that ends a liability case – with you on the losing end.

Safety Requirement Specification (SRS)

Each safety function needs to be defined in detail in a Safety Requirement Specification (SRS). A reliability assessment needs to be completed for each safety function defined in the SRS. This point is discussed in detail in IEC 62061, but is not dealt with in any detail in ISO 13849-1, so IEC/TR 62061-1 once again bridges the gap by providing an important detail that is missing in one of the two standards.

If you are unfamiliar with the concept of an SRS, each safety function needs to be described with a certain minimum amount of information, including:

  • The name of safety function;
  • A description of the function;
  • The required level of performance based on the risk assessment and according to either ISO 13849-1 (PLr a to e) or the required safety integrity according to IEC 62061 (SIL 1 to 3)

Once the safety functions are defined and analyzed, each safety function must be implemented by a control circuit. The selected PL will drive the design to one or two of the defined ISO 13849-1 architectures, and then the component selections and other design details will drive the final failure rate and PL. Alternatively, the SRS will drive the selection of IEC 62061 architecture (1oo1, 1oo2, 2oo2, etc.) and the rest of the design details will lead to the final failure rate and SIL.

Table 1 in the Technical Report compares the levels.

Table 1 – Relationship between PLs and SILs based on the average probability
of dangerous failure per hour

Performance Level (PL) Average probability of a dangerous
failure per hour (1/h)
Safety integrity level (SIL)
a >= 10-5 to < 10-4 No special safety requirements
b >= 3 x 10-6 to < 10-5 1
c >= 10-6 to < 3 x 10-6 1
d >= 10-7 to < 10-6 2
e >= 10-8 to < 10-7 3

This table combines ISO 13849-1 2007, Tables 3 & 4. No similar tables exist in IEC 62061 2005.

Combining Equipment with PLs and SILs

Section 7 of the report speaks to the challenge of integrating equipment with ratings in a mix of PLs and SILs. Until the standards merge and a single system for describing reliability categories is agreed on, this problem will be with us.

When designing systems using either system the designer has to determine the approximate rate of dangerous failures. In ISO 13849-1, MTTFd is the component failure rate parameter, while in IEC 62061, PFHd is the subsystem failure rate parameter. MTTFd does not consider diagnostics or architecture, only the component failure rate per year, while PFHd does include diagnostics and archtitecture, and it speaks to the system failure rate per hour. To compare these rates, ISO 13849-1 Annex K describes the relationship between MTTFd and PFHd for different architectures.

In the design process only one method can be used, so where equipment with different ratings must be combined the failure rates must be converted to either MTTFd or to PFHd, depending on the system being used to complete the analysis. Mixing requirements within the design of a subsystem is not permitted (See Clause 7.3.3).

Fault Exclusions

Fault exclusions are permitted under both standards with some limitations: up to IEC 62061 SIL 2. No fault exclusions are permitted in SIL 3. Properly justified fault exclusions can be used up to PLe. “Properly justified” fault exclusions are those that can be shown to be valid through the lifetime of the SRP/CS.

In general, fault exclusions for mechanical failures of electromechanical devices such as interlock devices or emergency stop devices are not permitted, with a few exceptions given in ISO 13849-2, (See Clauses 7.2.2.4 and 7.2.2.5).

This approach is consistent with the current approach taken in Canada, as described in CSA Z432 & Z434. Fault exclusions are generally not permitted under ANSI standards.

Worked Examples

Section 8 of the Technical Report gives a couple of worked examples, one done under ISO 13849-1, and one under IEC 62061. For someone looking for a good example of what a properly completed analysis should look like, this section is the gold at the end of the rainbow. Section 8.2 provides a good, clear example of the application of the standards along with a nice, simple example of what a safety requirement specification might look like.

Understanding the Differences

One area where proponents of the two standards often disagree is on the ‘accuracy’ of the analytical procedures given in the two standards. The Technical Report provides a detailed explanation of why the two techniques provide slightly different results and provides the rationale explaining why this variation should be considered acceptable.

To Buy or Not to Buy…

At the end of the day, the question that needs to be answered is whether to buy this document or not. If you use either of these standards, I strongly recommend that you spend the money to get this Technical Report, if for nothing more than the worked examples. Until the two standards are merged, and that could be a few years, you will need to be able to effectively apply these approaches to PL and SIL rated equipment. This Technical Report will be an invaluable aid.

It also provides some guidance on the direction that the new merged standard will take. Some old arguments can be settled, or at least re-directed, by this document.

Finally, since the TR is to be incorporated in both standards and contains material replacing that in the current editions of the standard, you must buy a copy to remain current.

For all of these reasons, I would spend the money to acquire this document, read and apply it.

Download IEC standards, International Electrotechnical Commission standards.

Download ISO Standards

If you’ve bought the report and would like to add your thoughts, please add a comment below. Got questions? Contact me!

Interlock Architectures – Pt. 3: Category 2

Tuesday, August 24th, 2010

In the first two posts in this series, we looked at Category B, the Basic category of system architecture, and then moved on to look at Category 1. These two categories of system architecture underpin Categories 2, 3 and 4. In this post we’ll look more deeply into Category 2.

Let’s start by looking at the definition for Category 2, taken from ISO 13849-1:2007. Remember that in these excerpts, SRP/CS stands for Safety Related Parts of Control Systems.

Definition

6.2.5 Category 2

For category 2, the same requirements as those according to 6.2.3 for category B shall apply. “Well–tried safety principles” according to 6.2.4 shall also be followed. In addition, the following applies.

SRP/CS of category 2 shall be designed so that their function(s) are checked at suitable intervals by the machine control system. The check of the safety function(s) shall be performed

  • at the machine start-up, and
  • prior to the initiation of any hazardous situation, e.g. start of a new cycle, start of other movements, and/or
  • periodically during operation if the risk assessment and the kind of operation shows that it is necessary.

The initiation of this check may be automatic. Any check of the safety function(s) shall either

  • allow operation if no faults have been detected, or
  • generate an output which initiates appropriate control action, if a fault is detected.

Whenever possible this output shall initiate a safe state. This safe state shall be maintained until the fault is cleared. When it is not possible to initiate a safe state (e.g. welding of the contact in the final switching device) the output shall provide a warning of the hazard.

For the designated architecture of category 2, as shown in Figure 10, the calculation of MTTFd and DCavg should take into account only the blocks of the functional channel (i.e. I, L and O in Figure 10) and not the blocks of the testing channel (i.e. TE and OTE in Figure 10).

The diagnostic coverage (DCavg) of the total SRP/CS including fault-detection shall be low. The MTTFd of each channel shall be low-to-high, depending on the required performance level (PLr). Measures against CCF shall be applied (see Annex F).

The check itself shall not lead to a hazardous situation (e.g. due to an increase in response time). The checking equipment may be integral with, or separate from, the safety-related part(s) providing the safety function.

The maximum PL achievable with category 2 is PL = d.

NOTE 1 In some cases category 2 is not applicable because the checking of the safety function cannot be applied to all components.

NOTE 2 Category 2 system behaviour allows that

  • the occurrence of a fault can lead to the loss of the safety function between checks,
  • the loss of safety function is detected by the check.

NOTE 3 The principle that supports the validity of a category 2 function is that the adopted technical provisions, and, for example, the choice of checking frequency can decrease the probability of occurrence of a dangerous situation.

ISO 13849-1 Figure 10

ISO 13849-1 Figure 10 - Category 2 Block diagram

Breaking it down

Let start by taking apart the definition a piece at a time and looking at what each part means. I’ll also show a simple circuit that can meet the requirements.

Category B & Well-tried Components

The first paragraph speaks to the building block approach taken in the standard:

For category 2, the same requirements as those according to 6.2.3 for category B shall apply. “Well–tried safety principles” according to 6.2.4 shall also be followed. In addition, the following applies.

Systems meeting Category 2 are required to meet all of the same requirements as Categories B & 1 as far as the components are concerned. Other requirements for the circuits are different, and we will look at those in a bit.

Self-Testing required

Category 2 brings in the idea of diagnostics. If correctly specified components have been selected (Category B), and those components can be considered ‘well-tried’ and are applied following ‘well-tried safety principles’ (Category 1), then adding a diagnostic component to the system should allow the system to detect some faults and therefore achieve a certain degree of ‘fault-tolerance’ or the ability to function correctly even when some aspect of the system has failed.

Let’s look at the text:

SRP/CS of Category 2 shall be designed so that their function(s) are checked at suitable intervals by the machine control system. The check of the safety function(s) shall be performed

  • at the machine start-up, and
  • prior to the initiation of any hazardous situation, e.g. start of a new cycle, start of other movements, and/or
  • periodically during operation if the risk assessment and the kind of operation shows that it is necessary.

The initiation of this check may be automatic. Any check of the safety function(s) shall either

  • allow operation if no faults have been detected, or
  • generate an output which initiates appropriate control action, if a fault is detected.

Whenever possible this output shall initiate a safe state. This safe state shall be maintained until the fault is cleared. When it is not possible to initiate a safe state (e.g. welding of the contact in the final switching device) the output shall provide a warning of the hazard.

Periodic checking is required. The checks must happen at least each time there is a demand placed on the system, i.e. a guard door is opened and closed, or an emergency stop button is pressed and reset. In addition the integrity of the SRP/CS must be tested at the start of a cycle or hazardous period, and potentially periodically during operation if the risk assessment indicates that this is necessary.

The testing does not have to be automatic, although in practice it usually is. As long as the system integrity is good, then the output is allowed to remain on, and the machinery or process can run.

Watch Out!

Notice that the words ‘whenever possible’ are used in the last paragraph in this part of the definition where the standard speaks about initiation of a safe state. This wording alludes to the fact that these systems are still prone to faults that can lead to the loss of the safety function, and so cannot be called truly ‘fault-tolerant’. Loss of the safety function must be detected by the monitoring system and a safe state initiated. This requires careful thought, since the safety system components may have to interact with the process control system to initiate and maintain the safe state in the event that the safety system itself has failed.

All of this leads to an interesting question: If the system is hardwired through the operating channel, and all the components used in that channel meet Category B & 1 requirements, can the diagnostic component be provided by a monitoring the system with a standard PLC?

Unfortunately, the answer to this is NO. This is true because ALL of the components must meet the well-tried requirement, and since programmable electronics are specifically excluded from being considered well-tried, this approach cannot be used. Some North American standards are written so that this approach could be applied, but under the International and EU requirements it is not acceptable.

Finally, for the faults that can be detected by the monitoring system, detection of a fault must initiate a safe state. This means that on the next demand on the system, i.e. the next time the guard is opened or the emergency stop is pressed, the machine must go into a safe condition. Generally, detection of a fault should prevent the subsequent reset of the system until the fault is cleared or repaired.

Testing is not permitted to introduce any new hazards or to slow the system down. The tests must occur ‘on-the-fly’ and without introducing any delay in the system compared to how it would have operated without the testing incorporated. Test equipment can be integrated into the safety system or be external to it.

Watch Out!

Note 1 in the definition highlights a significant pitfall for many designers: if all of the components in the functional channel of the system cannot be checked, you cannot claim conformity to Category 2. A system that otherwise would meet the architectural requirements for Category 2 must be downgraded to Category 1 in cases where all the components in the functional channel cannot be tested. This is a major point and one which many designers miss when developing their systems.

Calculation of MTTFd

The next paragraph deals with the calculation of the failure rate of the system, or MTTFd.

For the designated architecture of category 2, as shown in Figure 10, the calculation of MTTFd and DCavg should take into account only the blocks of the functional channel (i.e. I, L and O in Figure 10) and not the blocks of the testing channel (i.e. TE and OTE in Figure 10).

Calculation of the failure rate focuses on the functional channel, not on the monitoring system, meaning that the failure rate of the monitoring system is ignored when analyzing systems using this architecture. The MTTFd of each component in the functional channel is calculated and then the MTTFd of the total channel is calculated.

The Diagnostic Coverage (DCavg) is also calculated based exclusively on the components in the functional channel, so when determining what percentage of the faults can be detected by the monitoring equipment, only faults in the functional channel are considered.

This highlights the fact that a failure of the monitoring system cannot be detected, so a single failure in the monitoring system that results in the system failing to detect a subsequent normally detectable failure in the functional channel will result in the loss of the safety function.

Summing Up

The next paragraph sums up the limits of this particular architecture:

The diagnostic coverage (DCavg) of the total SRP/CS including fault-detection shall be low. The MTTFd of each channel shall be low-to-high, depending on the required performance level (PLr). Measures against CCF shall be applied (see Annex F).

The first sentence reflects back to the previous paragraph on diagnostic coverage, telling you, as the designer, that you cannot make a claim to anything more than LOW DC coverage when using this architecture.

This raises an interesting question, since Figure 5 in the standard shows columns for both DCavg = LOW and DCavg=MED. My best advice to you as a user of the standard is to abide by the text, meaning that you cannot claim higher than LOW for DCavg in this architecture.

Another problem raised by this sentence is the inclusion of the phrase “the total SRP/CS including fault-detection”, since the previous paragraph explicitly tells you that the assessment of DCavg ‘should’ only include the functional channel, while this sentence appears to include it. In standards writing, sentences including the word ‘shall’ are clearly mandatory, while those including the word ‘should’ indicate a condition which is advised but not required. Hopefully this confusion will be clarified in the next edition of the standard.

Failure rates in the functional channel can be anywhere in the range from LOW to HIGH depending on the components selected and the way they are applied in the design. The requirement will be driven by the desired PL of the system, so a PLd system will require HIGH MTTFd components in the functional channel, while the same architecture used for a PLa system would require only LOW MTTFd components.
Finally, applicable measures against Common Cause Failures (CCF) must be used. Some of the measures given in Table F.1 in Annex F of the standard cannot be applied, such as Channel Separation, since you cannot separate a single channel. Other CCF measures can and must be applied, and so therefore you must score at least the minimum 65 on the CCF table in Annex F to claim compliance with Category 2 requirements.

Example Circuit

Here’s an example of what a simple Category 2 circuit constructed from discrete components might look like. Note that PB1 and PB2 could just as easily be interlock switches on guard doors as push buttons on a control panel. For the sake of simplicity, I did not illustrate surge suppression on the relays, but you should include MOV’s or RC suppressors across all relay coils. All relays are considered to be constructed with  ‘force-guided’ designs and meet the requirements for well-tried components.

Example Category 2 circuit from discrete components

Example Example Category 2 circuit from discrete components

Here is how the circuit works:

  1. The machine is stopped with power off. CR1, CR2, CR3 and M are off.
  2. The reset push button, PB3,  is pressed. If both CR1, CR2 and M are off, their normally closed contacts will be closed, so pressing PB3 will result in CR3 turning on.
  3. CR3 closes its contacts, energizing CR1 and CR2 which seal their contact circuits in and de-energizing CR3. The time delays inherent in relays permit this to work.
  4. With CR1 and CR2 closed and CR3 held off because its coil circuit opened when CR1 and CR2 turned on, M energizes and motion can start.

In this circuit the monitoring function is provided by CR3. If any of CR1, CR2 or M were to weld closed, CR3 could not energize, and so a single fault is detected and the machine is prevented from re-starting. If the machine is stopped by pressing either PB1 or PB2, the machine will stop since CR1 and CR2 are redundant. If CR3 fails, then the M rung is all held open because CR3 has not deenergized, preventing the machine from starting with a failed monitoring system. If CR1 or CR2 fail with an open coil, then M cannot energize because of the redundant contacts on the M rung.

This circuit cannot detect a failure in PB1, PB2, or PB3. Testing is conducted each time the circuit is reset.

If M is a motor starter rather than the motor itself, it will need to be duplicated for redundancy and a monitoring contact added to the CR3 rung unless a reasonable case for fault exclusion can be made.

In calculating MTTFd, PB1, PB2, CR1, CR2, CR3 and M must be included. CR3 is included because it has a functional contact in the M rung and is therefore part of the functional channel of the circuit as well as being part of the OT and OTE channels.

Download IEC standards, International Electrotechnical Commission standards.
Download ISO Standards

Watch for the next installment in this series where we’ll explore Category 3, the first of the ‘fault tolerant’ architectures!


Bad Behavior has blocked 219 access attempts in the last 7 days.

leader