Home‎ > ‎Blog‎ > ‎

Safety Integrity Levels: An Overview

posted 16 Feb 2015, 01:31 by John O'Sullivan   [ updated 16 Feb 2015, 01:46 ]
First published in Engineers Journal, Engineers Ireland 14th January 2015

Safety Integrity Levels: An Overview

John O’Sullivan outlines how to assign a Safety Integrity Level to a system and how to mitigate risk with Safety Instrumented Functions.

Author: John O’Sullivan, Engineering Director, Douglas Control and Automation

bottle

Introduction

The term SIL is used as a convenient shorthand to describe the safety rating of various hardware components and systems, e.g. “This PLC CPU is rated SIL3”. The Safety Integrity Level (SIL), was designed to be a short hand to represent the results of complex analysis but it is still only a part of an overall lifecycle approach to functional safety.

Technically, the Safety Integrity Level is the level by which the risk is reduced by the introduction of a Safety Instrumented System (SIS). There are 4 levels, SIL1 being the least reduction in risk, SIL4 being the greatest.

The Safety Instrument System is separate to and independent of the Basic Process Control System (BPCS) and, like the BPCS, consists of sensor(s), logic solver(s) and final element(s). The SIS reduces the risk by intervening during a failure of the BPCS to ensure the system remains safe. While the SIS hardware and software components may resemble the BPCS components and may come from the same manufacturer, they are required to be more reliable.

The specification, design and operation (Safety Life Cycle, SLC) are defined in the standard IEC 61508, “Functional Safety of electrical/electronic/programmable electronic safety-related systems”. This standard has spawned a number of industry and sector specific standards that delve into more detail for specific industries, although we will focus on IEC 61508 in this article.

IEC 61508 defines the Safety Life Cycle in three sections

  • Phases 1 to 5: Analysis

  • Phases 6 to 13: Realisation

  • Phases 14 to 16: Operation

The following standards elaborate on the approach to SIL assignment outlined in IEC 61508:

  • IEC 61511”Functional Safety – Safety instrumented systems for the process industry sector”

  • IEC 61513 “Nuclear power plants – Instrumentation and control important to safety”

  • IEC 50128 “Railway applications – communication, signalling and processing systems – software for railway control and protection systems”

  • IEC 50129 “Railway applications – communication, signalling and processing systems – safety related electronic systems for signalling”

Hazard and Risk Analysis

During the analysis phases of a project, hazard identification and risk analysis are carried out by an interdisciplinary team. This should consist of all the system stakeholders including designers, process owners, safety, automation, mechanical & electrical specialists. Where possible, hazards are designed out of the system. Where this is not possible, e.g. a volatile raw material is essential to the process, the risks associated with the hazard are identified.

Hazards are considered occurrences of harm and once identified the risk is assessed as the product of “frequency of the occurrence” and the “severity of the harm”.

Methods of analysis include:

  • HAZOP: Hazard and Operability Study

  • FME(C)A: Failure Mode Effect (and Criticality) Analysis

  • FMEDA: Failure Mode Effect and Diagnostic Analysis

  • ETA: Event Tree Analysis

  • FTA: Fault Tree Analysis

Normally a risk matrix uses the likelihood of the occurrence and the consequence of the event to categorise the risks. Risks that cannot be designed out and are not tolerable will require safety functions to reduce the risk to a tolerable level. This results in the “residual risk” which must be less than the pre-defined “tolerable risk”. The greater the reduction required to reach the residual risk, the higher the SIL. See the diagram below where the consequences, frequency/exposure, probability of avoidance are used to determine the required SIL.

Figure 1: Risk Assessment

Risk Parameters:

C1:       Minor injury or damage

C2:       Serious injury or one death, temporary serious damage

C3:       Several deaths, long term damage

C4:       Many dead, catastrophic effects

Frequency / Exposure Time:

F1:        Rare to quite often

F2:        Frequent to continuous

Possibility of Avoidance:

P1:       Avoidance possible

P2:       Unavoidable, scarcely possible

Probability of Occurence:

W1:      Very low, rarely

W2:      Low

W3:      High, frequent

Safety Integrity Levels Required:

-:         Tolerable Risk, no safety requirements

a:         No special safety requirements

b:         A single E/E/PE is not sufficient

1:         SIL 1

2:         SIL 2

3:         SIL 3

4:         SIL 4

 

Depending on the SIL level to be achieved based on the risk reduction required, a device must achieve a low enough Probability of Failure and a high enough Safe Failure Fraction.

Probability of Failure

Probability of Failure comes in two flavours: Probability of Failure on Demand (PFD) for safety functions that are only activated when required and Probability of Failure per Hour (PFH) for safety functions that are operating continuously. The lower the Probability of Failure the higher the Risk Reduction Factor. The higher the risk reduction factor, the higher the SIL achieved. See the tables below for the figures related to PFD and PFH.

SIL

PFD

PFD (power)

RRF

1

0.1-0.01

10−1 - 10−2

10-100

2

0.01-0.001

10−2 - 10−3

100-1000

3

0.001-0.0001

10−3 - 10−4

1000-10,000

4

0.0001-0.00001

10−4 - 10−5

10,000-100,000

Table 1: Probability of Failure on Demand

 

SIL

PFH

PFH (power)

RRF

1

0.00001-0.000001

10−5 - 10−6

100,000-1,000,000

2

0.000001-0.0000001

10−6 - 10−7

1,000,000-10,000,000

3

0.0000001-0.00000001

10−7 - 10−8

10,000,000-100,000,000

4

0.00000001-0.000000001

10−8 - 10−9

100,000,000-1,000,000,000

Table 2: Probability of Failure per Hour

Safe Failure Fraction

While the PFD and PFH tell us how likely a failure is to occur, the Safe Failure Fraction (SFF) tells us what fraction of failures will be safe or if dangerous, detected. This is achieved by increased diagnostics and reporting of the safety function. The Greek letter λ is used to define the rate of failure per hour.

  • λsafe = Failure rate leading to safe state

  • λdangerous = Failure rate leading to dangerous state

  • λtotal = λdangerous + λsafe

This results in 4 types of failure rate depending on whether the failure is detected or undetected. λdu  is the rate of dangerous undetected failures.

Thus SSF = 1- λdu  / λtotal  

So for SSF to be as high as possible, failures have to be safe or detected. If all the failure were safe and/or detected the SFF would be 1 or 100%.

Before SSF can be used to determine the SIL, other factors have to be considered. First is the Hardware Fault Tolerance (HFT) of the device. Achieved through redundancy, a HFT of N means that N+1 faults are required before the safety function is lost. Secondly devices are treated differently for SSF depending on their type. Type A devices are considered to be well defined and have sufficient failure data from experience in the field. Type B devices are considered to have insufficient data and field experience. See the tables below for the figures related to SSF.

SSF

Hardware Fault Tolerance (HFT)

0

1

2

<60%

SIL1

SIL2

SIL3

60% to 90%

SIL2

SIL3

SIL4

90% to 99%

SIL3

SIL4

SIL4

>99%

SIL4

SIL4

SIL4

Table 3: SSF for Type A subsystem

SSF

Hardware Fault Tolerance (HFT)

0

1

2

<60%

Not allowed

SIL1

SIL2

60% to 90%

SIL1

SIL2

SIL3

90% to 99%

SIL2

SIL3

SIL4

>99%

SIL3

SIL4

SIL4

Table 4: SSF for Type B subsystem

In summary, the tools are available to identify and analyse risks associated with a system design and then implement the appropriate Safety Instrumented System to mitigate those risks and save lives and assets.

John O’Sullivan BE, Dip Phys Sci, CEng MIEI, Engineering Director of Douglas Control and Automation, has 20 years’ experience in the automation industry focusing on the pharmaceutical, biotechnology and medical device sectors.

He has developed design and test specifications for the regulated environment and project manages automation and safety projects for life science customers.

He has consulted on the validation of certified failsafe, high availability systems.

Comments