How Risk Assessment Fails

Fukushima Dai Ichi Nuclear plant before the meltdown
This entry is part 2 of 8 in the series Risk Assess­ment

Fukushima Dai Ichi Power Plant after the explosionsThe events unfold­ing at Japan’s Fukushi­ma Dai Ichi Nuclear Pow­er plant are a case study in ways that the risk assess­ment process can fail or be abused. In an arti­cle pub­lished on Bloomberg.com, Jason Clen­field item­izes decades of fraud and fail­ures in engi­neer­ing and admin­is­tra­tion that have led to the cat­a­stroph­ic fail­ure of four of six reac­tors at the 40-year-old Fukushi­ma plant. Clenfield’s arti­cle, ‘Dis­as­ter Caps Faked Reports’, goes on to cov­er sim­i­lar fail­ures in the Japan­ese nuclear sec­tor.

Most peo­ple believe that the more seri­ous the pub­lic dan­ger, the more care­ful­ly the risks are con­sid­ered in the design and exe­cu­tion of projects like the Fukushi­ma plant. Clenfield’s arti­cle points to fail­ures by a num­ber of major inter­na­tion­al busi­ness­es involved in the design and man­u­fac­ture of com­po­nents for these reac­tors that may have con­tributed to the cat­a­stro­phe play­ing out in Japan. In some cas­es, the cor­rect actions could have bank­rupt­ed the com­pa­nies involved, so rather than risk finan­cial fail­ure, these fail­ures were cov­ered up and the work­ers involved reward­ed for their efforts. As you will see, some­times the degree of care that we have a right to expect is not the lev­el of care that is used.

How does this relate to the fail­ure and abuse of the risk assess­ment process? Read on!

Risk Assessment Failures

Earthquake and Tsunami damage - Fukushima Dai Ichi Power PlantThe Fukushi­ma Dai Ichi nuclear plant was con­struct­ed in the late 1960’s and ear­ly 1970’s, with Reac­tor #1 going on-line in 1971. The reac­tors at this facil­i­ty use ‘active cool­ing’, requir­ing elec­tri­cal­ly pow­ered cool­ing pumps to run con­tin­u­ous­ly to keep the core tem­per­a­tures in the nor­mal oper­at­ing range. As you will have seen in recent news reports, the plant is locat­ed on the shore, draw­ing water direct­ly from the Pacif­ic Ocean.

Learn more about Boil­ing Water Reac­tors used at Fukushi­ma.

Read IEEE Spectrum’s “24-Hours at Fukushi­ma”, a blow-by-blow account of the first 24 hours of the dis­as­ter.

Japan is locat­ed along one of the most active fault lines in the world, with plate sub­duc­tion rates exceed­ing 90 mm/year. Earth­quakes are so com­mon­place in this area that the Japan­ese peo­ple con­sid­er Japan to be the ‘land of earth­quakes’, start­ing earth­quake safe­ty train­ing in kinder­garten.

Japan is the coun­ty that cre­at­ed the word ‘tsuna­mi’ because the effects of sub-sea earth­quakes often include large waves that swamp the shore­line. These waves affect all coun­tries bor­der­ing the worlds oceans, but are espe­cial­ly preva­lent where strong earth­quakes are fre­quent.

In this envi­ron­ment it would be rea­son­able to expect that con­sid­er­a­tion of earth­quake and tsuna­mi effects would mer­it the high­est con­sid­er­a­tion when assess­ing the risks relat­ed to these haz­ards. Remem­ber­ing that risk is a func­tion of sever­i­ty of con­se­quence and prob­a­bil­i­ty, the risk assessed from earth­quake and tsuna­mi should have been crit­i­cal. Loss of cool­ing can result in the cat­a­stroph­ic over­heat­ing of the reac­tor core, poten­tial­ly lead­ing to a core melt­down.

The Fukushi­ma Dai Ichi plant was designed to with­stand 5.7 m tsuna­mi waves, even though a 6.4 m wave had hit the shore close by 10 years before the plant went on-line. The wave gen­er­at­ed by the recent earth­quake was 7 m. Although the plant was not washed away by the tsuna­mi, the wave cre­at­ed anoth­er prob­lem.

Now con­sid­er that the reac­tors require con­stant forced cool­ing using elec­tri­cal­ly pow­ered pumps. The back­up gen­er­a­tors installed to ensure that cool­ing pumps remain oper­a­tional even if the mains pow­er to the plant is lost, are installed in a base­ment sub­ject to flood­ing. When the tsuna­mi hit the sea­wall and spilled over the top, the flood­wa­ters poured into the back­up gen­er­a­tor room, knock­ing out the diesel back­up gen­er­a­tors. The cool­ing sys­tem stopped. With no pow­er to run the pumps, the reac­tor cores began to over­heat. Although the reac­tors sur­vived the earth­quakes and the tsuna­mi, with­out pow­er to run the pumps the plant was in trou­ble.

Learn more about the acci­dent.

Clear­ly there was a fail­ure of rea­son when assess­ing the risks relat­ed the loss of cool­ing capa­bil­i­ty in these reac­tors. With sys­tems that are mis­sion crit­i­cal in the way that these sys­tems are, mul­ti­ple lev­els of redun­dan­cy beyond a sin­gle back­up sys­tem are often the min­i­mum required.

In anoth­er plant in Japan, a sec­tion of pip­ing car­ry­ing super­heat­ed steam from the reac­tor to the tur­bines rup­tured injur­ing a num­ber of work­ers. The pipe was installed when the plant was new and had nev­er been inspect­ed since instal­la­tion because it was left off the safe­ty inspec­tion check­list. This is an exam­ple of a fail­ure that result­ed from blind­ly fol­low­ing a check­list with­out look­ing at the larg­er pic­ture. There can be no doubt that some­one at the plant noticed that oth­er pipe sec­tions were inspect­ed reg­u­lar­ly, but that this par­tic­u­lar sec­tion was skipped, yet no changes in the process result­ed.

Here again, the risk was not rec­og­nized even though it was clear­ly under­stood with respect to oth­er sec­tions of pipe in the same plant.

In anoth­er sit­u­a­tion at a nuclear plant in Japan, drains inside the con­tain­ment area of a reac­tor were not plugged at the end of the instal­la­tion process. As a result, a small spill of radioac­tive water was released into the sea instead of being prop­er­ly con­tained and cleaned up. The risk was well under­stood, but the con­trol pro­ce­dure for this risk was not imple­ment­ed.

Final­ly, the Kashi­waza­ki Kari­wa plant was con­struct­ed along a major fault line. The design­ers used fig­ures for the max­i­mum seis­mic accel­er­a­tion that were three times low­er than the accel­er­a­tions that could be cre­at­ed by the fault. Reg­u­la­tors per­mit­ted the plant to be built even though the rel­a­tive weak­ness of the design was known.

Failure Modes

I believe that there are a num­ber of rea­sons why these kinds of fail­ures occur.

Peo­ple have a dif­fi­cult time appre­ci­at­ing the mean­ing of prob­a­bil­i­ty. Prob­a­bil­i­ty is a key fac­tor in deter­min­ing the degree of risk from any haz­ard, yet when fig­ures like ‘1 in 1000’ or ‘1 x 10-5 occur­rences per year’ are dis­cussed, it’s hard for peo­ple to tru­ly grasp what these num­bers mean. Like­wise, when more sub­jec­tive scales are used it can be dif­fi­cult to real­ly under­stand what ‘like­ly’ or ‘rarely’ actu­al­ly mean.

Con­se­quent­ly, even in cas­es where the sever­i­ty may be very high, the risk relat­ed to a par­tic­u­lar haz­ard may be neglect­ed because the risk is deemed to be low because the prob­a­bil­i­ty seems to be low.

When prob­a­bil­i­ty is dis­cussed in terms of time, a fig­ure like ‘1 x 10-5 occur­rences per year’ can make the chance of an occur­rence seem dis­tant, and there­fore less of a con­cern.

Most risk assess­ment approach­es deal with haz­ards singly. This is done to sim­pli­fy the assess­ment process, but the prob­lem that can result from this approach is the effect that mul­ti­ple fail­ures can cre­ate, or that cas­cad­ing fail­ures can cre­ate. In a mul­ti­ple fail­ure con­di­tion, sev­er­al pro­tec­tive mea­sures fail simul­ta­ne­ous­ly from a sin­gle cause (some­times called Com­mon Cause Fail­ure). In this case, back-up mea­sures may fail from the same cause, result­ing in no pro­tec­tion from the haz­ard.

In a cas­cad­ing fail­ure, an ini­tial fail­ure is fol­lowed by a series of fail­ures result­ing in the par­tial or com­plete loss of the pro­tec­tive mea­sures, result­ing in par­tial or com­plete expo­sure to the haz­ard. Rea­son­ably fore­see­able com­bi­na­tions of fail­ure modes in mis­sion crit­i­cal sys­tems must be con­sid­ered and the prob­a­bil­i­ty of each esti­mat­ed.

Com­bi­na­tion of haz­ards can result in syn­er­gy between the haz­ards result­ing in a high­er lev­el of sever­i­ty from the com­bi­na­tion than is present from any one of the haz­ards tak­en singly. Rea­son­ably fore­see­able com­bi­na­tions of haz­ards and their poten­tial syn­er­gies must be iden­ti­fied and the risk esti­mat­ed.

Over­sim­pli­fi­ca­tion of the haz­ard iden­ti­fi­ca­tion and analy­sis process­es can result in over­look­ing haz­ards or under­es­ti­mat­ing the risk.

Think­ing about the Fukushi­ma Dai Ichi plant again, the com­bi­na­tion of the effects of the earth­quake on the plant, with the added impact of the tsuna­mi wave, result­ed in the loss of pri­ma­ry pow­er to the plant fol­lowed by the loss of back­up pow­er from the back­up gen­er­a­tors, and the sub­se­quent par­tial melt­downs and explo­sions at the plant. This com­bi­na­tion of earth­quake and tsuna­mi was well known, not some ‘unimag­in­able’ or ‘unfore­see­able’ sit­u­a­tion. When con­duct­ing risk assess­ments, all rea­son­ably fore­see­able com­bi­na­tions of haz­ards must be con­sid­ered.

Abuse and neglect

The risk assess­ment process is sub­ject to abuse and neglect. Risk assess­ment has been used by some as a means to jus­ti­fy expos­ing work­ers and the pub­lic to risks that should not have been per­mit­ted. Skew­ing the results of the risk assess­ment, either by under­es­ti­mat­ing the risk ini­tial­ly, or by over­es­ti­mat­ing the effec­tive­ness and reli­a­bil­i­ty of con­trol mea­sures can lead to this sit­u­a­tion. Deci­sions relat­ing to the ‘tol­er­a­bil­i­ty’ or the ‘accept­abil­i­ty’ of risks when the sever­i­ty of the poten­tial con­se­quences are high should be approached with great cau­tion. In my opin­ion, unless you are per­son­al­ly will­ing to take the risk you are propos­ing to accept, it can­not be con­sid­ered either tol­er­a­ble or accept­able, regard­less of the legal lim­its that may exist.

In the case of the Japan­ese nuclear plants, the oper­a­tors have pub­licly admit­ted to fal­si­fy­ing inspec­tion and repair records, some of which have result­ed in acci­dents and fatal­i­ties.

In 1990, the US Nuclear Reg­u­la­to­ry Com­mis­sion wrote a report on the Fukushi­ma Dai Ichi plant that pre­dict­ed the exact sce­nario that result­ed in the cur­rent cri­sis. These find­ings were shared with the Japan­ese author­i­ties and the oper­a­tors, but no one in a posi­tion of author­i­ty took the find­ings seri­ous­ly enough to do any­thing. Rel­a­tive­ly sim­ple and low-cost pro­tec­tive mea­sures, like increas­ing the height of the pro­tec­tive sea wall along the coast­line and mov­ing the back­up gen­er­a­tors to high ground could have pre­vent­ed a nation­al cat­a­stro­phe and the com­plete loss of the plant.

A Useful Tool

Despite these human fail­ings, I believe that risk assess­ment is an impor­tant tool. Increas­ing­ly sophis­ti­cat­ed tech­nol­o­gy has ren­dered ‘com­mon sense’ use­less in many cas­es, because peo­ple do not have the exper­tise to have any com­mon sense about the haz­ards relat­ed to these tech­nolo­gies.

Where haz­ards are well under­stood, they should be con­trolled with the sim­plest, most direct and effec­tive mea­sures avail­able. In many cas­es this can be done by the peo­ple who first iden­ti­fy the haz­ard.

Where haz­ards are not well under­stood, bring­ing in experts with the knowl­edge to assess the risk and imple­ment appro­pri­ate pro­tec­tive mea­sures is the right approach.

The com­mon aspect in all of this is the iden­ti­fi­ca­tion of haz­ards and the appli­ca­tion of some sort of con­trol mea­sures. Risk assess­ment should not be neglect­ed sim­ply because it is some­times dif­fi­cult, or it can be done poor­ly, or the results neglect­ed or ignored. We need to improve what we do with the results of these efforts, rather than neglect to do them at all.

In the mean time, the Japan­ese, and the world, have some cleanup to do.

Series Nav­i­ga­tionISO With­draws Machin­ery Risk Assess­ment Stan­dards”>ISO With­draws Machin­ery Risk Assess­ment Stan­dardsThe Prob­lem with Prob­a­bil­i­ty

Author: Doug Nix

Doug Nix is Managing Director and Principal Consultant at Compliance InSight Consulting, Inc. (http://www.complianceinsight.ca) in Kitchener, Ontario, and is Lead Author and Senior Editor of the Machinery Safety 101 blog. Doug's work includes teaching machinery risk assessment techniques privately and through Conestoga College Institute of Technology and Advanced Learning in Kitchener, Ontario, as well as providing technical services and training programs to clients related to risk assessment, industrial machinery safety, safety-related control system integration and reliability, laser safety and regulatory conformity. For more see Doug's LinkedIn profile.