There is a common misunderstanding between safety and reliability. A lot of people think that increase of reliability automatically increases the safety of the system. This are two completely different factors which need to be taken into account. But let’s find a reply to the main question first – What is a difference between Safety and Reliability?
Reliability is strictly connected to the dependability. It focuses on the ability to consistently perform the functions, without a failure. Reliability can be applied to components and systems. It focuses on the minimalisation of brake downs. Longer the interval between the failures, the more reliable system is.
Safety is focusing on preventing incidents from happening. Not only connected to the components and systems. Safe system is designed to avoid and mitigate hazard outcomes of system operation.
Why there is confusion between Safety and Reliability?

The easiest and most common example to the students is breaking system in the car. Reliability of a breaking system is focused on the parts, breaking disc, breaking pads, breaking fluid, breaking pedal, feedback to user. However the user has to know when to apply breaking. So components are directly responsible for reliability, but user is responsible for safety.

If we use the same example, but for autonomous system, we have responsible autonomous driving unit. Set of sensors installed on the autonomous driving unit, is responsible for triggering breaking action based on the collected information. Breaking system itself is responsible for reliability of breaking system. But decision maker, is responsible for a safety.
Both factors are externally important, but as You see, increasing reliability of a breaking system can gain few meters of breaking distance, but not necessary increase a safety of a system. Every system is that strong as weakest link in the chain.
Reliability and its role in system engineering
Reliability is very important in providing the service. If system is not reliable, most probably it cannot provide a service. This doesn’t equal with that system is unsafe. Good example in this case is automatic cruise control in heavy weather conditions. Usually it gives you alarm, and force to take over control. It’s highly unreliable under heavy rain/snow conditions. However with this action it mitigates the risk of being responsible for an accident. So it’s not reliable in the heavy weather conditions, but it’s safe.
Key aspects of reliable system are availability – how many time system can operate without a maintenance. Or how often unexpected maintenance is required between the scheduled maintenance. Other important aspect is performance consistency. Between the maintenance periods, system should behave same way.
Different methods can be used to estimate reliability of the system. Reliability Block Diagram, Statistical Analysis of Failure Data, Fault Tree Analysis, Safety Integration Level. That’s actually funny, because SIL (Safety integration level) is measurement of the reliability required of a safety. So it focuses on reliability, not overall safety. So even with professional methods, reliability and safety are mixed up.
How to increase reliability?
The most common methods to increase reliability are the outcome of analysis:
- Redundancy – Some of system components can be duplicated like main and backup pump. This increases reliability of a system.
- Fault tolerance – Continue the operation in the degraded state, for example reduce speed to prevent further damage.
- Robust design – Increase a border metrics by safety margin. For example add additional 5% to the thickness of structure.
- Testing – Provide proper testing of the system, to correctly find design borders, and fault tolerance.
- Maintenance strategies – Make proper strategy for a maintenance intervals, with estimated use of a parts.
Safety and its role in system engineering
System safety is responsible for avoiding damages, to people, environment, company good name, etc. It focuses on hazard identification, risk assessment and risk mitigations. If we know that certain hazard exist in the system, we can prepare for them and work on the mitigation of the risks. Role of safety is provide safe environment to the the system and operators. Safe environment should cover internal and external impacts to the system. It focuses on interactions, impacts and integration, not only on the system faults.
STPA as way to increase system safety
STPA is our favourite safety analysis method. It focuses on the hazards coming from a system components, components interactions, human factors, environmental impacts. In opposition to traditional methods like FMEA, or HAZOP, STPA has a broadest scope. It views accidents as results of inadequate control actions within the system, not just single component failures, like traditional methods.
Method is also brilliant for analysis of human-machine interfaces. During the analysis, you ask questions like, how Human can wrongly believe in the presented data? What we should consider to make this action clear to operator? It focuses on all the factors: Human, Software, Organisation, Rules.
STPA follows the whole process, not only the design. Already at the concept phase, we think about maintenance and operation.

Key differences:
Reliability focuses on the system availability to perform the action. Safety focuses on delivery of a system goal without causing a loss.
Reliability is responsible for preventing functional failures of the system. Safety is responsible for preventing hazard outcomes.
Reliability analyses failure modes of a functions. Safety analyses hazards arising from inadequate system actions, including correct operation of the system (ex. To late apply of a breaking)
Safety and reliability are not completely disconnected

Reliability failures can lead to a safety hazards. They have to be taken into account during safety analyses. However we cannot achieve safety just by improving a reliability of the system. System can be unreliable, but still safe.
Safety requirements can easily guide reliability requirements for a components and system functions. Very often during STPA analysis, we found out the safety points, related directly to reliability of the system. However safety analysis, especially STPA, are wider in the scope. Good practice is to do a safety analysis first, and then apply reliability methods to fulfil safety requirements.
Safety and Reliability are complementary to each other. They look differently to the systems aspects. Properly designed systems, with use of STPA are safe and with use of FMEA it can be reliable. By applying this 2 methods at the different stages of the project you can create really robust system. Remember that if you have any questions, You can always contact us, we will be happy to help! Check also other insides about the STPA, here!