Reliability of embedded software


A smart meter is composed of hardware and embedded software. These two parts have to work in harmony with each other to make the system work in an orderly fashion. Failure of any part of the meter will affect its normal operation.

The reliability of smart meters is not only related to the hardware but also has a very close relationship with embedded software. Due to the mass production of the smart meter, once any quality problem of the embedded software occurs, it can cause severe economic loss and even disastrous consequences. Embedded software requires strict testing, validation and verification.

Embedded software failures and their sources

An embedded software fault refers to a defect or error that causes a failure to perform the required functions during the course of operations. [1] For embedded software that does not have fault tolerance, a fault means total failure. Embedded software failure in smart meters has the following characteristics: exponential scale, time-independent, external environment independent, high concealment, high randomness, high dispersibility, entirely caused by design, and never reappears after correction. In general, embedded software failures are mainly caused by design errors and inadequate consideration, including the following seven factors:

Incorrect standards. A standard is the logic and algorithm specification of problem-solving. [1] In programming, the development of programming standards is critical. If something goes wrong, is flawed, poorly thought out, or contradicts itself, the embedded software bugs become apparent.

Performance errors. The performance of embedded software is different from what the customer required; such as response time, execution time, the accuracy of the control system, and so on.

Lack of demand analysis. The first step in the design of smart meters is analysing customer requirements, which is extremely important. Before designing the smart meter system, we must thoroughly understand the customer’s demand and must make a sentence-by-word review of customer requirements. At the same time, we can conduct field applications, communications, and discussions with the customer in order to fully understand the requirements.

Mistakes in the plan. In the general scheme of things, the specified problems include: the embedded software architecture and data structure required by the customer; the relationship between system software and customer software; the main software program and its subroutines; the structure and function of the interrupt handler; the interface of particular peripheral; the device driver, and so on. In this process, any carelessness and oversight will certainly bring more troubles to the subsequent software development.

Errors in programming. Software engineers often make a variety of mistakes in the process of coding; such as syntax errors, semantic errors, domain errors, logical errors, arithmetic errors, endless loops and so on.

Incorrect standards. A standard is the logic and algorithm specification of problem-solving. [1] In programming, the development of programming standards is critical. If something goes wrong, is flawed, poorly thought out, or contradicts itself, the embedded software bugs become apparent.

Performance errors. The performance of embedded software is different from what the customer required; such as response time, execution time, the accuracy of the control system, and so on.

Errors occurring in interrupt and stack operations. Interruption is an essential measure when smart meters are handling a real-time response of power drop events. If errors occur during the interruption, it will cause energy data errors and other problems.

Human factors. The reliability of embedded software has a great dependence on the engineer. [1] Therefore, the embedded software designer must have good knowledge of data structure and program design, and be able to debug and test embedded software skillfully. At the same time, the designers should have good ideological qualities and a good work ethic.

Software quality management system

Failures can be introduced in every step of the software development process. Proper management measures are necessary for developing software with high dependability. [1] The dependability management of software is assured by implementing the principle of feedback control. The basics are mainly planning, organization, supervision and control. Software quality management involves the collaboration of multiple departments throughout each stage of the software life cycle. Figure 1 describes the basic flow of software quality management.

Reliability design of smart meter embedded software

The most significant feature of the smart meter system is based on the following factors: data acquisition and control; a massive amount of hardware and embedded software combinations; a large amount of functional operation; a large number of model calls; a complex external work environment; easy to interfere with other devices. Its execution error will result in data, collapse or worse. During the process of embedded software design, we should pay extra attention to redundancy and preventive design between hardware and software interface. We can adopt anti-jamming technology such as watchdog circuits, status monitoring, software lock design, program trap design and backup technology to carry out system fault-tolerance effectively. [2]

Software noise removal

Smart meters are often subjected to various kinds of noise in practical work. If this noise is not removed, there will inevitably be mistakes. In addition to being overcome in hardware design, there are also some measures that can be taken in software design to reduce the chance of errors. [1] For example, when detecting a power drop event, if only one set of the data is taken as the real data parameter at one time, the data will often be inaccurate due to the accidental interference during the acquisition. In this case, if a proper filtering method is added to the software, the effect of the interference can be eliminated.

Digital filtering design

At present, smart meters have adopted various metering chips. The central processing unit and the metering chip communicate with each other through a serial peripheral interface or universal asynchronous transmitter to obtain the operating parameters of the power system. If, during communication, the Bus is interfered with, or the metering chip is in an abnormal state, the CPU will get the error data. [3] Therefore, it is critical to add a filtering process into the embedded software program. The commonly used digital filters include FIR and IIR.

Data redundancy design

In order to improve the reliability of the system, the parameters of the system and the calibration parameters can be designed with multiple backups; when a group of data is disordered, you can enable another set of backup data. [3] In order to guarantee security and improve the fault tolerance of the data, several groups of data should be stored in a decentralised way.

Redundant design of data validation and operations

When the central processor writes settings or calibration parameters to the memory, it may be disturbed, resulting in erroneous data being written. However, the CPU can’t tell if the data is correct or not. [3] In order to ensure the proper writing of data, in the design of software programs, the written data should perform a “checksum process” where “checksum” is also written into the storage. After each writing operation is completed, it should reread the data and redo the “checksum” calculation; and then compare the two “checksum” results. If the two sets of data are inconsistent, the write operation is re-carried out until the data is correctly written. If the number of times exceeds the maximum rewrite times, it will be counted as a write operation error.

Software trap design

A software trap is an application of instruction redundancy to capture the uncontrollable program. [3] The ‘software trap’ is a jumper instruction that forces the captured program to a specific address and processes the faulty program. According to the above theory, the design of a software trap based on ARM kernel can record the information of the stack in the specific interrupt service program and enforce the trap program to help the software designer to locate the error quickly. Software traps are best used around the large read-only memory, available interrupt vector, “breakpoint” in the program, and the head and tail of the table.

Software watchdog design

A ‘watchdog’ is a soft-hard combination of methods to prevent the program from going into an endless loop. The watchdog hardware is a standalone counter with a timer period of T. During normal operation of the system, the watchdog will be zeroed within a time interval of less than T, so the timer will not overflow. However, when the system is in an abnormal state of work, the timing logic of the CPU is disrupted, and the counter cannot be zeroed in the period T, eventually causing the counter to overflow. The watchdog produces a reset signal, which is transmitted to the CPU for reset. This design can get the system out of the transient disturbance and enhance the reliability of the system.

Embedded software reuse

To reuse the mature embedded software to the maximum extent can not only shorten the development cycle and improve the development efficiency, but also improve the maintainability and reliability of the embedded software. Embedded software reuse is an integral part of the work at the beginning of project planning and is a necessary method to improve embedded software reliability.

Regular monitor

Design a few system self-check functions and load it regularly during system running. Report errors on time and save critical data when an exception is detected. This method can minimise the loss caused by the failure and help the researchers find and fix the problem as soon as possible.


This article is based on years of development experience of smart meters by Inhemeter Company. The quality of smart meters can be significantly enhanced by improving the quality of software developers, strengthening the execution ability of embedded software quality management and adopting a proper reliability design scheme.


1. Li Bocheng. Reliability design of embedded systems [M]. Beijing: electronic industry press, 2006.

2. Qiao Bing. Reliability design of embedded software [J]. Electronic world, 2003, (5) : 143-144.

3. Xu Xihua, Zheng Yujie, Xu Jinfeng. Study on reliability design of electric instrument [J]. Electric world, 2015,56 (6):43-45.


She Zhi is a senior engineer at INHEMETER Company. He has more than 10 years of experience in software project development and has won first prize in the company’s technological innovation three times. His main research direction is the research and development of new technologies for smart meters.