6.6 Dynamic execution architecture

6.6.1 Overview

Before starting the content of this chapter, let's first understand a very important basic concept in an industrial embedded product: strong real-time and weak real-time.

Most embedded systems we usually come into contact with are weak real-time systems, such as household washing machines. From when we press the start button to when the washing machine starts to wash the clothes, the interval can be 100ms or 1s, and there may even be an abnormal noise requesting to press it once. Because the response is slow at best, the experience is poor or two curses are caused, but there is no other loss, such systems are weak real-time systems.

Different from the weak real-time system, if the strong real-time system fails to work correctly before the specified time, it will lead to very serious consequences. The microcomputer protection device mentioned in this book is a strong real-time system. In the event of a power system failure, the faulty line must be cut off within a specific time, otherwise it will cause expensive electrical equipment to burn out, and even cause extreme accidents such as large-scale blackouts. In the field of microcomputer protection, this specific time is at the ms level. It is really so-called to raise soldiers for a thousand days and use them for a moment. This type of system is called a strong real-time system.

Although there is only one word difference between strong real-time and weak real-time, there are huge differences in programming concepts. This course will reveal this difference and lead you to experience my journey of exploration over the years.

In addition, it needs to be emphasized that even if it is a strong real-time system, most of the strong real-time is reflected in one or several points, while other modules are weak real-time systems. For example, the communication module in the microcomputer protection device, if under normal circumstances it is required to send the confirmation frame within 50ms after receiving the request, but even if it is sent only 100ms accidentally, it will cause communication delay or retransmission at most, and the loss is not large. These characteristics lead to the need to fully identify and treat them separately when constructing a program execution framework for microcomputer protection. Strong real-time must ensure that the execution is completed within a specific time, and weak real-time modules must be executed as fast as possible and as stable as possible.

Different from the ordinary strong real-time system, the microcomputer protection device has another feature, which requires a lot of complicated calculations, which further increases the complexity of software design. For the transformer protection mentioned in this book, it is not only necessary to calculate the three-phase current, three-phase differential current and three-phase braking current on the four sides of the transformer, but also to calculate the electrical quantities of the second and fifth blocking harmonics. Each circuit needs to be complicated. The attached filter calculation. How to ensure real-time performance under the premise of satisfying these calculations is an important factor that needs to be considered in transformer protection.

The formula for calculating the electrical quantity of each channel is as follows, and the conversion to a computer algorithm is a series of multiplication and addition operations. Fu-form transformation is often referred to as one of the most beautiful mathematical formulas, and it is the foundation of many engineering subjects. As follows:

Insert picture description here

◇◇◇

There is another misunderstanding prone to the dynamic organization of the program.

When communicating on the Internet, many people often complain that their company's products are too low, and they don't even use the OS. It seems that if the OS is used, it is a high-level product, and if it is not used, it is a low-level product. There are also a lot of college students who say that learning embedded feels like they are learning linux. Finally, they just called a few interface functions and learned a little knowledge.

To learn industrial embedded systems, I personally recommend starting from a bare system and a simple cpu, trying to build a development environment, setting up some registers, simply writing a few drivers, contacting interrupts and mutual exclusion, letting the program run, and first experiencing the software and hardware The smell of coordinated operation. In short, first have the overall understanding, and then you can get twice the result with half the effort later.

In this chapter, I will take you to experience a variety of embedded program organization modes. This is also my journey of exploration over the years.

6.6.2 Starting point: front and back system

The simplest program structure in the embedded system is called the front-end system, as shown in the following figure:
Insert picture description here
In this program structure, the main program part is a wireless loop, and each task module is called in turn in the loop. We are used to using the main loop program Call it the background. The interrupt service routine handles asynchronous events, generally setting the flag and leaving it to be processed by the background task, which is called the foreground. (Note: With the prevalence of Internet programs, front-end programs often specifically refer to interface programs that interact with users, and background programs refer to various business process database storage, etc. Be careful not to confuse the concepts.)

This kind of program structure is very low, but its vitality is very tenacious. It is almost a general structure for simple and non-strong real-time embedded devices. It has many advantages:

  1. The program structure is simple and small, the program space and ram space are relatively small, and some very low-cost cpu can be used;
  2. The tasks are executed serially, and there is almost no need to consider the issue of synchronization and mutual exclusion;
  3. Under normal circumstances, the flag is set in the interrupt, the flag is retrieved in the background task, and the corresponding processing is performed. Based on this mechanism, the mutual exclusion of interrupts and the main loop is easy to deal with, mainly to prevent the flags in the main program from being interrupted and modified. The strategy is also very simple. You can increase the cache before using the flags each time;
  4. Especially suitable for embedded applications that require power saving, such as various remote controls in the home, where the processor is usually in a halt state and wakes up quickly through interrupts;

Of course, this program structure also has many shortcomings, the main one of which is that the time for the background program to run one lap is uncertain, resulting in its worst response time being the maximum time for the main program to execute one lap. Obviously, this program structure does not meet the strong real-time requirements of microcomputer protection.

6.6.3 Exploration 1: Shuanghuan front and back system

The early embedded cpus were all 8-bit single-chip microcomputers with extremely limited performance. In order to ensure the strong real-time protection of the protection, real-time tasks could only be weakened. The old generation of microcomputer protection R&D personnel innovatively proposed a double-loop program structure, as shown in the following figure: the
Insert picture description here
whole program is divided into two loops, the weak real-time loop runs communication, liquid crystal and other weak real-time tasks, and the strong real-time loop runs to protect electrical quantities. Strong real-time tasks such as calculation, logical judgment, and action export.

Normally run weak real-time tasks, and perform quick start logic judgment during sampling interruption. If abnormal signs are found, quickly switch the program to strong real-time task operation. When the entire protection logic is processed and the entire group is reset, then switch back to weak real-time task operation.

This kind of program structure guarantees the strong real-time performance of the protection module when the CPU computing power is very scarce in the early days, but it also has many disadvantages:

  1. If you want to return to the protection program directly in the interrupt, you must adopt a strategy of modifying the program stack, similar to the modern OS task switching strategy. However, our early practices were rather brutal, and there were no strategies such as register saving and stack switching, which directly destroyed the original weak real-time task operating environment. When the whole group returns, the original weak real-time task needs to be reinitialized. This program structure requirement increases the complexity of the program.
  2. Over-reliance on protective starting components. The starting element is a small but sensitive judgment program. However, due to differences in algorithms, there may be situations where the protection element operates but the corresponding starting element does not operate, and a malfunction will occur at this time.
  3. In some time-consuming and long-term protection components, the communication LCD will not respond for a long time, and the effect is not big when the individual microcomputer protection is running in the early stage. However, as the microcomputer protection is gradually networked and operated, this shortcoming It's already unbearable.

I remember that when I first came into contact with this kind of program, I just graduated from university, and this kind of programming skills still touched me a lot. There are two important insights:

  1. The strategy of interrupting and returning to a strong real-time task is a good OS enlightenment, which makes me feel like it will come naturally when I learn OS later. Because of this experience, when I later instructed newcomers to learn strong real-time OS, I often let everyone experience this technique of returning from interrupt to other programs.
  2. In order to recover from the exception, weak real-time tasks need to be interrupted and still be able to resume operation. In order to do this, some programming skills are needed, and there are many considerations. Later, I discovered that this is a very simple strategy to improve program reliability. In order to improve reliability, the embedded system is in self-checking state all the time. After discovering that a certain task is abnormal, it used to adopt a rough strategy to reset the entire device. Now there is a very clever strategy to reset only a certain task.

6.6.4 Exploration 2: Time-limited front and back system

Around 2000, a single microcomputer protection device began to be gradually built into an integrated system, and networking became the most basic requirement. The original dual-ring front and back system could no longer meet the demand. Moreover, various 32-bit embedded cpus began to appear at this time, such as the classic MC68332 single-chip microcomputer. Compared with the earlier 51 single-chip microcomputers, the speed can be said to be a huge increase. In this context, we adopted a new program scheduling strategy. Because this strategy strictly limits the execution time of each task in the background program, it is called a time-limited front and back system. The program structure is shown as follows:
Insert picture description here

The most basic feature of the program structure is to ensure that no matter what the situation, the protection strong real-time program module must be executed within the maximum response time. Remember that the maximum response time was agreed to be 5ms, so all weak real-time tasks must be executed within 5ms. Based on this concept, the main loop of the program selects a weak real-time task and a protection strong real-time task each time, and the weak real-time tasks are switched and executed in turn.

During the development of this product, we were forced to develop a habit of testing the execution time of all programs. The test results are very surprising, and it also made me truly understand what the two-eight law is, 80% of the cpu time is spent on 20% of the programs, especially the algorithmic programs. This experience tells us that it is meaningless to spend great efforts to optimize the ordinary process code, and it may be valuable to save only one assembly instruction in the algorithm program.

In order to ensure that various weak real-time tasks can be executed within 5ms, we are forced to adopt a lot of programming skills and add many intermediate states. For example, if the flash write operation takes 10ms, therefore, the flash write operation at this time cannot finish sending the write command and die. It needs to record the intermediate state and exit quickly, and wait for the next execution to continue to complete a complete write process.

Program skills are a double-edged sword that can improve the programming level of project team members, but it will also increase the complexity of the code, which is not only easy to make mistakes, but also brings difficulties to the later code maintenance and reading. Especially with the continuous increase of non-protection functions, the shortcomings of this kind of program organization method of time-limited front and back are also increasingly prominent, and we are forced to go on the road again.

6.6.5 Exploration 3: Interrupting virtual tasks

Nowadays, when mentioning the microcomputer protection device, it is customary to say that the microcomputer protection is an intelligent device integrating protection, measurement, control, monitoring, wave recording, communication and other functions. In fact, this was not the case in the early days. With limited embedded CPU computing power and software and hardware resources, protection devices and measurement and control devices were separated. Later, with the rapid development of embedded CPUs, in order to save costs, they were slowly integrated together.

Under the new trend, there are more and more weak real-time software modules in protection devices, and the proportion of protection programs is gradually decreasing. The original time-limited front-end and back-end systems have become very embarrassing. The new development trend requires the new program organization to be compatible. How to solve this problem?

In the time-limited front-end system, in order to ensure the strong real-time characteristics of the protection module, high requirements are put forward for the weak real-time module. Now that the weak real-time module code accounts for a large proportion, can I change it? The change of thinking has brought about a new way of program organization, as shown in the figure below:
Insert picture description here
Different from the previous organization mode, only one layer of sampling and filtering interrupts the foreground program. At this time, an additional layer of foreground interrupt is used for execution protection and strong real-time Module, we are used to call it protection interrupt. In order to ensure strong real-time performance, the protection interrupt is a 5ms timer interrupt, but at this time it also brings an additional constraint: the protection interrupt program must be executed within 5ms.

Under the new architecture, the time optimization strategy focuses on strong real-time modules. Weak real-time modules can be written in traditional ways. At the same time, with the rapid improvement of CPU computing power, strong real-time modules are easier to compress to completion within 5ms, the best of both worlds . Even if there is a sign of protection activation in the sampling interrupt, we can adjust the protection interrupt timer to start in advance, further optimizing the protection action time.

Unfortunately, new strategies will always encounter new problems. The biggest drawback of this strategy is that interrupts occupy too much CPU time. Our measured data that year showed that the protection and sampling interruptions in the transformer differential protection device accounted for nearly 40% of the CPU. load. And because the priority of protection interrupt and sampling interrupt is relatively high, it affects other modules, such as serial communication.
Insert picture description here
Because the interruption is intensive and the CPU load is heavy, the serial port interruption is often lost, resulting in abnormal communication. In this case, you need to insert the serial port query program in the interrupt task. At the beginning, with the help of the buffer in the uart module, it was hard to achieve 9600 baud rate, but if you want to continue to increase it, it will cause the performance of the protection task to decrease, and the trade-off is quite difficult.

6.6.6 Exploration 4: Introducing a strong real-time OS for the first time

Virtual tasks in the interrupt mode are already very close to the OS scheduling mechanism, and it also brings a series of new problems, or simply go further and directly choose a strong real-time OS system. After "serious" weighing within the team, and I just finished learning UCOS, and I was in a good mood, we finally took a crucial step. However, we did not expect to greet us with a headshot.

According to the OS use principle, the code in the interrupt should be as few as possible. We propose the filtering calculation in the original sampling interrupt, and make the protection interrupt task. The whole program structure is shown in the following figure: The
Insert picture description here
first thing to do is to discover the discrete protection response time. Increased, the most intuitive feeling is that the protection action time is unstable, sometimes fast and sometimes slow. After testing, it was discovered that dense serial ports or CAN interrupts would affect the response time of strong real-time tasks.

The second hit is finding that OS-based programs are not easy to write. In the past, everyone was used to the front-end and back-end systems, and all software modules were query programs one by one. Now based on task scheduling, we took it for granted that we added a sleep to the previous query program and put it in the loop. Finally, we regret to find that although the cpu is faster, the program execution speed is not as good as before. Moreover, it is necessary to reserve a large enough stack space for each task, and the ram resource occupation is greatly increased than before.

The third headline is about program stability. Programs will always have some inexplicable problems. You must know that most of the code is accumulated through thousands of hard work, and after in-depth analysis, it is found that the source comes from "synchronous mutual exclusion". No matter how the previous scheduling mechanism is adjusted, it is essentially the mutual exclusion between the interrupt and the main program, which is relatively easy to handle. At present, after there are multiple tasks, the synchronization mutual exclusion point starts to increase, and we did not realize this at the beginning.

After a painful toss, a set of program versions that took up more resources, was slow, and unstable, everyone complained, and finally had to return to the previous version. This experience made me realize for the first time that good technology is a double-edged sword.

6.6.7 Exploration 5: Enhanced interrupt virtual tasks

Although it was the first time I tried OS and returned home, all the experiences are valuable. The first contact with OS has brought us many changes in thinking. For example, the synchronization mechanism in the OS will greatly simplify the program structure. A program that used to be implemented by a state machine now only needs to wait for a synchronization event. So, is it possible to introduce these strategies appropriately?

Based on interrupting virtual tasks, we have made some attempts to introduce a weak real-time scheduling mechanism into the main program to set priorities for various weak real-time tasks, not just switching, but scheduling based on priority. Of course, due to weak real-time, preemption is not allowed before each task is executed. The program structure is shown in the following figure:

Insert picture description here
Similar to the working principle of OS, we also build a task ready table. In order to simulate the synchronization semaphore, allowing recursive weak real-time scheduling during sleep or waiting for the semaphore, and additionally optimized the system timer module. The entire weak real-time scheduling mechanism is shown in the following figure:
Insert picture description here

6.6.8 Our current position: dynamic execution framework

OS is not only a set of scheduling mechanism, but also a set of software middleware, such as file system, tcp/ip, USB protocol stack, etc. With the rapid iteration of microcomputer protection equipment, network protocols, upgrade programs through U disk or export disturbance data and other functions have become common operations for users in daily maintenance. We don’t have to re-manufacture wheels, so we have no choice but to use OS.

Fortunately, we are here prepared this time. Dense serial port interrupts will affect the response time of the protection task, so choose a chip that supports the DMA mechanism; synchronous mutual exclusion will affect the stability of the program, as mentioned many times before, it will be limited to the api interface function. The person responsible;……

Moreover, not only that, but based on requirements, we additionally built a dynamic execution framework.

◇◇◇

OS-based tasks are endless loops, and tasks based on front-end and back-end systems are functions that can be called and returned. In order to unify these two types of programs, we re-extracted the application tasks and optimized all application modules into a series of message response functions. Of course, if you remember the distributed model, you will understand that this strategy will also facilitate the further distribution of the program to multiple CPUs.

Insert picture description here

◇◇◇

When using an embedded OS system, there is still a problem: There are many types of OS, and you need to toss every time you change it. We used the pSos system in the earliest stage. After pSos was acquired, we had to switch to the vxWorks system. Later, I used ucos on a simple cpu, and used windows simulation to build virtual devices. Recently, because of the ZTE Huawei incident, everyone began to slowly switch to the RT-thread system.

Different systems have different characteristics and have their own api interfaces. In the early days, we directly used the system interfaces of the OS. This may be the reason for our youth. We also like to fully explore the characteristics of using various OSs. Unfortunately, this It almost became a nightmare for later program porting, and large chunks of code needed to be discarded and rewritten.

In order to solve this problem, we tried to build an OS abstraction layer and carefully optimized a smaller subset of functions to constrain the application layer boundary. This approach brings convenience to the later system migration. The most typical one is that the weak real-time scheduling strategy we built before can also be perfectly transplanted, and it is easy to complete the full reuse of the early code.
Insert picture description here

◇◇◇

The task is the basic concept in the OS, the scheduling mechanism of the OS, and the resource carrier in the OS. OS-based programs need to set up a stack for each task, but memory in embedded devices is generally a scarce resource. Too many tasks will also increase the cpu load. In the field of microcomputer protection, cpu computing power is also in short supply. How to alleviate these contradictions?

In practice, we found that there are always many uncomfortable places to associate an OS task with an application module. The most typical one is the TCP communication protocol. The most classic strategy after each client connection is to create a new one. Tasks, but this strategy will cause too many tasks in the system, waste resources, and affect CPU efficiency.

In order to solve these problems, we separated OS tasks and applications, and conceptually abstracted the applications, including a series of message functions and timers. For the convenience of description, we call the OS task the active object, which is mainly responsible for resource scheduling, and the application module is called the reactive object, which is mainly responsible for application abstraction. The reactive object needs to be assigned to the active object to execute, and multiple reactive objects can be assigned to the same active object.

Based on active and reactive mechanisms, application modules can be constructed in the most comfortable mode, and task scheduling can also fully consider constraints such as hardware resources, CPU computing power, and real-time requirements.

Insert picture description here
◇◇◇

Similar to microcomputer protection devices, many industrial embedded devices require long-term stable and reliable operation. In order to prevent program disorder, it is customary to build a watchdog module in the front-end and back-end systems, and the dog feed operation must be performed at regular intervals, otherwise the system will reset. Based on the OS-based scheduling mechanism, the original watchdog strategy is not appropriate. If the watchdog task alone, many tasks may have died, but the watchdog task is still running well.

In order to solve this problem, the most common strategy is to introduce the keepAlive mechanism. Each task needs to report its running status to the system regularly. If a task does not report, then the task may have hung up and the task needs to be reset. Of course, this also requires the task to be easily reset. The most common strategy is to construct an abnormal initialization process that only pulls back the intermediate state of various programs, but interfaces such as memory allocation can no longer be called.

The comparison diagram of watchdog and keepAlive mechanism is as follows:

Insert picture description here
◇◇◇

Combining OS abstraction layer, active and reactive objects, message queue mechanism and other strategies, the entire dynamic execution framework is shown in the following figure:
Insert picture description here

So far, we have built a set of dynamic execution framework suitable for microcomputer protection, which is also the strategy our team is currently using. Different requirements will use different OSs, simple devices will continue to use weak real-time systems, complex weak real-time applications use linux, and strong real-time applications may use ucos or other os.

More importantly, because they are all based on a dynamic execution framework, no matter which OS is selected, the above code structure is the same, so there is a cross-system reuse foundation.

6.6.9 Reflection and summary

Double-loop structure, time-limited system, interrupted virtual tasks, weak real-time scheduling, and finally dynamic execution architecture. This journey is almost the epitome of my 20 years of work experience. All the way to explore, all the way to dig pits, often old problems have not been completely solved, new problems arise. A lot of emotion along the way.

Sometimes, I can't help but close the book and meditate on what made us toss all the way. After careful consideration, there are two key points: trend and demand. The first thing to bear the brunt is the trend. Embedded CPUs are getting faster and faster, resources are becoming more abundant, various peripherals are getting smarter, and various OSs are getting more and more friendly. Under this trend, we can't always hold on to the old things. If you don't let things go, you can only bravely welcome the future.

The abundance of resources has brought infinite possibilities. In the past, protection, measurement and control required separate devices to operate, but they have long been integrated; they used to run on separate devices, and now they can support complex object-oriented protocols. The abundance of software also brings changes in user operation and maintenance models, and even affects various organizational structures, and then further promotes rapid iteration of requirements.

It is foreseeable that the cycle of trends and needs will continue to iterate. Therefore, our journey of exploration will not stop. It will force everyone to either drive desperately or be thrown down by history.

◇◇◇

Another important insight is to reconstruct the cognition of technology.

I remember that in the early days, I especially liked chasing new technologies, and I often complained in the team. This is the age, still using assembler, not on the OS, not yet.... Now, in my thinking, technology is only a dimension, and the first thing in my mind is the product, the team. The technology itself is not good or bad, and there is no difference between high and low. Only based on the product and team, selecting the most appropriate technology is a better strategy.

Most of what we use is applied science, and new technology is relatively fast to learn. Even if the technology is completely unused, find a service team and learn together, using and learning, and the pace is generally fast. In fact, what is important is not new technology, but learning ability, which is something I often deliberately emphasize during team training.

Maybe because of the influence of classmates or advertisements, some guys and I always express that they want to learn a certain technology. I often ask, will it be used in the current product? If the answer is no, I generally suggest that you only take a quick look and do not study in depth, because there is a high probability that you will not stick to it, but it will affect your learning ability.

Based on the work, extended learning, each item is consolidated, quick start and quick use, will soon accumulate a lot of procedural skills, but also effectively enhance the learning ability.

◇◇◇

Finally, I want to share with you a point of view: all experiences are not in vain, even the initial task abnormal recovery mechanism in the double-ring structure is a very valuable program skill and life accumulation.

The path I have traveled is for your reference. Because everyone is engaged in a different industry, or because of different opportunities, I hope everyone can walk out a wonderful growth path of their own.

——————————————

Back to Contents

I am Xiaomaer, an embedded software engineer who longs for conscience and soul. Welcome your company and travel. If you are interested, you can add a personal WeChat account nzn_xiaomaer to communicate, and you need to note the word " different dimension ".

Guess you like

Origin blog.csdn.net/zhangmalong/article/details/106729593