Pan Aimin: The Evolution of Computer Programs - Thirty Years of My Program Life

This article is the content of " New Programmer 004 ", and I talked with Pan Aimin about his programming life. " New Programmer 004 " is coming soon, so stay tuned. From Michael "Monty" Widenius, the father of MySQL and founder of MariaDB, to Bruce Momjian, co-founder of PostgreSQL Global Development Group, Jia Yangqing, vice president of Alibaba, Pan Aimin, founder and CEO of Instruction Set, Wu Jun, a famous technology author, and Vue.js Author You Yuxi... "New Programmer 004" with the theme of "Our Technology Era, My Programming Life", conducted in-depth dialogues with many well-known domestic and foreign technology pioneers and representatives of the new generation of programmers, hoping that the industry will be excellent The characters' technical road and life insights bring inspiration to everyone.

Author | Pan Aimin Editor | Tang Xiaoyin

Produced | "New Programmer" editorial department

I first came into contact with computer programs in 1985. Although it was only some work-type BASIC programs on the claw machine, I still felt the joy of programming, and I enjoyed writing all kinds of trick programs. Over the past 30 years, computer programs have become a way of thinking for me. Whether it is the functions of the application layer, the capabilities of the system layer, or the data processing logic behind them, they have all been transformed into machine instructions.

I am fortunate to have fully experienced the development process from the PC era, to the rise and development of the Internet, to the mobile Internet, to the era of the Internet of Everything and industrial digitization. The past 30 years have not only been the development period of software technology and industry, but also my personal career and program life. This article introduces the evolution of computer programs in my cognition, which happens to be a summary of thirty years of program life.

Pan Aimin (taken while working at Peking University in 1999)

Software stack - from source code to machine instructions

Generally speaking, a computer program refers to the portion of code in software. Software covers much more, such as data, documents, and even hardware (such as dongles), and possibly corresponding services. A computer program is a set of instructions that instruct a computer or other device with information processing capabilities to perform various actions. Computer programs can be in the form of machine instructions (sequences of binary 0s and 1s that are difficult for humans to interpret) or in the form of raw code written by humans (usually interpreted and maintained by programmers).

The development of software technology over the past three decades can be viewed from the way programs are written and run. The typical ways are as follows:

1. Code literal translation execution

In the early days of programming, programmers controlled a machine by following the idea of ​​the machine executing the instructions. The most typical program is written in C language, almost every line of code can correspond to an instruction sequence, and even assembly instructions (the character description of machine instructions) can be directly embedded in the C language source code.

2. Code interpretation and execution

The original code is interpreted as an intermediate abstract language description, which is further converted into machine language for execution. Based on the philosophy of Java language "Write once, run anywhere.", programs written in Java language are inherently cross-platform. The programmer is faced with an abstract computing environment, how the written Java code is executed, there is a layer of indirection in the middle.

3. Virtual Machines and Containerization

The code written by the programmer is ultimately executed by the CPU on the machine. As the computer hardware becomes more and more capable, a machine can be virtualized into multiple computers. The machine code from which the source code is compiled or interpreted is further mapped into a sequence of instructions. The development and popularity of cloud computing has made this way of running programs mainstream. Containers are a lightweight form of virtual machines, and their ideas are essentially the same.

4. Front-end and back-end separation

Data display is done with Web technologies, back-end processing is done with a suitable programming language, and communication between the two is done through APIs that conform to Web standards. Separating the visualization and user interaction parts, rather than coupling them with business logic, is the core idea of ​​separating front and back ends.

Of the above four methods, the first three focus on the introduction of one or more intermediate layers into a computer program on the same machine, causing the source code to be interpreted or mapped multiple times before reaching the physical CPU; the fourth method is to cross the The machine's perspective decouples the visualization and user interaction parts from the business logic (especially data processing). These changes are closely related to the development of the software industry. The following points are worth mentioning.

  • From source code to corresponding machine instructions, the path is increasingly unclear. In the physical machine environment, the source code is separated from the CPU by one or more layers; in the cloud computing environment, the programmer does not even know how the code is physically executed. This changing trend has made program debugging and performance optimization more complex and difficult.

  • When the program runs, the code path of the business function becomes longer and even distributed to different computing environments. Where a business logic is triggered and where it is processed may span execution environments or even networks. This requires a higher composite ability of programmers, and programmers with a single technology stack will face great challenges.

  • These programming methods are integrated and have their own adaptation scenarios. With the continuous development of the Internet and its penetration into various industries, the hybrid programming method has become the mainstream of the software industry. For software developers, we need to go hand in hand with depth and breadth: the so-called depth refers to the understanding of the progressive execution of the source code to the next level; the so-called breadth refers to the migration of the execution chain from one execution environment to another execution environment. understanding of the process.

  • Technological progress is the internal driving force for industrial development. Cloud computing benefits from the great progress of hardware and the maturity of virtualization technology. In turn, the development of the industry has led to the large-scale expansion of software practitioners. There are millions of programmers in China, and even tens of millions according to some statistical methods. The development of these programming methods has a great relationship with the efficiency of the software industry. The advancement of new software technology is likely to increase industrial efficiency exponentially, which may in turn lead to the need for a large number of practitioners to make technological changes.

I have always believed that programmers writing code is a creative activity, which is the sacred part of the programmer's profession. Ideally, programmers write code that is creative and high-value beyond the reach of robots and artificial intelligence. Otherwise, sooner or later these jobs will be replaced by technological innovations. If the code written by the programmer is just some repetition of the formula, or can be described by a relatively simple formalization, then with today's technological evolution, the code can be written without a human. This is the goal that many low-code or zero-code development platforms are currently working on, and it has led to the trend that writing tools is very valuable, and work that does not depend on tools to write a lot of logic is of limited value.

Write code in the dormitory during graduate school

For the first half of my career, I've been a believer in "understanding a program or system in a binary way", trying to figure out the underlying sequence of instructions for every function or task. With the increasing complexity of programs or systems, cloud computing and the popularity of front-end and back-end separation models, it is increasingly impossible for us to understand them in a refined binary manner. In this case, the grasp of software architecture has become more and more important, and by chance, I have achieved an upgrade from a system programmer to a software architect.

Network - Connectivity Everywhere

The development of the Internet has changed our lives, which is one of the biggest changes in human society in the past three decades. The network has also changed the way computer programs run, and even changed the thinking of programming. Let's first look at the evolution of the network itself:

  • The initial network is used in the computer room environment or office environment, and the physical form of the network is very intuitive, because in most cases, each machine is dragged with a line. Common network functions are performed through specialized applications, such as e-mail programs, browsers, file transfer tools, and so on. Programming of network functions is an advanced programming technique.

  • The network has become popular. Homes and many public places have networks. The connection method can be a wire or Wi-Fi wireless. Under such conditions, more and more programs have added network capabilities, and network programming has gradually become popular, but it is still an advanced programming technology. However, with many middleware, the programming threshold has been lowered.

  • The popularity of mobile data networks. Advances in networking are comprehensive, including hardware infrastructure and software stacks. The mobile data network is relatively unstable, which is a challenge for software programming, and its complexity comes from two aspects: the handling of network anomalies, and the network connection involves two parties' coordination. However, thanks to the basic network functions natively provided by mobile operating systems, the threshold for network programming is greatly lowered.

Second, the idea of ​​the network is inseparable from the operating system, and the evolution of the two is even more closely related. The core philosophy of the network is protocol layering. The function implementation of each layer only depends on the semantics provided by the next layer, and also provides standard or agreed semantics for the previous layer. The operating system also has a similar hierarchical structure. The higher the level, the further away from the hardware; the higher the level, the lower the threshold for programming. The software stack for networking is mostly in the operating system, and correspondingly, writing applications is not significantly more difficult since most of the complexity introduced by networking is absorbed by the operating system. For example, most of the complexity brought by the mobile data network is handled by the mobile operating system, and the application development framework provided above does not expose the complexity of the mobile data network environment.

However, the network itself has a huge impact on the writing of applications, from software design to code writing, and the following are some notable changes.

  • The most basic pattern of network programming is asynchronous programming, and exception handling and recovery. TCP and UDP are not only two transport protocols, but also two programming ideas. They guide us how to design and write server and client programs.

  • For programs running in a network environment, we may write only half a program. The other half of the program may be developed in the hands of people who are completely unfamiliar, maybe on the other side of the world, or even not here. We need to abide by the agreement of both sides of the network, abide by the norms of the other party, or be flexible enough to deal with possible surprises.

  • Network functions are susceptible to performance and experience issues. The network is often a shared resource, so its instability is usually predictable. If the network function is handled well, it is possible to resolve the unstable factors; if it is not handled properly, it will cause extremely poor performance. Even the experience is extremely poor, and even the process deadlocks or crashes. From this point of view, there is always room for optimization and improvement in network programming.

  • The instability of the network brings uncertainty and makes the diagnosis of network programs difficult. On the one hand, the network environment of the program may be jittery, resulting in unreproducible network behavior, which increases the difficulty of diagnosis; on the other hand, the network program may run in a physically unreachable environment (such as a cloud host), which requires Programmers should have a more comprehensive understanding of the network environment, otherwise it will be difficult to analyze the root cause of the problem from phenomena or error codes.

The computer network has experienced a lot of technical optimization and elimination. The stable network and good network applications we enjoy today are the result of historical precipitation. In terms of hardware, our network is becoming more and more stable, and seamless switching between base stations (or access points) of wireless networks can be achieved; in terms of software, the operating system solves a large number of network complexity problems, leaving the application program relatively Easy-to-implement processing logic, such as easy-to-handle HTTP protocol, stateless remote requests, automatic offline caching, and more. In the early days of mobile data networks, many applications encountered a white screen and did not respond when the network was unstable. In order to improve the user experience, application developers needed to write a lot of code. With the popularity and stability of mobile data networks and the maturity of mobile operating systems, this type of application code has been greatly reduced.

With the development of the mobile Internet, we have entered the era of the Internet of Everything, and industrial digitization is in full swing. The connection on the network is no longer limited to computers or personal devices, and more and more various devices are connected to the Internet. Both network technology and operating system are facing an upgrade, from concept to functional extension are changing. I am fortunate enough to be on the wave of technology after so many years in the business. In this round of technological upgrades, an operating system for device connection emerged as the times require. Therefore, I founded an instruction set company in 2018, specializing in the development and commercialization of Internet of Things operating systems.

Artificial Intelligence - From Simulated Intelligence to Beyond Human Intelligence

The development of artificial intelligence represents a pursuit of human use of computing. Computing is an ability that can do many things, including scientific computing and transactional tasks; the task of artificial intelligence is to allow machines to have the same intelligence as humans through computing. Since the birth of computers, the development of artificial intelligence has experienced ups and downs, but in the past three decades, the discipline of artificial intelligence has generally been moving forward. The following are some typical events in the field of artificial intelligence:

  • Deep Blue computer (made by IBM), in 1997, Deep Blue defeated the human chess champion Kasparov.

  • The bionic robot "Big Dog" (developed by Boston Dynamics), in 2005, can walk on four legs.

  • AlphaGo (AlphaGo, developed by Google), in 2016, defeated the human Go champion Lee Sedol.

  • Face recognition technology is applied to mobile apps. For example, in 2015, Jack Ma demonstrated face-swiping payment at the CeBIT (Germany) exhibition.

  • AlphaFold/AlphaFold2 (developed by Google), 2020/2021, basically overcomes the problem of predicting protein folding structure that has puzzled human scientists for a long time.

In addition, in the past 10 years, most car manufacturers (whether traditional car companies or new power car manufacturers) and some Internet technology companies have been researching self-driving cars, and some self-driving cars have been launched one after another. From the above events, we can see that there are many exploration paths for artificial intelligence applications:

  • analog intelligence

The human thinking process is simulated using computing power. The more typical ones are regularized intellectual activities such as chess and Go, and the human thinking process can be properly extracted. Therefore, as long as there is sufficient storage and computing power support, as well as human experience models, there is a chance to do better than humans.

  • Using algorithms to achieve intelligent tasks

In many application scenarios, artificial intelligence algorithms (mainly deep learning algorithms) can be used to complete some well-defined tasks, such as face recognition, license plate recognition, speech recognition, and so on. This type of artificial intelligence application requires two conditions: enough samples and enough computing power. In the past ten years, the vigorous development of the mobile Internet has brought together enough sample data for many business scenarios, and combined with the development of cloud computing, this type of artificial intelligence application has developed rapidly.

  • Replacing human beings comprehensively to achieve human intelligence

The most typical examples are self-driving cars and various robots with complex decision-making capabilities. Self-driving cars can free humans from driving tasks, and robots can replace humans in complex scenarios to perform tasks. This type of artificial intelligence application requires the integration of various software and hardware technologies, and has become a hot spot in the industry in recent years.

  • beyond human intelligence

Explore uncharted territory for the benefit of mankind. Typically, in some scientific research fields, the combination of artificial intelligence technology has achieved revolutionary breakthroughs. For example, the above-mentioned AlphaFold2 has made a breakthrough in the prediction of protein folding structure, reaching a level that humans could not do through experiments. result.

The three core elements of artificial intelligence are data, computing power and algorithms. Computing power is the physical basis of computing, data is the raw material of computing, and algorithm is the logic of computing, and its final form is software code. The development of artificial intelligence has created a large number of data engineer and algorithm engineer jobs. Data engineers are responsible for collecting data, performing various processing on them, and collecting them for use by algorithms; algorithm engineers are responsible for implementing various algorithms, or calling some general algorithms to complete specific tasks. After years of development, many algorithm libraries have been deposited, and many are open sourced for the industry to use, such as TensorFlow, PyTorch, Ray, and SparkML.

The programming of algorithms should pay special attention to performance to ensure that the performance of the algorithm is sufficiently good. This is not a simple task, it requires a solid knowledge of the underlying system, and even an understanding of the hardware architecture. On the one hand, the transmission and distribution of data is very important for an algorithm with a large amount of computation; on the other hand, among many computing nodes, it is necessary to avoid a single point of performance bottleneck. When using artificial intelligence algorithms, many domain experts do not understand the configuration requirements of the underlying computing platform, or fail to use the computing library correctly, resulting in wasted resources or excessive computing time, which is common in practice.

I had the opportunity to build a large computing device at Zhijiang Laboratory, called the Intelligent Computing Digital Reactor, which aims to build a large computing platform. The digital reactor can aggregate a variety of heterogeneous computing resources, and provide a unified computing platform for various applications (including scientific discovery, digital economy, industrial simulation, etc., called application reactors) through some computing frameworks or algorithm libraries. It is conceivable that once such computing facilities are available, it will become a development model to combine artificial intelligence algorithms with models in various fields to surpass human intelligence in local areas.

Visualization and User Interaction - From GUI to Digital Twin

In application software development engineering, visualization and user interaction often occupy a considerable proportion. Engineers who write this part of the program logic are often more popular, because they can quickly build some programs with visible results. In the current situation of industrial digitization in full swing, in addition to data engineers, the most popular ones are engineers who develop visualization and user interaction. Over the past three decades, mainstream user interaction and visualization software technologies have undergone several changes:

  • Native GUI (supported by operating system)

In the early 1990s, writing user interfaces often faced directly with the APIs provided by the operating system. In order to create a good-looking, neat and aesthetic graphical interface, programmers must not only be familiar with the window management and graphics API of the operating system, but also be proficient in computer graphics, and even need to understand color aesthetics. More importantly, the amount of code in the interactive interface part of the program is extremely large, and these codes are very unfriendly to the compatibility of hardware monitors and operating system versions.

  • GUI in application programming framework

Implementing a graphical user interface through an application programming framework can effectively solve the problem of directly writing interactive interface logic on the basis of native GUI. Therefore, many application programming frameworks have been born since the mid-to-late 1990s. The most classic is Microsoft's MFC application. programming framework, and Qt for cross-platform support. The Java environment provides user interaction and visualization support, and .NET also provides powerful graphical user interface development capabilities.

  • Graphical interface interaction engine

For applications with high requirements for user interface dynamics, such as their data is dynamic, or the form is dynamically configurable, they tend to have a built-in graphical interface interaction engine, so as to visualize the rendering effect of the interface and the processing flow of interaction events control within its own program. Typical rendering and interaction engines are the Apple-supported WebKit engine (formerly KHTML), Adobe's Flash engine, and Google's Chromium browser engine (derived from the WebKit and V8 engines). When Adobe Flash was still in its prime (2010), the industry once sparked a debate about which is the future, Flash or HTML5.

  • B/S Architecture

After years of development, visualization and user interaction have gradually tended to be standardized—HTML+CSS+JavaScript, which is the basis for the front-end separation mentioned above. The premise is that each front-end environment has a Web browser (Browser), the front-end logic runs in the browser, and they communicate with the back-end (Server) through HTTP or HTTPS protocol. This architecture fits perfectly with server virtualization for cloud computing. Applications can be deployed on the cloud, and users can see the results of program execution and interact with them anywhere on the Internet just through a browser.

The technological evolution of GUIs has gone through the process of "optimization of single-machine graphics display - standardization of GUIs - separation of front and back ends of GUIs". The mainstream visual interactive interface development forms are developing towards high efficiency, and the architecture is also becoming more reasonable. . However, these development methods are not simple alternatives, and each method still has its applicable software and hardware environment in today's industrial environment.

In the era of PC Internet, visualization and user interaction technology focused on the continuous improvement of rendering performance and response timeliness; in the era of mobile Internet, in addition to performance requirements, dynamism was a more important requirement, and page content in many applications. and interaction logic needs to be easily customized; and in the era of the industrial Internet, visualization and user interaction have new needs and trends. The following two aspects of development deserve special mention:

1. Low-code development platform

As the name suggests, a low-code development platform refers to the development of an application that requires only a small amount of code, or even without writing code. This is a new mode of software development at the application layer after cloud computing virtualization matures. There are a large number of dynamic data visualization requirements in the industrial digitalization scene, which has spawned various page customization tools, and further expanded to form a low-code development platform. The advantages of the low-code development platform are: low entry threshold, quick page generation, and short page testing process.

2. Digital twin

Digital twins were first proposed and developed in the industrial field, and later applied to the digitalization of buildings and cities, and then adopted by all walks of life. The digital twin connects the physical world with the digital space for mapping and feedback. Digital twins involve a wide range of technologies, mainly including IoT technology (for digitizing the physical world and feeding back from the digital space to the physical world), data modeling (building a twin in the digital space), 3D visualization (transforming the digital space into twins), GIS and BIM (establishing a consistent coordinate system in digital space), etc.

In the past 30 years, visualization and user interaction technology, from the initial stage to solve the most basic rendering effect, performance and usability problems, to gradually form a systematic, high-efficiency solution and platform tools, which makes software development engineering Spend less and less manpower on visualization and user interaction and more on the software business itself. Thanks to the continuous development of the graphical interface interaction engine and the advancement of artificial intelligence technology, more and more visualization and user interaction work will be done by machines in the future, and eventually zero-code development will be achieved.

In my practice in recent years, our team has made full use of the advancement of visualization and user interaction technology, built a twin model and low-code development platform in the instruction set IoT operating system. A direct benefit of this is that for each user scenario, as long as the device is connected, it naturally becomes part of the twin model, participates in the business model, and can be displayed in the visual interface. This approach can greatly improve the digital efficiency of IoT scenarios.

Prediction of technological progress in programming

After reviewing the important advances in programming technology over the past three decades, we also look to the future. I describe it in terms of what is likely to happen in the next ten years:

  • The development route of the operating system will enter a new track - the industry operating system.

The essence of the industry operating system is to seize the common parts of the industry to form a software system that can be copied to the hardware computing environment with significant commonality in the industry. Typical examples are automotive operating systems and smart building operating systems. Only industry scenarios that can abstract the hardware computing environment and common requirements can form an industry operating system.

Source: CSDN downloaded from Oriental IC

  • There will be major breakthroughs in artificial intelligence, and these breakthroughs will have profound changes in human production and life.

The opportunity for this round to promote the development of artificial intelligence is that various basic research fields have made great progress in the use of computing power. Researchers in the fields of biology, medicine, materials, astronomy, and geography have a deeper understanding of what can be done with computing power and algorithms. AlphaFold is an example, the mode can be extended to other areas.

  • A public digital twin space.

Many public facilities and equipment are open in the digital twin space and can be accessed by the public network through a URI, which constitute a public digital twin space. After ten years, device access is very convenient. If the firewall of the local network does not restrict a device, the device will become a node in the public digital twin space after access. Based on this public digital twin space, some applications and services will also be born. Corresponding to the public digital twin space is the private digital twin space of each organization.

  • In China, the number of people writing code peaks after 2-3 years and decreases by an order of magnitude after 10 years.

In the past two years, industrial digitalization has been fully promoted, and it has all started from application requirements, which will inevitably lead to a large number of personnel to do application functions, including various data engineering and visualization. With the improvement of system tools and the return to rationality of digital construction, the opportunities for coders are reduced. The birth and popularization of good tools will eliminate a lot of jobs. Of course, the deepening of digitization will also bring new opportunities for coding, which depends on who has a strong ability to relearn.

  • A Chinese-based programming language was born for application layer logic definition.

This programming language was born from one industry and is applicable to many more industries. Combined with NLP and other artificial intelligence technologies, universal programming in our country is expected to come true.

The above predictions are the thinking of a programmer, and some of them also carry my personal wishes, or even the continuation of the work I am currently doing, or the goals I expect to achieve. From this perspective, these predictions are too realistic and not outrageous enough.

Epilogue

Finally, I mention two fundamental principles behind the development of computer programs and programming techniques:

  • As the usage scenarios and scope of computer programs become wider and wider, the basic means to adapt to this breadth expansion is layering, that is, adding layers;

  • In the hierarchical structure of a computer system, the lower the level, the more the ability to provide commonality, on the contrary, the higher the level, the more individualized it is.

With the widespread promotion of industrial digitization, the industrial operating system is a new level, and there is a more abstract technical operating system under the industrial operating system. The so-called progress of computer programming technology is to make the level and division of labor tend to be reasonable according to this principle.

This article reviews the evolution of computer programs in the past 30 years from the perspectives of software stack, network, artificial intelligence, and visualization and user interaction, which just corresponds to some insights from my 30-year programming career. Looking forward to the continuous progress of computer programming technology in the next ten years, opening up a better life and experience for human beings.

About the author: Pan Aimin, founder and chairman of the instruction set, chief architect of the intelligent computing digital reactor of Zhijiang Laboratory. He has been engaged in the research and development of software and system technology for a long time. He has written a large number of software technology articles, translated many classic computer books, and published more than 30 papers in domestic and foreign academic journals. He used to teach at Peking University and Tsinghua University (part-time), and then entered the industrial world, and worked in Microsoft Research Asia, Shanda Network and Alibaba. His main research interests include mobile operating systems, information security, big data, mobile Internet, Internet of Things, and smart cities.


" New Programmer 001-004 " is fully listed, dialogues with world-class masters, and reports on innovation and creation in China's IT industry

insert image description here

Guess you like

Origin blog.csdn.net/programmer_editor/article/details/123741004