Editor thinking and system design thinking

foreword

Compared with the history of human society, the history of computers is very short, and the 1950s and 1960s can be called ancient times. But the history of computers is amazing. Early ideas are often very advanced and advanced. For example, although the EJB technology was proposed in 1998, its design is very advanced, and the later technologies such as microservices have more or less borrowed its ideas. By understanding the development history of computer technology, we can often find many creative ideas, which can help us solve current problems. So, today I want to break down the historical gossip of Emacs and Vim, two enduring old antique software, to see if there is anything worth learning from.

Vim gossip

Vim genealogy

Let's start with the origin of the Vim editor. The following figure shows the family tree of Vim:

Vim's predecessor was the ed editor:

  • ed is one of the oldest programs on UNIX systems, and has been in it since the first version, by Ken Thompson (one of the UNIX authors). It provides basic line-oriented editing commands.
  • ex is a superset of ed. Bill Joy (one of the founders of Sun) enhanced ed when he developed BSD, so he named it ex. But ex is still a line-oriented editor.
  • Bill Joy later provided a visual interface (Viusal Interface) for ex to provide full-screen editing capabilities, so it was named vi.
  • To port vi to Amiga machines, Bram Moolenaar developed Vi IMitation (Vi imitation). With the continuous increase of functions, the name has also been upgraded to Vi IMproved (Vi improved version), namely Vim.

ed editor

ed is very different from modern editors such as VSCode and Sublime Text. As mentioned above, it is a line editor (here has been highlighted for you), that is, the edited object is a whole line of text .

ed is divided into command mode and edit mode. After starting ed, it enters the command mode by default, waiting for the user to input a command. ed finally achieves the purpose of editing files by executing these commands. Students using Mac computers can try to execute ed in the terminal. The format of the ed command is [addressing][command]:

  • Addressing: Select the target row to be operated on. ed provides three addressing methods: line number: an integer starting from 1; $ represents the last line. Pattern: Selects lines that match the regular expression. By default, it starts from the current line and selects the first matching line. Such as /re/. Add the prefix g for global matching. Such as g/re/range: an addressing range consisting of two addresses, [address], [address]. Such as /BEGIN/,/END/
  • Command: represented by a single character. The following are the most commonly used commands: p: show, output the target line. i: Insert, insert the content to the previous row of the target row. a: Append, append the content to the next line of the target line. c: Change, replace the content of the target line. d: delete, delete the target row. s: Replace, replace the matching line content with a regular expression.

The i, a, c commands will make ed enter the editing mode from the command mode, and enter a line in the editing mode. Then it will return to the command mode. Here are a few examples of ed edits:

  • Delete all blank lines: g/^$/d. Globally searches for the regular expression /^$/ with the prefix g, and executes the delete command.
  • Print all lines containing "re": g/re/p. Also globally search for the regular expression /re/ and execute the show command. Because this function is so commonly used, a command "grep" was specially developed.

editor thinking

As mentioned above, the ed editor is very different from the modern editor. It is actually an editing command interpreter; but the ed editor is very similar to the modern editor. The essence of all editors is to continuously perform "addressing". As with "command", the difference between the different types of editors is only the objects edited:

  • ed is the text line editor: the edited object is the text line.
  • Microsoft Word is a document editor: editing objects are document elements such as chapters, paragraphs, and sentences.
  • Sketch is a graphic editor: the edited objects are graphic elements such as points, lines, and surfaces.
  • IntelliJ IDEA includes a Java code editor: the edited objects are Java semantic elements such as classes, methods, and statements.
  • jQuery is a DOM editor: the objects edited are DOM elements. First, use CSS Selector to address and select the DOM element to be processed; then use chaining expressions to perform a series of editing actions.
  • ……

It can be seen that editor thinking is everywhere, as long as it conforms to the "address + command" mode, it can be called an editor, so everything can be edited! The essence of editor thinking or editing, in terms more familiar to developers, is CRUD:

If someone doubts that the developers are just doing simple additions, deletions, modifications, and checking, please tell them bravely: In fact, I am working as an editor in a vertical field!

If you realize that what you are doing is actually an editor, you can use editor thinking to quickly discover the shortcomings of system capabilities. Taking the commodity management system as an example, if the commodity management only provides the function of querying commodities by ID, just like the ed editor only supports addressing by line numbers, it is very inconvenient to use. You can learn from ed through regular expression pattern matching. Addressing capability, providing the ability to match products through product names and other information, and even match similar products through product photos; similarly, the ability to create products can also learn from the editor’s ability to copy and paste, providing the ability to quickly create new products with similar products , and even offers the ability to move from other platforms.

ed's genealogy

The previous article only introduced the function of ed interactive editing. In fact, ed also supports scripted editing, which is to save the editing commands entered into the terminal as a script file for subsequent repeated execution. The benefit is that you can batch edit as many files as you like with the same editing command.

The picture above is the family tree of the ed editor. Subsequent derivative programs have selected and increased some of the capabilities of ed. for example:

  • The branch ex, vi, vim chooses the interactive route.
  • grep, fgrep, egrep took the pattern matching route.
  • Sed, awk chose the scripted route.

Emacs gossip

Defection from the Vim camp

I used to be a heavy user of Vim, because the operating system I used in college was Debian Linux, and whether I wrote C code or Java code, I was a shuttle in Vim. The advantage is that the closed-book written test can be directly dictated, and the students who use Eclipse basically cannot remember the full name of the JDK API. :-p

After graduation, I joined a foreign company and had to start using the Windows XP system. One day I was writing something in a notepad and found that I would often press the Esc key unconsciously. Students who have used Vim must know that this is switching modes. This I realized: Vim's multi-modal design is very anti-human. When Vim starts up, it does not enter the editing mode by default. When a novice user has not learned anything, he cannot use Vim as an ordinary notepad. There was a joke about Vim about how to get a random string of codes, and the answer was for a Vim novice to try to quit Vim.

This method doesn't suit my taste, I try to find a new editor - when I haven't learned anything, I can use it as the most common notepad; when I need advanced functions, I can call it out through shortcut keys, etc. . It turns out that Emacs fits the bill, so I defected from the Vim camp to the Emacs camp since 2010. I think this difference in usage is the most essential difference between Vim and Emacs: Vim forces users to do things according to its rules from the beginning; Emacs requires relatively little prior knowledge. There has been a learning curve of an editor circulating on the Internet, which is quite appropriate:

Origins of Emacs

The predecessor of Vim, ed, is derived from the UNIX system, and the predecessor of Emacs, TECO, is derived from the predecessor of the UNIX system, the Multics system.

In the 1970s, when Richard Stallman, the founder of GNU, was working in MIT's AI lab, he invented the TECO editor, which ran on the PDP-10 machine. Similar to ed, TECO is also a command interpreter—receiving and executing editing commands—and also takes single-character command names, such as "l" to move one line, and "5l" to move five lines. The big guys at MIT wanted to use TECO commands to complete some complex editing work, so they added functions such as branch judgment and looping; but due to congenital deficiencies, when TECO was first designed, it did not design the commands into a complete programming language, resulting in Subsequent improvements are also difficult, such as the command name can only be a single character, and soon the characters are not enough.

The so-called foundation is not firmly shaken, and everyone believes that it is necessary to replace TECO's semi-finished scripting language with a rigorous and complete programming language. So a professor named Bernie rewrote TECO with MacLisp on the Multics system, named it Emacs, and wrote a detailed manual for it, teaching everyone how to extend this editor to meet their own work needs. As a result, this version of Emacs was a huge success, and even Bernie 's secretary—a man who claimed he didn't know how to program—was following the manual and wrote Lisp code to extend the editor's functionality. After the incident caused a stir in the lab, Bernie made a summary: If there is an application - a program that can help you do something useful - Lisp is embedded, and it can be extended by Lisp programs. , which is a great way to get started with learning programming! Those who don't think they can program, this way will give them the opportunity to write small but useful programs, and let them grow in practice until they find that they are programming.

Stallman and they think this idea is crazy! At the same time, they wanted to migrate this easy-to-use Emacs version to other systems other than Multics, but at that time only Multics had a complete Lisp environment—both compilers and interpreters—but not on systems such as UNIX. .

There is also a small episode here. James Gosling, the father of Java, also wrote a cross-platform version of Emacs called Gosmacs. Originally, the community wanted to come together to improve this version, but Gosling sold it to a commercial company. At the same time, the underlying Lisp was not a real and complete Lisp, but a fake Lisp called Mocklisp, which was just syntactically similar to Lisp. So the community eventually dropped this option and decided to make a whole new Emacs from scratch, aka GNU Emacs. Stallman first used C language to develop a cross-platform Lisp interpreter - Emacs Lisp, and then used Lisp to implement editing logic. This allows Emacs extensions to be written in a unified Lisp dialect on all platforms, while maintaining performance.

After a period of development and comparison of GNU Emacs, because Stallman was too busy by himself, the community created a branch called XEmacs, which enhanced font anti-aliasing and other functions. Later, the maintenance of GNU Emacs became active again, and many features of XEmacs were merged back into GNU Emacs, so now XEmacs is almost obsolete, and the mainstream version is still GNU Emacs.

system design

Editor Holy War

The programmer's world is full of contempt chains, including editor contempt chains, programming language contempt chains, and operating system contempt chains... Why are these holy wars never ending? Peeling an egg first broke the big head or the small head and started a war, or is it really impossible to have both?

As mentioned earlier, Vim likes to force users to do things its way. Vim inherits the features of the line editor from ed, the underlying model is based on "line", so it will force all edited objects to adapt to its underlying model. You use Vim to write Java code, and you edit lines of text; you use Vim to write a blog, and you edit lines of text; you use Vim to write a paper, and you edit lines of text; whether you edit classes , function, paragraph, table of contents, or any other content, must first be translated into the corresponding line-oriented editing commands such as dd, yy, etc. in your mind.

Emacs is a personalized editor that allows users to transform Emacs into target objects first, and can recognize target models, such as paragraphs, chapters, and directories. In a fashionable phrase, Emacs has an industry Know-How. The same example: write Java code in Emacs, you edit classes, methods, statements...; you write a blog in Emacs, you edit paragraphs, sentences...; you write a paper in Emacs, you edit The content is the table of contents, chapters, text, index... .

Two design approaches

The reason for the above difference is the two different design methods behind it, called Top Down and Bottom Up:

The method is described from the top to the bottom, and the large tasks are divided into small tasks with suitable granularity - small enough to do practical things - improve the underlying programming language, etc. - let the underlying infrastructure continue to approach the business Domain - to adapt to the task, the advantages are low difficulty, the goal is clear, the iteration is fast, the function is complete, the adaptability is strong, the disadvantage is too tightly coupled with the current needs, the ability to cope with changes is slightly weaker, the progress is relatively slow, and the progress is slow.

Editing with Vim is a top-down approach—continuously splitting editing tasks into line-oriented editing commands; just like the daily development of Java, it will be split level by level and finally broken down to the JDK API. Editing with Emacs is a bottom-up approach—first improve the underlying Emacs Lisp language, gradually abstract the business-oriented domain-specific language, and finally use DSL to complete editing tasks; The next list item, the next cell of the table, etc. are specific editing operations for the Markdown field.

The difference between these two design methods does not mean that the code is written in a different order, but the difference in the system abstraction process, which is ultimately reflected in the difference in system scalability. I personally divide the scalability of the system into 4 levels:

  • Hard-coded: When the system is running, data and behavior are hard-coded and cannot be changed.
  • Configurable: When the system is running, the data can change dynamically, but the behavior is fixed.
  • Controllable: When the system is running, the data can be dynamically changed and can be dynamically selected from a variety of predefined behaviors.
  • Programmable: When the system is running, the data can be changed dynamically, and the behavior can be dynamically added during the running process, that is, the user can reset the system behavior.

The top-down extreme is hard-coding, which will prematurely limit the function to the current requirements, and the subsequent requirements can only approximate the initial model as much as possible; the bottom-up extreme is programmable, which is easy to transition design, which is impossible for the future. Changing scenarios provide flexibility and even become a general-purpose programming language.

There is no absolute right or wrong for the two design methods, and each has its own applicable scenarios. There will be problems with any single method, and a trade-off between rapid implementation and system scalability needs to be made according to the actual situation. It is precisely because there is no right or wrong that the editor's holy war will never end.

 

Original link

This article is original content of Alibaba Cloud and may not be reproduced without permission.

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324064976&siteId=291194637