How exciting is the online Excel project?

It’s been several months since I joined the Tencent Document Excel development team. I just downloaded 100+W lines of code at the beginning. The amount of code is huge, but the module design and code quality are much better than I thought. Today I will share with you the next Excel project. How fun can be.

Challenges of real-time collaborative editing

When it comes to the difficulties of real-time collaborative editing, everyone's first reaction is basically collaborative conflict handling.

Conflict handling

The conflict resolution solutions are actually relatively mature, including:

  1. Edit lock : When someone is editing a document, the system will lock the document to prevent others from editing at the same time.

  2. diff-patch : Based on the similar idea of ​​Git and other version management, operations such as difference comparison and merging of content, including GNU diff-patch, Myer's diff-patch and other programs.

  3. Final consistency realization : Including Operational Transformation (OT), Conflict-free replicated data type (CRDT, called conflict-free replicated data type).

The implementation of the edit lock is simple and rude, but it will directly affect the user experience. diff-patch can merge conflicts by themselves, or hand them over to the user when conflicts occur. The OT algorithm is the scheme adopted in Google Docs, and the Atom editor uses CRDT.

OT and CRDT

The similarity between the OT and CRDT methods is that they provide ultimate consistency. The difference lies in how they operate:

  • OT does this by changing operations

    • OT will split and convert editing operations to achieve the effect of conflict handling

    • OT does not include specific implementation, so the project needs to be implemented by itself, but high-precision conflict handling can be performed according to the needs of the project

  • CRDT does this by changing the state

    • Basically, CRDT is a data structure. When the same set of operations are used for updating, even if these operations are applied in a different order, they will always converge on the same representation.

    • CRDT has two methods: operation-based and state-based

OT is mainly used for text, which is usually complex and not scalable. CRDT implementation is very simple, but Google, Microsoft, CKSource, and many other companies rely on OT for a reason. The current state of CRDT research supports collaboration on two main types of data: plain text and arbitrary JSON structure.

For more advanced structures such as rich text editing, OT trades complexity for the user's expected realization, while CRDT pays more attention to the data structure. As the complexity of the data structure increases, the time and space complexity of the algorithm will also appear. An exponential rise will bring performance challenges. Therefore, most real-time collaborative editing is now based on OT algorithms.

Version management

In a multi-person collaboration scenario, in order to ensure user experience, the diff-patch/OT algorithm is generally used to handle conflicts. In order to ensure that every user operation can be updated according to the correct timing, it is necessary to maintain a self-increasing version number, and each time there is a new modification, the version number will be updated.

Data version update

Several prerequisites are required to update the data version in an orderly manner as expected:

  • The collaborative data version is updated normally

  • The missing data version was successfully patched

  • Orderly increasing version of submitted data

How to understand these premises? Let's take an example.

Xiao Ming opens a document, and the data version pulled from the server is 100. At this time, the server issued a message saying that someone had updated the version to 101, so Xiaoming needed to update the data of this 101 version to the interface. This is a normal update of the collaborative data version .

Xiao Ming edited based on the latest 101 version and generated a new operation data. When Xiao Ming submitted this data to the server, the server saw that Xiao Ming's data was based on version 101, and told Xiao Ming that the latest version is now 110. Xiao Ming can only go to the server and pull back the 102-110 version. This is the successful patching of the missing data version .

After the data version of 102-110 is pulled back, Xiaoming's previous operation data needs to be conflicted with these data versions, and finally an operation data based on version 110 is obtained. At this time, Xiao Ming resubmitted the data to the server, the server accepted and assigned Xiao Ming version 111, so Xiao Ming upgraded his local data version to version 111, which is an orderly increase of the submitted data version .

Maintain data task queue

To manage these versions, we need to maintain a data queue for user operations to submit data in an orderly manner. The responsibilities of this queue include:

  • User operation data enters the queue normally

  • The queue task is normally submitted to the access layer

  • Retry after the queue task is submitted abnormally

  • The queue task is removed after confirming the submission

Such a queue may also face the possibility that the user suddenly closes the page. We also need to maintain a cached data. When the user opens the page again, the user edited but not submitted data is resubmitted to the server. In addition to the situation where the browser is closed, there is also a network interruption caused by the user's network status changes during the editing process. In this case, we also need to offline the user's operation to the local, and continue uploading when the network is restored.

Room management

Due to the need for multi-person collaboration, compared with ordinary Web pages, there are more room and user management. Users in the same document can be regarded as being in the same room. In addition to seeing who is in the same room, we can receive messages from each other. In the document scene, every operation of the user can be regarded as a message.

However, the difference between a document and a normal room chat is that the user's operation cannot be lost, and a strict version order is also required. The user's operation content may be very large, for example, the user copy and paste a 10W, 20W form content, such a message obviously cannot be transmitted all at once. In this case, in addition to considering the need for data compression like Websocket (HTTP itself supports compression), we also need to implement our own fragmentation logic. When it comes to data fragmentation, there are also how to fragment and how to deal with fragmented data loss.

Multiple communication methods

There are many front-end and back-end communication methods, common ones include HTTP short polling (polling), Websocket, HTTP long-polling (long-polling), SSE (Server-Sent Events), etc.

We can also see that the communication methods used by different online documentation teams are not consistent. For example, Google Docs uses Ajax for upstream data and HTTP long polling for downstream data; Graphite Docs uses Ajax for upstream data and SSE for downstream data; Jinshan Docs, Feishu Docs, and Tencent Docs all use Websocket transmission.

Each communication method has its own advantages and disadvantages, including compatibility, resource consumption, real-time performance, etc., and may also be related to the business team's own back-end architecture. Therefore, when designing the connection layer, we consider the scalability of the interface and should reserve support for various methods.

Each grid is a rich text editor

In fact, in addition to real-time collaborative editing, Excel projects also face many other challenges. Everyone knows that the rich text editor is pitted, but in Excel, every box is a rich text editor.

Rich text

There are generally several ways to deal with rich text editing:

  • A simple div increase in contenteditableproperty, native browser execCommandexecution

  • div + event monitoring to maintain a set of editor state (including cursor state)

  • textarea + event monitoring maintains a set of editor status

For contenteditableproperties, to be the operation of the selected text (e.g., italics, color), it is necessary to determine the location of the cursor, where Range is determined by the selected text, and this text is judged not have been processed, it is necessary to cover, or to remove Keep the original effect, there are more pits here, and compatibility problems often occur. Generally speaking, complex editors like Atom and VSCode implement contenteditable functions by themselves, using div+ event monitoring. Ace editor, Jinshan document, etc. use hidden textarea to receive input and render it into div to achieve editing effect.

copy and paste

Generally speaking, when a single cell or multiple cells are selected and copied, what we can get is the original data of the grid, so two steps are required: Convert the data into rich text (splicing table/tr/td and other elements) , And then write to the clipboard .

The paste process also requires: get the content from the clipboard , then convert the content into cell data , and submit the operation data . It may also involve the uploading of pictures and the analysis of various rich texts. Each cell may have a straight line of complexity due to some attributes set (including merged cells, row height and column width, filtering, functions, etc.) rise.

Copy and paste related function modules can be divided into two types according to usage scenarios:

  1. Copy and paste inside .

  2. Copy and paste externally .

Internal copy and paste refers to copy and paste in your own products. Since a copy and paste process involves a lot of calculation and analysis, internal copy and paste can consider whether to directly write cell data to the clipboard. When pasting, you can directly Obtaining data eliminates the time-consuming and resource-intensive steps of converting data into rich text and parsing rich text into cell data.

External copy and paste is more about the compatibility of various similar Excel editing products and the compatibility of the system clipboard content format, and the code implementation is particularly complicated.

How complicated is the table rendering

Generally speaking, there are two implementation schemes for drawing tables:

  1. DOM drawing .

  2. Canvas drawing .

The well-known handsontable open source library in the industry is based on DOM to achieve rendering, but it is obvious that DOM rendering of one hundred thousand and one million cells will cause greater performance problems. Therefore, nowadays, many web version of spreadsheet implementations are based on canvas + superimposed DOM. The use of canvas also needs to consider the visible area, scrolling operations, and the hierarchy of the canvas. There are also some performance problems faced by the canvas, including how the canvas is Proceed straight out and so on.

Table rendering involves various scenarios such as merging cells, selection, zooming, freezing, rich text, and automatic line wrapping. Let's take a look at how complicated it is.

Word wrap

Generally speaking, the automatic line wrapping of a cell is reflected in the data storage, which only includes: cell content + line wrapping attribute. But when such a data needs to be rendered, it faces some calculations of automatic line wrapping:

We need to find the column width of the column, and then branch the rendering layer according to the content of the cell. As shown in the figure, such a string of text will be divided into three lines according to the calculation of the branch logic. After the automatic line wrapping, it may also involve the adjustment of the row height of the row where the cell is located. The adjustment of the row height may also affect the rendering results of some centering properties of other cells in the row and need to be recalculated.

Therefore, when we set automatic line-wrapping for a column of grids, it may cause large-scale recalculation and rendering, which will also involve greater performance consumption.

Frozen area

The freezing function can divide our table into four areas, and the left and right and up and down areas are divided into frozen and non-freeze areas. The complexity of the frozen area mainly lies in the processing of some special cases of the boundary, including the selection of the area and the cutting of the picture. Let's look at a picture:

As shown in the figure, for a picture, although it is placed directly on the entire table, when it falls into the data layer, it actually only belongs to a certain grid. In the editing of the frozen area, we need to segment it, but no matter which area is selected, we still need to display its original image:

This means that when we get the position of the mouse click in the canvas, we also need to calculate whether the corresponding clicked grid belongs to the image coverage.

Alignment and cell overflow

There are generally three types of horizontal alignment of a cell: left alignment, center alignment, and right alignment. When the cell is not set to wrap and its content exceeds the width of the grid, it will cover other grids:

In other words, when we draw a grid, we also need to calculate whether the nearby grid has overflowed to the current grid. If there is overflow, we need to draw in this grid. In addition, when a column of grids is hidden, the overflow logic may need to be adjusted and updated.

The above lists are just some of the more detailed points, and the rendering of the table also involves various logics such as the hiding, dragging, zooming, and selection of cells and rows and columns, as well as some complex calculations for cell borders. In addition, because canvas rendering is a screen of content, scrolling of the page, updating of collaborative data, etc. may also cause frequent updates of the canvas.

Data management problems

When each grid supports rich text content, in the scenario of one hundred thousand and one million cells, the storage of data placed on the disk and the data change of user operations also pose no small challenges.

Atomic manipulation

Similar to database transactions, for spreadsheets, we can split user operations into indivisible atomic operations. Why do you do that? In fact, it is mainly to facilitate the conflict processing of the OT algorithm, which can perform specific logic conflict calculation and conversion for each indivisible atomic operation, and finally put it in the storage.

For example, we insert a sub-table such an operation, in addition to inserting itself, may need to move other sub-tables. Then, for a sub-table, our operations may include:

  • insert

  • Rename

  • mobile

  • delete

  • update content

  • ...

As long as the split is careful enough, for all user behaviors of the sub-table, these operations can be combined into the final effect. These operations that can no longer be split are the final atomic operations. For example, copying and pasting a sub-table can be split into 插入-重命名-更新内容; cutting a sub-table can be split into 插入-更新内容-删除-移动其他子表. By analyzing user behavior, we can extract these basic operations and look at a specific example:

As shown in the figure, for the server, two new sub-tables are finally added, one is Zhang San's "Worksheet 2" and the other is Li Si's "Worksheet 2 (automatically renamed)".

In implementation, the tranform function is generally used to handle concurrent operations. This function accepts two operations that have been applied to the same document state (but on different clients), and calculates that it can be applied after the second operation and retain the first The expected change of the new operation of the operation.

The names of OT functions used in different OT systems may be different, but they can be divided into two categories:

  • inclusion transformation / forward transformation: as expressed IT(opA,opB), opAto comprise an effective opBway to influence the operations into another operation opB'.

  • exclusion transformation / backward transformation: as expressed ET(opA,opB), opAin an effective negative opBimpact manner, the operation is converted to another operation opB''.

Some OT systems use both IT and ET functions, and some only use IT functions. The complexity of OT function design depends on many factors: whether the OT system supports consistent maintenance, whether it supports Undo/Redo, which conversion attributes to meet, whether to use ET, whether the OT operating model is universal, and whether the data in each operation is based on Characters (single objects) are still in character strings (sequences of objects), hierarchical or other structures, etc.

Except that the client needs to perform local conflict handling after receiving the server's collaborative message, the server may also perform conflict handling after receiving two messages based on the same version. There is a set of consistent conflict handling logic on the local and the server to ensure the final consistency of the algorithm.

Version rollback/rework

For most editors, Undo/Redo is the most basic ability, and document editing is no exception. Earlier we mentioned the concept of real-time collaboration with versions, and each user's operation may be split into multiple atomic operations.

In such a scenario, Undo/Redo involves not only the recovery of the data placed on the disk, but also the handling of conflicts encountered during the restoration of user operations. In a multi-person collaboration scenario, if some other person's operation data is received during the editing process, will the other person's operation be withdrawn during Undo?

OT-based algorithm is relatively simple idea Undo fact, typically implemented for each of the corresponding atomic operation invert()method, the inverse operation of the atomic operation to generate a new atomic operation and application.

Earlier we introduced that the transform function can be divided into IT and ET, and Undo can be implemented in two ways:

  • Inv & IT: invert + inclusion transformation

  • ET & IT: exclusion transformation + inclusion transformation

Regardless of the algorithm, the basic idea of ​​OT for undo is to convert the inverse operation of the operation (the operation to be undone) into a new form according to the effects of the operations performed after the operation, so that the converted inverse operation can be implemented correctly Undo influence. But if the user receives a new collaborative operation when editing, when the user is performing Undo, the atomic operation generated by the inverse operation also needs to conflict with these new collaborative messages to ensure final consistency.

data

For cells that support rich text, in addition to some of its own property settings, including data format verification, function calculation, width and height, borders, fill colors, etc., each cell also needs to maintain the rich text format and association in the cell. Some data of the picture. When these data are faced with 100,000 or even millions of cells, they also bring great challenges to data transmission and storage.

The version and restoration of revision records, how to optimize memory, how to optimize data size, how to efficiently use data, and how to reduce the complexity of computing time and space have all become some of the problems faced by the data layer.

END

The above list only accounts for a small part of the entire Excel project. In addition, there are Workers, menu bars, and various feature functions, such as data formats, functions, pictures, charts, filtering, sorting, and smart dragging. Drag, import and export, regional permissions, search and replace, each function will face various challenges due to the complexity of the project.

In addition, the functional decoupling between various modules, how to organize and structure the 100W+ code, how to optimize the code loading process, the problems caused by multi-person collaboration, the maintainability/readability of the project, and the performance optimization are all ours. Questions that often require thinking.

Concluding remarks

Participating in such a project, the biggest feeling is that there is no need to scratch your head to think about what other highlights can be made in a project, because there are so many things that can be done. For many businesses, code quality, maintainability, and readability are often ignored. Because of the limitations of the project itself (relatively simple), we are often unable to find a point that we can dig deeper, so in the end, we can only improve the efficiency as much as possible through automation and configuration, but what we can do is actually very limited. , Its own growth is therefore limited.

People often ridicule that the ceiling at the front end is too low and that they are facing elimination at the age of 35. Disregarding personal interests, enthusiasm, and bottlenecks, in many cases, it is also because the conditions are not allowed and the business scenarios are relatively simple, so there is no scenario where you can use your abilities. I used to think it’s okay to study after get off work, but if you go to work and do what you like, wouldn’t it kill two birds with one stone?

Finally, welcome all kinds of discussions and exchanges~

PS: Our team is still hiring~~

About AlloyTeam

AlloyTeam is one of the most influential front-end teams in China, with core members from the former WebQQ front-end team.

AlloyTeam has been responsible for large-scale Web projects such as WebQQ, QQ groups, interest tribes, and Tencent documents, and has accumulated a lot of valuable Web development experience.

The technical atmosphere here is good, the leadership is nice, and the Qian Jing is good. Whether you are a experienced engineer or a newcomer who is about to enter society from school, as long as you love challenges and hope that the front-end technology and our rapid improvement, this will be the best A place for you.

To join us, please send your resume to [email protected]



Guess you like

Origin blog.csdn.net/Tencent_TEG/article/details/111306051