To achieve real-time collaborative editing

What is the real-time collaborative editing

Real-time collaborative editing here, refers people to simultaneously edit a document, the most typical example is Google Docs, you can see real-time changes made by others, without having to manually refresh the page.

To achieve real-time editing, we need to solve two technical points: real-time communication problem, edit conflicts, which real-time communication is better to solve the problem, you can use long pull or WebSocket, so there is not too much discussion, the focus will be on how to solve edit the conflict.

Options

Next, from easy to difficult in order to introduce several viable options, namely: "Edit Lock", "GNU diff-patch", "Myer's diff-patch", "Operational Transformation" and "distributed Operational Transformation." .

Edit lock

This is to achieve synergy edit lock edit the easiest way, is simply when someone is editing a document, the system will lock the document, prevent others from editing the same time, because simple, so this program is the most widely used such as internal company system commonly used in TWiki, although this way you can avoid overwriting problem to some extent, but its experience is not good, can not do "real time", so we are not discussed here.

GNU diff-patch

Git and other version control software is actually a collaborative editing tool, because everyone can parallel editing, you can automatically merge encountered when editing the same file, so we can be implemented collaborative editing using a similar principle, there are two specific methods: diff-patch and merge.

Start with diff-patch, where the diff and patch refers to the command of two unix, diff can output differences between the two texts, and then use the patch to update other files, as long as we achieve these two algorithms in JS, can be achieved by collaborative editing process:

Have established long connection came when each user, keep a copy of the current document
When someone edited if pause five seconds (according to the specific product strategy), it will copy existing documents and previous conduct diff, the results to the server, updated copy
Server update the document, and then through a long connection to the diff results to other users editing the same time, these users use the patch method to update their documents ta

GNU diff but there is a problem, because based on matching rows, so it is easy conflict, let us test the "Baidu Web" and "Baidu Web front-end" these two texts diff results

[nwind@fex ~]$ diff old.txt other-new.txt > old-to-other-new.patch
[nwind@fex ~]$ cat old-to-other-new.patch
1c1
< 百度 Web
---
> 百度 Web 前端

In this diff results, 1c1first before the first line of "1" represents the changes, "c" represents the back of the "modified" line after the second "1" represents the modification, that is to say the first line the "Baidu Web 'to' Baidu Web front-end", the modified content to put the first line. Which means if they both will modify the line of conflict, can be confirmed by the following tests:

[nwind@fex ~]$ cat my-new.txt
Web

[nwind@fex ~]$ patch my-new.txt < old-to-other-new.patch
patching file b-new.txt
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED -- saving rejects to file my-new.txt.rej

[nwind@fex ~]$ cat my-new.txt.rej
***************
*** 1
- 百度 Web--- 1 -----
+ 百度 Web 前端

Where my-new.txtis my modified version, I removed the previous "Baidu", leaving only the "Web", in fact, this is not a conflict of two modifications, they can be combined into "Web front-end", as shown below

merge

But ordered his men to use the patch, it will generate a new post-conflict file my-new.txt.rejto describe the reason for the failure, this is not intuitive way to show the need to open two file comparison, we use another way to better demonstrate that access introduced down merge command, its use is as follows:

[nwind@fex ~]$ merge my-new.txt old.txt other-new.txt
merge: warning: conflicts during merge

[nwind@fex ~]$ cat my-new.txt
<<<<<<< my-new.txt
Web=======
百度 Web 前端>>>>>>> other-new.txt

You can see it directly to local conflicts wrote my-new.txtagain, and this seems to be more convenient than the patch, this result is estimated that most students will look familiar, because merge commands and tools such as Git merge algorithm is the same .

We can see by using a merge command has the disadvantage that requires the use of 3 parts of a complete text to be compared, in order to avoid all the text content delivery each time, we can combine to reduce the transfer volume diff, at the rear with a new patch text.

Whether diff or merge, since they are based on an algorithm to compare the line, leading to the inevitable conflict editor on the same line, in order to solve this problem, we can try to diff algorithms based on the character size, that is, the next will introduce Myer's diff -patch.

Myer’s diff-patch

Myer algorithm is another diff-patch algorithm, it has a lot of open-source implementation of the language , we will not introduce here the details of the algorithm, the direct use of the previous example, to test its effects, first of all look at its diff result, the calling code as follows :

var old_text = "百度 Web";
var new_text = "百度 Web 前端";

var dmp = new diff_match_patch();
var patch_list = dmp.patch_make(old_text, new_text);
patch_text = dmp.patch_toText(patch_list);

console.log(decodeURI(patch_text))

The output is

@@ -1,6 +1,9 @@
 百度 Web
+ 前端

Wherein the first row -and +two symbols does not make sense, the phrase indicates starting position before modification 1 (since the array is zero, it will first calculate a reduced internal) length of 6, the rear the 1,9indicating a start position is modified to a length of 9. In the local representative of the next two text changes, pay attention to the front, "Baidu Web" spaces, which represents an equal, which is added directly to the string, while the latter +represents the add text, specific details can be its implementation source code to understand.

So make sure it's diff strategy is based on matching characters, so that we can resolve the conflict before encountered it? Next to test the source code as follows:

//相关代码同上
var patches = dmp.patch_fromText(patch_text);
var results = dmp.patch_apply(patches, "Web");

console.log(results[0]); //Web 前端

This output is correct, that is to say it is a good solution to the problem before, but if it is modified with a position of what would happen? I continue to do a few experiments:

var old_text = "百度 Web";
var other_new_text = "百度 Web 后端";
var my_new_text = "百度 Web 前端";
...
//结果为「百度 Web 前端 后端」

===
var old_text = "百度 Web 前端";
var other_new_text = "百度 Web 后端";
var my_new_text = "百度 Web 全端";
...
//结果为「百度 Web 后端」

===
var old_text = "百度 Web";
var other_new_text = "Web 前端";
var my_new_text = "百度 FE";
//结果为「FE 前」

The first example is added behind different characters, it is the result of two additions take effect, the second example is modified to different characters in the same place, it is the result of someone else's changes to take effect, but in the end an example of an error , losing the "end" word, this looks okay, but if the content is rich text there will be problems, such as <b>less >is not acceptable.

Overall Myer low cost algorithm can solve most problems, so some online editor Select it to implement collaborative editing features, such as CodeBox , its client code in this , server-side code in it .

Myer but in some cases will lose the character, is there a better way? The answer is yes, that is the Operational Transformation techniques described next.

Operational Transformation

Operational Transformation (hereinafter abbreviated OT) program is the technology used in Google Docs, so is proven, is worth studying.

I have always felt very beginning OT will be very complicated, because it's related presentations articles are written very long, such as this and Wikipedia introduction , but after reading only after the discovery of its principles are not complicated, I will be here a simple explanation.

First, we can modify the contents of the text converted into the following three types of operations (Operational):

retain (n): hold n characters, that is to say the n characters intact
insert (str): str insert characters
delete (str): str delete characters

For example, assume that the user A "Baidu Web" to "Web front end", is generated corresponding to the following three operations:

delete('百度 '),  //删掉「百度 」
retain(3),       //跳过 3 个字符（也就是「Web」）
insert(' 前端')   //插入「 前端」

These extraction operations can be implemented by Levenshtein distance (edit distance) algorithm. How to resolve the conflict that it is a problem? For example, then if the user B "Baidu Web" into a "Baidu FE ', B produced by the procedure will be as follows:

retain(3),       //跳过 3 个字符（也就是「百度 」）
delete('Web'),
insert('FE')

Will fail if we first application operation A, the string becomes "Web Front End", then apply operations B, because B perform the second operation delete('Web')when not "Web", then from the fourth characters start has become a "front end."

Therefore we need to convert the operation B to adapt to the new string, such as tune into the following operations:

delete('Web'),
insert('FE'),
retain(3)

The conversion algorithm is the core of OT, OT actually refers to a class of technology, rather than a specific algorithm, the idea is first converted into the editing operation (Operational), if the people operating at the same time, the need to convert these operations (Transformation), which is why it is called operational Transformation, and specifically what actions should be split into as well as conversion algorithms are customizable, so OT flexibility to support a variety of collaborative editing applications, such as non-text editing classes.

Back before Myer algorithm leads to lost characters that example, to see if we can solve the OT, and here I use an open source library changesets , the following example is based on the merger of its realization:

var Changeset = require('changesets').Changeset;

var text = "百度 Web"
  , textA = "Web 前端"
  , textB = "百度 FE";

var csA = Changeset.fromDiff(text, textA);
var csB = Changeset.fromDiff(text, textB);

var csB_new = csB.transformAgainst(csA); //这里这就是操作转换

var textA_new = csA.apply(text);
console.log(csB_new.apply(textA_new)); //结果是「 前端FE」

The results are not correct, the right should be "front-end FE ', look at csB_newthe contents and found it was actually converted into the following:

delete(3),   //注意 changesets 在这里的参数不是字符串而是数字，它会直接删掉 3 个字符，不够内容是什么
retain(3),
insert('FE')

Note that this is not a problem OT technology itself, but the problem changesets conversion algorithm implemented, although not perfect, but compared to previous algorithm Myer, at least Modiu character, then I did a few tests and found OT technical accuracy rate higher than Myer, so it is best suited for collaborative editing technology.

Distributed Operational Transformation

If you think reading the above article does not seem hard real-time collaborative editing, then you are wrong, because before we did not consider the problem of distributed, OT technology in academia have studied more than 20 years, and no one has summed up the best way, Google Wave former engineer at ShareJS wrote on the first page:

Unfortunately, implementing OT sucks. There’s a million algorithms with different tradeoffs, mostly trapped in academic papers. The algorithms are really hard and time consuming to implement correctly. I am an ex Google Wave engineer. Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time.

So in fact it is very difficult to do, there is the most troublesome problem is caused by a distributed, the next will introduce three I can think of problems and solutions.

1. order problem

The first problem is the first order problem, because OT algorithms are dependent on the order of a different order will lead to different results, we have illustrated by the image below:

order-problem

Assumptions Client Amade two asynchronous requests in doing two revisions, probably because of network latency leads to the second request but first, and eventually lead to the server version and Client Ainconsistent see, when the same request to the server to the other client will wrong sequence problems, as shown in Client Balso a problem.

The solution to this problem is simple, we can add all queues on the client and server side to ensure that the request order, such as a request under the former issued after the end of a request.

2. Storage of atomic operations

If you have multiple servers or multiple threads / processes at the same time processing the request will encounter coverage problems, because reading and writing are not atomic database operations, such as the following example:

data-atomic

Web Server AAnd Web Server Bsimultaneous access to the database, resulting in Web Server Amodification of the cover.

Fortunately, this problem is still relatively common, there are three possible solutions:

保证操作只在一个线程中执行，比如某个文档的更新只在某个固定的机器，使用 Node 这样的单线程模型提供服务，这样就不可能并行修改了
如果数据库支持事务(transaction)，可以通过事务来解决
如果数据库不支持事务，就只能用分布式锁了，如 ZooKeeper

从实现角度来看，第一和第二种方法都比较简单，而第三种方法会带来很多问题，比如可能导致文档被锁死，假如上锁后由于种种原因没有执行解锁操作，这个文档就会永远被锁住，所以还得加上超时限制等策略。

然而在解决了原子操作后，我们将发现一个新的问题，那就是版本管理问题。

3. 版本管理问题

在前面的例子中，两段新文本的修改都是基于同一个旧版本的，如果旧版本不一样，就有可能出错，具体可以通过下面这张图来解释：

version-problem

在这个例子中，Web Server A 接收到操作命令是将「a」文本改成「aa」，Web Server B 接收到操作命令是将「a」文本改成「ab」，这里我们加上了锁机制来避免同时读写数据，Web Server A 首先得到了锁，然后修改并更新数据，而 Web Server B 需要先等待数据解锁，等 Web Server B 拿到数据后它已经从「a」变成了「aa」，如果还按照 retain(1), insert('b') 进行修改，数据将变成「ab」，而不是正确的「aab」，引起这个问题的原因就是旧版本不一致，Web Server B 需要根据 Web Server A 的操作进行操作转换，变成 retain(2), insert('b')，然后才能对数据进行修改。

因此想要解决这个问题，就必须引入版本，每次修改后都需要存储下新版本，有了版本我们就能使用 diff 功能来计算不同版本的差异，得到其它人修改的内容，然后通过 OT 合并算法合并两个操作，如下所示：

version-problem

在 Web Server A 操作前数据版本是 v=1，操作后变成了 v=2，等到 Web Server B 处理的时候，它通过版本比较发现不一致，所以就首先通过编辑距离算法算出 Web Server A 所做的操作，然后用这个操作来对自己的操作进行转换，得到正确的新操作，从而避免了覆盖问题。

如果保存所有版本会导致数据量大大增加，所以还需要再优化，比如每个服务器保存一个数据副本，但这里就不再展开了，可以看要支持分布式还是挺麻烦的，不过目前出现了一些前后端整合的方案，如 ShareJS 和 OpenCoweb Framework，可以参考。

另外之前提到的 Myer’s diff 算法也有分布式解决方案，具体细节可以参考这篇文档。

初步结论

如果你只是一个内部小项目，实时性要求不高，但对准确性要求比较高
- 推荐用 merge 或 diff3 工具，出现同一行冲突时由用户来解决，这样能避免自动合并有可能出错的问题
如果想具备一定的实时性，流量不大，不想实现太复杂，且对少量的冲突可以忍受
- 推荐用 Myer’s diff，后端只开一个 Node 进程
如果想具备实时性，且有多台后端服务同时处理
- 可以用 Operational Transformation 或 Myer’s diff，但需要注意分布式带来的问题
如果需要很精细的控制，如支持富文本编辑等非单纯文本格式
- 只能使用 Operational Transformation，但要自己实现操作合并算法，比如 XML 可以参考这篇文章

后续

除了文本合并，真正要做在线编辑还有很多细节处理，感兴趣的同学可以继续研究：

Support constituency, others see the text segment chosen, of course, this is also the issue of merger
To change the text pointer with more moved into the correct position
Support undo

Reproduced in: https: //www.jianshu.com/p/a3b350063cb5