[Vue2.0 source code learning] template compilation - template parsing stage (text parser)

1 Introduction

In the previous article, we said that when HTMLthe parser parses the text content, it will call charsthe functions in the 4 hook functions to create text-type ASTnodes, and also said that in charsthe function, it will be subdivided according to whether the text content contains variables To create nodes with variables and nodes ASTwithout variables , as follows:AST

// 当解析到标签的文本时,触发chars
chars (text) {
    
    
  if(res = parseText(text)){
    
    
       let element = {
    
    
           type: 2,
           expression: res.expression,
           tokens: res.tokens,
           text
       }
    } else {
    
    
       let element = {
    
    
           type: 3,
           text
       }
    }
}

As can be seen from the above code, ASTwhen creating a node with variables, the attribute of the node typeis 2, and compared with ASTthe node without variables, there are two more attributes: expressionand tokens. So how to judge whether the text contains variables and what are the two extra attributes? This involves the text parser. When the text is parsed Vueby HTMLthe parser, the parsed text content is passed to the text parser. Finally, the text parser parses whether the text contains variables and if it contains variables, then parse expressionand tokens. So next, this article will analyze what the text parser does.

2. Analysis of results

Before studying the internal principles of the text parser, let's take a look at HTMLwhat the output of the text content parsed by the parser looks like after passing through the text parser, which will be of great help to our analysis of the internal principles of the text parser later .

As can be seen from the code of the above charsfunction, HTMLthe text content parsed by the parser textis passed to the text parser parseTextfunction, and whether the text contains variables is judged according to whether the function has a return value, and the required sum parseTextis obtained from the return value . So let's first look at what the return value of the function looks like if it has a return value.expressiontokensparseText

Suppose the existing HTMLtext content parsed by the parser is as follows:

let text = "我叫{
    
    {name}},我今年{
    
    {age}}岁了"

After being parsed by the text parser, we get:

let res = parseText(text)
res = {
    
    
    expression:"我叫"+_s(name)+",我今年"+_s(age)+"岁了",
    tokens:[
        "我叫",
        {
    
    '@binding': name },
        ",我今年"
        {
    
    '@binding': age },
    	"岁了"
    ]
}

From the above results, we can see that expressionthe attribute is to extract the variables and non-variables in the text, then wrap the variables , and finally connect _s()them in the order of the text . +It istokens an array, and the contents of the array are also variables and non-variables in the text. The difference is that the variables are constructed as {'@binding': xxx}.

So what is the use of doing this? renderThis is mainly used for generating functions in the later code generation phase . We will explain this in detail later in the code generation phase. Here it can be understood as a simple form of construction.

OK, now we can know that the text parser does three things inside:

  • Determine whether the incoming text contains variables
  • construct expression
  • Construct tokens

Then we will analyze the internal working principle of the text parser line by line by reading the source code.

3. Source code analysis

The source code of the text parser is located src/compiler/parser/text-parsre.jsin , the code is as follows:

const defaultTagRE = /\{\{((?:.|\n)+?)\}\}/g
const buildRegex = cached(delimiters => {
    
    
  const open = delimiters[0].replace(regexEscapeRE, '\\$&')
  const close = delimiters[1].replace(regexEscapeRE, '\\$&')
  return new RegExp(open + '((?:.|\\n)+?)' + close, 'g')
})
export function parseText (text,delimiters) {
    
    
  const tagRE = delimiters ? buildRegex(delimiters) : defaultTagRE
  if (!tagRE.test(text)) {
    
    
    return
  }
  const tokens = []
  const rawTokens = []
  /**
   * let lastIndex = tagRE.lastIndex = 0
   * 上面这行代码等同于下面这两行代码:
   * tagRE.lastIndex = 0
   * let lastIndex = tagRE.lastIndex
   */
  let lastIndex = tagRE.lastIndex = 0
  let match, index, tokenValue
  while ((match = tagRE.exec(text))) {
    
    
    index = match.index
    // push text token
    if (index > lastIndex) {
    
    
      // 先把'{
    
    {'前面的文本放入tokens中
      rawTokens.push(tokenValue = text.slice(lastIndex, index))
      tokens.push(JSON.stringify(tokenValue))
    }
    // tag token
    // 取出'{
    
    { }}'中间的变量exp
    const exp = parseFilters(match[1].trim())
    // 把变量exp改成_s(exp)形式也放入tokens中
    tokens.push(`_s(${
      
      exp})`)
    rawTokens.push({
    
     '@binding': exp })
    // 设置lastIndex 以保证下一轮循环时,只从'}}'后面再开始匹配正则
    lastIndex = index + match[0].length
  }
  // 当剩下的text不再被正则匹配上时,表示所有变量已经处理完毕
  // 此时如果lastIndex < text.length,表示在最后一个变量后面还有文本
  // 最后将后面的文本再加入到tokens中
  if (lastIndex < text.length) {
    
    
    rawTokens.push(tokenValue = text.slice(lastIndex))
    tokens.push(JSON.stringify(tokenValue))
  }

  // 最后把数组tokens中的所有元素用'+'拼接起来
  return {
    
    
    expression: tokens.join('+'),
    tokens: rawTokens
  }
}

We see that, except for the comments we added, the code is actually not complicated, and we analyze it line by line.

parseTextThe function receives two parameters, one is the incoming text content to be parsed text, and the other is a symbol to wrap the variable delimiters. The first parameter is easy to understand, so what does the second parameter do? Don't worry, let's look at the first line of code in the function body:

const tagRE = delimiters ? buildRegex(delimiters) : defaultTagRE

The variable is first defined in the function body tagRE, representing a regular expression. This regular expression is used to check whether the text contains variables. We know that usually when we write variables in templates, we write like this: hello. The content of the package here { {}}is the variable. So we know that tagREit is used to detect whether there is in the text { {}}. AndtagRE it is variable, it is delimitersdifferent according to whether the parameter is passed in, that is to say, if no delimitersparameter is passed in, it will detect whether the text contains { {}}, if a value is passed in, it will detect whether the text contains the passed entered value. In other words, in a development Vueproject, the user can customize the symbols used to include variables in the text, for example, you can use %package variables such as: hello %name%.

Next, tagREit is used to match the incoming text content to determine whether the variable is included. If not, return it directly, as follows:

if (!tagRE.test(text)) {
    
    
    return
}

If it contains variables, then continue to look down:

const tokens = []
const rawTokens = []
let lastIndex = tagRE.lastIndex = 0
let match, index, tokenValue
while ((match = tagRE.exec(text))) {
    
    

}

This then starts a whileloop that ends on whether the tagRE.exec(text)result is , by performing a match search in a string, returning if it doesn't find any match , but returning an array if it finds a match. For example:matchnullexec( )null

tagRE.exec("hello {
    
    {name}},I am {
    
    {age}}")
//返回:["{
    
    {name}}", "name", index: 6, input: "hello {
    
    {name}},I am {
    
    {age}}", groups: undefined]
tagRE.exec("hello")
//返回:null

It can be seen that when matched, the first element of the matching result is the first complete wrapped variable in the string, the second element is the first wrapped variable name, and the third element is The starting position of the first variable in the string.

Then look down the body of the loop:

while ((match = tagRE.exec(text))) {
    
    
    index = match.index
    if (index > lastIndex) {
    
    
      // 先把'{
    
    {'前面的文本放入tokens中
      rawTokens.push(tokenValue = text.slice(lastIndex, index))
      tokens.push(JSON.stringify(tokenValue))
    }
    // tag token
    // 取出'{
    
    { }}'中间的变量exp
    const exp = match[1].trim()
    // 把变量exp改成_s(exp)形式也放入tokens中
    tokens.push(`_s(${
      
      exp})`)
    rawTokens.push({
    
     '@binding': exp })
    // 设置lastIndex 以保证下一轮循环时,只从'}}'后面再开始匹配正则
    lastIndex = index + match[0].length
  }

In the above code, first obtain the starting position of the first variable in the string and assign it index, and then compare indexthe lastIndexsize of the sum. At this point, you may have doubts. lastIndexWhat is this? In the definition of variables above, it is defined let lastIndex = tagRE.lastIndex = 0, so lastIndexit is tagRE.lastIndex, tagRE.lastIndexand what is it? When called on exec( )a regular expression object with modifiers g, it will set the property of the current regular expression object lastIndexto the character position next to the matched substring, and when the same regular expression is called a second time exec( ), it will set lastIndexthe property from Search starts at the indicated string and resets to 0 if exec( )no match is found . lastIndexExamples are as follows:

const tagRE = /\{\{((?:.|\n)+?)\}\}/g
tagRE.exec("hello {
    
    {name}},I am {
    
    {age}}")
tagRE.lastIndex   // 14

As you can see from the example, tagRE.lastIndexit is the position of the first wrapped variable }in the string where the last one is located. lastIndexThe initial value is 0.

Then the following is easy to understand. index>lastIndexAt that time , it means that there is plain text in front of the variable, then intercept this plain text and store it rawTokensin , and at the same time call to wrap JSON.stringifythis text in double quotes and store it tokensin , as follows :

if (index > lastIndex) {
    
    
    // 先把'{
    
    {'前面的文本放入tokens中
    rawTokens.push(tokenValue = text.slice(lastIndex, index))
    tokens.push(JSON.stringify(tokenValue))
}

If indexit is not greater than lastIndex, the description indexis also 0, that is, the text is a variable from the beginning, for example: hello. Then there is no plain text in front of the variable at this time, so there is no need to intercept, directly take out the first element variable name of the matching result, store it in the _s()package tokens, and then construct the variable name into {'@binding': exp}the store rawTokens, as follows:

// 取出'{
    
    { }}'中间的变量exp
const exp = match[1].trim()
// 把变量exp改成_s(exp)形式也放入tokens中
tokens.push(`_s(${
      
      exp})`)
rawTokens.push({
    
     '@binding': exp })

Then, update lastIndexto ensure that in the next cycle, only }}start to match the regular pattern from the back, as follows:

lastIndex = index + match[0].length

Then, when whilethe loop is over, it indicates that all the variables in the text have been parsed. If at this time lastIndex < text.length, it means that there is still plain text behind the last variable, then store it in tokensand again rawTokens, as follows:

// 当剩下的text不再被正则匹配上时,表示所有变量已经处理完毕
// 此时如果lastIndex < text.length,表示在最后一个变量后面还有文本
// 最后将后面的文本再加入到tokens中
if (lastIndex < text.length) {
    
    
    rawTokens.push(tokenValue = text.slice(lastIndex))
    tokens.push(JSON.stringify(tokenValue))
}

Finally, tokensthe elements in the array are +connected and rawTokensreturned together, as follows:

return {
    
    
    expression: tokens.join('+'),
    tokens: rawTokens
}

The above is all the logic of the text parser parseTextfunction.

4. Summary

This article introduces the internal working principle of the text parser. The function of the text parser is to HTMLperform secondary parsing on the text content parsed by the parser, and analyze whether the text content contains variables. If it contains variables, extract the variables for processing. Processing renderto prepare for subsequent production functions.

Guess you like

Origin blog.csdn.net/weixin_46862327/article/details/131744220
Recommended