Use pegjs to parse java code

Use pegjs to parse java code

What is pegjs

pegjs is an implementation of peg grammar, peg grammar is a kind of analytic expression grammar, its specific analytic formula is very similar to commonly used regular expressions, it should be noted that peg does not allow parsing to be ambiguous.

pegjs official website https://pegjs.org/

The role of pegjs

When regular matching cannot be achieved or is difficult, you can choose pegjs to handle parsing, such as the parsing of sql statements. It is also very convenient to write custom rules when constructing dsl.

Simple application of pegjs

1. Take the parsing of a piece of java code as an example, prepare a piece of java code that needs to be parsed

class Test {
    
    
  @tag(1)
  @label("名字")
  String name;

  @tag(2)
  @label("性别 0-男 1-女")
  Int sex;
}

2. Define the root node, including the type, version and code block array, use * to indicate multiple

CodeBlock = blocks:IdlBlock* {
    return {
        type: 'javaSchema',
        version: '1.0.0',
        blocks
    }
} 

3. Use _ to indicate white space, match the class keyword, use the Identifier rule to match the class name and assign it to className, then return, and
_ '{' children:Children* '}'parse the sub-nodes between the two curly braces after the white space

IdlBlock =
    _ 'class'
    _ className:Identifier
    _ '{' children:Children* '}'
    _ {
        return {
            className,
            children
        }
    }
    
    
_ "whitespace" = [ \t\r\n]*

Identifier = $([a-zA-Z_])+

4. Resolve variables in child nodes

Children =
    _ variable:Variable';'
    _ {
    	return {
            variable
        }
    }

5. Define variable resolution rules

Variable = 
	_ tag:Tag?
    _ label:Label?
    _ type:DataType?
    _ name:Identifier?
    _ {
    	return  {
        	tag,
            label,
            type,
            name
        }
    }

6. Parse @tag(1)and return the parameters

Tag = '@tag('tagColumn:TagColumn')' {
	return tagColumn
}

TagColumn = $([0-9])*

7. Parse @label("名字")and return the description information in the parameters

Label = '@label("'labelColumn:LabelColumn'")' {
	return labelColumn
}

LabelColumn = $([^\r\n\t\"\)])*

8. Analyze the types of variables, here only two types used in the code block are defined, and more types can be extended

DataType = 'String' / 'Int'

Complete example

CodeBlock = blocks:IdlBlock* {
    return {
        type: 'javaSchema',
        version: '1.0.0',
        blocks
    }
} 

IdlBlock =
    _ 'class'
    _ className:Identifier
    _ '{' children:Children* '}'
    _ {
        return {
            className,
            children
        }
    }
    
_ "whitespace" = [ \t\r\n]*

Identifier = $([a-zA-Z_])+

Children =
    _ variable:Variable';'
    _ {
    	return {
            variable
        }
    }

Variable = 
	_ tag:Tag?
    _ label:Label?
    _ type:DataType?
    _ name:Identifier?
    _ {
    	return  {
        	tag,
            label,
            type,
            name
        }
    }
    
    
Tag = '@tag('tagColumn:TagColumn')' {
	return tagColumn
}

TagColumn = $([0-9])*

Label = '@label("'labelColumn:LabelColumn'")' {
	return labelColumn
}

LabelColumn = $([^\r\n\t\"\)])*

DataType = 'String' / 'Int'

Analysis result

{
   "type": "javaSchema",
   "version": "1.0.0",
   "blocks": [
      {
         "className": "Test",
         "children": [
            {
               "variable": {
                  "tag": "1",
                  "label": "名字",
                  "type": "String",
                  "name": "name"
               }
            },
            {
               "variable": {
                  "tag": "2",
                  "label": "性别 0-男 1-女",
                  "type": "Int",
                  "name": "sex"
               }
            }
         ]
      }
   ]
}

The above rules can be directly verified in the web version of pegjs official website, pegjs also provides npm package, pegjs js api is relatively simple, you can refer to pegjs official documentation

Guess you like

Origin blog.csdn.net/vipshop_fin_dev/article/details/109172798