javaCC教程-1、简单语法解析案例

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/yunfeng482/article/details/89004272

javaCC入门教程-1、匹配括号

1、配置javacc环境变量

将javacc的路径添加到系统变量Path,D:\java源码包\javacc\javacc-5.0\bin
在这里插入图片描述
在这里插入图片描述

测试javacc命令

cmd模式下j输入javacc测试
在这里插入图片描述

Simple1.jj 文件内容如下:


options {
  LOOKAHEAD = 1;
  CHOICE_AMBIGUITY_CHECK = 2;
  OTHER_AMBIGUITY_CHECK = 1;
  STATIC = true;
  DEBUG_PARSER = false;
  DEBUG_LOOKAHEAD = false;
  DEBUG_TOKEN_MANAGER = false;
  ERROR_REPORTING = true;
  JAVA_UNICODE_ESCAPE = false;
  UNICODE_INPUT = false;
  IGNORE_CASE = false;
  USER_TOKEN_MANAGER = false;
  USER_CHAR_STREAM = false;
  BUILD_PARSER = true;
  BUILD_TOKEN_MANAGER = true;
  SANITY_CHECK = true;
  FORCE_LA_CHECK = false;
}

PARSER_BEGIN(Simple1)

/** Simple brace matcher. */
public class Simple1 {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    Simple1 parser = new Simple1(System.in);
    parser.Input();
  }

}

PARSER_END(Simple1)

/** Root production. */
void Input() :
{}
{
  MatchedBraces() ("\n"|"\r")* <EOF>
}

/** Brace matching production. */
void MatchedBraces() :
{}
{
  "{" [ MatchedBraces() ] "}"
}

测试步骤

1、通过javacc命令生产一群java文件,该文件可以进行转换和词法分析

javacc Simple1.jj

2、编译java文件

javac *.java

3、执行词法转换器parser

java Simple1

测试案例

% java Simple1
{{}}<return>
<control-d>
%
% java Simple1
{x<return>
Lexical error at line 1, column 2.  Encountered: "x"
TokenMgrError: Lexical error at line 1, column 2.  Encountered: "x" (120), after : ""
        at Simple1TokenManager.getNextToken(Simple1TokenManager.java:146)
        at Simple1.getToken(Simple1.java:140)
        at Simple1.MatchedBraces(Simple1.java:51)
        at Simple1.Input(Simple1.java:10)
        at Simple1.main(Simple1.java:6)
%
% java Simple1
{}}<return>
ParseException: Encountered "}" at line 1, column 3.
Was expecting one of:
    <EOF> 
    "\n" ...
    "\r" ...

        at Simple1.generateParseException(Simple1.java:184)
        at Simple1.jj_consume_token(Simple1.java:126)
        at Simple1.Input(Simple1.java:32)
        at Simple1.main(Simple1.java:6)
%

功能介绍

这个是javacc 语法程序,可以匹配左右括号,最后输入0获取多个空行结束程序。

合法的语法例子如下:
“{}”, “{{{{{}}}}}”

非法例子如下:
“{{{{”, “{}{}”, “{}}”, “{{}{}}”, 等等

括号 […]
在JavaCC输入文件中指示…是可选的。

[…]也可以写成(…)?这两种形式是等价的。
可能出现在扩展中的其他结构是:
e1 | e2 | e3 | …:e1,e2,e3等的选择
(e)+:e的一次或多次出现
(e)*:零次或多次出现e

案例2-Simple2.jj

Simple2.jj是对Simple1.jj的一个小修改,允许空格
角色中间插入的角色。 所以然后输入这样的
如:

“{{} \ n} \ n \ n”

现在是合法的。

这个文件和Simple1.jj之间的另一个区别就是这个
文件包含词法规范 - 以…开头的区域
“跳跃”。在这个区域内有4个正则表达式 - 空格,制表符,
换行,并返回。这说明这些常规比赛
表达式将被忽略(并不考虑解析)。于是
只要遇到这4个字符中的任何一个,它们就是
扔掉了。

除了SKIP之外,JavaCC还有其他三个词法规范
区域。这些是:

TOKEN:用于指定词法标记(参见下一个示例)
SPECIAL_TOKEN:用于指定要使用的词法标记
在解析期间被忽略。从这个意义上讲,SPECIAL_TOKEN是
与SKIP相同。但是,这些令牌可以被恢复
在解析器操作中要进行适当处理。
MORE:这指定了部分令牌。完整的令牌是
由一系列MORE组成,后跟一个TOKEN
或SPECIAL_TOKEN。

您可以构建Simple2并使用来自的输入调用生成的解析器
键盘作为标准输入。

javacc -debug_parser Simple2.jj
javac Simple2*.java
java Simple2

javacc -debug_token_manager Simple2.jj
javac Simple2*.java
java Simple2

请注意,debug_token_manager 调试会产生大量诊断信息
信息,它通常用于查看单个调试跟踪
一次令牌。

Simple2.jj文件内容如下:

/* Copyright (c) 2006, Sun Microsystems, Inc.
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 * 
 *     * Redistributions of source code must retain the above copyright notice,
 *       this list of conditions and the following disclaimer.
 *     * Redistributions in binary form must reproduce the above copyright
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *     * Neither the name of the Sun Microsystems, Inc. nor the names of its
 *       contributors may be used to endorse or promote products derived from
 *       this software without specific prior written permission.
 * 
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
 * THE POSSIBILITY OF SUCH DAMAGE.
 */


PARSER_BEGIN(Simple2)

/** Simple brace matcher. */
public class Simple2 {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    Simple2 parser = new Simple2(System.in);
    parser.Input();
  }

}

PARSER_END(Simple2)

SKIP :
{
  " "
| "\t"
| "\n"
| "\r"
}

/** Root production. */
void Input() :
{}
{
  MatchedBraces() <EOF>
}

/** Brace matching production. */
void MatchedBraces() :
{}
{
  "{" [ MatchedBraces() ] "}"
}

案例3-Simple3.jj

Simple3.jj是我们匹配括号的第三个也是最终版本探测器。 此示例说明了TOKEN区域的用法指定词法标记。 在这种情况下,“{”和“}”被定义为代币和名称分别为LBRACE和RBRACE。 这些标签然后可以在尖括号内使用(如示例中所示)来引用这个标记。 通常使用这种令牌规范复杂的标记,如标识符和文字。 令牌是简单的字符串保留原样(在前面的例子中)。

此示例还说明了语法中的操作的使用制作。 此示例中插入的操作计算数量匹配括号。 注意使用声明区域来声明变量“count”和“nested_count”。 另请注意非终端如何“MatchedBraces”将其值作为函数返回值返回。

/* Copyright (c) 2006, Sun Microsystems, Inc.
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 * 
 *     * Redistributions of source code must retain the above copyright notice,
 *       this list of conditions and the following disclaimer.
 *     * Redistributions in binary form must reproduce the above copyright
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *     * Neither the name of the Sun Microsystems, Inc. nor the names of its
 *       contributors may be used to endorse or promote products derived from
 *       this software without specific prior written permission.
 * 
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
 * THE POSSIBILITY OF SUCH DAMAGE.
 */


PARSER_BEGIN(Simple3)

/** Simple brace matcher. */
public class Simple3 {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    Simple3 parser = new Simple3(System.in);
    parser.Input();
  }

}

PARSER_END(Simple3)

SKIP :
{
  " "
| "\t"
| "\n"
| "\r"
}

TOKEN :
{
  <LBRACE: "{">
| <RBRACE: "}">
}

/** Root production. */
void Input() :
{ int count; }
{
  count=MatchedBraces() <EOF>
  { System.out.println("The levels of nesting is " + count); }
}

/** Brace counting production. */
int MatchedBraces() :
{ int nested_count=0; }
{
  <LBRACE> [ nested_count=MatchedBraces() ] <RBRACE>
  { return ++nested_count; }
}

案例4-IdList.jj

此示例说明了SKIP的一个重要属性规格。需要注意的要点是正则表达式在SKIP规范中,只有在Token之间忽略而不是
between tokens。该语法接受任何标识符序列中间有空白区域。

该语法的合法输入是:

“abc xyz123 A B C \ t \ n aaa”

这是因为允许任意数量的SKIP正则表达式在连续之间。但是,以下不合法输入:

“xyz 123”

这是因为“xyz”之后的空格字符在SKIP中类别因此导致一个标记结束而另一个标记开始。这要求“123”是单独的标记,因此不匹配语法。

如果中的空格正常,那么所有人必须做的就是替换Id的定义为:

TOKEN:
{
  <Id:[“a” - “z”,“A” - “Z”]((“”)* [“a” - “z”,“A” - “Z”,“0” - “9” ])*>
}

请注意,在TOKEN规范中包含空格字符并不意味着空格字符不能在SKIP中使用规格。所有这一切都意味着任何空间角色
出现在可以放在标识符中的上下文中将参加的比赛,而所有其他空间字符将被忽略。匹配算法的细节是在网页的JavaCC文档中描述。

作为必然结果,必须将令牌定义为其中的任何内容不得出现空白字符等字符。在里面如上所示,如果被定义为语法生成而不是如下所示的词汇标记,然后是“xyz 123”已被公认为合法(错误地)。

void Id():
{}
{
<[“a” - “z”,“A” - “Z”]>(<[“” - “z”,“A” - “Z”,“0” - “9”]>)*
}

注意,在上述非终端Id的定义中,它由一系列单个字符标记(注意<…> s的位置),因此在这些角色之间允许有空格。

/* Copyright (c) 2006, Sun Microsystems, Inc.
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 * 
 *     * Redistributions of source code must retain the above copyright notice,
 *       this list of conditions and the following disclaimer.
 *     * Redistributions in binary form must reproduce the above copyright
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *     * Neither the name of the Sun Microsystems, Inc. nor the names of its
 *       contributors may be used to endorse or promote products derived from
 *       this software without specific prior written permission.
 * 
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
 * THE POSSIBILITY OF SUCH DAMAGE.
 */

PARSER_BEGIN(IdList)


/** ID lister. */
public class IdList {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    IdList parser = new IdList(System.in);
    parser.Input();
  }

}

PARSER_END(IdList)

SKIP :
{
  " "
| "\t"
| "\n"
| "\r"
}

TOKEN :
{
  < Id: ["a"-"z","A"-"Z"] ( ["a"-"z","A"-"Z","0"-"9"] )* >
}

/** Top level production. */
void Input() :
{}
{
  ( <Id> )+ <EOF>
}

案例5-NL_Xlator.jj

这个例子详细介绍了编写正则表达式JavaCC语法文件。它还说明了一个稍微复杂的集合转换语法描述的表达式的动作
英文

上面例子中的新概念是使用更复杂的常用表达。正则表达式:

<ID:[“a” - “z”,“A” - “Z”,“”]([“a” - “z”,“A” - “Z”,“”,“0” - “9”])*>

创建一个名为ID的新正则表达式。这可以在语法中的任何其他地方简单地称为。接下来是什么方括号是一组允许的字符 - 在这种情况下它是任何大写或小写字母或下划线。这是然后是0或更多次出现的任何大写或小写
字母,数字或下划线。

可能出现在正则表达式中的其他构造是:

(…)+:一次或多次…
(…)? :可选的出现…(注意在这种情况下
词汇标记,(…)?和[…]不等同)
(r1 | r2 | …):r1,r2中的任何一个,…

形式[…]的构造是一个与之匹配的模式在…中指定的字符。这些角色可以是个人的字符或字符范围。在该构造之前的“〜”是a匹配任何未在…中指定的字符的模式。因此:
[“a” - “z”]匹配所有小写字母
〜[]匹配任何字符
〜[“\ n”,“\ r”]匹配除新行字符以外的任何字符

在扩展中使用正则表达式时,它的值为键入“令牌(Token)”。这将生成到生成的解析器目录中作为“Token.java”。在上面的例子中,我们定义了一个变量键入“Token”并为其分配正则表达式的值。

/* Copyright (c) 2006, Sun Microsystems, Inc.
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 * 
 *     * Redistributions of source code must retain the above copyright notice,
 *       this list of conditions and the following disclaimer.
 *     * Redistributions in binary form must reproduce the above copyright
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *     * Neither the name of the Sun Microsystems, Inc. nor the names of its
 *       contributors may be used to endorse or promote products derived from
 *       this software without specific prior written permission.
 * 
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
 * THE POSSIBILITY OF SUCH DAMAGE.
 */

PARSER_BEGIN(NL_Xlator)

/** New line translator. */
public class NL_Xlator {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    NL_Xlator parser = new NL_Xlator(System.in);
    parser.ExpressionList();
  }

}

PARSER_END(NL_Xlator)

SKIP :
{
  " "
| "\t"
| "\n"
| "\r"
}

TOKEN :
{
  < ID: ["a"-"z","A"-"Z","_"] ( ["a"-"z","A"-"Z","_","0"-"9"] )* >
|
  < NUM: ( ["0"-"9"] )+ >
}

/** Top level production. */
void ExpressionList() :
{
	String s;
}
{
	{
	  System.out.println("Please type in an expression followed by a \";\" or ^D to quit:");
	  System.out.println("");
	}
  ( s=Expression() ";"
	{
	  System.out.println(s);
	  System.out.println("");
	  System.out.println("Please type in another expression followed by a \";\" or ^D to quit:");
	  System.out.println("");
	}
  )*
  <EOF>
}

/** An Expression. */
String Expression() :
{
	java.util.Vector termimage = new java.util.Vector();
	String s;
}
{
  s=Term()
	{
	  termimage.addElement(s);
	}
  ( "+" s=Term()
	{
	  termimage.addElement(s);
	}
  )*
	{
	  if (termimage.size() == 1) {
	    return (String)termimage.elementAt(0);
          } else {
            s = "the sum of " + (String)termimage.elementAt(0);
	    for (int i = 1; i < termimage.size()-1; i++) {
	      s += ", " + (String)termimage.elementAt(i);
	    }
	    if (termimage.size() > 2) {
	      s += ",";
	    }
	    s += " and " + (String)termimage.elementAt(termimage.size()-1);
            return s;
          }
	}
}

/** A Term. */
String Term() :
{
	java.util.Vector factorimage = new java.util.Vector();
	String s;
}
{
  s=Factor()
	{
	  factorimage.addElement(s);
	}
  ( "*" s=Factor()
	{
	  factorimage.addElement(s);
	}
  )*
	{
	  if (factorimage.size() == 1) {
	    return (String)factorimage.elementAt(0);
          } else {
            s = "the product of " + (String)factorimage.elementAt(0);
	    for (int i = 1; i < factorimage.size()-1; i++) {
	      s += ", " + (String)factorimage.elementAt(i);
	    }
	    if (factorimage.size() > 2) {
	      s += ",";
	    }
	    s += " and " + (String)factorimage.elementAt(factorimage.size()-1);
            return s;
          }
	}
}

/** A Factor. */
String Factor() :
{
	Token t;
	String s;
}
{
  t=<ID>
	{
	  return t.image;
	}
|
  t=<NUM>
	{
	  return t.image;
	}
|
  "(" s=Expression() ")"
	{
	  return s;
	}
}

猜你喜欢

转载自blog.csdn.net/yunfeng482/article/details/89004272