4.1.1. Identifiers and Key Words

4.1.1. Identifiers and Key Words

4.1.1. 标识符和关键词

Tokens such as SELECT, UPDATE, or VALUES in the example above are examples of key words, that is, words that have a fixed meaning in the SQL language. The tokens MY_TABLE and A are examples of identifiers. They identify names of tables, columns, or other database objects, depending on the command they are used in. Therefore they are sometimes simply called “names”. Key words and identifiers have the same lexical structure, meaning that one cannot know whether a token is an identifier or a key word without knowing the language. A complete list of key words can be found in Appendix C.

上节示例中的SELECT、UPDATE、VALUES这些标记均为关键词,也就是,在SQL语言中有固定含义的词。而标识MY_TABLE和A是标识符的示例。根据不用的语境,它们指定了表的名称、列名或者其他的数据库对象。因此一般简单的称其为“名称”。关键词和标识符具有相同的词法结果,也就是说,如果你不了解SQL语言,那么就很难区分它们。关键词列表详情参见附录C。

SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($). Note that dollar signs are not allowed in identifiers according to the letter of the SQL standard, so their use might render applications less portable. The SQL standard will not define a key word that contains digits or starts or ends with an underscore, so identifiers of this form are safe against possible conflict with future extensions of the standard.

SQL标识符和关键字必须以字母 (a-z,但也可以带有变音符号和非拉丁字母)或下划线(_)开头。之后可以为字母、下划线、数字(0-9)或者$符号。请注意,根据SQL标准,$符号是不允许在标识符中使用的,所以在PostgreSQL的标识符中使用$符号会降低应用程序的可移植性。SQL标准不会定义包含数字或者开头或结尾为下划线的关键词,所以以此格式定义的标识符一般不会与将来的标准扩展名有冲突。

The system uses no more than NAMEDATALEN-1 bytes of an identifier; longer names can be written in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier length is 63 bytes. If this limit is problematic, it can be raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h.

数据库系统中不会使用超过NAMEDATALEN-1长度的标识符,超长的名字可以在命令中书写,但是会被截断。默认情况下,NAMEDATALEN为64,所以标识符最大为63个字符。如果想调高该限制,可通过修改src/include/pg_config_manual.h中的NAMEDATALEN来实现。

Key words and unquoted identifiers are case insensitive. Therefore:

关键词及未引起来的标识符是大小写不敏感的。因此:

UPDATE MY_TABLE SET A = 5;

can equivalently be written as:

也可以写做:

uPDaTE my_TabLE SeT a = 5;

A convention often used is to write key words in upper case and names in lower case, e.g.:

约定成俗的,关键词用大写,名称用小写,例如:

UPDATE my_table SET a = 5;

There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes ("). A delimited identifier is always an identifier, never a key word. So "select" could be used to refer to a column or table named “select”, whereas an unquoted select would be taken as a key word and would therefore provoke a parse error when used where a table or column name is expected. The example can be written with quoted identifiers like this:

还有一种标识符:分隔符或引号标识符。其格式为使用双引号引起任意字符。分隔符为标识符而不是关键字。所以“select”可以指向一个叫做select的表或列,但是没有双引号的select,在解析时会作为关键词,所以在需要表名或列名的地方使用不带引号的select标识,则会触发解析错误。引起来的标识符示例如下:

UPDATE "my_table" SET "a" = 5;

Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.) This allows constructing table or column names that would otherwise not be possible, such as ones containing spaces or ampersands. The length limitation still applies.

引起来的标识符可以包含任意字符,但代码为零的字符除外。(含双引号的字符,要写两个双引号。)这允许构造包含空格或&的表名或列名(但强烈不建议这么做)。长度限制依旧适用。

A variant of quoted identifiers allows including escaped Unicode characters identified by their code points. This variant starts with U& (upper or lower case U followed by ampersand) immediately before the opening double quote, without any spaces in between, for example U&"foo". (Note that this creates an ambiguity with the operator &. Use spaces around the operator to avoid this problem.) Inside the quotes, Unicode characters can be specified in escaped form by writing a backslash followed by the four-digit hexadecimal code point number or alternatively a backslash followed by a plus sign followed by a six-digit hexadecimal code point number. For example, the identifier "data" could be written as:

带引号的标识符的变体允许包含由其代码点标识的转义Unicode字符。 此变体以U&(大写或小写的U,后跟&)开头,后接双引号(中间没有空格),例如U&“ foo”。 (请注意,这会与运算符&产生歧义。请在运算符周围使用空格以避免出现此问题。)在引号内,可以通过写反斜杠后跟四位数的十六进制代码点号或反斜杠后加+号跟六位数的十六进制代码点来指定Unicode字符。 或者,反斜杠后跟加号,后跟六位十六进制代码点编号。 例如,标识符“data”可以写为:

U&"d\0061t\+000061"

The following less trivial example writes the Russian word “slon” (elephant) in Cyrillic letters:

以下简单的例子用西里尔字母写俄语单词“ slon”(大象):

U&"\0441\043B\043E\043D"

If a different escape character than backslash is desired, it can be specified using the UESCAPE clause after the string, for example:

如果需要与反斜杠不同的转义字符,则可以在字符串后使用UESCAPE子句来指定它,例如:

U&"d!0061t!+000061" UESCAPE '!'

The escape character can be any single character other than a hexadecimal digit, the plus sign, a single quote, a double quote, or a whitespace character. Note that the escape character is written in single quotes, not double quotes.

转义字符可以是除十六进制数字、加号、单引号、双引号或空格字符以外的任何单个字符。 请注意,转义字符用单引号而不是双引号引起来。

To include the escape character in the identifier literally, write it twice.

标识符中如果包含转义字符,写两次就可以。(例如转义字符为\,则写\\)

The Unicode escape syntax works only when the server encoding is UTF8. When other server encodings are used, only code points in the ASCII range (up to \007F) can be specified. Both the 4-digit and the 6-digit form can be used to specify UTF-16 surrogate pairs to compose characters with code points larger than U+FFFF, although the availability of the 6-digit form technically makes this unnecessary. (Surrogate pairs are not stored directly, but combined into a single code point that is then encoded in UTF-8.)

Unicode转义语法仅在服务器字符编码为UTF8时有效。 当使用其他服务器字符编码时,只能指定ASCII范围(最大\ 007F)中的代码点。 4位和6位格式均可用于指定UTF-16代理对,以组成代码点大于U+FFFF的字符,尽管6位格式的可用性在技术上没有此必要。 (代理对不直接存储,而是组合成一个代码点,然后以UTF-8进行编码。)

Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If you want to write portable applications you are advised to always quote a particular name or never quote it.)

将标识符引起来同时也使其区分大小写,而未引起来的名称总是为小写。 例如,PostgreSQL认为标识符FOO,foo和“ foo”相同,但是“ Foo”和“ FOO”与这三个标识符互不相同。(在PostgreSQL中将未加引号的名称默认小写是与SQL标准不兼容的,SQL标准中未加引号的名称应该默认为大写。因此,根据标准,foo应该等于“ FOO”而不是“ foo”。 如果您想编写可移植的应用程序,建议您将特定的名字要么总是引起来,要么从不引起来。)

发布了341 篇原创文章 · 获赞 53 · 访问量 88万+

猜你喜欢

转载自blog.csdn.net/ghostliming/article/details/104185626
key
今日推荐