C language character sets

                                   C language character sets

When the compiler converts the source code, the environment setting is called translation (translation environment); compiled during program execution, the environment becomes runtime environment (execution environment). C language, the translation environment and runtime environment is different. Thus, C language defines two characters (character set): the source character set and character set operation. Source character set (source character set) for the character C composed of a set of source code, the character set operation (execution character set) can be interpreted as a set of characters executing a program. In many C language implementation version, the two characters are the same. If not, then the character constants and string literals compiler will convert the source code into a character set corresponding to the operating element.

Both character sets include basic character set (basic character set) and extended characters (extended character). C language usually does not specify extended characters, this is usually determined by the local language. Extended characters plus the basic character set, consisting of an extended character set (extended character set).

The basic source character set and character sets run substantially contains the following character types:

Latin alphabet, decimal digits,

The following 29 characters:

!“   #  %  &  `  ()  *  +  ,  -  .   /  :  ;  <  =  >  ? [  \  ]  ^  _  {  |  }  ~

Five kinds of white space:

            Space, horizontal tab, vertical tab, line feed, feed

The basic operation of the character set defined four non-printable characters:

null character (as a character string terminated) \ 0, an alarm (alert) \ a, backspace (backspace) \ b and the carriage (carriage return) \ r

 

                      C language character set

When the compiler converts source program code, the environment in which it is located is called the translation environment; when the program is executed after compilation, the environment is in the execution environment. For C, the translation environment and the runtime environment are different. Therefore, C language defines two character sets (character set): source code character set and running character set. The source character set is the set of characters used to form the C source code, and the execution character set is the set of characters that can be interpreted by the executing program. In many C implementations, these two character sets are the same. If they are not the same, the compiler will convert the character constants and string literals in the source code into corresponding elements in the running character set.

Both character sets include a basic character set and an extended character. C language usually does not specify extended characters, which is usually determined by the native language. The extended characters plus the basic character set form the extended character set.Both the basic source code character set and the basic run character set include the following character types

:Latin alphabet, decimal Arabic numerals,

The following 29 characters :

! "#% &` () * +,-. /:; <=>? [\] ^ _ {|} ~

5 types of whitespace:Spaces, horizontal tabs, vertical tabs, line breaks, page breaks

The basic running character set defines four non-printable character sets:null characters (used as string termination) \ 0, alert \ a, backspace \ b, and carriage return \ r

Guess you like

Origin www.cnblogs.com/hoganben/p/12152457.html