go string types of properties

Reference article:  http://c.biancheng.net/view/36.html

1. Get the number of types of characters and a length ascii obtaining the number of character length type utf8

a. len ( "Mimi") // return 6 which is acquired by ASCII length

b.utf8.RuneCountInString ( "Mimi") // return 2 the number of characters is acquired length utf8

2. String traversal, traversing as ASCII and Unicode way to traverse

// test string length 
str: = "I ABCD" 


@ type traverse the ASCII code 
for I: = 0; I <len (STR); I ++ { 
  fmt.Printf ( "the ASCII% C,% D \ n-", STR [I], STR [I]) 
} 

// iterate Unicode code type 
for _, S: = Range STR { 
  FMT the Printf ( "the Unicode C%, D% \ n-", S, S). 
}

  Output

ASCII  æ, 230
ASCII  ˆ, 136
ASCII  ‘, 145
ASCII  æ, 230
ASCII  ˜, 152
ASCII  ¯, 175
ASCII   , 32
ASCII  a, 97
ASCII  b, 98
ASCII  c, 99
ASCII  d, 100
Unicode  我, 25105 
Unicode  是, 26159 
Unicode   , 32 
Unicode  a, 97 
Unicode  b, 98 
Unicode  c, 99 
Unicode  d, 100 

You can see in ASCII code is inserted garbled, we can print processed in unicode mode. The difference between the two is that for which recycling

ASCII: Use subscripts for traversing back digital representation is encoded in ASCII. 

Unicode:. Traverse back using for range digital representation is encoded in the unicode

 

Expansion: What is the difference UTF-8 and Unicode?

Unicode and ASCII similar, is a character set.

Each character is assigned a unique ID for the character set, we used all of the characters in the Unicode character set has a unique ID, for example, in the above example encoded in a Unicode and 97 are in ASCII. Chinese character "you" in Unicode encoding is 20320, focused on the characters of different countries, corresponding to the character ID will be different. Regardless of any case, Unicode characters are the ID will not change.

UTF-8 encoding rules, the characters in the Unicode ID encoded in some manner, UTF-8 is a variable-length coding rules, ranging from 1 to 4 bytes. Encoding rules are as follows:

  • 0xxxxxx text symbols represents 0 to 127, is compatible with the ASCII character set.
  • From 128 to 0x10ffff represent other characters.


According to this rule, the Latin language character encoding generally one byte for each character, and each Chinese character occupies 3 bytes.

Broadly it refers to a Unicode standard, which defines the character sets and encoding rules, the Unicode character set, and UTF-8, UTF-16 coding.

Reference:  http://c.biancheng.net/view/18.html

3. Type a strong turn

1  // Type strong rotation 
2 STR: = " This is a AAA " 
. 3 bytestr: = [] byte (STR)
 . 4  fmt.Println (bytestr)
 . 5 fmt.Println ( String (bytestr))

. A string strong turn into byte: [] byte (str)

. B byte turn into strong string: string (byte)

 4. Efficient string splicing

1  // string concatenation 
2 Hammer: = " eat me a hammer " 
. 3 Sickle: = " go die " 
. 4  
. 5  / * * splicing normal string * / 
. 6 Hammer + = Sickle
 . 7  fmt.Println (Hammer)
 . 8  FMT .Println (Sickle)
 . 9  
10  / * * efficient manner connected in the string * / 
11  // declare a byte buffer 
12 is  var StringBuilder for bytes.Buffer states
 13 is  
14  // the character string written in the buffer 
15  stringBuilder.WriteString (Hammer)
 16  stringBuilder.WriteString (Sickle)
 . 17 
18  // buffer output string 
. 19 fmt.Println (stringBuilder.String ())

Output:

1  Eat me a hammer go die 
3  go die
 4 eat me a hammer go die go die

  Simple things may not be efficient. In addition to the plus connection string, Go is also a mechanism similar to StringBuilder efficient string concatenation

The verbs and some common features of formatting styles

Table: common verbs and string formatting function
Verb Features
% v Originally output value by the value
% + V % V on the basis of field names and values ​​are expanded structure
% # V Go output value of language syntax
%T Output type and value of the Go language syntax
%% Output% body
%b Integer displayed in binary mode
%O Integer displayed in octal way
%d Integer displayed in decimal way
%x Integer displayed in hexadecimal mode
%X Integer displayed in hexadecimal, uppercase letters
% U Unicode character
%f Float
%p Pointer, displayed in hexadecimal

6.

7.

 

Guess you like

Origin www.cnblogs.com/ITPower/p/11925668.html