Reference article: http://c.biancheng.net/view/36.html
1. Get the number of types of characters and a length ascii obtaining the number of character length type utf8
a. len ( "Mimi") // return 6 which is acquired by ASCII length
b.utf8.RuneCountInString ( "Mimi") // return 2 the number of characters is acquired length utf8
2. String traversal, traversing as ASCII and Unicode way to traverse
// test string length str: = "I ABCD" @ type traverse the ASCII code for I: = 0; I <len (STR); I ++ { fmt.Printf ( "the ASCII% C,% D \ n-", STR [I], STR [I]) } // iterate Unicode code type for _, S: = Range STR { FMT the Printf ( "the Unicode C%, D% \ n-", S, S). }
Output
ASCII æ, 230 ASCII , 136 ASCII , 145 ASCII æ, 230 ASCII , 152 ASCII ¯, 175 ASCII , 32 ASCII a, 97 ASCII b, 98 ASCII c, 99 ASCII d, 100 Unicode 我, 25105 Unicode 是, 26159 Unicode , 32 Unicode a, 97 Unicode b, 98 Unicode c, 99 Unicode d, 100
You can see in ASCII code is inserted garbled, we can print processed in unicode mode. The difference between the two is that for which recycling
ASCII: Use subscripts for traversing back digital representation is encoded in ASCII.
Unicode:. Traverse back using for range digital representation is encoded in the unicode
Expansion: What is the difference UTF-8 and Unicode?
Unicode and ASCII similar, is a character set.
Each character is assigned a unique ID for the character set, we used all of the characters in the Unicode character set has a unique ID, for example, in the above example encoded in a Unicode and 97 are in ASCII. Chinese character "you" in Unicode encoding is 20320, focused on the characters of different countries, corresponding to the character ID will be different. Regardless of any case, Unicode characters are the ID will not change.
UTF-8 encoding rules, the characters in the Unicode ID encoded in some manner, UTF-8 is a variable-length coding rules, ranging from 1 to 4 bytes. Encoding rules are as follows:
- 0xxxxxx text symbols represents 0 to 127, is compatible with the ASCII character set.
- From 128 to 0x10ffff represent other characters.
According to this rule, the Latin language character encoding generally one byte for each character, and each Chinese character occupies 3 bytes.
Broadly it refers to a Unicode standard, which defines the character sets and encoding rules, the Unicode character set, and UTF-8, UTF-16 coding.
Reference: http://c.biancheng.net/view/18.html
3. Type a strong turn
1 // Type strong rotation 2 STR: = " This is a AAA " . 3 bytestr: = [] byte (STR) . 4 fmt.Println (bytestr) . 5 fmt.Println ( String (bytestr))
. A string strong turn into byte: [] byte (str)
. B byte turn into strong string: string (byte)
4. Efficient string splicing
1 // string concatenation 2 Hammer: = " eat me a hammer " . 3 Sickle: = " go die " . 4 . 5 / * * splicing normal string * / . 6 Hammer + = Sickle . 7 fmt.Println (Hammer) . 8 FMT .Println (Sickle) . 9 10 / * * efficient manner connected in the string * / 11 // declare a byte buffer 12 is var StringBuilder for bytes.Buffer states 13 is 14 // the character string written in the buffer 15 stringBuilder.WriteString (Hammer) 16 stringBuilder.WriteString (Sickle) . 17 18 // buffer output string . 19 fmt.Println (stringBuilder.String ())
Output:
1 Eat me a hammer go die 3 go die 4 eat me a hammer go die go die
Simple things may not be efficient. In addition to the plus connection string, Go is also a mechanism similar to StringBuilder efficient string concatenation
The verbs and some common features of formatting styles
Verb | Features |
---|---|
% v | Originally output value by the value |
% + V | % V on the basis of field names and values are expanded structure |
% # V | Go output value of language syntax |
%T | Output type and value of the Go language syntax |
%% | Output% body |
%b | Integer displayed in binary mode |
%O | Integer displayed in octal way |
%d | Integer displayed in decimal way |
%x | Integer displayed in hexadecimal mode |
%X | Integer displayed in hexadecimal, uppercase letters |
% U | Unicode character |
%f | Float |
%p | Pointer, displayed in hexadecimal |
6.
7.