DataWhale & Golang (five, dictionary, string)
Study outline:
table of Contents
DataWhale & Golang (five, dictionary, string)
String implementation is based on UTF-8 encoding
5.1.1 How to define a dictionary
supplement:
package main
import "fmt"
func main() {
//使用make申请一个map,键为string类型,值为int类型
m := make(map[string]int)
//设置值
m["k1"] = 7
m["k2"] = 13
//取指定键的值
v1 := m["k1"]
fmt.Println("v1: ", v1)
//取长度
fmt.Println("len:", len(m))
//遍历
for key, value := range m {
fmt.Println(key, ":" , value)
}
//删除
delete(m, "k2")
fmt.Println("map:", m)
//初始化时直接指定值
n := map[string]int{"foo": 1, "bar": 2}
fmt.Println("map:", n)
}
Definition string
A string is an immutable sequence of bytes. The string can contain any data, but it is usually used to contain readable text. The string is a sequence of UTF-8 characters (when the characters are in the ASCII code table) It takes 1 byte for characters, and 2-4 bytes for other characters as needed).
UTF-8 is a widely used encoding format, which is the standard encoding for text files, including XML and JSON. Due to the uncertainty of the length of bytes occupied by the encoding, strings in the Go language may also occupy 1 to 4 bytes as needed, which is different from other programming languages such as C++ , Java or Python (Java always uses 2 bytes ). Go language not only reduces the memory and hard disk space occupation, but also does not need to encode and decode the text using the UTF-8 character set like other languages.
A string is a value type, and its value is immutable, that is, after creating a text, the content of the text cannot be modified again. More deeply, a string is a fixed-length array of bytes.
To define a string, you can use double quotes
""
to define a string. You can use escape characters in a string to achieve effects such as line breaks and indentation. Commonly used escape characters include:
- \n: Newline character
- \r: carriage return
- \t:tab 键
- \ u or \ U: Unicode character
- \\: the backslash itself
String implementation is based on UTF-8 encoding
The internal implementation of strings in Go language uses UTF-8 encoding, and each UTF-8 character can be easily accessed through the rune type. Of course, the Go language also supports character-by-character access according to the traditional ASCII code.
5.1 Dictionaries
Map is a relatively special data structure (key-value pair structure), and the corresponding value can be quickly obtained through a given key.
5.1.1 How to define a dictionary
var m1 map[string]int
m2 := make(map[int]interface{}, 100)
m3 := map[string]string{
''name'':''james'',
''age'':''35'',
}
When defining a dictionary, you do not need to specify its capacity, because the map can grow dynamically, but in order to improve the efficiency of the program, it is better to indicate the capacity of the program in advance when the capacity of the map can be predicted. It should be noted that you cannot use incomparable elements as the keys of the dictionary, such as arrays and slices. The value can be of any type. If you use interface{} as the value type, you can accept various types of values, but you need to use type assertions to determine the type in specific use.
m3['key1']='v1' //向字典中放入元素:
len(m3)//获取字典的长度
Determine whether the key-value pair exists (whether value is empty)
if value,ok := m3[''name''];ok{
fmt.Println(value)
}
The function of the above code is to retrieve the corresponding value if there is a string with the key name in the current dictionary and return true, otherwise it returns false.
Traverse the dictionary:
for key,value:=range m3{
fmt.Println('key:',key,'value:',value)
}
The output order of the above program is not the same each time, because for a dictionary, it is out of order by default.
(The dictionary can be ordered by slicing)
Delete the value in the dictionary: use go's built-in function delete
delete(m3,'key1')
Store the function as a value type in the dictionary:
func main(){
m := make(map[string]func(a, b int) int)
m["add"] = func(a, b int) int {
return a + b
}
m["multi"] = func(a, b int) int {
return a * b
}
fmt.Println(m["add"](3, 2))
fmt.Println(m["multi"](3, 2))
}
5.2 String
5.2.1 String definition
A string is a value type, and its value is immutable after the string is created.
To modify the content of a string, you can convert it to byte slices, and then convert it to a string, but you also need to reallocate memory.
func main(){
s :='hello'
b :=[]byte(s)
b[0]='g'
s=string(b)
fmt.Println(s)//gello
}
len(s) //获取字符串长度
But if the string contains Chinese, you cannot directly use byte slices to operate on it. In go language, we can use this method
func main(){
s := "hello你好中国"
fmt.Println(len(s)) //17
fmt.Println(utf8.RuneCountInString(s)) //9
b := []byte(s)
for i := 0; i < len(b); i++ {
fmt.Printf("%c", b[i])
} //helloä½ å¥½ä¸å�½
fmt.Println()
r := []rune(s)
for i := 0; i < len(r); i++ {
fmt.Printf("%c", r[i])
} //hello你好中国
}
In the Go language, strings are stored in utf-8 encoding format, so each Chinese occupies three bytes plus 5 bytes of hello so the length is 17, if we get it through the utf8.RuneCountInString function The length of the string containing Chinese is in line with our intuition. And because Chinese is not printable for each individual byte, you can see a lot of strange output, but there is no problem in converting the string to rune slices.
5.2.2 strings package
The strings package provides many functions for manipulating strings.
func main() {
var str string = "This is an example of a string"
//判断字符串是否以Th开头
fmt.Printf("%t\n", strings.HasPrefix(str, "Th"))
//判断字符串是否以aa结尾
fmt.Printf("%t\n", strings.HasSuffix(str, "aa"))
//判断字符串是否包含an子串
fmt.Printf("%t\n", strings.Contains(str, "an"))
}
5.2.3 strconv package
The strconv package implements the conversion between basic data types and strings
i, err := strconv.Atoi("-42") //将字符串转为int类型
s := strconv.Itoa(-42) //将int类型转为字符串
If the conversion fails, the corresponding error value is returned
5.2.4 String splicing
In addition to the above operations, string splicing is also a very common operation. There are many ways to realize string splicing in the go language, but the efficiency of each method is not the same. The following are several methods. Compared.
1.Sprintf
const numbers = 100
func BenchmarkSprintf(b *testing.B) {
b.ResetTimer()
for idx := 0; idx < b.N; idx++ {
var s string
for i := 0; i < numbers; i++ {
s = fmt.Sprintf("%v%v", s, i)
}
}
b.StopTimer()
}
2.+ splicing
func BenchmarkStringAdd(b *testing.B) {
b.ResetTimer()
for idx := 0; idx < b.N; idx++ {
var s string
for i := 0; i < numbers; i++ {
s += strconv.Itoa(i)
}
}
b.StopTimer()
}
3.bytes.Buffer
func BenchmarkBytesBuf(b *testing.B) {
b.ResetTimer()
for idx := 0; idx < b.N; idx++ {
var buf bytes.Buffer
for i := 0; i < numbers; i++ {
buf.WriteString(strconv.Itoa(i))
}
_ = buf.String()
}
b.StopTimer()
}
4.strings.Builder splicing
func BenchmarkStringBuilder(b *testing.B) {
b.ResetTimer()
for idx := 0; idx < b.N; idx++ {
var builder strings.Builder
for i := 0; i < numbers; i++ {
builder.WriteString(strconv.Itoa(i))
}
_ = builder.String()
}
b.StopTimer()
}
5. Contrast
BenchmarkSprintf-8 68277 18431 ns/op
BenchmarkStringBuilder-8 1302448 922 ns/op
BenchmarkBytesBuf-8 884354 1264 ns/op
BenchmarkStringAdd-8 208486 5703 ns/op