Machine learning basic study notes [2]

1. Basics of Python data processing

1. Basic data types

  • There are 6 standard data types, including number, str, list, tuple, set, dictionary
  • Among them, number and str belong to basic data types, and list, tuple, dict, and set belong to composite data types.
  • List, set, and dictionary are mutable data types, while number, str, and tuple are immutable data types.

1.1 number

  • Numeric type is a data type specially used to store numerical values ​​and is immutable.
  • Therefore, there are no increment and decrement operators such as ++ and – for numerical data.
  • Supports 4 different numerical data types: int, float, bool, complex
  • When dividing 2 numbers, the result is always a floating point number., even if both numbers are integers and divisible

1.2 str

1.Index

  • Get the string by indexSingle character value.
  • Indexes start from 0, and negative indexes can also be used.
    2. Slice
  • by slicingGet the value of a substring[Use index to extract multiple characters from a string at the same time]
  • The value of the end index cannot be obtained during slicing.
    3.Other operations
  • The reverse arrangement of strings can be achieved through the variable [::-1]
str='Picture' 
print(str[2]) # 索引 
print(str[1:3]) # 切片 
print(str[-3:-1]) 
print(str[3:-1]) 
print(str[-6:7]) 
print(str[2:]) #从开始索引到末尾 
print(str[:5]) # 从开头到末索引之前的元素  
print(str[:]) # 整个字符串
print(str[1::2]) # 步长 
print(str[::-1]) print(str*2) print(str+'TEST')

c
ic
ur
tur
icture
cture
Pictu
Picture
itr
erutciP
PicturePicture
PictureTEST

  • String variables can beOverall modification,butA character in the string cannot be modified
  • Common methods for string rewriting:Common methods for string rewriting
  • Note: In addition to changing the first letter to uppercase, .capitalize() and .title() will also change other uppercase letters to lowercase letters.

1.2.1 String judgment

  • To determine what the string starts with, use the method .startswith()
  • To determine where a string ends, use the method .endswith()
  • To determine whether a character (string) is in a string, you can use the "in" operator
  • Obtain the subscript of a character or substring through the method .find(),When the return value is -1, it means that it was not found.

1.2.2 Split string

  • Using the method .split(), you get a list
  • Indexing and slicing of lists are the same as for strings, butIndividual elements in the list can be modified
  • Commonly used methods for lists are as follows:Commonly used methods for lists

1.3 List[]

1.3.1 Delete list elements

  • The first three are deleted by position, and the last one is deleted by value.
  1. Use the del statement to delete elements, del mylist[0], to delete the element with index 0 in the list
  2. Use the pop() method to delete the last element. Use the pop() method to delete the last element of the list and assign the element
    to a variable mylist1=mylist.pop()
  3. Use the pop() method to delete elements at any position. Just specify the index of the element to be deleted in parentheses, and assign the element
    to a variable. mylist1=mylist.pop(3)
  4. To delete an element based on its value, use the remove() method and assign the element to a variable.
    mylist1=mylist.remove('elem') , this methodOnly delete the first specified value. If the value to be deleted appears multiple times in the list, you need to use a loop to achieve it.

1.4 tuple ()

  • The tuple is written into (), and the elements are separated by commas. The elements can have different types.
  • Tuples are similar to lists, but the elements in the tuple cannot be modified.
    But if the data item inside the tuple is of variable type, the data item can be modified, for example, the data item is a list.
  • Tuples are intercepted in a similar way to strings and lists. The subscripts start from 0 and the end position starts from -1.
  • Tuples can be assigned directly to variables

1.5 dict {}

  1. Dictionary access requires the use of keys, and the keys are enclosed in []
  2. Modify the dictionary: add or modify by assigning access keys, use the update() method to modify or add, use the del operator to
    delete, and use the clear() method to clear the dictionary.
dict={
    
    'name':'zhangsan','age':20} 
print('原字典:') 
print(dict) # 添加 
dict['gender']='Female' 
print('添加后:') 
print(dict) # 修改1 
dict['name']='lisi' 
print('修改后:') 
print(dict) # 修改2 
dict.update({
    
    'No':1,'age':22}) 
print('修改后:')
print(dict) # 删除 
del dict['gender'] 
print('删除后:') 
print(dict) # 清空 
dict.clear() 
print('清空后:') 
print(dict)

Original dictionary: {'name': 'zhangsan', 'age': 20}
After addition: {'name': 'zhangsan', 'age': 20, 'gender': 'Female'} After
modification: {'name ': 'lisi', 'age': 20, 'gender': 'Female'}
modified: {'name': 'lisi', 'age': 22, 'gender': 'Female', 'No': 1}
After deletion: {'name': 'lisi', 'age': 22, 'No': 1}
After clearing: {}

1.6 set {}

  • It consists of a series of unordered [so cannot be indexed], non-repeating data items. Each element in the set is unique.
  • To create a collection, you can use the {} or set() function. The parameter of the set() function is str, which will split the str into single characters.
  • To create an empty collection, you must use the set() function, because empty {} creates an empty dictionary.
  • The main function of a set is to remove duplicates. You can use the set() function to remove duplicates.

1.7 range() function

  1. Used to generate a series of numbers,The return value is of type range.[Note: It is a series of numbers, not a list], this is a function
    , and the parameters are separated by commas.
  • The parameter can be 1, and range(a) gets a number starting from 0 to a-1.Cannot get the last value
    , which is the result of a different behavior of programming speech
  • Take 2 parameters, range(b,a) gets a total of a-b+1 numbers starting from b to a-1
  • There are 3 parameters, range(b,a,c) gets several numbers starting from b to a-1, the difference between each number is c, that is, the third parameter is used to set the step
    size

2. Operator

2.1 Arithmetic operators

  • +、-、*、/、%、**、//
  • conductDivision operation, regardless of whether the quotient is an integer or a floating point number,The result of the operation is always a floating point number

2.2 Bit operators

  • &, |, ^, ~, <<, >>
    -== ^ is XOR ==
  • ~ is bitwise reversal. The conversion formula is ~x=-x-1Used here: Negative numbers are stored using two's complement in computers.

2.3 Input and output

2.3.1 input

  • The input function defaults all received data types to str. To get the required data type, forced conversion is required.

2.3.2 Formatted output

  • Mainly to facilitate statement modification and reduce the workload of writing code, and includes automatic bit selection, base conversion, etc.
  • There are three types of formatted output, %+ format character method, format function method, and f-string method.
  1. %s, %d, %f, %%. For example,
    print("My name is %s"%name)
    print("My student number is %06d"%student_no)
    print("Apple unit price%.02f"%price)
    print("Data proportion%.02f%% ”%scale)

  2. Use format. For example,
    print("{} said: Learn and practice it from time to time, not {}".format("Confucius", "Shuohu"))
    print("{1} said: Study and practice it from time to time, not {0 }".format("Shuohu","Confucius"))

  3. f-string (f string) method
    print(f'{name}'s hobby is {fondness}')
    where name and fondness are variable names, such as name="zhangsan" fondness="pingpang"

  4. The function of the print( doc ) statement is to print the large comment (also called the documentation
    ) that precedes the statement but is closest to it.

3. Control structure

  • About break and continue
for i in range(10):
	if i==3: 
		break 
	print(f'--------{
      
      i}--------') 
	for j in range(6): 
		if j==2: 
			continue 
		elif j==4: 
			continue 
		else:
			print(j)

--------0--------
0
1
3
5
--------1--------
0
1
3
5
--------2--------
0
1
3
5

4. Function

4.1 Function parameters

  • An indefinite number of positional parameters [tuple], add an asterisk in front of it when defining
  • An indefinite number of keyword parameters [dictionary], add two asterisks in front of the definition
# 位置参数、默认参数、不定量参数【元组】、关键字参数【字典】 
def func(a,b=1,*num,**kwargs): 			 
	print(a,b,num,kwargs) 
	
func(2) 
func(4,2,3,4,5,6,c=2,d=5,e=8) 
func(4,2,3,d=5,e=8)

4.2 Anonymous functions

  • The function is nameless, uselambda keywordcreate

5. Data file reading and writing

5.1 Python native standard file opening, reading (writing), and closing operations

  1. The built-in function to open a file is the open() function. After opening the file, a file object is created, and the file is accessed through this
    file object.
  • There are three parameters in the function. The first is the file name, the second is the opening method, and the third parameter is optional and is used to set whether to use the buffer.
  • All parameters in the function are enclosed in quotation marks
  1. Different ways to open files
  • The opening methods include w, w+, r, r+, a, a+, etc.With + means the reading and writing mode is open
  • There are also opening methods such as wb, rb, ab, etc.Open in binary mode.
  • When the opening method is writing or appending,Open the file if it exists, create the file if it does not exist
  1. Files are opened in different ways, and the pointers point to different
  • When the opening mode is w or r,Pointer points to the beginning of the file
  • When the opening mode is a,Pointer points to the end of the file
  1. To write data to a file, use the write() method of the file object. The parameter is the string to be written to the file.
  • If you need to write data to the file, the opening method needs to select 'w' (overwrite) or 'a' (append) or 'r+' mode.
  • Open with w:When writing, the contents of the file will be cleared, and then the data will be rewritten starting from the current pointer.
  • The opening method is a:When writing, it will be added at the end of the subsequent file.
  • Open with r+:When writing, the content in the original file will be overwritten starting from the current pointer, and the content that is not overwritten will be
    retained.
  1. The file object also provides methods for reading files, including read(), readline(), readlines(), etc.
  • f.read(),Read the entire file by default, if the parameter count is set, read count bytes from the current position. The return value
    is a string
  • f.readline(),Read a line from the file at the current position, the return value is a string
  • f.readlines(), starting from the current position,Read all lines of the file and return the value as a list, an item in the list corresponds to
    a line of the file, which is a string. You can use a for loop to traverse file objects
  • After the file is read or written, the pointing of the file pointer will change.
  1. After using the file, close the file. Close the file using the .close() method of the file object
  • A more convenient method is to use the with statement provided by python. When using the with statement to open a file, you do not need to call the f.close() method
    .Automatically close files. Even if there is an error while reading the file, the file is guaranteed to be closed.
with open('f:/temp.txt','a+') as f: 
	f.write('lisi\n')

5.2 Numpy access files

  • Data can be read from a text file through loadtxt(), and the result is nd-array
  • Use savetxt() to write an array to a text file
  • The main reading and writing functions of Numpy are as follows:Numpy's main reading and writing functions
  • ndarray in Numpy requires elements to be data of a single data type.
  • The data types in the read .csv files may be different, so add dtype=np.str_ to convert all data into a unified type, that is, string type
# 读取 import numpy as np 
tmp=np.loadtxt('f:/temp1.txt',dtype=np.str_,delimiter='\n') 
print(tmp,type(tmp))
# 另一种导入包及函数的方式 
from numpy import loadtxt 
tmp=loadtxt('f:/temp1.txt',dtype=np.str_,delimiter='\n') 
print(tmp,type(tmp))

5.3 Pandas access files

  • The basis of Pandas is Numpy, and its core function is data calculation and processing.
  • The Pandas library provides specialized file input and output functions, which are roughly divided into read functions and write functions, as shown in the table
    Pandas access files
import pandas as pd 
data=pd.read_csv('f:/data/film.csv') # 默认是逗号分隔符,可以省略 
data.head()

Guess you like

Origin blog.csdn.net/QwwwQxx/article/details/124707390