python basis and use pycharm
1.pycharm
1.1 add header information for a particular file format
File--Settings--Editor--File and Code Templates--Python Script:
"""
===============
author:${USER}
time:${DATE}
E-mail:[email protected]
===============
"""
${PROJECT_NAME} - 当前Project名称;
${NAME} - 在创建文件的对话框中指定的文件名;
${USER} - 当前用户名;
${DATE} - 当前系统日期;
${TIME} - 当前系统时间;
${YEAR} - 年;
${MONTH} - 月;
${DAY} - 日;
${HOUR} - 小时;
${MINUTE} - 分钟;
${PRODUCT_NAME} - 创建文件的IDE名称;
${MONTH_NAME_SHORT} - 英文月份缩写, 如: Jan, Feb, etc;
${MONTH_NAME_FULL} - 英文月份全称, 如: January, February, etc;
The difference between 1.2 Directory and Package (__init__ file)
Package will be more than a __init__.py file, indicating py files in that directory can be imported as a module other py file
The benefits of using modules: function and variable names to avoid conflict; to achieve code reuse;
Use package of benefits: the module name to avoid conflict, to achieve code reuse;
__init__.py file usefulness:
(1) used to sign a file folder is a python package
(2) introducing the fuzzy definition of contents from package import * is defined to be introduced by __all__ variable _
** Note __all__ can only define a variable import import * by the way, for from package.module import obj does not work! **
# 在pakcage1里面有module1(func1, func2), module2, module3模块,在package2要导入
from package1 import module1, module2, module3
# 有时偷懒会写成,虽然pep8不推荐,这种方式会导入所有非下划线开头的方法和变量
from package1 import *
# 如果有些非下划线开头的方法和变量也不想被导入,可以在模块里面和包的__init__.py文件定义__all__,把允许import的对象暴露出来。譬如不允许导入module1的func1,和module3
# module1.py里面加上
__all__ = ['func1', 'func2']
# package1的__init__.py文件,这样就无法通过from package1 import *的方式导入module3,但仍可通过from package1 import module3导入
__all__ = ['module1','module2']
(3) introducing a simplified package inside the module class import statements
# 在package2里面的moudle2中需要导入package1包下的模块包module_package1下的一个模块module1的一个类class1
from package1.module_package1.module1 import class1
# 如果我们在module_package1的__init__.py文件里写了
from module1 import class1
# 那在在package2的module2导入class1就可以写为(少写一个模块名)
from package1.module_package1 import class1
1.3 Remove items
Close the project in pycharm, and then go to the disk location where the project, delete project files
1.4 Shortcuts
https://blog.csdn.net/weixin_37292229/article/details/81737194
运行py文件:ctrl + shift + F10
调试模式: shift + F9
自动调整为pip8规范:ctrl + shift + alt + l
注释:ctrl + /
展开代码:ctrl + shift + +
收缩代码:ctrl + shift + -
批量缩进(或名"对齐")
缩进:Tab
反向缩进:Shift+Tab
1.5 venv virtual environment
Compatible pycharm python3 provides a virtual environment, each virtual environment inside the third-party libraries are isolated, to support multiple versions of python developing or running on the same machine when. deactivate exit the terminal venv
1.6 pip install third-party libraries
Modify the source address for domestic mirror, accelerate downloads
一次性指定源: -i https://pypi.tuna.tsinghua.edu.cn/simple
永久修改:
linux下,修改 ~/.pip/pip.conf (没有就创建一个), 修改 index-url 为国内镜像地址,内容如下:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
windows下,直接在user目录中创建一个pip目录,如:C:\Users\xx\pip,新建文件pip.ini,内容如下
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
pip install lib_name
# 其他命令
list
download
uninstall
show
check
search
2.python basis
2.1 naming convention
Naming convention: start with a letter or an underscore, letters, or numbers underscore try to start with a letter, not a python keyword, pay attention to begin with an underscore have a special meaning, be used with caution.
# 查询python关键字
from keyword import kwlist
print(kwlist)
['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']
naming method:
- Snake: student_name
- Big Hump: StudentName
- Small hump: studentName
Variable names, method names, modules generally lowercase, serpentine, with all uppercase constants; class name with a large hump
Begin with an underscore are of special significance, and generally do, in conjunction with the scene
2.2 quotes and backslash
Double quotes with single quotes no difference : When a string containing a single primer / double quotes, the outer primers can double / single primer enclose the
three marks 2 use:
(1) holding a text format, such as a wrap, such as a string i.e., a single primer have double quotes
(2) Notes
Wrap spliced within double quotes: After adding "" wraps, print it out will not wrap, the code will not be too long
Backslash Usage
- For the back of the character string converted into special significance, as previously added to n, i.e., \ n, it represents a newline
- Common turn into special characters string, such as \ n plus front, i.e. \\ n represents n.
- r + string, said they did not escape the attention r native character can not be an odd number from \ the end , otherwise it will error
2.3 Data Type
type (str1) return data type
isinstance (a, type) determines a type of data is not the type
Variable / immutable type
Original Address: http: //www.cnblogs.com/huamingao/p/5809936.html
The variable type Vs immutable type
The variable type (mutable): lists, dictionaries
Immutable type (unmutable): number, string, a tuple
Variable immutable herein, refers to whether the piece of content (value) may be changed in the memory
Determining the minimum number of words: x if x <y and x <z else y if y <z else z
2.3.1 value
1. Type
- Int int
Small integer pool: -5 to 256, the variable within the range of pre-built variables. (See below effect pycharm inside, because pycharm made a greater integer pool).
By id see the memory address whether to perform same address. == for determining values are equal, is determined whether the same memory address
>>> a=100
>>> b=100
>>> id(a)
140708647059376
>>> id(b)
140708647059376
>>> a==b
True
>>> a is b
True
>>> c=1000
>>> d=1000
>>> id(c)
2349952671472
>>> id(d)
2349952671280
>>> c == d
True
>>> c is d
False
Float float
About Accuracy
1 may be used round (float_num, num) determination accuracy
2. The method of math floor and ceil rounding down
3. More decimal precision modules
4. When the output format is formatted using the specified accuracy
In [3]: a=21.2345 In [4]: round(a,2) Out[4]: 21.23 2、%nf In [5]: b = '%.2f'%a In [6]: b Out[6]: '21.23' In [7]: b = float('%.2f'%a) 3、'{.%2f}'.format() In [10]: b = '{:.2f}'.format(a) In [11]: b Out[11]: '21.23'
Boolean type bool
True: Boolean value length is not null character is True, pay attention to the first letter capitalized
False:None, 0, '', ' ', [],{},()
Complex complex
2. Operator
Arithmetic operators: + - * /% // **
assignment operator: + == = = * / =
comparison operators:> <> = <= = ==!
Logical operators: and or not
members of the operator : in, not in (the value type is not supported)
identity operator: is, not is (2 determines whether or not reference the same object identifier, points to the same block of memory)
Priority: Arithmetic available () indicates the priority calculation enclosed
a/b 完整结果
a//b 结果取整
a%b 结果取余
3. random module
randint (a, b) returns a random integer between a ~ b, a and b comprising
random () returns a random decimal between 0-1, do not contain a
uniform (a, b) returns a random decimal between a ~ b, the bottom layer () is achieved by random
Choice (seq) Returns seq elements inside iterables
2.3.2 String
1. index and sliced
Index: index subscript string support, from left to right starting from 0, from right to left at -1
Slice: not take take the first end of the left and right to open and close, support setting step
Splicing: + strings can be spliced, multiple indicated by *
>>> str1='abcdefg'
>>> str1[0]
'a'
>>> str1[2]
'c'
>>> str1[-1]
'g'
>>> str1[1:]
'bcdefg'
>>> str1[1:3]
'bc'
>>> str1[1:7:3]
'be'
>>> str1[::-1]
'gfedcba'
>>> str2='123'
>>> str1+str2
'abcdefg123'
>>> str1*2
'abcdefgabcdefg'
2. The common method
Note that the string data type is immutable, all operations are not modified the original string
The string join (str) inserted in the middle of dashes iterable, return replacement - ''.
str.find (sub) Returns the index sub, such as absence, return -1
str.index (sub) Returns the index sub, such as the absence of exception thrown
str.split ( 'sub' [, count]) to be cut according str sub, is cut according to the default spaces, carriage returns, tabs
str.strip () to remove the head and tail whitespace
str.replace (old, new [, count]) replacement character
>>> str1='abcdefg'
>>> str1[0]
'a'
>>> str1[2]
'c'
>>> str1[-1]
'g'
>>> str1[1:]
'bcdefg'
>>> str1[1:3]
'bc'
>>> str1[1:7:3]
'be'
>>> str1[::-1]
'gfedcba'
>>> str2='123'
>>> str1+str2
'abcdefg123'
>>> str1*2
'abcdefgabcdefg'
>>> '-'.join(str1)
'a-b-c-d-e-f-g'
>>> str1
'abcdefg'
>>> str3='11 A2\t3A3\n4A4'
>>> str3.split()
['11', 'A2', '3A3', '4A4']
>>> str3
'11 A2\t3A3\n4A4'
>>> str3.split('A')
['11 ', '2\t3', '3\n4', '4']
3. Output Formatting
powerful formatting output format, using {} placeholder, the placeholder being retired%
{: .2f} 2 decimal places
Python is based on the scope resolution LEGB rule, namely Local, Enclosing, Global, Built-in.
List
Tuple
dictionary
set
3. With regard to the encoding format
1.7 encoding format
https://zhuanlan.zhihu.com/p/67865867
https://zhuanlan.zhihu.com/p/25148581
python2 default ASCII encoding, because python2 earlier than unicode appear, so the default is ASCII code, does not support Chinese, need to add in the file header plus coding = utf8 to deal with the Chinese.
python3 using utf8 encoding default.
concept
byte
Computer storage unit, a byte is equal to 8 bits of byte
character
Information unit, English characters, numbers, Chinese is becoming a character
character set
Comprising a set of characters becomes character set, such as the ASCII character set contains only 127 characters, including numbers, letters and symbols and they do not contain Chinese; 7000 comprises a plurality of Chinese GB2312, GBK comprising more than 20,000 Chinese
Character code
Character set number of characters inside. The ASCII character set which is number 97 of a letter
Character Encoding
The characters in the character set mapping byte stream, such as the ASCII character set inside the letter a number is 97, i.e. converted into byte stream encoded binary, single byte is 0x51, when b'01100001 is written to the storage device '
Encoding and decoding
Converts the characters into a byte stream that is the encoding process, the character is to be byte-stream decoding process. Different coding methods, different characters as the different byte processing, such as are single-byte ASCII, UTF8 variable length byte, double-byte Unicode A, unicode4 4 bytes.
ASCII encoding
Appeared earlier coding format, one byte (8 bits) represents a character (not Chinese), a total of 127 characters, 7 less than a power of 2, represented by 7 binary, referred to as base the ASCII, binary 8 for expansion. However, due to the expansion in different countries result in transmission of very confusing.
ASCII table
0 to 31 and 127 (of 33) is a control character or a communication special characters (the remainder being displayable characters), such as the control characters: LF (line feed), CR (carriage return), the FF (form feed), DEL (deleted) , BS (backspace), BEL (ring) and the like. Communication special characters: SOH (packet header), EOT (end of file), ACK (acknowledgment) and the like;
ASCII value 8,9,10 and 13 were converted to backspace, tab, line feed and carriage return characters. They are not specific graphics, but will depend on the application, but have different effects on the text display.
32 to 126 (total of 95) is a character (32sp is a space), which is 48 to 57 ten arabic numerals 0 to 9;
65 to 90 to 26 capital letters, 97 122 ~ 26 lowercase letters, punctuation remainder arithmetic symbols
GB2312,GBK
Double-byte character encoding GB2312 Chinese people's definition of accommodating the more than 7,000 Chinese, is compatible with ASCII. GB2312 GBK is based on expansion, including minority languages, the English character is represented by a byte Chinese characters are represented by two bytes. But GBK Chinese people can only solve the problem use, circulation problems still exist around the world, so there is a Unicode
Unicode
To address the limitations of traditional character encoding is generated, writing systems in the world for most encode, organize, can accommodate more than one million mark, making it easier for computer processing and display text.
16-bit code space, each character occupies 2 bytes. Unicode implementations become Uincode format conversion, extended from ASCII code. Unicode 2 formats, UCS-2, and UCS-4, because the world far more than 65,535 kinds of characters, UCS-4 defines a character represented by 4 bytes.
However, since the fixed character byte-length, resulting in a waste, such as the letter does not require 2 bytes.
And some with two bytes, some of which are 4 bytes, are stored to a computer numeric string of 0 and 1, not four bytes are the two is a Chinese character or English characters .
So with the back of the utf8
UTF-8
utf8 unicode is an implementation of a variable length character encoding, according to the rules with 1-4 bytes to represent a character, such as English is one byte, 3 bytes in Chinese.
utf8 provisions: For multi-byte character, the first byte of the first n bit is 1, the n + 1 bit is set to 0, the following bytes of the first two digits are 10. The remaining bits using all of the character filling code unicode
To "good" for example, "good" is the corresponding Unicode 597D, the corresponding range is 0000 0800-0000 FFFF, so it needs to use three bytes to store when expressed in UTF-8, 597D expressed in binary is: 0101100101111101 filled to 1110xxxx 10xxxxxx 10xxxxxx get 1,110,010,110,100,101 10111101, converted to hexadecimal is e5a5bd, and therefore "good" of Unicode code U + 597D UTF-8 encoded corresponds to the "E5A5BD". You can use Python code to verify:
>>> a = u"好"
>>> a
u'\u597d'
>>> b = a.encode('utf-8')
>>> len(b)
3
>>> b
'\xe5\xa5\xbd'