1.ModuleNotFoundError: No module named 'numpy'
原因:python环境没有安装numpy包
解决方案:通过pip install numpy安装
补充:在cmd环境下已经安装了numpy包,但在pycharm环境中提示No module named 'numpy'
解决方案:在pycharm环境中file->setting->project interpreter中没有numpy包,按下图,安装numpy包
2.安装python包慢
原因:pip安装默认使用的源为https://pypi.python.org/simple,国内网络访问慢
解决方案:修改pip默认安装源为国内源
http://pypi.douban.com/simple/ 豆瓣
http://mirrors.aliyun.com/pypi/simple/ 阿里
http://pypi.hustunique.com/simple/ 华中理工大学
http://pypi.sdutlinux.org/simple/ 山东理工大学
http://pypi.mirrors.ustc.edu.cn/simple/ 中国科学技术大学
https://pypi.tuna.tsinghua.edu.cn/simple 清华
在window下,在C:\Users\用户名 目录下创建pip目录,在pip目录下创建pip.ini配置文件,在配置文件内写入如下内容:
[global]
index-url=http://mirrors.aliyun.com/pypi/simple/
[install]
trusted-host=mirrors.aliyun.com
pycharm环境中,在file->setting->project interpreter中修改pip源配置
Anaconda国内源配置:
在C:\Users\<你的用户名> 下创建配置文件.condarc,写入如下内容
channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
show_channel_urls: true
3.%matplotlib inline报错
%matplotlib inline:在jupyter notebook环境中,显示出打印信息
只在notebook环境下运行才需要加%matplotlib inline,正常py代码里会报错
4.去除字符串中标点符号
import re
import string
string = “Hello, how are you!”
print(re.sub(r'[{}]+'.format(string.punctuation),"",string))
print(re.sub("[+\.\!\/_,$%^*(+\"\']+|[+——!,。??、~@#¥%……&*()]+","",string))
5.矩阵相乘
import numpy as np
# 2-D array: 2 x 3
two_dim_matrix_one = np.array([[1, 2, 3], [4, 5, 6]])
# 2-D array: 3 x 2
two_dim_matrix_two = np.array([[1, 2], [3, 4], [5, 6]])
two_multi_res = np.dot(two_dim_matrix_one, two_dim_matrix_two)
print('two_multi_res: %s' %(two_multi_res))
6.len使用
n_greater_50k = len(data[data['income']=='>50K'].index)
等价于:
n_greater_50k = 0
for info in data['income']:
if info == '>50K':
n_greater_50k += 1
7.将二分类数据转换0/1编码
#方法1
labels = {'<=50K':0,'>50K':1}
income = income_raw.map(labels)
#方法2
income = (income_raw=='>50K').astype(int)
#方法3
income = income_raw.apply(lambda x: 0 if x == '<=50K' else 1)
8.将字符串转换为小写
for i in documents:
lower_case_documents.append(i.lower())
9.运行jupyter notebook ***报错
ValueError: Please install nodejs 5+ and npm before continuing installation. nodejs may be installed using conda or directly from the nodejs website.
安装Nodejs,去https://nodejs.org/en/下载安装8.11.2LTS
10.获取字典中最值对应的键
dict = {'u': -1, 'r': 0, 'd': 0, 'l': 0}
max(dict,key=dict.get) #参数key=不能少
或
max(dict.items(), key=lambda x: x[1])
min(dict, key=lambda x: d[x])
11.判断字符串是否存在子串
string = 'hello world'
if 'world' in string:
if string.find(’world‘) == 5:
if string.index(’world‘) > -1:
12. 获取字典大小
keys = dict.keys()
len(keys)