I rewrote the code in <Machine Learning Practice> by myself. I just want to complain that the sample code in the book <Machine Learning Practice> is written like shit, although the book is a good book. . .
ch2
2-1
from import和import https://www.zhihu.com/question/38857862
tile https://docs.scipy.org/doc/numpy/reference/generated/numpy.tile.html similar to broadcast
.sum(axis=1) sums the rows of the two-dimensional array, axis=0 is the column
.get(key, default value) Get the value of the corresponding key from the dictionary, and return the default value if the key does not exist
sorted https://docs.python.org/3/howto/sorting.html Figure out how to sort tuples, objects, dictionaries
.items() https://docs.python.org/3/library/stdtypes.html#mapping-types-dict By default, dict iterates over keys. If you want to iterate value, you can use for value in d.values(), if you want to iterate key and value at the same time, you can use for k, v in d.items()
In order to prevent the problem of two modules importing each other, all modules in Python are imported only once by default. If you need to re-import a module,
you can use reload() directly in Python 2.7, and you can use the following methods in Python 3:
Method 1: Basic method
from imp import reload
reload(module)
2-2
open returns a file object, the default file is read-only, remember to close it after use https://docs.python.org/3/tutorial/inputoutput.html
.readlines() is used to read all lines (until the end character EOF) and return a list which can be processed by Python's for...in... construct
zeros https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.zeros.html
.strip https://www.tutorialspoint.com/python/string_strip.htm
.split http://python-reference.readthedocs.io/en/latest/docs/str/split.html
map http://www.runoob.com/python/python-func-map.html
.add_subplot http://blog.csdn.net/You_are_my_dream/article/details/53439518 Object-oriented approach is recommended
.scatter https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html
2-3
.min() global .min(0) for column .min(1) for row
2-4
\ Indicates that the line is not enough, start a new line to write
print('%d and %f' % (a, b)) formatted print
2-5
input https://www.ibm.com/developerworks/cn/linux/l-python3-1/index.html
.readline http://www.runoob.com/python/file-readline.html
2-6
listdir http://www.runoob.com/python/os-listdir.html
ch3
3-1
.keys http://www.runoob.com/python/att-dictionary-keys.html
if currentLabel not in labelCounts.keys():
log https://docs.python.org/3.6/library/math.html
3-2
.extend http://www.runoob.com/python/att-list-extend.html
.append http://www.runoob.com/python/att-list-append.html
3-4
.count http://www.runoob.com/python/att-list-count.html
del (note the difference between .remove) http://www.jb51.net/article/35012.htm
subLabels = labels[:] copy copy
3-5
dict http://www.runoob.com/python/python-func-dict.html
.figure https://matplotlib.org/api/_as_gen/matplotlib.pyplot.figure.html
.clf https://matplotlib.org/api/_as_gen/matplotlib.figure.Figure.html#matplotlib.figure.Figure
.subplot https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplot.html Note the difference with .add_subplot, the latter is recommended
.annotate https://matplotlib.org/users/annotations.html#using-connectorpatch
3-6
.keys() http://www.runoob.com/python/att-dictionary-keys.html After python3.6, it returns an iterator and does not support indexing (the iterator needs to be converted into a list before indexing)
.values() http://www.runoob.com/python/att-dictionary-values.html After python3.6, it returns an iterator and does not support indexing (the iterator needs to be converted into a list before indexing)
type http://www.runoob.com/python/python-func-type.html
Default static variable for ._name_ class
Spyder Shortcuts
Ctrl + 1: Comment/Uncomment
Ctrl + 4/5: block comment/block uncomment
Ctrl + s: save
3-7
.text https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.text.html#matplotlib.axes.Axes.text
3-8
.index http://www.runoob.com/python/att-list-index.html
3-9
pickle.dump & pickle.load https://blog.oldj.net/2010/05/26/python-pickle/ (code is a bit outdated) https://docs.python.org/3/library/pickle.html ( new version)
open http://www.runoob.com/python/python-func-open.html
ch4
4-1
set http://blog.csdn.net/business122/article/details/7541486
list*5 expands list by 5 times
4-2
list.append([1,2,3]) list.extend([1,2,3])
4-5
The r before the string in python means unescape, which is often used in the regular expression s = r'test\tddd' print(s) output: test\tddd
python regular expressions (greedy matching) http://www.runoob.com/python/python-reg-expressions.html
.lower() string to lowercase .upper() uppercase
.read() returns a string containing everything in the file
range(1,26) 1-25
random.shuffle() http://www.runoob.com/python/func-number-shuffle.html
Python string encoding and decoding https://www.cnblogs.com/evening/archive/2012/04/19/2457440.html
# -*- coding: UTF-8 -*- http://www.runoob.com/python/python-chinese-encoding.html
ANSI encoding under windows https://mozillazg.com/2013/09/python-windows-ansi.html
Python debugging (using the pdb package, very powerful) https://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/
python debugging (using spyderIDE tool) http://blog.csdn.net/qq_33039859/article/details/54645465
4-6
How to get Craigslist RSS feed http://brittanyherself.com/cgg/tutorial-how-to-subscribe-to-craigslists-rss-feeds/
feedparser (mainly understand field feed, entries) http://pythonhosted.org/feedparser/common-rss-elements.html http://blog.topspeedsnail.com/archives/8156
The ML writers use the RSS feed at this URL: https://newyork.craigslist.org/search/stp https://sfbay.craigslist.org/search/stp (found at ny.feed.link)
ny.feed.title ny.feed.title_detail (RSS feed) ny.feed.link () ny.entries ny.entries.title ny.entries.summary ny.entries.link ny.entries.published
min http://www.runoob.com/python/func-number-min.html
.remove (note the difference from del) http://www.runoob.com/python/att-list-remove.html
Python rounding https://www.cnblogs.com/lipijin/p/3714312.html In addition, int() can also be rounded
New stopwords list https://www.ranks.nl/stopwords
ch5
Indication function http://www.cnblogs.com/xiaoxuesheng993/p/7977629.html
5-1
int() float() can directly convert strings to numbers http://www.runoob.com/python/python-func-int.html
math.exp() http://www.runoob.com/python/func-number-exp.html
5-2
ax.scatter https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html
arange https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html
Pyplot tutorial (Introduction to Pyplot, the former calls the function method to understand the concept of current figure and current axes, while the latter uses the Artist method to call the function (recommended) to understand the figure container and axes container)
https://matplotlib.org/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py
https://matplotlib.org/tutorials/intermediate/artists.html#sphx-glr-tutorials-intermediate-artists-py
numpy provides many copies of math functions that can accept arrays or matrices as input and return arrays or matrices of the same size numpy.exp numpy.cos
ch6
6-1
random.randrange https://docs.python.org/3/library/random.html
6-2
multiply https://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html
python logical operators not or and http://www.runoob.com/python/python-operators.html
python direct assignment, shallow copy, deep copy http://www.runoob.com/w3cnote/python-understanding-dict-copy-shallow-or-deep.html
When matrix and two-dimensional array slice columns, matrix returns a column vector, and two-dimensional array returns a one-dimensional array
The package imported after python will overwrite the previously imported package with the same name, such as from numpy import *; import random The random package that comes with python will overwrite numpy.random
indexing with Boolean Arrays https://docs.scipy.org/doc/numpy-dev/user/quickstart.html#indexing-with-boolean-arrays
6-3
matrix.A https://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.matrix.A.html
nonzero https://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html works wonders with indexing with boolean arrays
shape is suitable for matrix, len is suitable for array