Python Data Analysis - The Basics

1. Data science five commonly used Python library

Numpy

  • N-dimensional array (matrix), fast and efficient vector math
  • Efficient index, does not need to cycle
  • Free open source cross-platform, operating efficiency and sufficient C / Matlab comparable

Scipy

  • It depends on Numpy
  • Designed for science and engineering
  • Achieve a variety of commonly used scientific computing, such as: linear algebra, Fourier transform, signal and image processing

 

Pandas

  • Structured data analysis tool (Numpy dependent)
  • Offers a variety of advanced data structures: Time-Series, DataFrame, Panel
  • Powerful indexing and data processing capabilities

Matplotlib

  • Python 2D graphics areas most widely used suite
  • The basic drawing functions can replace Matlab (scatter plots, graphs, bar charts, etc.)
  • You can draw beautiful 3D drawing by mplot3D

​​​​​​​

Scikit-Learn

  • Machine Learning Python module
  • Built on scipy, it provides a common machine algorithms: clustering, regression
  • Easy to learn API interface

2. Based on the review of the mathematical matrix operations

basic concept

  • Matrix: a matrix array, ie two-dimensional array. Wherein the vector and scalar matrix is ​​a special case.
  • Vector: is a matrix 1xn or nx1
  • Scalar: 1x1 matrix
  • Array: N-dimensional arrays, matrix extension

Special Matrices

  • 1 0 Full Full matrix: values ​​are 0 or 1
  • Matrix: multiplying a diagonal equal to a diagonal, any matrix multiplication and matrix are equal to the original matrix, nxn.

​​​​​​​

Matrix addition and subtraction

  • Adding, subtracting two matrices must have the same columns and rows.
  • Columns and rows corresponding element addition and subtraction.

An array multiplication (dot)

  • Multiplication array (dot) is a multiplication between the corresponding elements

Guess you like

Origin blog.csdn.net/qq_34156628/article/details/91787816