Python Reference in Data Analysis / Mining Tools

If you are already familiar with the module/package loading methods of Python and R, the following table is relatively easy to find.

Python is referenced in the following table as a module. Some modules are not native modules. Please use pip install * to install;

For the same reason, in order to facilitate indexing, R also refers to:: indicates the function and the name of the package where the function is located. If it does not contain :: indicates that it is in the default package of R, such as::, please use install.packages("*") to finish the installation.

Connector & IO

Mechine Learning

Category

Subcategory Python
  LDA sklearn.discriminant_analysis.LinearDiscriminantAnalysis
  QDA sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
SVM (Support Vector Machine) Support Vector Classifier (SVC) sklearn.svm.SVC
SVM (Support Vector Machine) Non-support vector classifier (nonSVC) sklearn.svm.NuSVC
SVM (Support Vector Machine) Linear Support Vector Classifier (Lenear SVC) sklearn.svm.LinearSVC
Based on proximity K-proximity classifier sklearn.neighbors.KNeighborsClassifier
Based on proximity Radius proximity classifier sklearn.neighbors.RadiusNeighborsClassifier
Based on proximity Nearest Centroid Classifier sklearn.neighbors.NearestCentroid
Bayes Naive Bayes sklearn.naive_bayes.GaussianNB
Bayes Multinomial Naive Bayes sklearn.naive_bayes.MultinomialNB
Bayes Bernoulli Naive Bayes sklearn.naive_bayes.BernoulliNB
DecisionTree DecisionTree Classifier sklearn.tree.DecisionTreeClassifier
DecisionTree DecisionTree Regressor sklearn.tree.DecisionTreeRegressor
Assemble Method Bagging Random Forest Classifier sklearn.ensemble.RandomForestClassifier
Assemble Method Bagging Random Forest Regressor sklearn.ensemble.RandomForestRegressor
Assemble Method Boosting Gradient Boosting xgboost Module
Assemble Method Boosting AdaBoost sklearn.ensemble.AdaBoostClassifier
Cluster kmeans scipy.cluster.kmeans.kmeans
Cluster Hierarchical Cluster scipy.cluster.hierarchy.fcluster
Cluster DBSCAN sklearn.cluster.DBSCAN
Cluster Birch sklearn.cluster.Birch
Cluster K-Medoids Cluster

pyclust.KMedoids(Unknown reliability)

Association Rule Apriori Algorithm

apriori(Unknown reliability, not support py3),
PyFIM(Unknown reliability, unable to install with pip)

Association Rule FP-Growth Algorithm

fp-growth(Unknown reliability, not support py3),
PyFIM(Unknown reliability, unable to install with pip)

Neural Network Neural Network neurolab.net, keras.*
Neural Network Deep Learning keras.*

Database

Category Python
MySQL mysql-connector-python(Official)
Oracle cx_Oracle
Redis redis
MongoDB pymongo
neo4j py2neo
Cassandra cassandra-driver
ODBC pyodbc
JDBC Unknown[Jython Only]

IO

Category Python
excel xlsxWriter, pandas.(from/to)_excel, openpyxl
csv csv.writer
json json
图片 PIL


Statistics

Category Python
描述性统计汇总 scipy.stats.descirbe
均值 scipy.stats.gmean(几何平均数), scipy.stats.hmean(调和平均数), numpy.mean, numpy.nanmean, pandas.Series.mean
中位数 numpy.median, numpy.nanmediam, pandas.Series.median
众数 scipy.stats.mode, pandas.Series.mode
分位数 numpy.percentile, numpy.nanpercentile, pandas.Series.quantile
经验累积函数(ECDF) statsmodels.tools.ECDF
标准差 scipy.stats.std, scipy.stats.nanstd, numpy.std, pandas.Series.std
方差 numpy.var, pandas.Series.var
变异系数 scipy.stats.variation
协方差 numpy.cov, pandas.Series.cov
(Pearson)相关系数 scipy.stats.pearsonr, numpy.corrcoef, pandas.Series.corr
峰度 scipy.stats.kurtosis, pandas.Series.kurt
偏度 scipy.stats.skew, pandas.Series.skew
直方图 numpy.histogram, numpy.histogram2d, numpy.histogramdd

Regression (including statistics and machine learning)

类别 Python
普通最小二乘法回归(ols) statsmodels.ols, sklearn.linear_model.LinearRegression
广义线性回归(gls) statsmodels.gls
分位数回归(Quantile Regress) statsmodels.QuantReg
岭回归 sklearn.linear_model.Ridge
LASSO sklearn.linear_model.Lasso
最小角回归 sklearn.linear_modle.LassoLars
稳健回归 statsmodels.RLM

Hypothetical Test

类别 Python
t检验 statsmodels.stats.ttest_ind, statsmodels.stats.ttost_ind, statsmodels.stats.ttost.paired; scipy.stats.ttest_1samp, scipy.stats.ttest_ind, scipy.stats.ttest_ind_from_stats, scipy.stats.ttest_rel
ks检验(检验分布) scipy.stats.kstest, scipy.stats.kstest_2samp
wilcoxon(非参检验,差异检验) scipy.stats.wilcoxon, scipy.stats.mannwhitneyu
Shapiro-Wilk正态性检验 scipy.stats.shapiro
Pearson相关系数检验 scipy.stats.pearsonr

Time series

Category Python
AR statsmodels.ar_model.AR
ARIMA statsmodels.arima_model.arima
VAR statsmodels.var_model.var

猜你喜欢

转载自www.cnblogs.com/aiden-liu/p/10773803.html