版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/wuxintdrh/article/details/84258032
Spark MLlib
一、Data Types - MLlib
- Local vector
- Labeled point
- Local matrix
- Distributed matrix
- RowMatrix
- IndexedRowMatrix
- CoordinateMatrix
- BlockMatrix
二、Basic Statistics
三、Classification and Regression
监督学习中,如果预测的变量是离散的,我们称其为分类(如决策树,支持向量机等),如果预测的变量是连续的,我们称其为回归
- Linear models
- classification (SVMs, logistic regression)
- linear regression (least squares, Lasso, ridge)
- Decision trees
- Ensembles of decision trees
- random forests
- gradient-boosted trees
- Naive Bayes
- Isotonic regression