Learn Pycaret while writing code

Introduction to PyCaret

PyCaret is an open source library for simplifying Python machine learning workflows. It provides a high-level, low-code interface for automating various aspects of the machine learning process, making it easier for data scientists and analysts to build and deploy Machine learning model. Some of PyCaret's key features and uses include:

1. Automated Machine Learning (AutoML): PyCaret can automate many tedious and time-consuming tasks in machine learning, such as data preprocessing , feature selection, hyperparameter tuning, model selection and evaluation.

2. Simplified Workflow: It provides a consistent and organized workflow, allowing users to execute with just a few lines of code Common machine learning tasks.

3. Model selection: PyCaret supports a variety of machine learning algorithms to help users quickly compare and select the model most suitable for specific tasks.

4. Hyperparameter adjustment: It automates the process of hyperparameter adjustment and helps users find the best hyperparameters of the model.

5. Model evaluation: PyCaret provides comprehensive model evaluation and comparison indicators to help users understand the performance of the model and make informed decisions. decision making.

6. Data preprocessing: It automates data preprocessing steps such as missing value imputation, categorical encoding and feature scaling.

7. Visualization: PyCaret provides a variety of visualization tools to help users explore data, understand model performance, and make data-driven decisions .

8. Model deployment: It allows users to easily deploy machine learning models to production environments and is therefore suitable for creating end-to-end machine learning assembly line.

9. Anomaly Detection: PyCaret also includes anomaly detection functionality that helps detect outliers in the data set.

10. Natural Language Processing (NLP): It supports natural language processing tasks, making it suitable for various machine learning applications.

PyCaret is especially useful for data scientists and analysts who want to quickly try out different machine learning models and techniques without having to write tons of code for each step. It is designed to reduce the time and effort required to build and deploy machine learning models, making it a valuable tool in the data science and machine learning communities.

Sample code(Classification)

1. setup

This function initializes the training environment and creates the conversion pipeline. The setup function must be called before any other function is executed. It requires two required parameters: data and target. All other parameters are optional.

import pandas as pd
from pycaret.datasets import get_data

pd.set_option('display.max_columns', None)
data = get_data('diabetes')

PyCaret 3.0 has two APIs. You can choose one of them according to your preference. The function is consistent with the experimental results.

Functional API

from pycaret.classification import *
s = setup(data, target = 'Class variable', session_id = 123)

OOP API

from pycaret.classification import ClassificationExperiment
s = ClassificationExperiment()
s.setup(data, target = 'Class variable', session_id = 123)

2. Compare Models

This function uses cross-validation to train and evaluate the performance of all estimators available in the model library. The output of this function is a scoring grid with average cross-validation scores. Metrics evaluated during CV can be accessed using the get_metrics function. Custom metrics can be added or removed using the add_metric and remove_metric functions.

# functional API
best = compare_models()

# OOP API
best = s.compare_models()

3. Analyze Model

This function analyzes the performance of the trained model on the test set. In some cases it may be necessary to retrain the model.

# functional API
evaluate_model(best)

# OOP API
s.evaluate_model(best)

evaluate_model can only be used in Notebook because it uses ipywidget. You can also generate plots individually using the plot_model function.

# functional API
plot_model(best, plot = 'auc')

# OOP API
s.plot_model(best, plot = 'auc')

# functional API
plot_model(best, plot = 'confusion_matrix')

# OOP API
s.plot_model(best, plot = 'confusion_matrix')

4. Predictions

This function scores the data and returns the prediction_label and prediction_score probabilities for the predicted class). When data is None, it predicts labels and scores on the test set (created during setting up the function).

# functional API
predict_model(best)

# OOP API
s.predict_model(best)

Evaluation metrics are calculated on the test set. The second output is a pd.DataFrame containing the predictions for the test set (see the last two columns). To generate labels on an unseen (new) dataset, simply pass the dataset into the data parameter under the Predict_model function.

# functional API
predictions = predict_model(best, data=data)
predictions.head()

# OOP API
predictions = s.predict_model(best, data=data)
predictions.head()

The score represents the probability of the predicted class (not the positive class). If prediction_label is 0 and prediction_score is 0.90, it means the probability of class 0 is 90%. If you want to see the probabilities for both classes, just pass raw_score=True in the predict_model function.

# functional API
predictions = predict_model(best, data=data, raw_score=True)
predictions.head()

# OOP API
predictions = s.predict_model(best, data=data, raw_score=True)
predictions.head()

5. Save the model

# functional API
save_model(best, 'my_best_pipeline')

# OOP API
s.save_model(best, 'my_best_pipeline')

6. To load the model back in environment:

# functional API
loaded_model = load_model('my_best_pipeline')
print(loaded_model)

# OOP API
loaded_model = s.load_model('my_best_pipeline')
print(loaded_model)

Error encountered

1. On a mac computer, I encountered a problem when installing pycaret "Could not build wheels for lightgbm, which is required to install pyproject.toml-based projects"Could not build wheels for lightgbm, which is required to install pyproject.toml-based projects"

  note: This error originates from a subprocess, and is likely not a problem with pip.

  ERROR: Failed building wheel for lightgbm

Failed to build lightgbm

ERROR: Could not build wheels for lightgbm, which is required to install pyproject.toml-based projects

Solution

First run brew install libomp, then install pip install pycaret

2. lib_lightgbm.so' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e))

Solution:

conda install \
   --yes \
   -c conda-forge \
   'lightgbm>=3.3.3'

python - Is LightGBM available for Mac M1? - Stack Overflow

Guess you like

Origin blog.csdn.net/keeppractice/article/details/133826184