[Deep Learning | Transformer] Transformers Tutorial: One-click prediction of pipeline

I. Introduction

Transformers is a library of pretrained state-of-the-art models for natural language processing (NLP), computer vision, and audio and speech processing tasks. The library contains not only Transformer models, but also non-Transformer models such as modern convolutional networks for computer vision tasks.

pipeline()Multiple models can be loaded to make reasoning easy, and even if you don't have experience with a particular modality or aren't familiar with the underlying code behind the model, you can still use them for reasoning pipeline().

2. Computer vision

2.1 Image classification

Label images from a set of predefined classes.

from transformers import pipeline
classifier = pipeline(task="image-classification")
preds = classifier(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)

preds = [{
    
    "score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]

The output is:

{
    
    'score': 0.4335, 'label': 'lynx, catamount'}
{
    
    'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}
{
    
    'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}
{
    
    'score': 0.0239, 'label': 'Egyptian cat'}
{
    
    'score': 0.0229, 'label': 'tiger cat'}

2.2 Object detection

Object detection identifies image objects and their locations in the image.

from transformers import pipeline
detector = pipeline(task="object-detection")
preds = detector(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)

preds = [{
    
    "score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]

The output is:

[{
    
    'score': 0.9865,
  'label': 'cat',
  'box': {
    
    'xmin': 178, 'ymin': 154, 'xmax': 882, 'ymax': 598}}]

2.3 Image segmentation

Image segmentation is a pixel-wise task that assigns each pixel in an image to a class.

from transformers import pipeline
segmenter = pipeline(task="image-segmentation")
preds = segmenter(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)

preds = [{
    
    "score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]

The output is:

{
    
    'score': 0.9879, 'label': 'LABEL_184'}
{
    
    'score': 0.9973, 'label': 'snow'}
{
    
    'score': 0.9972, 'label': 'cat'}

2.4 Depth estimation

Predict the distance of each pixel in the image from the camera.

from transformers import pipeline
depth_estimator = pipeline(task="depth-estimation")
preds = depth_estimator(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)

3. NLP

3.1 Text classification

Tokenize a sequence of text from a set of predefined classes.

from transformers import pipeline
classifier = pipeline(task="sentiment-analysis")
preds = classifier("Hugging Face is the best thing since sliced bread!")

3.2 Token classification

Assign each token a label in the defined category.

from transformers import pipeline
classifier = pipeline(task="ner")
preds = classifier("Hugging Face is a French company based in New York City.")

3.3 Question answering

Returns answers to questions, sometimes with context (open domain) and sometimes without context (closed domain).

from transformers import pipeline
question_answerer = pipeline(task="question-answering")
preds = question_answerer(
    question="What is the name of the repository?",
    context="The name of the repository is huggingface/transformers",
)

3.4 Summarization

Creates shorter versions from longer text while attempting to preserve most of the meaning of the original document.

from transformers import pipeline
summarizer = pipeline(task="summarization")
summarizer(
    "In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers. On both WMT 2014 English-to-German and WMT 2014 English-to-French translation tasks, we achieve a new state of the art. In the former task our best model outperforms even all previously reported ensembles."
)

3.5 Translation

Convert one language to another language.

from transformers import pipeline
text = "translate English to French: Hugging Face is a community-based open-source platform for machine learning."
translator = pipeline(task="translation", model="t5-small")

3.6 Language modeling

3.6.1 Predicting the Next Word in a Sequence

from transformers import pipeline
prompt = "Hugging Face is a community-based open-source platform for machine learning."
generator = pipeline(task="text-generation")

3.6.2 Predicting a masked token in a sequence

text = "Hugging Face is a community-based open-source <mask> for machine learning."
fill_mask = pipeline(task="fill-mask")

Guess you like

Origin blog.csdn.net/wzk4869/article/details/130515395