Cluster & Tune: Boost Cold Start Performance in Text Classification，ACL2022 - 代码天地

Cluster & Tune: Boost Cold Start Performance in Text Classification，ACL2022

企业开发 2023-04-10 05:15:22 阅读次数: 0

在这里插入图片描述
在真实的场景中，当标记数据稀缺时，文本分类任务通常存在冷启动的现象（也就是标记文本数量太少时很容易过拟合）。本文提出了一种方法来提高这类模型的性能，即在预处理阶段和微调阶段之间增加一个中间的无监督分类任务。作为中间任务，进行聚类，并训练预训练模型预测聚类标签。本文在各种数据集上测试了这个假设，结果表明，当可用于微调的标记实例数量只有几十到几百个时，这个额外的分类阶段可以显著提高性能，主要是针对主题分类任务。
之所以可以这样做，因为合理的中间任务有望为最后的微调阶段提供一个更好的起点，在目标任务可用的稀缺标记数据上执行，旨在最终提高性能。尽管这两个任务并不是相关联的，但是在对域内数据底层语义的学习上还是大有裨益的。
而本文采纳的无监督方法也是最为简单的一种：以BOW学习的文本表示，将未标记的训练数据划分为与文本实例的相对同构簇。接下来，将这些聚类作为中间文本分类任务的标记数据，并在最终对实际目标任务标签进行微调之前，针对这个多类问题，对预训练模型进行带有或不带有额外的MLM预训练。鉴于MLM任务以及聚类、微调三种任务可以任意组合，本文所提出的总体结构图如下：
在这里插入图片描述
step1-4分别对应了四种不同的处理流程：

直接对下游任务进行微调。
先对领域做MLM的自监督学习，再对下游进行微调。
聚类之后再做微调。
MLM自监督之后

猜你喜欢

转载自blog.csdn.net/qq_36618444/article/details/124527381

Cluster & Tune: Boost Cold Start Performance in Text Classification，ACL2022

A Contrastive Learning Approach for Hierarchy Text Classification，ACL2022

tune app performance

paper name:How to Fine-Tune BERT for Text Classification?

《How to Fine-Tune BERT for Text Classification》-阅读心得

BOOST lookup performance

cold-start problem（推荐系统）

How to Tune Performance of Informatica Lookup Transformation

EDA Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

Bert微调技巧实验大全-How to Fine-Tune BERT for Text Classification

【论文解读】(如何微调BERT？) How to Fine-Tune BERT for Text Classification?

【KD】2022 ICLR Cold brew

Performance Tool(3)Gatling Upgrade and Cluster

db mysql / mysql cluster 5.7.19 / performance

ubuntu start with text mode

Failed to start A high performance web server and a reverse

【基础】.text .global _start和_start

元学习<A Meta-Learning Perspective on Cold-Start Recommendations for Items>论文解读

Text Classification

A cold welcome

[大语言模型应用于推荐系统]Large Language Models are Competitive Near Cold-start Recommenders for Language- and I

【Failed to start nginx - high performance web server.】

tune a video:one-shot tuning of image diffusion models for text-to-video generation

【论文精读】Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

json 解析 A JSONArray text must start with '['

详解tkinter.Text.get(start, end)

报错：XML or text declaration not at start of entity

error: XML or text declaration not at start of entity.

performance

启动Tomcat 失败（Unable to start cluster）及解决方法

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

周排行

Python环境安装与基础语法（1）——计算机基础知识

IMU预积分

ADAS中的LDW、FCW、BSD、LCA、ACC、AEB、APA、DMS代表的含义

B站笔试两道题

skyeye arm 硬件虚拟机环境的搭建

Web前端静态页面示例

数组-合并排序数组 II-简单

springcloud之版本问题启动报错

面向对象-------------匿名对象(六)

输入URL到页面呈现中间发生了什么？

每日归档

更多

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)