Attendance Record|2022 CNCC Chinese Computer Conference Attendance Summary

foreword

The 19th CNCC will be held on December 8-10, 2022. The three-day conference will be held online for the first time. The theme is "computing power, data, ecology", focusing on maintaining diversity and focusing on hot and cutting-edge topics , Balance the participation of academia and industry and other dimensions to discuss. The conference is chaired by Wang Huaimin, member of CCF, academician of the Chinese Academy of Sciences, and professor of National University of Defense Technology. There are 14 specially invited reports, 3 conference forums, 118 technical forums and special activities involving more than 30 fields. Award winner, Professor Jack Dongarra of the University of Tennessee, Academician of the Chinese Academy of Sciences, Professor Qian Depei of the School of Computer Science, Beihang University, Academician of the Chinese Academy of Sciences, Professor Guan Xiaohong of Xi'an Jiaotong University, Academician of the Chinese Academy of Engineering, Professor Zhang Ping of Beijing University of Posts and Telecommunications More than 700 A lecturer in the field of computing will give a report. This article summarizes the two reports related to NLP in this conference.

CNCC 2022 China Computer Conference


Report 1 - Less Annotated NLP Discussion

The current mainstream natural language processing models are highly dependent on large-scale labeling data. However, due to the characteristics of natural language processing tasks such as high difficulty in labeling, many types of tasks, large differences in domains, and endless emergence, the amount of labeled data for specific tasks is often small. Therefore, it is of great significance to study how to build a high-precision natural language processing system based on a small amount of labeled data. However, because natural language processing also has the characteristics of knowledge dependence, symbolic representation, and task diversity, the existing few-label learning methods are often incapable of facing natural language processing problems. This forum will invite a number of natural language processing experts to conduct in-depth discussions on the latest research progress and future development directions of less-labeled natural language processing theories and methods.

This report was presided over by Professor Che Wanxiang from Harbin Institute of Technology. There were 4 sub-reports, which were lectured by Professor Zhang Yue from West Lake University, Professor Chen Huajun from Zhejiang University, Professor Liu Zhiyuan from Tsinghua University and Professor Qiu Xipeng from Fudan University.

  • In the first report, Mr. Zhang Yue mentioned the robustness of language models in cross-domain scenarios, and for the first time tried to use prompt learning (Prompt Learning) for named entity recognition. In addition, the use of data augmentation can greatly improve the The small sample learning ability of the model in the same distribution and cross distribution scenarios.

    After the end of this report, Mr. Che Wanxiang raised a question about the leap of model capabilities : Will the large model automatically handle OOD and other problems? Teacher Zhang Yue said that this issue deserves continuous attention.

    Personal thinking: If the language model finds a Shortcut during the training process, it will be opportunistic, and the generalization will decrease accordingly. Is it true that for models with low security (such as easy to implant backdoors), it can be used according to its generalization? to identify?

  • In the second report, Mr. Chen Huajun extended the two major tasks of LRL4KG and KG4LRL around the two concepts of Knowledge Graph and Low Resource Learning, and summarized for the KG4LRL scenario as follows: large samples rely on machine learning, small The samples are reasoned by knowledge, which shows the importance of knowledge in low-resource scenarios.

  • In the third report, Mr. Liu Zhiyuan focused on the theme of "Delta Tuning: Efficient Fine-tuning of Small Parameters for Large Models", and compared the difference between Fine-tuning and Prompt-learning for the paradigm of "pre-training + fine-tuning". How to better apply large models to downstream tasks while fine-tuning parameters in a small range. Development report.

    For OpenDelta-related work, see specific papers: Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models (arXiv, 2022)

  • In the fourth report, Mr. Qiu Xipeng made a report on "Language Model as a Service" (Language Model as a Service, LMaaS), and there are two main challenges: (1) Construct a model suitable for all nlp tasks, that is, One Model Fits All task (2) Design different fine-tuning methods, such as y-Tuning, Black Box-Tuning and other research work mentioned in the report.

    Related papers:

In this report, several teachers all mentioned the importance of knowledge for language models in the era of large models. The training of language models is inseparable from the support of knowledge. In the development process of language models, especially large-scale models, the support of a large amount of knowledge is indispensable, which is also the general trend.


Report 2 - Research on Modern Text Summarization Technology

In recent years, the emergence of pre-trained language models has greatly promoted the progress in the field of natural language processing. As one of the most classic tasks in the field of natural language processing, text summarization, what technological changes have taken place? Consistency of facts and low resources have become new research hotspots; at the same time, Internet companies such as Google and Amazon have successively launched online summarization services for different fields, establishing new application scenarios for the development of summarization technology. And other new fields have set off a new wave of exploration. This "Modern Text Summarization Technology Research" forum not only hopes to discuss the latest issues of summarization technology, but also hopes to dig deeper into how to construct scientific problems and method models with its own characteristics under the blessing of large-scale model technology. To this end, five guests were specially invited to discuss from multiple dimensions: natural language generation technology, scientific literature summarization, factual consistency research of summaries, dialogue summarization and low-resource text summarization technology.

This report is co-hosted by Professor Qin Bing from Harbin Institute of Technology and Professor Wan Xiaojun from Peking University. There are 5 sub-reports, respectively by Professor Huang Minlie from Tsinghua University, Dr. Associate Professor Gao Yang and Associate Professor Yan Rui from Renmin University of China gave lectures.

  • In the first report, Mr. Huang Minlie started with ChatGPT and started his report on "the future of natural language generation". The outline is as follows:

    • NLG Challenges and Opportunities

      • challenge:
        1. Hard to improve model performance
        2. The timeliness of the model (model iteration speeds up)
        3. Incremental cost of resources and time
      • opportunity:
        1. New tasks and application scenarios
        2. new generation method
    • Universal LM (understanding of pre-trained language models)

      • Understanding: Explore the knowledge learned by the pre-trained language model to better complete downstream tasks
      • Universality: Interpretability of the model - why pre-trained language models can adapt to many downstream tasks
      • Reliability: Prompt exploration, choose the best prompt
    • Long text generation: facing problems include (1) controllability (2) repetition (3) coherence (4) conflict

    • Non-Autoregressive Text Generation (NATG): The generated text is decoded at the same time, the inference speed is accelerated, and there is no exposure bias, more flexible decoding methods Future direction: machine translation ➡️ General text generation such as dialogue
      generation

    • Evaluation (evaluation of text generation)

    • Summarize:

      • Large-scale online deployment of generative models faces two bottlenecks: (1) computing power consumption (2) decoding speed
      • Security and controllability issues to be resolved: detection algorithms, generating more secure

It is recommended to play the small program "AI Utopia" here, a personalized AI creation engine

AI Utopia, a personified AI creation engine

  • In the second report, teacher Xiao Xinyan from Baidu gave a report on the theme of "Reliable Text Generation for Factual Consistency". Factual Consistency Is Reliability, Looking Ahead: Methods for Reliable Text Generation and Evaluation.

  • In the third report, Mr. Feng Xiaopin delivered a report on the theme of "Knowledge-Guided Dialogue Abstract Technology Research". Regarding conversation summarization, there are already some applications in the industry such as Google: Conversation Summary & Amazon: Call Summarization & Microsoft: Call Summarization & Headroom: Meeting Summarization. Dialogue
    Open text generation (by text length)
    summary = conversation understanding + summary generation . Example conversation summary:

    Input text :
    Conversation summary input text
    Output text :
    Dialog summary output text
    Unlike traditional text summarization tasks, dialogue summarization faces challenges such as dialogue as the source content, data scarcity, dialogue modeling, and scene understanding. In addition, the report mentioned the topic drift phenomenon, that is, for the input long dialogue text, there may be multiple topics, and the topic needs to be refined first.

    Future Directions : Multimodal, Multi-Domain, Multilingual, Reliable Dialogue Summarization

  • In the fourth and fifth reports, Mr. Gao Yang and Mr. Yan Rui gave reports on the themes of "Low-resource-oriented Text Abstract Generation Technology" and "Learning towards Abstractive Text Generation" respectively. Since the research work introduced in these two reports is more concrete, there is no detailed record.


In addition, I learned a concept in other reports at the conference: MLOps, MLOps is the abbreviation of Machine Learning Operations, an engineering discipline that aims to unify ML system development (dev) and ML system deployment (ops) to standardize process production Continuous delivery of high-performance models. See this blog for details .


References

  1. 2022 China Computer Conference (CNCC 2022)
  2. 2022 China Computer Conference (CNCC 2022) Conference Manual
  3. Demystifying the mechanism behind the large model, Tsinghua's 49-page long article comprehensively analyzes the parameters of the efficient fine-tuning program Delta Tuning - Tencent Cloud Developer Community - Tencent Cloud (tencent.com)
  4. "Language Model as a Service" must-read papers - Zhihu (zhihu.com)

Guess you like

Origin blog.csdn.net/qq_36332660/article/details/128636527