Liu Zhiyuan, Tsinghua University: Where do good research ideas come from?

(This article is reproduced from the column of Teacher Liu Zhiyuan Zhihu, the original link: https://zhuanlan.zhihu.com/p/93765082 )

1. Background description

As the deadline for submission of ACL 2020 is approaching, I have intensive discussions with my classmates, arguing about which research ideas are suitable for submission to ACL and have a chance to hit. Judging from my more than ten years of research experience, how to judge whether a research idea is good or not, and where these research ideas come from, is indeed a difficult problem for beginners. So, I simply collected this short article and shared some experiences and ideas, hoping to be useful to new students who just entered the field of NLP. Please correct me if there are any mistakes.

There is a classic martial arts contest in Wong Kar-wai's movie "The Great Master". President Gong said to Ye Wen, "Today we are not better than martial arts, we are better than ideas." In fact, a good idea or idea is also the soul of an excellent research result. In the computer field, there is a popular saying "IDEA is cheap, show me the code", which also shows that for computer disciplines that value practice, the quality of an idea depends on its actual effectiveness. Let's talk about where good research ideas come from.

2. What counts as a good idea

In 2015, I wrote a joke on Weibo:

The ML school is located in the mountains of the United States of America. In the past century, martial arts talents have emerged in large numbers, and they have become the most famous school in the world. There are three sets of introductory martial arts in the gate: graph model plus circle, neural network layer, optimization goal plus regularity. There are nursery rhymes as proof: proficient in ML introductory skills, can't compose, and scorn .

In 2018, I continued a short paragraph:

Within a few years, the Northern DL God Sect has sprung up. Internal cultivation means learning, external training of neural networks, and numerous mental methods, such as door, attention, memory, confrontation, and enhancement. After the Battle of ImageNet, the Alpha Dog was unreachable. For a time, every family built alchemy furnaces, everyone was busy making alchemy, and the disciples gathered, and there were many adherents. There are nursery rhymes as evidence: big data in the left hand, Nvidia in the right hand, always busy with alchemy at the top.

The graph model adding circle, neural network adding layer, optimization goal adding regularity, gate, attention, memory, etc. in neural network are all innovative ideas to improve model performance, which are widely used and published by major NLP tasks. Papers, perhaps because they are repeatedly used and published by different NLP tasks, are somewhat aesthetically fatigued and lack deeper innovative ideas. Some netizens and scholars have criticized them as "irrigating". It does not seem to be a good idea.

So what is a good idea? I understand that the word "good" has at least two levels of meaning.

3. "Good" from the perspective of disciplinary development

The essence of academic research is the exploration of unknown fields and the search for answers to open questions. Therefore, from the perspective of promoting the development of the discipline, the standard for judging what is a good research idea is first of all in the word " new ".

There was a saying in the past that the subject of artificial intelligence has a curse. Any part of artificial intelligence that is solved (or has a solution) is no longer considered to represent "human intelligence." Computer vision, natural language processing, machine learning, and robotics are still listed as the main directions of artificial intelligence, perhaps because they have not yet been resolved and can still represent the dignity of "human intelligence." And we want to carry out innovative research, is to put forward new ideas to solve these problems. The word "new" can be embodied in raising new problems and tasks, exploring new solutions, proposing new algorithm technologies, and realizing new tool systems.

On the basis of ensuring "newness", whether the research idea is good or not depends on how much it contributes to the development of the discipline . The reason why deep learning has such a prominent influence is that it has a revolutionary impact on all important directions such as artificial intelligence natural language processing, speech recognition, and computer vision. It has completely changed the impact on unstructured signals (voice, image, Text) the technical route of the semantic representation.

4. "Good" from the perspective of research and practice

Isn’t that just enough "new" ideas? Is the newer the better? I think it should not be. Because only ideas that can be done are qualified to be analyzed. Therefore, from the perspective of research and practice, it is also necessary to consider the feasibility and verifiability of research ideas.

Feasibility is reflected in whether the idea has enough mathematics or machine learning tools to support its realization. Verifiability is reflected in whether the idea has an appropriate data set and widely accepted evaluation criteria. The reason why the ideas of many folk scientists are not recognized by the academic community is because these ideas often lack feasibility and verifiability. They only stay on the imaginary paper, just illusory ideas.

V. Where do good research ideas come from

Whether the idea is good or not is not a question of dichotomy of black and white, but a continuous distribution like a spectrum, which varies from time to time and suits each person. The development of the computer science and technology field has both a process of accumulation and a singularity of transition. Only when the amount of accumulation changes can there be a qualitative change. The third steamed bun is full, and it is also because the first two steamed buns make the bottom.

Now academic research has become a highly specialized profession, with a large group of researchers. "Publish or Perish" means that people who are engaged in academic professions (such as professors, researchers, graduate students) must do a good job of balancing, and cannot require that every job of a researcher is a "Nobel Prize" or "Turing Prize" level It's worth publishing. As long as it contributes to the development of the research field, it is worth publishing to help colleagues move forward. Lu Xun said: Geniuses are not monsters that grow up in the wilderness by themselves. They are produced and nurtured by people who can grow geniuses. Therefore, without such people, there will be no geniuses. This huge group of researchers is the mass basis for the growth of geniuses. At the same time, new academics are also carrying out innovative research and training, constantly honing their ability to find good ideas. Lu Xun also said: Even if a genius, the first cry at birth is the same as a normal child. A good poem.

So, where do good research ideas come from? I conclude that, first of all, we must have the ability to distinguish between good and bad research ideas. This requires a thorough and comprehensive understanding of the history and current situation of the research direction, specifically a comprehensive grasp of the subject literature. Humans are the most adept animals. They can use the ideas of different periods of research work in the existing literature as the learning objects, and understand their impact on the development of the subject after they are proposed-specifically reflected in the citations of papers, academic evaluations and other aspects- -Establish an evaluation model for the good and bad of research ideas. It is difficult for us to analyze and list all the feature vectors that distinguish between good and bad ideas, but the powerful learning ability of the human brain can automatically learn and establish a discriminant model in the neural network as long as it is given enough input data to learn from the past. Today, seeing Weizhi's works, this may be what is often called academic insight.

Students who have done some research will feel that they only read the literature of their own research direction, and there are still not many new ideas. This is because all I read are thoughts when the research question has been completed, and they themselves cannot inspire new ideas. How to generate new ideas? I conclude that there are three possible basic approaches:

Practice method. That is, to implement the best algorithms available on research tasks. By analyzing experimental results, for example, finding that these algorithms have extremely high computational complexity, particularly slow training convergence, or finding that the error examples of the algorithm show obvious patterns, which can inspire you Ideas to improve existing algorithms. The latest algorithms on the Leaderboard of many natural language processing tasks are based on analyzing error samples to improve the algorithms in a targeted manner [1].

Analogy. About to establish an analogy connection between the research problem and other tasks, investigate the latest effective ideas, algorithms or tools on other similar tasks, and apply them to the current research problems through reasonable conversion and migration. For example, the attention mechanism was a great success in neural network machine translation. At that time, attention was mainly established at the word level. Later, Lin Yankai and Shen Shiqi of our research group proposed to establish sentence-level attention to solve the remote supervision training data of relation extraction. Labeling the noise problem [2], this is an analogy.

Combination method. The new research problem is decomposed into a number of well-solved sub-problems, and a solution to the new research problem is established by organically combining the best practices on these sub-problems. For example, the pre-training language model that we propose to integrate knowledge graphs is a new model built by fusing existing algorithms such as BERT and TransE [3].

Just as the highest level of martial arts is that there are no tricks to win, good research ideas are not limited to the above paths. In many cases, they are based on the researcher’s deep understanding of the research problem, comprehensively rich research experience and ingenuity. The result of the "Epiphany". For beginners, it’s probably difficult to get a glimpse of the door. You need to start from basic skills, and after a lot of scientific research and practical training, can you have a sense of entering the room.

In the process of scientific research practice, in addition to understanding history through a large number of literature readings, and generating insights through in-depth thinking and summarization, there is also an indispensable work, that is, active and open academic exchanges and a sense of cooperation. The exchange and collision of ideas and results in different research fields not only provides a new source of innovative ideas, but also provides an opportunity for "analogy" and "insight". Knowing the history can tell that the proposal of artificial intelligence is the product of the intersection of mathematics, computer science, cybernetics, information theory, brain science and other disciplines. The origin of the popular deep learning, Parallel Distributed Processing (PDP) in the 1980s, is also the product of the collaboration of researchers in computer science, brain cognitive science, psychology, biology and other fields. Below is the cover of the first volume of the famous book "Parallel Distributed Processing: Explorations in the Microstructure of Cognition" published in 1986.

3456.jpg

The author talked about their cooperation process in the preface. For the first six months, they met twice a week to discuss research progress.

We expected the project to take about six months. We began in January 1982 by bringing a number of our colleagues together to form a discussion group on these topics. During the first six months we met twice weekly and laid the foundation for most of the work presented in these volumes.

The list of members of the PDP research group provided in the book still makes me marvel at its high degree of inter-institutional and inter-disciplinary features, 40 years later. Therefore, it is particularly recommended that students maintain an active sense of academic exchanges in the scientific research training under the premise of focusing on research issues. Whether it is listening to lectures, attending academic conferences, or elective courses, they will consciously broaden the breadth of academic exchanges. Not only to mingle with small colleagues, but also academic partners in research fields that seem to be out of reach. With the enrichment of research experience, you will feel more and more intensely that the more wide-span academic reports, the more inspired you will be, and more research ideas that excite yourself will be generated.

3457.jpg

6. What should beginners do

Compared with reading papers, writing papers, designing experiments, etc., how to generate good research ideas is a less rule-based link, and it is difficult to summarize a fixed paradigm to follow. Like a pony crossing the river, it takes a lot of training and practice to accumulate one's own research experience. However, for beginners, there are still a few simple and feasible principles to refer to.

The publishable value of a paper depends on the Delta between it and the most directly related work. Most of our research work is based on the work of our predecessors. Newton said: If I see farther than others, it is because I stand on the shoulders of giants. In my opinion, judging the value of a research idea in a thesis is to look at which giant or giants it stands on, and how far it has gone up on this basis. Conversely, before preparing to start a research work, when forming a research idea, it may be necessary to clarify which giant you are going to stand on and how you plan to go further. The Delta, which is directly related to the existing work, determines the value of this research idea.

Take care of picking fruits and chewing bones. People generally call the research ideas that are easier to think of as Low Hanging Fruit. Low-hanging fruits are easy to pick, but there are many people picking them at the same time. If you choose to pick fruits, you will easily be troubled by thoughts. For example, in 2018, the pre-training language model led by BERT made a major breakthrough, and a lot of improvement work occurred in mid-2019. Among them, the cross-modal pre-training model was taken as an example. In just a few months, http://arxiv.org More than six pre-trained models of image and text fusion from different teams have been published [4]. Put yourself in the position to think about it, and conduct cross-modal pre-training model research is a direction that is easier to think of. You must have the ability to predict. Knowing that there will be many teams in the world that also carry out this research at the same time, if you choose to enter The field must be more in-depth and distinctive, and have their own unique contributions. Relatively speaking, for those difficult problems, fewer people are willing to deal with it. It is also a good choice to dive into the hard bones. Of course, you will also face the risk of not being able to do it, or the risk of not getting too much attention when you do it. . Students need to take into account the two types of research ideas based on their own characteristics, experience and needs.

3458.jpg

Pay attention to the thematic continuity of multiple research work. The research training of students often lasts for several years. It is necessary to pay attention to the continuity of multiple research topics before and after, so as to ensure the unity of internal logic. You need to consider that you can put these research results together on your resume, in the Personal Statement when going abroad to apply for, or in various awards and presentations, and state your overall goals and overall ideas for carrying out these research work. Objectively speaking, the pace of research in the field of artificial intelligence is fast, and the technology is updated quickly, so the publication of results also tends to be miniaturized, short and fast. I have friends from business school and social sciences, and their research work often takes one year or even more than several years; the research cycle of high-performance computing and computer network is relatively long. The characteristics of artificial intelligence, such as small steps and fast running, determine that many students will publish multiple papers even when they graduate from undergraduates, not to mention masters and doctoral students. In this case, it is especially necessary to pay attention to the continuity and anaphoric relationship of the previous and subsequent work when researching the topic. Putting several research works together, it is impossible to say whether they are separated from each other, or are they working hard for a unified goal, which particularly reflects the overall awareness and layout ability of the research. For example, the picture below is the chapter setting of the doctoral dissertation "Network Representation Learning for Social Computing" when Dr. Tu Cunchao of our research group graduated in 2018. On the whole, it is better than "Several Important Issues of Social Computing" and other writing methods that are not inherently related. More convincing. Of course, for beginners, it is impossible to think clearly about the five-year research plan from the beginning. But thinking, or not thinking, the result is still different.

3459.jpg

Pay attention to summarize and grasp research dynamics and trends, and move with time. In 2019, Zhihu had such a question: "In the field of NLP in 2019, what valuable and promising work can individuals/teams with limited resources do?" My answer at the time was as follows:

I feel that the problems that the industry has begun to engage in grouping indicate that the main open problems have been almost solved, such as language recognition, face recognition, etc., which have been widely used in commercial applications in the past 20 years. Looking at the recent BERT and GPT-2, I understand that more is to maximize the ability of deep learning to fit large-scale data. Under the premise that the deep learning technology route is basically mature, large companies have strong computing capabilities. You can use more data, make the model bigger, and better fit the effect.

The entry of mature high-tech into commercial competition will roughly conform to the development law of Moore's Law. Now BERT and other training seem to be out of reach, but with the development of computing power and other factors, maybe in a few years, everyone can easily train BERT and GPT-2, and everyone will be on the same starting line. Eyes shifted to the next challenging problem.

So it is better to consider in advance which problems cannot be solved by pure data-driven technology. Difficult tasks in NLP and AI, such as common sense and knowledge reasoning, complex context and cross-modal understanding, and explainable intelligence, have not yet feasible solutions. Personally, I am not optimistic that data-driven methods can be completely solved. Higher-level cognitive abilities such as association, creation, and insight are even more untouched. These are the directions that far-sighted researchers should begin to pay attention to.

It needs to be noticed that the research dynamics and trends in different periods are different. Grasping these dynamics and trends can make results that are of interest to the research community. Otherwise, even if there is no change in the research results, just a few years earlier or later, the results will be very different. For example, word2vec published in 2013 and carried out research on word representation learning between 2014 and 2016. It is relatively easy to get admissions to conferences such as ACL and EMNLP; but in 2017-2018, words in ACL and other conferences indicate learning related Work is relatively rare.

Seven. Final supplement

This short article is mainly for beginners to introduce some experience and precautions in the process of seeking new ideas. I hope everyone will avoid some detours. But reading the literature, thinking deeply, receiving rejected manuscripts and making continuous improvements, you still have to eat. Academic research and publication of papers may mean high salaries and scholarships for individuals, but its ultimate goal is to really promote the development of the discipline. Therefore, to do academic research that can stand the test, the key lies in "true" and "new", which requires us to always abide by and diligently.

The famous historian and Tsinghua alumnus Mr. He Bingdi once mentioned in his autobiography "Sixty Years of Reading History and Reading the World" an exhortation from the famous mathematician Lin Jiaqiao: "The important thing is that no matter which line you are in, you should never do second-class questions. ." Specific to each field, the question of what is first-class is that different people have different opinions, but they actually point to an inner "truth-seeking" attitude.

 

references

[1] https://paperswithcode.com/ & http://nlpprogress.com/

[2] Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, Maosong Sun. Neural Relation Extraction with Selective Attention over Instances. The 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).

[3] Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu. ERNIE: Enhanced Language Representation with Informative Entities. The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).

[4] https://github.com/thunlp/PLMpapers:

 

Past review:

"2019 Artificial Intelligence Development Report"! Heavy release

Academician Zhang Bo won the Wu Wenjun Artificial Intelligence Highest Achievement Award, the second scientist in history to receive this honor!

"Artificial Intelligence Emotional Computing" (2019 Issue 6)

Guess you like

Origin blog.csdn.net/AMiner2006/article/details/103365894