MIT's Ph.D. experience: What do you need to pay attention to when doing AI research?

Translation | reason_W

Editor | Aspirin

Produced | AI Technology Base Camp (Public ID: rgznai100)

AI research is so hot, why can't I just face the sky full of formulas? There are so many fields, where should I start and what should I learn? When those big guys were undergraduates, they could send their papers to the top conference. Why do I have a master's/doctoral degree for a year now, and I haven't even touched the edge of scientific research? How can I find a research path that suits me? Don't worry, even a Ph.D. at MIT will have the same mental journey as you.

The following article is about the research experience of Tom Silver, a second-year PhD student at MIT. Tom Silver is a graduate of Harvard University majoring in computer science and mathematics, and has had internship experience in many well-known AI laboratories or companies such as Sabeti Lab, Google, and Vicarious. I believe his scientific research experience will give you some inspiration.

The following content is translated by AI Technology Base Camp:

My friend is about to start AI research, and I happened to be two years earlier than him, so he has been asking me about his research experience recently. This article is a summary of my two-year research experience, which includes both ordinary life insights and research skills. I hope it will help readers and friends.

▌Getting Started

Find the right person to ask the "stupid question"

When I first started doing research, I was often afraid to ask my colleagues for advice, for fear of asking questions that sounded too unprofessional and would be looked down upon by them. It took several months for this to ease, but I was still very cautious for fear of missing a beat. But now I have a few confidantes with whom I can discuss issues directly. Wish I could have known them sooner!

In the past, when I encountered problems, I directly googled them, and the links and information on the screen often made me very confused; but now whenever I encounter a problem, I can directly raise it and discuss it with everyone, instead of solving it by myself. .

Find research inspiration in different places

Deciding what to do next can often be the most difficult part of a research career for many people. Here are a few strategies that are commonly used by researchers:

  • Communicate with people from different fields of study . Ask them questions that interest them, and try to rephrase them in computer-specific terms. Ask them if they have any datasets they want to analyze but are difficult to solve with existing technology. Much of the most impactful work in machine learning comes from the collision of computers with biology/chemistry/physics, social sciences, or pure mathematics. For example, the paper ( Composing graphical models with neural networks for structured representations and fast inference ) published by Matthew Johnson et al. at NIPS 2016 was inspired by a mouse behavior data set; another example is Justin Gilmer et al. at ICML 2017 conference. Paper ( Neural Message Passing for Quantum Chemistry ), which applies machine learning methods to the study of quantum chemistry.

  • Write a simple baseline code to get a feel for the problem . For example, try writing some code that controls an inverted pendulum and calibrate it carefully, or try implementing a bag-of-words model on a natural language dataset. When writing baselines, I often encounter unexpected situations - errors in the mental model or in the code. Even if my baseline code works, I usually try some other ideas to get a better understanding of the problem.

  • Expand the experimental part of your favorite paper . Read carefully the methods and results of those papers. Try to find the most valuable part of it. First we can consider some of the simplest extensions and ask ourselves whether the approach in the paper is applicable. Then consider the baseline methods that the article doesn't discuss, and think about where those methods might fail.

Master visualization tools and skills

When writing code, my usual strategy is to start by creating a visual script. When I'm done writing other code, visual scripting helps me quickly verify that the code matches my mental model. More importantly, good visualization often makes it easier for me to spot errors in thinking or code than other methods. Another reason is self-motivation: every time I finish a piece of code, I can show off a nice diagram or video to everyone!

Of course, getting the visualization right for the problem at hand may require some skill. If it is an iterative optimization model (such as deep learning), you can start by drawing the loss function curve. There are also many techniques that can be used to visualize and interpret the learned weights of (especially convolutional) neural networks, such as guided backpropagation .

In reinforcement learning and planning, the things that need to be visualized are obvious, the behavior of an agent in an environment, such as an Atari game, a robotic task, or a simple Grid World (such as the environment in OpenAI Gym). With different settings, we can also visualize the value function and its changes during training (shown below), or visualize the state tree that has been traversed.

When working with graphical models, there is a lot of information that can be gained by visualizing changes in the distribution of 1D or 2D variables during inference (see below). One way to measure the effectiveness of visualization techniques is to estimate the amount of information you need to have in your head each time you analyze a visualization. A bad visualization will require you to do a detailed review of the code you've written, while a good visualization will draw conclusions.

Tensorboard is a popular GUI for visualization of Tensorflow deep learning models

Plotting distributions as evidence accumulates to make debugging of graphical models easier (from Wikimedia).

The value function learned through Q-learning can be visualized in the Grid World it represents (by Andy Zeng).

Learn to identify basic starting points for researchers and papers

While many researchers will publish at the same conferences, use the same terminology, and all claim to be in the field of artificial intelligence, their motivations are likely to be quite the opposite. Some even want to rename the field to solve the problem (Michael Jordan in a recent article calling for a name change for the field https://medium.com/@mijordan3/artificial-intelligence-the-revolution-hasnt-happened -yet-5e1d5812e1e7 ). There are at least three main perspectives in this field, namely "mathematics", "engineering" and "cognition".

  • From a "mathematical" perspective: what are the fundamental properties and limitations of intelligent systems?

  • From an "engineering" perspective: how can we develop intelligent systems that can better solve real-world problems?

  • From a "cognitive" perspective: how should we model the natural intelligence found in humans and other animals?

These starting points do not conflict, and many interesting things in the field of AI are very interesting from any perspective. In addition, individual researchers are often touched by different angles, which contributes to the convergence of the AI ​​field.

Of course, the starting point may also be inconsistent. I have friends and colleagues who clearly focus on the "engineering" angle, while others are primarily interested in "biology". If there is an article showing that some clever combination of existing technologies can surpass the current state-of-the-art on the baseline, there is a good chance that engineers will be interested in the article, while cognitive scientists may not be interested at all, or even interested in it. It scoffed. An article that illustrates biological plausibility, or cognitive connections, but remains theoretical or has no serious results, may receive diametrically opposite responses from the two types of researchers.

Good papers and researchers state their starting point at the outset, but the underlying motivation behind it can be hidden deep. If the starting point is not obvious, it can be helpful to analyze the article from multiple angles.

▌Drawing nourishment from the research community

find papers

There are tons of AI papers on arXiv, and they're free to view. In addition to the rapid increase in the number of papers, the large number of active users in the community also reduces the difficulty of finding high-quality papers. Li Feifei's student Andrej Karpathy built the arXiv sanity preserver, which helps us sort, search, and filter relevant articles. Miles Brundage often tweets a curated list of arXiv papers every night; much of this task is done by the Brundage Bot. Many other Twitter users also share interesting references from time to time - I suggest you follow some researchers you are interested in on Twitter.

If you like using Reddit, then consider using r/MachineLearning , but these posts tend to be more geared towards machine learning engineers than academic researchers. Jack Clark publishes a weekly community newsletter called "Import AI" (https://jack-clark.net/ ), and Denny Britz has a column called " The Wild Week in AI ".

A collection of papers from some AI conferences is also worth watching. The three top conferences in the field of machine learning are NIPS, ICML and ICLR. Other conferences include AAAI, IJCAI and UAI. There will also be more specific conferences specific to each sub-discipline of this field. Conferences in computer vision include CVPR, ECCV, and ICCV; conferences in natural language processing include ACL, EMNLP, and NAACL; conferences in robotics include CoRL, ICAPS, ICRA, IROS, and RSS; for more theoretical work, focus on AISTATS , COLT and KDD these conferences. These conferences are currently the main channel for publishing AI papers, and of course there are also some journals. JAIR and JMLR are the two most important journals in this field. Occasionally more advanced articles appear in scientific journals such as Nature and Science.

It's also important to consult classic papers, but it's also harder. The names of those classic papers are often in the references of many articles, or in the recommended reading list of graduate courses. Another way to discover classic papers is to start with senior professors in the field and look for their earlier work, i.e. their research trajectories, and also email these professors to request more references (of course, if they are too busy And no reply, don't mind). And for older articles, a Google Scholar keyword is a good idea.

How much time should I spend reading my dissertation?

There are two suggestions I often hear about how much time people spend learning about previous research work. First, if you're just starting out, read all the papers! It is often said that the first semester or first year of a graduate student should only be the thesis. The second piece of advice is to not spend too much time reading papers after you have an initial understanding of the field of study! The starting point for the latter is that it is easier for researchers to construct and solve problems in creative ways if they are not influenced by previous methods.

I personally agree with the first suggestion and disagree with the second suggestion. In my opinion, researchers should read as many papers as possible on the premise of ensuring original research time. "If I'm not familiar with what other people have tried, it's easier for me to devise a newer, better way to solve a difficult problem." - that seems unlikely, and a little too arrogant. Yes, it's important to look at things from a fresh perspective, and many of the stories of those amateurs are because they think outside the box. But as professional researchers, we cannot rely on this element of luck alone to arrive at a solution to a problem without thoughtfulness. For the vast majority of our research careers, we have been patiently, step-by-step, methodical in solving problems. Reading related papers is an efficient way to understand where we are and what to try next.

Of course, in the matter of reading as many papers as possible, I need to remind you that taking time to digest papers is just as important as reading them. It is better to read a few papers first, then carefully record and reflect on each one, rather than gobble it one by one.

Communication >> Video> Paper> Conference Speech

To understand an unfamiliar research calendar year, reading a paper is of course the easiest way, but what about the most efficient way? Different people may have different answers, and for me, communicating with others (ideally, with someone who already understands what I'm thinking about) is by far the fastest and most effective way to understand. If you don't have such people around, watching a video about the topic - such as an invited talk given by the author of this paper - is also a very good way to understand. When speakers are addressing a live audience, they prioritize clarity over accuracy. But in most essay writing, the priorities of the two are reversed. In a paper, word count is very important (authors should not devote too much space to clarifying a concept), and inaccurate background presentations can also lead to the perception that the author lacks knowledge of the field. Finally, brief conference presentations are often more formal than meaningful. Of course, the exchange with the keynote speaker after the speech is also very valuable.

Beware of Hype

Success in the field of artificial intelligence has drawn public attention and attracted more people into the field, and the effects of this cycle are mainly benign, but there is also a harmful side effect - hype. The media always wants to get more clicks; tech companies want to attract investors and hire more employees; similarly, researchers want to increase the visibility and citations of their papers, all of which lead to more and more The more serious the hype. So when we see the headlines of media reports or papers, we must think more about the factors behind them and beware of headline parties.

During a paper Q&A session at the 2017 NIPS conference, hundreds of viewers heard a rather prominent professor (to counter the hype) holding a microphone and admonishing paper authors to use caution in the title of "imagination" "The word. I have mixed feelings about this kind of near-public objection, and I happen to like this particular paper, but I quite understand the professor's displeasure. One of the most common and loathsome propaganda tropes in AI research is to reinvent an old idea with a new term. Beware of these buzzwords—as a serious researcher, you should still judge papers primarily on their experiments and results.

▌Research is a marathon

Set quantifiable goals

When looking for research projects early on, I spent a lot of time brainstorming. For me, brainstorming at the time was all about putting my head on my desk and hoping that vague intuitions would turn into concrete insights. After a day of brainstorming, I often feel tired and frustrated. Is this life science?

Of course, there is no real recipe for immediate research progress, and groping in the dark is part of most people's research careers. But now I find that by setting quantifiable goals and then planning my work, I can make my research life easier and more fulfilling. When I don't know what to do next, I often write down my vague ideas in as much detail as possible; if in the process of writing down the idea, I don't think it's appropriate, then write down the ideas that exclude it. justification (rather than scrapping the idea entirely and no longer measuring one's own progress). Without any ideas, we can resort to reading articles or communicating with colleagues. At the end of each day, my work finally has some tangible traces. Even though these ideas were never used, it greatly improved my self-confidence. I no longer have to worry about wasting time on the same ideas later on.

Learn to recognize and avoid dead ends

Good researchers spend more time on good ideas because they spend less time on bad ones. The ability to distinguish between good and bad ideas seems to be primarily a matter of experience. Nonetheless, researchers of all skill levels are often faced with the following choices. My research idea is flawed or uncertain, so should I try to choose A): further rescue or continue the idea, or B): abandon the idea altogether? I personally regret spending time on A) when I should have spent on B). Especially in the beginning, I used to get stuck in a dead end many times and spent a long time there. The root cause of my reluctance to give up may be the sunk cost misunderstanding - if I give up on this dead end, I'll be wasting the time I've already spent.

Now I still get disappointed every time I give up a dead end. But I've been trying to tell myself that backtracking is also a step forward, which is a bit counter-intuitive, but I've been internalizing that awareness. The cost already paid is worth it and has not sunk. If I hadn't explored this dead end today, I might have drilled into it again tomorrow. Dead ends are not the end, they are part of normal scientific life. Hopefully one of these ideas will stick with me. If not, there's Feynman's famous quote: We try to prove ourselves wrong as quickly as possible, because only then can we improve.

We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress 。― Richard Feynman

Start writing!

I once had the opportunity to ask a very well-known AI researcher for early career advice. His advice is simple: write! In addition to writing blogs and essays, more importantly, write down your thoughts every day. Since taking his advice, I've noticed a marked difference in the progress I make whenever I'm actively writing, not just thinking.

Physical and mental health is the premise of scientific research

Researchers often forget to eat and sleep when they are engrossed in scientific research, which is a very dangerous performance. I used to aim for this state a lot and often feel ashamed that I didn't get there. I understand now (at least on a rational level) that exercise and mental rest are investments, not distractions. If I spend 8 hours sleeping and 4 hours working, I will be more productive than if I spend 4 hours sleeping and 8 hours working.

Of course, interrupting work in the face of a tough problem can still be very difficult. Even when I'm past the point of exhaustion or depression, and I'm not making any substantial progress, I don't rest but keep digging. When I'm finally able to move forward a little bit, and can stop and take a deep breath, I'm always genuinely happy that I persevered. Hopefully I can continue this drive as I move into the next phase of my research career.

Original Author: Tom Silver

Original link:

http://web.mit.edu/tslvr/www/lessons_two_years.html


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325388110&siteId=291194637