Watercress score 9.9! Burst reputation at home and abroad to strengthen the learning of the Chinese version of the Bible is finally here!

December 7, 2017, was developed in the community go to beat the invincible hand of AlphaGo team introduced a more powerful AlphaZero, it trained from scratch, only 8 hours to beat AlphaGo v18 and Shishi battle!

However, the impact AlphaZero bring much more than that! In the battle of the Gods AlphaZero in the face of the earth was the strongest chess engine Stockfish, AlphaZero not gold Shayu one hundred innings unbeaten record to 28 wins and 72 flat, the champion Stockfish cut from his horse. Such a result can not help but shocked everyone thinks Stockfish had already perfected its code there are numerous human well-constructed algorithm techniques. On the speed, Stockfish ability to calculate 60 million positions per second is enough to complete explosion 60,000 per second AlphaZero.

The reality is that --Stockfish can never overcome AlphaZero. AlphaZero have a more clever way of thinking, which makes it more sensible, know what to think, what to ignore. This comes from thinking smarter reinforcement learning.

Connectionism neural network on behalf of the depth of learning is undoubtedly the early 21st century the most important in the field of artificial intelligence, one of the most significant technological breakthroughs and practical, it is basic research to industrial applications has made great contributions, accordingly He won great reputation and attention.

However, the full swing of industrial application could not conceal the cool artificial intelligence researchers have concerns about the future direction, more and more researchers study the improvement of depth application skills learning is regarded as the industrial sector, and started to pay attention and connectionism explore different paradigms of classic depth study of artificial intelligence.

It is a typical representative of reinforcement learning in this exploration.

Reinforcement learning with traditional collected in advance or construct good data and labeling of supervised learning is essentially different, it emphasizes the interaction with the environment get reflected feedback signal real degree of achievement, emphasizing trial and error model of learning and sequence of decision-making behavior dynamic and long-term effects. This makes reinforcement learning plays an important role irreplaceable in the study of some of the challenges in the field of artificial intelligence. And these valuable ideas, but also for CONNECTIONISM depth study further developments in small data, dynamic environment, self-learning provides an important foundation.

After AlphaGo victory over Shishi, AlphaZero complete with its self-learning beyond human ability in a variety of board games in the thousands of years of experience once again set a human understanding of artificial intelligence, but also makes learning and strengthen the combined depth study by the academic unprecedented attention of the community and industry.

"Reinforcement learning (2nd Edition)" Reinforcement Learning: An Introduction (Second Editio the n- ) It is published in this context.

The authors of the book Richard S. Sutton and Andrew G. Barto are strengthening pioneer in the field of learning, they began as early as areas of concern now known as reinforcement learning in late 1979 and engaged in the research, published in 1998. This the first edition of the book, the industry caused a sensation.

The book as a reinforcement learning in the field of pioneering, ground-breaking book, learning to strengthen ideological and conducted in-depth dissection, provides a clear and concise explanation for reinforcement learning core concepts and algorithms, 20 years to lead countless fans into the reinforcement learning, and He nurtured several generations of outstanding researchers in the field of reinforcement learning.

Today, 20 years later, in machine learning (including reinforcement learning), driven by the development of cutting-edge technology, artificial intelligence has made significant progress. These developments not only thanks to a computer powerful computing capabilities of these years is developing rapidly, but also benefit from the many innovations on the theory and algorithms. "Reinforcement learning (2nd Edition)" came into being, 2nd Edition added a lot of new content, including an introduction to the depth of reinforcement learning applications (such as AlphaGo), as well as updating the thinking and understanding, making the book both remain clear and concise explanation of the core theory, but also contains the latest ideas and the latest application of the results of the author's times.

The book continues the burst in the country's reputation on the international and domestic famous book learners on the real deal gives 9.9 high scores!

读者们也给出了十分中肯的评价：

这本书是迄今为止最系统最完整地描述强化学习领域的教材，在第2版中除了包含机器学习、神经网络等人工智能诸多方面的内容外，还涉及心理学与神经科学等内容，新概念、新词汇繁多，对于大部分国内读者来说存在着极高的阅读门槛。

值得庆幸的是，上海交通大学俞凯教授率领团队已经将这部行业圣经的思想和内容以符合中国人理解习惯的方式进行了高质量地翻译！

俞凯教授身为上海交通大学计算科学与工程系教授、思必驰公司创始人及首席科学家，长期从事交互式人工智能，尤其是智能语音及自然语言处理的研究和产业化工作，有着非常丰富的强化学习和深度学习实践经验。这也保证中文版忠于原著且行文流畅。

《强化学习（第2版）》已登录各大平台！

本书从强化学习的基本思想出发，深入浅出又严谨细致地介绍了马尔可夫决策过程、蒙特卡洛方法、时序差分方法、同轨离轨策略等强化学习的基本概念和方法，并以大量的实例帮助读者理解强化学习的问题建模过程以及核心的算法细节。

香港科技大学杨强教授赞誉到：“毫不夸张地说，《强化学习（第2版）》中文版的面世为机器学习领域的中国学者和学生架起了一座通往强化学习经典知识宝库的桥梁。”

此外，Yoshua Bengio、Demis Hassabis、周志华、邓力等众多国内外行业大咖同样力荐：

乔鲍·塞派什瓦里（Csaba Szepesvari）

DeepMind研究科学家，阿尔伯塔大学计算机科学教授

杰米斯·哈萨比斯（Demis Hassabis）

DeepMind联合创始人兼首席执行官

邓力

美国城堡基金首席人工智能官 (Chief AI Offiffifficer) ，美国微软公司原首席人工智能科学家

黃士傑(Aja Huang)

AlphaGo首席工程师(Lead Programmer of AlphaGo)

佩德罗·多明戈斯（Pedro Domingos）

华盛顿大学计算机科学教授，《终极算法》作者

漆远

蚂蚁金服副总裁，首席 AI 科学家

汤姆·米切尔（Tom Mitchell）

卡内基梅隆大学计算机科学教授

杨强

前海微众银行首席人工智能官，香港科技大学讲座教授，国际人工智能联合会理事会主席（2017―2019）

约舒亚·本吉奥（Yoshua Bengio）

蒙特利尔大学计算机科学与运筹学教授

张钹

中国科学院院士，清华大学人工智能研究院院长

周志华

南京大学计算机系主任/人工智能学院院长，欧洲科学院外籍院士

得知本书将要在中国上市，两位原著作者还特意为中国读者写了寄言。

We are most pleased that Professor Kai Yu has produced this Chinese translation of our textbook, which we hope will enable more Chinese students to self-study reinforcement learning and lead to the development of new ideas within China that contribute to the diversity and vigour of worldwide reinforcement learning research.

——Richard Sutton and Andrew Barto

我们非常高兴俞凯教授将我们的教材翻译成中文，希望这本教材能够帮助更多的中国学生自学强化学习，并且促进更多的新思想在中国产生，为世界范围的强化学习研究的多样性和生机活力做出贡献。

——理查德·萨顿、安德鲁·巴图

强化学习是人工智能领域的一颗明珠，也会是后深度学习时代技术发展的重要火种之一。正如俞凯教授在译者序里所讲那样：

“希望本书的中文译本能够让他们的思想为更多的中国研究者所了解，并作为一个种子，在中国孕育并产生人工智能前沿研究的新思想。”

博文视点企业博客

发布了1739 篇原创文章 · 获赞 740 · 访问量 476万+

他的留言板关注

Watercress score 9.9! Burst reputation at home and abroad to strengthen the learning of the Chinese version of the Bible is finally here!

Guess you like