[AI underlying logic] - Chapter 7 (Part 2): Computing resources & software code sharing

Continuing from the previous article...

Table of contents

Continuing from the previous article...

3. Computing resources

1. The first stage: data concentration

2. The second stage: resource cloudification

①Classification of “cloud”

②Virtualization technology

③Popularization of edge computing

4. Software code sharing

Summarize

  Highlights from the past:


3. Computing resources

Although the AlphaGo algorithm paper has been published, it is definitely a money-losing business to copy the second one from a commercial point of view, because it requires a large number of servers and data centers to support the algorithm implementation. For most AI applications, as long as it can be connected to the Internet, it can theoretically use nearly unlimited computing resources - turning physical resources (such as server equipment, computing nodes, storage nodes, network nodes, etc. ) Available on demand. This process has gone through two stages: data centralization and resource cloudification.

1. The first stage: data concentration

Data centralization : Build a large number of data center infrastructure and centralize data for use.

Data Center : Data Center, sometimes abbreviated as DC, such as the data center that provides Internet services such as website publishing, virtual hosting and e-commerce is called IDC (Internet Data Center), and the data center that provides cloud computing services is called CDC (Cloud Data Center). Center), others include Enterprise Data Center ( EDC ), Government Data Center ( GDC ), etc. ——There is no unified definition, but generally speaking, the broad sense of data center includes three levels of meaning :

① A physical space is an important infrastructure for enterprise information construction, providing the necessary physical place and supporting environment for the operation of information systems.

②A logical architecture , which is the IT application environment formed by the large data concentration of enterprises, and is the hub for providing IT services such as data calculation, processing, and storage.

③ An organizational structure , that is, an organization or team responsible for the operation and maintenance of information systems through centralized operation, monitoring, management, etc. (responsible for ensuring system stability and business continuity).

Energy consumption : Data centers are huge energy consumers. Currently, the industry has reached and exceeded 1% of global energy consumption, and its green energy-saving construction plan has also been put on the agenda. Natural resources such as water energy, wind energy, and solar energy can be used in data centers to save energy consumption!

2. The second stage: resource cloudification

Resource cloudification : resources such as computing and storage are also centralized and turned into virtual resources to provide external services .

In 2006, Google first proposed the concept of "cloud computing". It believed that electronic devices such as mobile phones and computers are just ancillary equipment terminals to the Internet. In the future, all programs and data can be stored on the Internet, and people do not need to manage software. With upgrades and security patches, you don’t have to worry about losing important data.

Cloud usually refers to two aspects: ① Services , which are cloud services provided on the Internet. These services include cloud computing, cloud storage, cloud security, etc.; ② Technology , which is the technology platform that provides cloud services. This platform must solve the problems of big data, virtualization, etc. ization, distribution and other issues. The essence of the cloud is a very large-scale distributed system that transfers massive computing and storage tasks to a large number of computer nodes located in different physical locations.

①Classification of “cloud”

The development of almost all technologies is divided into: budding creative period-hype period-disillusionment period-recovery period-maturity period . Cloud computing is no exception!

The first is based on deployment methods , such as private cloud, public cloud, hybrid cloud, community cloud, industry cloud, etc. For most people, pure "cloud" refers to public cloud.

①私有云:企业自己建设的、供内部使用的云,集中在如金融、医疗、政务等重要服务行业。安全性高、定制化程度高。
②公有云:由云服务商建设和维护的,通过互联网为企业和个人提供服务,在游戏、视频行业用的较多。成本低、无须维护、可无限伸缩、高可靠性。
③混合云:将企业内部的基础架构、私有云与公有云相互结合的云。兼有两者特点,但也导致架构复杂、维护难度大等。

The second type is based on service content , including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) .

①IaaS提供硬件设备服务,包括服务器、计算、存储、网络和配套管理工具的虚拟数据中心,通常面向的是企业基础设施运维人员。
②PaaS提供平台服务,如业务软件、中间件、数据库等,面向的是应用程序开发人员。它屏蔽了系统底层复杂的管理操作,使得开发人员可以快速开发高性能、可扩展的程序和服务。
③SaaS提供现成的软件服务,如在线邮件系统、在线存储、在线Office等,面向普通用户。

②Virtualization technology

From a technical perspective, the bottom layer of the cloud uses a virtualization technology!

Virtual technology in the traditional sense : one or more operating systems can be executed simultaneously in an isolated environment, that is, multiple completely different operating systems can be run on one physical device or server ( a physical device can be divided into several virtual machines ) . Later, people used virtualized containers that are smaller than the operating system to provide services (the container is equivalent to the virtualization of software and components running in the operating system). Compared with virtualizing the entire operating system, it provides finer service granularity. Each component All can be individually customized and independent, and many AI and cloud services use virtualization technology.

Benefits: Resources can be dynamically allocated to better match business needs with IT resources . For example, when many online games are first launched, it is difficult to accurately estimate how many servers are needed to support them. At this time, the number of servers can be dynamically adjusted based on the number of players after the game is launched; another example is that Alibaba will add a large number of online servers in advance to ensure that "Double Eleven" online shopping, Weibo expanded its online capacity to support the surge in users and activity caused by some unexpected hot topics.

③Popularization of edge computing

In addition to cloud computing , another computing technology used in conjunction with AI is called edge computing . In recent years, with the popularization of network communication technologies such as 5G, the number of devices that people need to monitor and the amount of data that they need to process have increased significantly. Hardware such as sensors, wearable devices, and proprietary chips have become popular, and they can even be implanted into animals, plants, and In the human body, many computing tasks can be completed locally through these hardware. This computing model is called edge computing.

Comparison: Different from the cloud computing model, edge computing does not need to upload data to the cloud for processing. Instead, it completes some computing tasks through local devices or nearby infrastructure to provide users with better and faster services and reduce data consumption. The cost of processing and transmission - for example, when a driverless car is driving, it will prioritize the calculation of sensor and camera data locally. Only in this way can it ensure real-time perception of road conditions and respond to various emergencies.

Edge computing nodes can filter data locally and upload only valuable data to the cloud . This can not only continue to provide local services when the network is interrupted, but also avoid the risk of data privacy leakage. For example, video data captured by neighborhood road cameras are usually monitored and analyzed locally, and then key video clips are uploaded to the cloud. This can protect user privacy and significantly reduce network bandwidth usage costs.


4. Software code sharing

When talking about the topic of AI, we always talk about data, algorithms and the infrastructure that provides computing and storage resources. These are the "open lines" of technology, but there is a "hidden line" - open source is indispensable . It enables large-scale technical collaboration.

When it comes to open source, we have to mention GitHub , the world's largest programming community and code hosting website . GitHub was founded in San Francisco, USA in 2008. It created a new way of development assistance where people can obtain massive free code resources on the website. Many good AI algorithms and projects can be found here, such as Google's TensorFlow, Microsoft's CNTK (Cognitive Toolkit), Meta's Torch, Caffe, Theano and other open source deep learning frameworks.

GitHub核心使用了Git技术,以前的版本控制系统都是运行在集中的版本控制服务器上,
而Git创新地把它变成了分布式。通过GitHub网站,我们只需要从任何公开的代码仓库中
克隆代码到自己账号下,就可以进行开发和编辑。我们也可将修改的代码给原作者发送
一个推送请求,原作者觉得代码改动没问题,就可直接把修改的代码合并到自己的原代码中,
这样实现了集体编程。

Open source dominates the technology ecosystem . For example, many people like to use Python to study AI algorithms. This is because many practical deep learning frameworks and machine learning software can run based on Python, such as TensorFlow, Scikit-learn, Keras, Theano, Caffe, etc. These open source software provide a good learning environment!

But there are also side effects - some people like to use it. Programming no longer starts from scratch, but directly copies the entire code from the Internet. Software companies begin to rely on encapsulation and fine-tuning of source code to claim independent research and development.

Although the software is open source, it is not the best choice for enterprises, because open source software cannot guarantee high availability and security , and its use is very risky for business . In addition, AI is different from other ordinary programs. It does not run well as long as it has software code. It also needs to rely on massive data, high-performance computing power, and many rounds of model training—so some more mature AIs may not be open source. Road, such as the popular natural language processing model GPT some time ago, when GPT-2 was released, OpenAI did not choose to disclose all the codes, but only a part of it. It believed that GPT-2 was too powerful, and there might be security risks if it was completely open source; GPT When the -3 model was released, OpenAI chose to invite testing in the form of an API interface and did not directly disclose the code in order to prevent the abuse of this technology .


Summarize

Although computers can already perform billions of operations per second, they still cannot meet the needs of human calculations!

Although we can simulate more than 100 billion model parameters, this may not even be one ten thousandth compared to the neurons in the human brain!

Until today, we are far from having the computing power to reach the level of general AI, pursuing high-performance computing power, and understanding the direction that humans need to work hard!

  Highlights from the past:

[AI underlying logic] - Chapter 5 (Part 2): Clustering & dimensionality reduction & time series of machine learning algorithms

[AI underlying logic] - Chapter 3 (Part 2): Information exchange & information encryption and decryption & noise in information

[AI underlying logic] - Chapter 3 (Part 1): Data, information and knowledge & Shannon information theory & information entropy

[Machine Learning] - Continued: Convolutional Neural Network (CNN) and parameter training

[AI underlying logic] - Chapter 1&2: Statistics and probability theory & data "traps"

[AI underlying logic] - Chapter 5 (Part 1): Regression & Classification of Machine Learning Algorithms

Guess you like

Origin blog.csdn.net/weixin_51658186/article/details/132018851