All eyes are on, Google's counterattack is coming! The new PaLM 2 overtakes GPT-4, the office family barrel bursts and upgrades, and the epic evolution of Bard...

 Datawhale dry goods 

Latest: Google PaLM 2, Source: Qubits

All eyes are on, and Google's counterattack is coming.

Now, Google Search is finally adding the AI ​​dialogue function, and the queuing channel has been opened.

0ec6093a2866b66623f74498380f8551.gif

Of course this is only the first step.

The big ones are yet to come:

The new large language model PaLM 2 was officially unveiled, and Google claims that it surpasses GPT-4 in some tasks.

The ability of Bard has been greatly updated , no need to wait in line, and supports new languages.

The Google version of the AI ​​office assistant is also launched, and will be the first to appear in Gmail.

Google Cloud has also launched a number of basic large-scale models to provide further generative AI services for the industry...

At the latest I/O developer conference, Google's big show was really shocking.

Some netizens called out:

The AI ​​war is in full swing.

5fa29615022415f5153ede9328892d03.png

Some even said:

Now I regret paying for ChatGPT.

40b6235fe2cf6b1d98316234fb575ff6.png

After a press conference, Google's stock price rose by more than 4%.

e816e885061e61087f0be99c6286a687.png

Some tasks of PaLM 2 surpass GPT-4

There is no doubt that PaLM 2 is the top priority of this year's I/O conference, which was introduced by Pichai himself.

2630b30ffeeaa6d0d28cc4b269fd327f.png

The current Bard and more than 25 AI products and functions of Google are now supported by PaLM 2 as the underlying technology.

As Google's most advanced large model at present, PaLM 2 is based on the Pathways architecture, an upgraded version of PaLM, built on TPU v4 through JAX.

According to reports, PaLM 2 has received training in more than 100 languages , which makes it more capable in language understanding, generation and translation, and will be better at common sense reasoning and mathematical logic analysis.

Google said that the PaLM 2 data set has a large number of papers and web pages, which contain a lot of mathematical expressions. After training on these data, PaLM 2 can easily solve math problems and even make graphs.

In terms of programming, PaLM 2 now supports 20 programming languages, such as Python, JavaScript and other commonly used languages, as well as Prolog, Fortran and Verilog.

c299720e118bf0be4079da23d51c00e4.png

This time Google launched PaLM 2 in four different sizes.

They use different animals to represent the scale. The smallest is "Gecko" and the largest is "Unicorn".

8a65737dd5984916cc31fcd07785d9bc.png

Among them, the "Gecko" version is very lightweight and can run quickly on mobile devices, including offline status; it can process 20 tokens per second.

ff3f51f1c63d26fdc09a11634d9d841e.png

The vice president of DeepMind said at the press conference before the I/O conference:

We've found that bigger isn't always better, which is why we've decided to offer a range of models in different sizes.

This means that it will be easier to fine-tune PaLM 2 so that it can support more products and applications.

At the I/O conference, Google announced that more than 25 products and applications are now using PaLM 2 capabilities.

The specific form of expression is Duet AI .

86360b5b985396590923bc54e7a5749f.png

It can be understood as a benchmark product of Microsoft 365 Copilot, an AI assistant that can be embedded in various office software.

Google has demonstrated the capabilities of Duet AI in Gmail, Google Docs, and Google Sheets at the press conference.

Including supplementing email content according to prompts, generating PPT, generating picture materials according to prompts, generating forms with one click, etc.

57143d62ba402773bfbb61f402c3c07d.gif

Likewise, this AI assistant can also provide programming assistance. Based on Google Cloud, it can recommend and correct code blocks in real time, and answer programming questions in a conversational manner. It currently supports Go, JavaScript, Python, and SQL.

In addition, based on PaLM 2, Google has also launched some large models in professional fields.

The health team at Google built Med-PaLM 2 . It can answer a variety of medical questions and is said to be the first large language model to achieve expert-level performance on the US Medical Licensing Examination.

c2a63659a4393273d848d8d1f6b8d1dc.png

Google is currently trying to make it multimodal, such as giving a diagnosis after examining X-rays yourself. Later this summer, this model will be available to a small group of Google Cloud users.

e614eb6fdb637f2d76ceb4602ec09a28.gif

Another specialized large model is Sec-PaLM 2.

This is a large-scale model for network security maintenance, which can analyze and explain potential malicious scripts, and detect the danger of scripts.

Then, after demonstrating the outstanding capabilities of PaLM 2, it's time to talk about how to open it for use.

Google says PaLM 2 is now available through the PaLM API interface, Firebase, and Colab.

Bard is fully open, supporting applications such as pictures and integrated maps

Bard, which is benchmarked against ChatGPT, finally canceled the queuing trial and fully opened it in 180+ countries and regions around the world.

Added dark mode, highly praised by programmers: (manual dog head)

51c0cf9fa0f2501e6b61c35cbad2cac3.png

In addition to expanding the scope of access, Bard has also added the ability to speak directly in Japanese and Korean in addition to English. Chinese seems to have to wait for the next wave —Google says it will add 40 language versions soon.

188c2744f531d689e3ae9cf76df41e70.png

Since today, Bard will be fully connected to PaLM 2, so its programming and reasoning capabilities have also been greatly improved, and code generation, debugging, and interpretation are more professional (the kind recognized by programmers).

When you let it use python to write a "four-step kill" (scholar's mate) move in chess, which refers to other codes, it will give relevant links for your convenience.

53fbf09eada06cf89b9d492245e7bd07.png

You can ask it further about a function in the code it doesn't understand, ask it if it can improve it, or ask it to combine everything in one code block.

11b77988fde957959aa3e3fee2e7d9c6.gif

However, the most surprising thing is that the one-click import function has been added at the request of the majority of developers .

Now, you can export the code generated by Bard directly to Colab.

94f45dd0fb3b6220d27618926ec4f577.png

In addition to code, any content you generate with Bard, such as email drafts and forms, can also be directly dragged into Gmail, Docs and Sheets.

By the way, Bard now also supports pictures in his answer . It is most convenient to ask for travel guides:

b5713fff6375ce89486dae6cada8e8fc.gif

In addition to it being able to answer you with pictures, you can also directly throw pictures to it, such as uploading a picture of two dogs, and let it help you make up some interesting stories:

9281040bcdc72b9e9caa2df0b47b612c.gif

The feature is powered by Google Lens, an AI app that lets machines learn to "see pictures and talk."

In addition to Google Lens, Google's own application capabilities such as Docs, Drive, Gmail, and Maps are also integrated into Bard.

For example, in Bard's answer, directly use Google Maps to view the geographic location of several universities:

ef30c812709cdc2daac418a296e97094.png

There is a feeling that if you want to use various Google products now, you only need to use Bard as an entry .

In addition to its own application, Bard also brought up Adobe Firefly this time, and various copyrighted creative images can be "handy" with dialogue:

ca086ebdd1a36d60b1b6fc65abd60011.gif

Search Refactoring, Join the AI ​​Dialogue

After thousands of calls, Google search finally opened the ability of AI dialogue.

"For a family with a child under 3 and a dog, is Bryce Canyon or Arches National Park better?"

For this question, before putting it aside, you may need to break it down into various small questions and go to the search engine to sort out a lot of information before you can finally find the answer.

Now Google lets you try to do it in one step.

As shown in the figure, Google search did not simply carry the searched answers, but took into account the two factors of children and dogs to give a sorted answer, for example, it said:

Bryce Canyon has two loops that dogs can enter, and is very friendly to strollers; Arches National Park does not allow pets to enter most of the roads; both places require pets to be on a leash, etc.

9c64cf0720b4d61aa454804f3644c6eb.png

Each sentence has a specific basis link for viewing:

1c007022156bc8208fbbc5f4820134dc.png

In addition, it will also display links to strategies posted by netizens from different websites.

Best of all, you can ask further conversational questions about its answers by clicking the "ask for a follow up" button.

Shopping is also fun with the new Google Search , which claims to help you make quick, rational buying decisions.

For example, when you want a "bike for a 5-mile mountain commute", it will first tell you the important factors to consider before choosing, such as:

Look at the design: e-bikes, road bikes and hybrid bikes are suitable for commuting;

The second is to look at the motor and battery, and the third is to look at the suspension for shock absorption. Commuting on mountain roads needs to deal with the impact of cracks and bumps.

Then recommend a suitable car to you, and give comprehensive information such as specific product descriptions, latest reviews, prices, and pictures when recommending.

2eae72202dc2cc4dbab210e40318f2ce.png

You can also ask further questions, for example, as long as you only need a red electric bicycle, it will further optimize the answer.

c200519b9040833c0d5410b5c933d30f.png
19835baae6b27ca20bd9695377876690.png

This feature is powered by Google's shopping comparison product Shopping Graph, which collects and constantly updates product listings from around the world.

It is worth mentioning that Google bluntly stated that the updated AI search interface will still be embedded with advertisements, but rest assured: it will only appear in dedicated ad slots and will not be mixed into your search results.

Finally, this new feature can only be applied for trial in Google Search Labs, and it is limited to the user experience in the United States.

Three basic models launched on Google Cloud

At this year's I/O conference, the content of Google Cloud is also eye-catching.

After updating a large wave of AI capabilities, Google launched three new models for its cloud machine learning platform Vertex AI:

  • Codey: text-to-code, helping programmers write code

  • Imagen: text-to-image, generate high-quality images

  • Chirp: speech-to-text, easy to communicate

b2d692ab373b061a17386d381ccd442c.gif

The capabilities of these three models were actually demonstrated at today's press conference, such as generating codes and smart editing of Google photos.

c669ac5382014abac1f0674264ad9719.gif

In addition, embedded APIs for text and images are now available on Vertex AI. It supports converting text and image data into multidimensional numerical vectors and mapping semantic relationships, allowing developers to create more interesting applications.

Another major update was on RLHF, where Google said they were the first to bring this functionality as a managed service to their end-to-end machine learning platform. The advantage is that companies can combine RLHF to quickly train the reward model for fine-tuning the basic model, which is critical for improving the accuracy of large models in industrial applications.

In addition to the model, Google Cloud also launched the next-generation A3 GPU supercomputing for training. By combining the A3 virtual machine and Nvidia H100, Google Cloud can provide greater computing throughput and bandwidth, enabling enterprises to develop machine learning models faster.

In addition to these, Google also brought new hardware products such as the first folding screen mobile phone priced at US$1,799 (approximately RMB 12,000) this time, as well as the Android 14 system with access to AI functions (such as providing information reply suggestions or something) , not shown here one by one.

In general, as the 15th I/O conference, Google has really brought you a lot of dry goods this time.

It is worth mentioning that Jeff Dean is no longer the guest speaker introduced on stage this time. He just changed his rank a few days ago.

As the most representative executive of Google AI in the past, where will he be in the AI ​​2.0 wave?

It is also worth looking forward to whether Google can catch up in the fields of large models and AI search.

Are you satisfied with Google's counterattack this time?

eee7af7116abee49c76bda64530a1a69.png

Dry goods learning, like three times

Guess you like

Origin blog.csdn.net/Datawhale/article/details/130633413