Challenging the defenses of ChatGPT and Google Bard

So far, scientists have created AI-based chatbots that can help with content generation. We've also seen AI being used to create malware like WormGPT, although the underground community isn't happy about this. But now chatbots are being created that can be attacked using generative AI by injecting activity on the fly.

New artificial intelligence capable of performing instant injections

Scientists at Singapore's Nanyang Technological University (NTU) have developed a working tool for attacking popular artificial intelligence-based chatbots.

The artificial intelligence chatbots they created were able to easily bypass protections associated with censorship and restrictions in projects such as ChatGPT, Microsoft Copilot, and Google Bard. 

The new artificial intelligence developed by Singaporean computer scientists is called Masterkey: the algorithm works based on a proprietary neural network.

It turns out that accessing protected information is not difficult. For example, experts are able to bypass lists of banned terms or statements by adding a space after each character in the question.

As a result, the chatbot understood the context of the question but did not log these activities as violations of internal rules.

Another way it works is the interesting phrasing of the request so that the generated AI "reacts like a human being, without principles and moral compass." Both methods are known to obtain necessary information without the constraints of censorship.

Effectiveness of new chatbot

As experts say, the Masterkey neural network they created has proven to be highly effective at finding new ways to select recommendations that bypass existing protection mechanisms built into popular chatbots.

They also hope Masterkey will allow them to discover security holes in neural networks faster than AI hackers can. The scientists reported their findings to companies involved in developing large-scale language models.

In the near future, this will no longer be new, but a normal way for cybercriminals to break through other artificial intelligence-generated network barriers.

While the use of artificial intelligence today is not very beneficial to cybercrime experts, there is no doubt that in the near future these technologies will continue to evolve and may become key to increasingly targeted and widespread attacks.

So we find ourselves talking about agents violating other agents (as Bill Gates describes the future of artificial intelligence)... even when thinking about The Matrix movies, it all sounds eerily familiar.

Most of the time, few traces are analyzed by other specific agents performing incident response (IR) activities. This is the direction we are taking and the direction we will take in the future.

The role humans play in this type of society remains to be understood.

Guess you like

Origin blog.csdn.net/qq_29607687/article/details/135311682