AI chatbots ‘highly vulnerable’ to dangerous prompts, UK Government finds

Artificial intelligence (AI) chatbots are “highly vulnerable” to prompts for harmful outputs, new research has found.

The research by the UK AI Safety Institute (AISI) revealed “basic jailbreaks” can bypass large language models (LLMS) security guardrails.

Testing five LLMs, the study saw models issue harmful content even without “dedicated attempts to beat their guardrails”.

These security measures are designed so to avoid the model issues toxic or dangerous content even when prompted to do so.

“All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards,” AISI researchers said.

The team used questions from an academic paper as well as their harmful prompts, which included asking the system to generate text on convincing users to commit suicide to an article suggesting the Holocaust didn’t happen.

Although the tested models remain unnamed, the government has confirmed they are in the public use.

The announcement comes after LLM developers have pledged to ensure the safe use of AI. Last year, OpenAI, which owns ChatGPT, announced its approach to AI safety and said it does not permit its technology to be used “to generate hateful, harassing, violent or adult content”.

Meanwhile, Meta’s Llama 2 model has undergone testing to “identify performance gaps and mitigate potentially problematic responses in chat use cases” and Google has said its Gemini model has built-in safety filters to prevent toxic language and hate speech.

Other findings showed the LLMs hold expert-level knowledge of chemistry and biology but struggle with cybersecurity challenges aimed at university students.

The LLMs also struggled to complete sequences of actions for complex tasks, without human oversight.

The research comes as the UK is set to co-host the second global AI summit in Seoul.

The first summit was held at Bletchley Park in November and saw the first-ever international declaration to deal with AI.

The AISI also announced plans to open its first overseas in San Francisco, where major AI firms such as Open AI and Anthropic are based.

Holyrood Newsletters

Holyrood provides comprehensive coverage of Scottish politics, offering award-winning reporting and analysis: Subscribe

Originally Appeared Here

Pages

Categories

AI chatbots ‘highly vulnerable’ to dangerous prompts, UK Government finds

Holyrood Newsletters

About the Author:

Holyrood Newsletters

You May Also Like

New curriculum for Classes IV-XII to include subjects on emerging technologies

What Is Loop Engineering? The New AI Coding Shift Explained

AI prompt engineering, data science to be taught in Classes VI-XII | Chennai News

What Is Loop Engineering? Why It Could Replace Prompt Engineer…

Khalifa Fund launches second edition of ‘Prompt Engineering’ programme for members of Abu Dhabi Chamber Al Ain

Stop Prompting, Start Designing Autonomous Agent Workflows

About the Author: