Microsoft’s latest LLM-powered tools detect vulnerabilities, hallucinations within AI apps

The new safety features are aimed at Azure customers

What’s the story

Microsoft’s Chief Product Officer of Responsible AI, Sarah Bird, has announced the launch of several safety features for Azure users.

In an interview with The Verge, Bird states that these innovative features are designed to identify potential risks, monitor for unsupported “hallucinations,” and prevent harmful prompts in real-time.

The new measures cater to Azure customers who may not have dedicated red teamers to test their AI services.

Safety features aim to prevent controversial AI responses

Bird explained that the evaluation system generates prompts mimicking potential attacks such as prompt injection or offensive content.

This process allows customers to receive a score and view the results, helping them avoid controversies caused by undesirable or unintended responses from generative AI.

The move comes in response to recent controversies involving explicit celebrity fakes by Microsoft’s Designer image generator, historically inaccurate images by Google Gemini, and inappropriate content on Bing.

Microsoft unveils three key safety features for Azure AI

The new safety features introduced by Microsoft include:

Prompt Shields: Designed to prevent harmful prompts or prompt injections from external documents that could cause models to deviate from their training.

Groundedness Detection: Aims to identify and prevent hallucinations.

Safety evaluations: Used to examine model vulnerabilities.

These three features are currently available in preview on Azure AI.

Microsoft plans to introduce two more features soon, that will guide models toward safe outputs and monitor prompts to identify potentially problematic users.

How does the monitoring system work?

The monitoring system checks if a user’s input or third-party data triggers any banned words or hidden prompts before it is sent to the model for a response. It then verifies if the model has hallucinated information not present in the document or prompt.

Microsoft allows Azure customers to control AI model filtering

Addressing concerns about companies determining what is suitable for AI models, Bird stated that a feature has been incorporated allowing Azure customers to control the filtering of hate speech or violence identified by the model.

This feature empowers users to decide what content the model prevents, offering them more control over their AI services.

Future Azure users will also have access to reports on users who attempt to generate unsafe outputs, aiding system administrators in identifying potentially harmful users.

Originally Appeared Here

Pages

Categories

Microsoft’s latest LLM-powered tools detect vulnerabilities, hallucinations within AI apps

Safety features aim to prevent controversial AI responses

Microsoft unveils three key safety features for Azure AI

How does the monitoring system work?

Microsoft allows Azure customers to control AI model filtering

About the Author:

Safety features aim to prevent controversial AI responses

Microsoft unveils three key safety features for Azure AI

How does the monitoring system work?

Microsoft allows Azure customers to control AI model filtering

You May Also Like

The Impact of AI on Paralegals – Artificial Lawyer

Generative AI owes massive rhetorical debt, but no one’s paying it back

AI won’t replace computer scientists any time soon – here are 10 reasons why

Outperformed by AI: Time to replace your analyst?

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

Mastering AI: Your Ultimate Prompt Engineering Course Guide

About the Author: