OpenAI releases Model Spec, a guideline on how its AI tools like ChatGPT and Sora should behave

OpenAI has released the first draft of the Model Spec, a document outlining the desired behaviour for its models in the OpenAI API and ChatGPT. This document includes core objectives and guidance on handling conflicting objectives or instructions. OpenAI says the purpose of the Model Spec is to serve as guidelines for researchers and data labellers involved in reinforcement learning from human feedback. Although the Model Spec has not been fully implemented yet, it draws upon existing documentation at OpenAI. Additionally, OpenAI wants that eventually AI models like ChatGPT, Sora and Dall-E should directly learn from the Model Spec.

The Model Spec outlines three types of principles: objectives, rules, and defaults. Objectives provide broad directional guidance, while rules establish specific behaviours for high-stakes situations. Defaults, on the other hand, offer baseline behaviours that can be overridden by developers or users as needed. Conflicts between objectives are addressed through a combination of rules and defaults. Rules are employed for situations where negative consequences are unacceptable, while defaults provide stable behaviours that align with underlying principles.

OpenAI views the release of the Model Spec as an integral part of an ongoing dialogue about model behaviour and ethical AI development. They aim to engage a diverse range of stakeholders, including policymakers, trusted institutions, and domain experts, to gather feedback on the approach and specific objectives, rules, and defaults outlined in the document.

Through this engagement, OpenAI seeks to understand stakeholders’ perspectives on the Model Spec and determine if there are additional considerations to be included. They are particularly interested in hearing whether stakeholders support the approach and its components.

Additionally, OpenAI is also inviting the general public to provide feedback on the Model Spec for the next two weeks. This feedback will be used to refine the document and ensure that it aligns with OpenAI’s mission of responsible AI development.

OpenAI plans to provide regular updates on any changes to the Model Spec and how they are incorporating feedback into their research and development process. This transparency underscores OpenAI’s commitment to building AI models that prioritise safety, fairness, and beneficial outcomes for society.

In essence, the Model Spec serves as a comprehensive framework for guiding the behaviour of AI models, ensuring they remain beneficial and safe for users and society at large.

Nandini Yadav

May 9, 2024

Early Bird