Comment on NTIA AI Accountability Policy

OpenAI Blog 06/12/23, 12:00 AM News

ai-policy regulation accountability openai ntia governance

Summary

OpenAI submits formal comments to the NTIA on AI accountability policy, outlining their approach to responsible development of foundation models and supporting both horizontal and vertical accountability frameworks across the AI ecosystem.

No content available

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:54 PM

# Comment on NTIA AI Accountability Policy Source: [https://openai.com/global-affairs/comment-on-ntia-ai-accountability-policy/](https://openai.com/global-affairs/comment-on-ntia-ai-accountability-policy/) OpenAI is pleased to respond to the National Telecommunications and Information Administration’s \(NTIA\)’s April 13, 2023 request for comments \(RFC\) on AI Accountability Policy\. In this comment, we describe our thinking on AI accountability based on the safety practices we apply to the services we provide today, and the practices we plan to apply to services we anticipate offering in the future\. We welcome NTIA’s decision to frame this discussion in terms of an “ecosystem” of AI accountability\. As the RFC observes, policy stakeholders are exploring a “range of trustworthy AI system goals and deployment contexts\.”[1](https://openai.com/global-affairs/comment-on-ntia-ai-accountability-policy/#citation-bottom-1)Policies and practices designed to achieve accountability will vary accordingly\. At the same time, specific accountability measures will need to coexist with one another, and what matters most is the impact they have in concert\. We believe that a mature ecosystem for AI accountability will include both horizontal and vertical elements\. That is, we both expect there to be some elements that apply to certain AI systems across domains of application, as well as some elements that are tailored to particular domains\. We are engaged in the development and deployment of highly capable foundation models—models that learn from a large amount of data in order to be able to perform a wide range of downstream tasks\. In our view, AI developers like us must act responsibly and take a careful and safety\-focused approach to the development and deployment of the most advanced capabilities\. This is true regardless of the particular domains in which such models may be used\. A wide range of existing laws already apply to AI—including to our products—and the legal landscape is quickly evolving, with legislative initiatives in Congress, the AI Act under development in Europe, and legislative and policy initiatives unfolding around the world\. At the same time, long\-established bodies of law, regulation, and other expectations in areas like medicine, education, and employment are already being interpreted and adapted in ways that will shape the role AI plays in those domains\. We see these sector\-specific efforts, informed by deep domain expertise, as a critical part of the AI accountability landscape\. We strongly support efforts to harmonize the emergent accountability expectations for AI, including the efforts of the NIST AI Risk Management Framework, the U\.S\.\-E\.U\. Trade and Technology Council, and a range of other global initiatives\. While these efforts continue to progress, and even before new laws are fully implemented, we see a role for ourselves and other companies to make voluntary commitments on issues such as pre\-deployment testing, content provenance, and trust and safety\. Our current engineering approach requires a unique scale of computing resources, and we regard this as a promising basis for defining additional and distinctive accountability expectations that would apply to actors like us\. We support scoping any new regulation for highly capable foundation models carefully so as to preserve the ability of all actors to fairly compete and innovate\. Accountability plays a role throughout the technology lifecycle\. Our efforts to make our models safe and reliable begin before development starts, continue throughout deployment and operation of our models, and address both creators and users of highly capable foundation models\. We provide developers with world\-leading capabilities for their applications, and provide powerful capabilities directly to the millions of people who use ChatGPT and our other services every day\. Our usage policies apply to all users of our models, tools, and services\.[2](https://openai.com/global-affairs/comment-on-ntia-ai-accountability-policy/#citation-bottom-2)We comply with existing laws, and require that our developers and users comply when they use our services\. We focus the remainder of this comment on our current approaches to AI accountability, and describe important areas where we and others are working to strengthen the ecosystem\. We note that policymakers in the United States and around the world are considering a wide range of policies and measures intended to achieve AI accountability, including legislation, regulations, international agreements, self\-regulatory programs, and enforceable technical and other standards\. We appreciate these efforts and stand ready to partner with other stakeholders to develop and implement effective approaches to AI accountability\. Red teaming is the process of qualitatively testing our models and systems in a variety of domains to create a more holistic view of the safety profile of our models\. We conduct red\-teaming internally with our own staff as part of model development, as well as with people who operate independently of the team that builds the system being tested\. In addition to probing our organization’s capabilities and resilience to attacks, red teams also use stress testing and boundary testing methods, which focus on surfacing edge cases and other potential failure modes with potential to cause harm\. Red teaming is complementary to automated, quantitative evaluations of model capabilities and risks that we also conduct, which we describe in the next section\. It can shed light on risks that are not yet quantifiable, or those for which more standardized evaluations have not yet been developed\. Our prior work on red teaming is described in the DALL\-E 2 System Card and the GPT‑4 System Card\. Our red teaming and testing is generally conducted during the development phase of a new model or system\. Separately from our own internal testing, we recruit testers outside of OpenAI and provide them with early access to a system that is under development\. Testers are selected by OpenAI based on prior work in the domains of interest \(research or practical expertise\), and have tended to be a combination of academic researchers and industry professionals \(e\.g, people with work experience in Trust & Safety settings\)\. We evaluate and validate results of these tests, and take steps to make adjustments and deploy mitigations where appropriate\. OpenAI continues to take steps to improve the quality, diversity, and experience of external testers for ongoing and future assessments\. In addition to the qualitative red teaming described above, we create automated, quantitative evaluations for various capabilities and safety oriented risks, including risks that we find via methods like red teaming\. These evaluations allow us to compare different versions of our models with each other, iterate on research methodologies that improve safety, and ultimately act as an input into decision\-making about which model versions we choose to deploy\. Existing evaluations span topics such as erotic content, hateful content, and content related to self\-harm among others, and measure the propensity of the models to generate such content\. Independent assessments of models and systems, including by third parties, may be increasingly valuable as model capabilities continue to increase\. Such assessments can strengthen accountability and transparency about the behaviors and risks of AI systems\. Some forms of assessment can occur within a single organization, such as when a team assesses its own work or when a team or part of the organization produces a model and another team or part, acting independently, tests that model\. A different approach is to have an external third party conduct an assessment\. As described above, we currently rely on a mixture of internal and external evaluations of our models\. Third\-party assessments may focus on specific deployments, a model or system at some moment in time, organizational governance and risk management practices, specific applications of a model or system, or some combination thereof\. The thinking and potential frameworks to be used in such assessments continue to evolve rapidly, and we are monitoring and considering our own approach to assessments\. For any third\-party assessment, the process of selecting auditors/assessors with appropriate expertise and incentive structures would benefit from further clarity\. In addition, selecting the appropriate expectations against which to assess organizations or models is an open area of exploration that will require inputs from different stakeholders\. Finally, it will be important for assessments to consider how systems might evolve over time and build that into the process of an assessment / audit\.

Comment on NTIA AI Accountability Policy

Similar Articles

OpenAI’s comment to the NTIA on open model weights

Response to NIST Executive Order on AI

OpenAI’s comments to the NTIA on data center growth, resilience, and security

Questions for the Record

Our views on AI policy and political advocacy

Submit Feedback

Similar Articles

OpenAI’s comment to the NTIA on open model weights

Response to NIST Executive Order on AI

OpenAI’s comments to the NTIA on data center growth, resilience, and security

Our views on AI policy and political advocacy