A hazard analysis framework for code synthesis large language models

OpenAI Blog 07/25/22, 07:00 AM Papers

code-synthesis llm-safety hazard-analysis codex ai-alignment risk-assessment code-generation

Summary

OpenAI presents a hazard analysis framework for evaluating safety risks associated with code synthesis LLMs like Codex, examining technical, social, political, and economic impacts through a novel evaluation methodology for code generation capabilities.

No content available

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# A hazard analysis framework for code synthesis large language models Source: [https://openai.com/index/a-hazard-analysis-framework-for-code-synthesis-large-language-models/](https://openai.com/index/a-hazard-analysis-framework-for-code-synthesis-large-language-models/) ## Abstract Codex, a large language model \(LLM\) trained on a variety of codebases, exceeds the previous state of the art in its capacity to synthesize and generate code\. Although Codex provides a plethora of benefits, models that may generate code on such scale have significant limitations, alignment problems, the potential to be misused, and the possibility to increase the rate of progress in technical fields that may themselves have destabilizing impacts or have misuse potential\. Yet such safety impacts are not yet known or remain to be explored\. In this paper, we outline a hazard analysis framework constructed at OpenAI to uncover hazards or safety risks that the deployment of models like Codex may impose technically, socially, politically, and economically\. The analysis is informed by a novel evaluation framework that determines the capacity of advanced code generation techniques against the complexity and expressivity of specification prompts, and their capability to understand and execute them relative to human ability\.

A hazard analysis framework for code synthesis large language models

Similar Articles

A research agenda for assessing the economic impacts of code generation models

Evaluating large language models trained on code

Running Codex safely at OpenAI

Lessons learned on language model safety and misuse

@OpenAI: Another reason to switch to Codex.

Submit Feedback

Similar Articles

A research agenda for assessing the economic impacts of code generation models

Evaluating large language models trained on code

Running Codex safely at OpenAI

Lessons learned on language model safety and misuse

@OpenAI: Another reason to switch to Codex.