Webmaster’s Home (ChinaZ.com) December 19 News: OpenAI is the developer of ChatGPT, and they have formulated a plan to deal with the serious dangers that artificial intelligence may bring. The framework includes risk “scorecards” using AI models to measure and track various indicators of potential harm, as well as conduct assessments and predictions. OpenAI says it will continue to refine and update the framework based on new data, feedback and research.
The company’s Preparedness Framework will employ AI researchers, computer scientists, national security experts and policy professionals to monitor the technology and continually test and alert companies if they believe any AI capabilities become dangerous . The team sits between OpenAI’s “Secure Systems” team and the “Hyper-Alignment” team. The former works to solve problems in artificial intelligence, such as injecting racist bias, while the latter studies how to ensure that artificial intelligence does not cause harm to humans in an imagined future where artificial intelligence completely surpasses human intelligence.
It is reported that the “response” team is recruiting national security experts from outside the field of artificial intelligence to help OpenAI understand how to respond to major risks. They are in discussions with organizations including the U.S. National Nuclear Security Administration to ensure companies can properly study the risks of artificial intelligence.
The company will also allow "qualified, independent third parties" from outside OpenAI to test its technology.
OpenAI’s “response framework” contrasts sharply with the policies of its main competitor, Anthropic.
Anthropic recently released its Responsible Scaling Policy, which defines specific AI security levels and corresponding protocols for developing and deploying AI models. There are significant structural and methodological differences between the two frameworks. Anthropic's policy is more formal and prescriptive, directly tying security measures to model capabilities and pausing development when security cannot be demonstrated. OpenAI’s framework is more flexible and adaptable, setting general risk thresholds that trigger review rather than predefined levels. Both frameworks have their strengths and weaknesses, but Anthropic's approach may have the advantage in incentivizing and enforcing security standards,
experts say. Some observers also believe that OpenAI is catching up on security protocol work after facing rapid and aggressive deployment of models such as GPT-4. Anthropic's policy is advantageous in part because it is proactive rather than reactive.
Regardless of their differences, both frameworks represent important advances in artificial intelligence security. As AI models become more powerful and ubiquitous, collaboration and coordination on safety technologies among leading laboratories and stakeholders is now key to ensuring the beneficial and ethical use of AI for humans.