Experts Urge Standardized AI Vulnerability Evaluations and Independent Oversight to Mitigate Risks
November 7, 2024A recent panel featuring experts from Google DeepMind, OpenAI, Hugging Face, and Microsoft discussed the current practices and challenges in evaluating AI vulnerabilities.
While general-purpose AI systems like ChatGPT and Stable Diffusion are widely utilized, they pose significant risks, including biased decision-making and the potential for creating non-consensual intimate imagery.
In her keynote address, Rumman Chowdhury, CEO of Humane Intelligence, emphasized the necessity of independent oversight, criticizing self-evaluations by companies as potentially biased.
Nicholas Carlini from Google DeepMind highlighted the absence of standardized procedures for disclosing AI vulnerabilities, contrasting it with established norms for software vulnerabilities.
To address these vulnerabilities, Avijit Ghosh introduced a framework for Coordinated Flaw Disclosures (CFD), which includes various components for ethical and safety assessments.
Lama Ahmad from OpenAI detailed the company's three forms of evaluations, stressing the importance of external red teaming and partnerships with AI safety organizations.
Independent third-party evaluations are crucial for providing unbiased assessments of AI systems, as they incorporate diverse perspectives and expertise.
Key outcomes from the discussions included a call for legal protections, termed 'safe harbors,' for third-party evaluators, as well as the need for standardization in evaluation processes and terminology.
Casey Ellis of Bugcrowd pointed out the necessity of a common language in policy discussions regarding AI vulnerabilities, acknowledging that such vulnerabilities are inevitable.
Chowdhury also advocated for building a robust pipeline of talent and collaboration among stakeholders, including lawyers and AI specialists, to enhance evaluation practices.
The third panel addressed legal and policy aspects, with Harley Geiger emphasizing the need for updated laws to protect evaluators and cover non-security AI risks.
Jonathan Spring from CISA outlined seven security goals that should guide the management of AI vulnerabilities, while Lauren McIlvenny reiterated the importance of coordination in addressing both software and AI risks.
Summary based on 1 source
Get a daily email with more AI stories
Source
Stanford HAI • Nov 6, 2024
Strengthening AI Accountability Through Better Third Party Evaluations