Global AI experts suggest steps towards trustworthy AI development

Leading researchers co-author unique report proposing ten mechanisms for more AI developers to make more verifiable claims

58 experts on the technical and policy aspects of AI have jointly authored a ground-breaking report – proposing ten detailed, concrete steps AI companies should take to move towards trustworthy AI development.

“In order for AI developers to earn trust from system users, purchasers, civil society, governments, and other stakeholders that they are building AI responsibly, there is a need to move beyond principles to a focus on mechanisms for demonstrating responsible behavior.” says newly-published report

Assessing the limits of ethics principles and codes of conduct – as well as the substantial impact AI development is having on communities around the globe – the report is a clarion call for AI developers worldwide to address the clear lack of trust in how AI is currently developed.

However, the report – Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims – also recommends interventions to move toward trustworthy AI development:

A coalition of stakeholders should create a task force to research options for conducting and resourcing third-party auditing of AI systems.
Organizations developing AI should run red-teaming exercises to explore risks associated with systems they develop and share best practices and tools.
AI developers should pilot bias and safety bounties for AI systems.
AI developers should share more information about AI incidents including through collaborative channels such as the Partnership on AI.
Standard setting bodies should work with academia and industry to develop audit trail requirements for safety-critical applications of AI systems.
Organizations developing AI and funding bodies should support research into the interpretability of AI systems, with a focus on supporting risk assessment and auditing.
AI developers should develop, share, and use suites of tools for privacy-preserving machine learning that include measures of performance against agreed standards.
Industry and academia should work together to develop hardware security features for AI accelerators or otherwise establish best practices for the use of secure hardware (including secure enclaves on commodity hardware) in machine learning contexts.
One or more AI labs should attempt to comprehensively account for the computing power used in the context of a single project, and report on lessons learned regarding the potential for standardizing such reporting.
Government funding bodies should substantially increase funding for computing power resources for researchers in academia and civil society, in order to improve the ability of those researchers to verify claims made by industry.

The co-authors come from a wide range of organisations and disciplines, including the Alan Turing Institute and the Partnership on AI; Cambridge University’s Centre for the Future of Intelligence and Oxford University’s Future of Humanity Institute; Google Brain and OpenAI, leading AI research companies; the Center for Security and Emerging Technologies, a US-based bipartisan think-tank; and other organisations.

The 72-page report identifies three areas (institutional, software and hardware) in which progress can be made on specific mechanisms.

It suggests that institutional mechanisms can shape incentives or constrain behavior of the people involved in AI development. They can help clarify an organization’s goals and values, can increase transparency regarding an organization’s AI development, can create incentives for organizations to act in ways that are responsible processes, and can foster exchange of information between developers. The authors call for AI developers to explore third-party auditing, red teams, safety and bias bounties and incident sharing.

Likewise, software mechanisms allow researchers, auditors, and others to understand the internal workings of an AI system. They can also help characterize how an AI system can be expected to behave when used in a particular setting. The proposed mechanisms are audit trails, interpretability, and privacy-preserving machine learning.

For hardware, mechanisms address who has what physical computing resources, and how they are accessed and monitored. It also involves how those resources are designed, manufactured, and tested. Hardware mechanisms aim to condition or constrain the behavior of actors who use these resources. The report emphasises the importance of secure hardware for machine learning, high-precision compute measurement, and computing power support for academia.

While the trustworthy development of AI has been highlighted in high-profile settings (e.g. the European Commission’s High-Level Expert Group on AI), a set of concrete, voluntary mechanisms that AI developers can adopt to make more verifiable claims has not yet been analysed comprehensively – until now.

“We are facing a crisis of trust in AI development, and AI developers need to take concrete steps now to address this crisis.”

Haydn Belfield,
- CSER Research Associate, and one of the report’s lead co-authors.

Added Haydn Belfield: “People understand the opportunities and challenges AI and machine learning bring. Almost all AI developers want to act responsibly, safely and ethically - but its been unclear what they can concretely do. No longer. It’s now time for AI developers to move beyond well-meaning ethical principles, and introduce concrete mechanisms to move towards trustworthy AI development.”

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims is available to download and read at http://www.towardtrustworthyai.com/.

Related team members

View all team members

Related research areas

View all research areas

Risks from Artificial Intelligence

Related resources

View all resources

Toward Trustworthy AI: Mechanisms for Supporting Verifiable Claims
Report by Miles Brundage, Shahar Avin, Jasmine Wang, Haydn Belfield, Gretchen Krueger, Gillian Hadfield, Heidy Khlaaf, Jingying Yang, Helen Toner, Ruth Fong, Tegan Maharaj, Pang Wei Koh, Sara Hooker, Jade Leung, Andrew Trask, Emma Bluemke, Jon Lebensold, Cullen O'Keefe, Mark Koren, Théo Ryffel, JB Rubinovitz, Tamay Besiroglu, Federica Carugati, Jack Clark, Peter Eckersley, Sarah de Haas, Martiza Johnson, Ben Laurie, Alex Ingerman, Igor Krawczuk, Amanda Askell, Rosario Cammarota, Andrew Lohn, Shagun Sodhani, Charlotte Stix, Peter Henderson, Logan Graham, Carina Prunkl, Bianca Martin, Elizabeth Seger, Noa Zilberman, Seán Ó hÉigeartaigh, Frens Kroeger, Girish Sastry, Rebecca Kagan, Adrian Weller, Brian Tse, Beth Barnes, Allan Dafoe, Paul Scharre, Martijn Rasser, David Kreuger, Carrick Flynn, Ariel Herbert-Voss, Thomas Krendl Gilbert, Lisa Dyer, Saif Khan, Markus Anderljung, Yoshua Bengio

Global AI experts suggest steps towards trustworthy AI development

Haydn Belfield

Adrian Weller

Shahar Avin

Beth Barnes

Seán Ó hÉigeartaigh

Risks from Artificial Intelligence