EA - Cost-effectiveness of student programs for AI safety research by Center for AI Safety

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Kategorien:

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Cost-effectiveness of student programs for AI safety research, published by Center for AI Safety on July 10, 2023 on The Effective Altruism Forum.SummaryThis post explores the cost-effectiveness of field-building programs for students, specifically the Atlas Fellowship (a rationality program, with some AI safety programming), MLSS (an ML safety course for undergraduates), a top-tier university student group, and undergraduate research stipends.We estimate the benefit of these programs in ‘Quality-Adjusted Research Years’, using cost-effectiveness models built for the Center for AI Safety (introduction post here, full code here). Since our framework focuses on benefits for technical AI safety research exclusively, we will not account for other benefits of programs with broader objectives, such as the Atlas Fellowship.We intend for these models to support - not determine - strategic decisions. We do not believe, for instance, that programs which a model rates as lower cost-effectiveness are necessarily not worthwhile as part of a portfolio of programs.The models’ tentative results, summarized below, suggest that student groups and undergraduate research stipends are considerably more cost-effective than Atlas and MLSS. (With many important caveats and uncertainties, discussed in the post.) Additionally, student groups and undergraduate research stipends compare favorably to ‘baseline’ programs - directly funding a talented research scientist or PhD student working on trojans research for 1 year or 5 years respectively.ProgramCost (USD) Benefit (counterfactual expected QARYs)Cost-effectiveness (QARYs per $1M)Atlas9,000,000 434.7MLSS330,0006.419Student Group350,00050140Undergraduate Stipends50,00017340Baseline: Scientist Trojans500,00084170Baseline: PhD Trojans250,0008.735For readers who are after high-level takeaways, including which factors are driving these results, skip ahead to the cost-effectiveness in context section. For those keen on understanding the model and results in more detail, read on as we:Give important disclaimers. (Read more.)Direct you to background information about this project. (Read more.)Walk through the model. (Read more.)Contrast these programs with one another, and with funding researchers directly. (Read more.)Test the robustness of the model. (Read more.)DisclaimersThis analysis is a starting point for discussion, not a final verdict. The most critical reasons for this are that:These models are reductionist. Even if we have avoided other pitfalls associated with cost-effectiveness analyses, the models might ignore factors that turn out to be crucial in practice, including (but not limited to) interactions between programs, threshold effects, and diffuse effects.The models’ assumptions are first-pass guesses, not truths set in stone. Most assumptions are imputed second-hand following a short moment of thought, before being adjusted ad-hoc for internal consistency and differences of beliefs between Center for AI Safety (CAIS) staff and external practitioners. In some cases, parameters have been redefined since initial practitioner input.This caveat is particularly important for the Atlas Fellowship, where we have not discussed parameter values with key organizers.Instead, the analyses in this post represent an initial effort in explicitly laying out assumptions, in order to take a more systematic approach towards AI safety field-building.BackgroundFor an introduction to our approach to modeling - including motivations for using models, the benefits and limitations of our key metric, guidance for adopting or adapting the models for your own work, comparisons between programs for students and professionals, and more - refer to the introduction post.The models’ default parameters are based on practitioner surveys and the ex...

Visit the podcast's native language site