EA - An experiment eliciting relative estimates for Open Philanthropy’s 2018 AI safety grants by NunoSempere

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Podcast artwork

Kategorien:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An experiment eliciting relative estimates for Open Philanthropy’s 2018 AI safety grants, published by NunoSempere on September 12, 2022 on The Effective Altruism Forum. Summary I present the design and results of an experiment eliciting relative values from six different researchers for the nine large AI safety grants Open Philanthropy made in 2018. The specific elicitation procedures I used might be usable for rapid evaluation setups, for going from zero to some evaluation, or for identifying disagreements. For weighty decisions, I would recommend more time-intensive approaches, like explicitly modelling the pathways to impact. Background and motivation This experiment follows up on past work around relative values (1, 2, 3) and more generally on work to better estimate values. The aim of this research direction is to explore a possibly scalable way of producing estimates and evaluations. If successful, this would bring utilitarianism and/or longtermism closer to producing practical guidance around more topics, which has been a recurring thread in my work in the last few years. Methodology My methodology was as follows: I selected a group of participants whose judgment I consider to be good. I selected a number of grants which I thought would be suitable for testing purposes. Participants familiarized themselves with the grants and with what exactly they ought to be estimating. Participants made their own initial estimates using two different methods: Method 1: Using a utility function extractor app. Method 2: Making a “hierarchical tree” of estimates. For each participant, I aggregated and/or showed their two estimates side by side, and asked them to make a best guess estimate. I took their best guess estimates, and held a discussion going through each grant, making participants discuss their viewpoints when they had some disagreements. After holding the discussion, I asked participants to make new estimates. Overall, the participants took about two to three hours each to complete this process, roughly divided as follows: 10 to 30 mins to familiarize themselves with the estimation target and to re-familiarize themselves with the grants 20 to 40 mins to do the two initial estimates 5 to 30 mins to give their first best guess estimate after seeing the result of the two different methods 1h to hold a discussion 5 to 30 mins to give their resulting best guess estimate The rest of this section goes through these steps individually. Selection of participants I selected participants by asking friends or colleagues whose judgment I trust, and who had some expertise or knowledge of AI safety. In particular, I selected participants who would be somewhat familiar with Open Philanthropy grants, because otherwise the time required for research would have been too onerous. The participants were Gavin Leech, Misha Yagudin, Ozzie Gooen, Jaime Sevilla, Daniel Filan and another participant who prefers to remain anonymous. Note that one participant didn’t participate in all the rounds, which is why some summaries contain only five datapoints. Selection of grants The grants I selected were: AI Impacts — General Support (2018): $100,000 Machine Intelligence Research Institute — AI Safety Retraining Program: $150,000 Open Phil AI Fellowship — 2018 Class: $1,135,000 Ought — General Support (2018): $525,000 Oxford University — Research on the Global Politics of AI: $429,770 Stanford University — Machine Learning Security Research Led by Dan Boneh and Florian Tramer: $100,000 UC Berkeley — AI Safety Research (2018): $1,145,000 Wilson Center — AI Policy Seminar Series: $400,000 These are all the grants that Open Philanthropy made to reduce AI risk in 2018 above a threshold of $10k, according to their database. The year these grants were made is long enough ago that we have some information...

Visit the podcast's native language site