EA - AI Safety - 7 months of discussion in 17 minutes by Zoe Williams

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Podcast artwork

Kategorien:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Safety - 7 months of discussion in 17 minutes, published by Zoe Williams on March 15, 2023 on The Effective Altruism Forum.In August 2022, I started making summaries of the top EA and LW forum posts each week. This post collates together the key trends I’ve seen in AI Safety discussions since then. Note a lot of good work is happening outside what's posted on these forums too! This post doesn't try to cover that work.If you’d like to keep up on a more regular basis, consider subscribing to the Weekly EA & LW Forum Summaries. And if you’re interested in similar overviews for other fields, check out this post covering 6 months of animal welfare discussion in 6 minutes.Disclaimer: this is a blog post and not a research report - meaning it was produced quickly and is not to our (Rethink Priorities') typical standards of substantiveness and careful checking for accuracy. Please let me know if anything looks wrong or if I've missed key pieces!Table of Contents(It's a long post! Feel free to pick and choose sections to read, they 're all written to make sense individually)Key TakeawaysResource CollationsAI CapabilitiesProgressWhat AI still fails atPublic attention moves toward safetyAI GovernanceAI Safety StandardsSlow down (dangerous) AIPolicyUS / China Export RestrictionsPaths to impactForecastingQuantitative historical forecastingNarrative forecastingTechnical AI SafetyOverall TrendsInterpretabilityReinforcement Learning from Human Feedback (RLHF)AI assistance for alignmentBounded AIsTheoretical UnderstandingOutreach & Community-BuildingAcademics and researchersUniversity groupsCareer PathsGeneral guidanceShould anyone work in capabilities?Arguments for and against high x-riskAgainst high x-risk from AICounters to the above argumentsAppendix - All Post SummariesKey TakeawaysThere are multiple living websites that provide good entry points into understanding AI Safety ideas, communities, key players, research agendas, and opportunities to train or enter the field. (see more)Large language models like ChatGPT have drawn significant attention to AI and kick-started race dynamics. There seems to be slowly growing public support for regulation. (see more)Holden Karnofsky recently took a leave of absence from Open Philanthropy to work on AI Safety Standards, which have also been called out as important by leading AI lab OpenAI. (see more)In October 2022, the US announced extensive restrictions on the export of AI-related products (eg. chips) to China. (see more)There has been progress on AI forecasting (quantitative and narrative) with the aim of allowing us to understand likely scenarios and prioritize between governance interventions. (see more)Interpretability research has seen substantial progress, including identifying the meaning of some neurons, eliciting what a model has truly learned / knows (for limited / specific cases), and circumventing features of models like superposition that can make this more difficult. (see more)There has been discussion on new potential methods for technical AI safety, including building AI tooling to assist alignment researchers without requiring agency, and building AIs which emulate human thought patterns. (see more)Outreach experimentation has found that AI researchers prefer arguments that are technical and written by ML researchers, and that greater engagement is seen in university groups with a technical over altruistic or philosophical focus. (see more)Resource CollationsThe AI Safety field is growing (80K estimates there are now ~400 FTE working on AI Safety). To improve efficiency, many people have put together collations of resources to help people quickly understand the relevant players and their approaches - as well as materials that make it easier to enter the field or upskill...

Visit the podcast's native language site