EA - Four mindset disagreements behind existential risk disagreements in ML by RobBensinger

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Podimo 60!!! Tage kostenlos! testen

Ein Universum voller exklusiver Podcasts und Hörbücher. Klicken Sie hier um loszulegen!

Kategorien:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Four mindset disagreements behind existential risk disagreements in ML, published by RobBensinger on April 11, 2023 on The Effective Altruism Forum.I've previously written that good ideas and conversations about AGI seem to have propagated through ML weirdly slowly.A different weird phenomenon I observe is that the field's relative inaction about AGI seems less based on a confident set of beliefs about AGI definitely going well (or definitely being far off), and more based on an implicit sense like "the default is everything going well, and we don't need to change anything until there's overwhelming evidence to the contrary".Some people do have confident beliefs that imply "things will go well"; I disagree there, but I expect some amount of disagreement like that.But that doesn't seem to be the crux for most people in ML.In a sane world, it doesn't seem like "well, maybe AI will get stuck at human-ish levels for decades" or "well, maybe superintelligence couldn't invent any wild new tech" ought to be cruxes for "Should we pause AI development?" or "Is alignment research the world's top priority?"Note that I'm not arguing "an AGI-mediated extinction event is such a big deal that we should make it a top priority even if it's very unlikely". There are enough other powerful technologies on the horizon, and enough other risks for civilizational collapse or value lock-in, that I don't in fact think AGI x-risk should get major attention if it's very unlikely.But the most common view within ML seems to be less "it's super unlikely for reasons X Y Z", and more of an "I haven't thought about it much" and/or "I see some reasons to be very worried, but also some reasons things might be fine, so I end up with medium-ish levels of worry".48% of researchers who had recently published in NeurIPS or ICML gave double-digit probabilities in mid-2022 to advanced AI's long-term effect being â€œextremely bad (e.g., human extinction)â€. A similar number gave double-digit probabilities to "human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species".91% of researchers working on "long-term AI topics" at CHAI, DeepMind, MIRI, OpenAI, Open Philanthropy, and what would become Anthropic gave double-digit probabilities in early 2021 to "the overall value of the future will be drastically less than it could have been, as a result of AI systems not doing/optimizing what the people deploying them wanted/intended".The level of concern and seriousness I see from ML researchers discussing AGI on any social media platform or in any mainstream venue seems wildly out of step with "half of us think there's a 10+% chance of our work resulting in an existential catastrophe".I think the following four factors help partly (though not completely) explain what's going on. If I'm right, then I think there's some hope that the field can explicitly talk about these things and consciously course-correct."Conservative" predictions, versus conservative decision-making.Waiting for a fire alarm, versus intervening proactively.Anchoring to what's familiar, versus trying to account for potential novelties in AGI.Modeling existential risks in far mode, versus near mode.1. "Conservative" predictions, versus conservative decision-makingIf you're building toward a technology as novel and powerful as "automating every cognitive ability a human can do", then it may sound "conservative" to predict modest impacts. But at the decision-making level, you should be "conservative" in a very different sense, by not gambling the future on your technology being low-impact.The first long-form discussion of AI alignment, Eliezer Yudkowsky's Creating Friendly AI 1.0, made this point in 2001:The conservative assumption according to futur...

Visit the podcast's native language site