EA - Could someone help me understand why it's so difficult to solve the alignment problem? by Jadon Schmitt

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Kategorien:

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Could someone help me understand why it's so difficult to solve the alignment problem?, published by Jadon Schmitt on July 24, 2023 on The Effective Altruism Forum.AGI will be able to model human langauge and psychology very accurately. Given that, wouldn't alignment be easy if you trained the AGI to interpret linguistic prompts in the way that the "average" human would? (I know language doesn't encode an exact meaning, but for any chunk of text, there does exist a distribution of ways that humans interpret it.)Thus, on its face, inner alignment seems fairly doable. But apparently, according to RobBesinger, "We don't know how to get an AI system's goals to robustly 'point at' objects like 'the American people' ... [or even] simpler physical systems." Why is this so difficult? Is there an argument that it is impossible?Outer alignment doesn't seem very difficult to me, either. Here's a prompt I thought of: "Do not do an action if anyone in a specified list of philosophers, intellectuals, members of the public, etc. would prefer you not do it, if they had all relevant knowledge of the action and its effects beforehand, consistent with the human legal standard of informed consent." Wouldn't this prompt (in its ideal form, not exactly as I wrote it) guard against many bad actions, including power-seeking behavior?Thank you for the help!Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Visit the podcast's native language site