EA - The first AGI will be a buggy mess by titotal
The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund
Kategorien:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The first AGI will be a buggy mess, published by titotal on July 30, 2022 on The Effective Altruism Forum. Theres a common sentiment when discussing AI existential risk (x-risk), that "we only have one shot", "we have to get AI safety right on the first try", etc. Here is one example of this sentiment: We think that once humanity builds its first AGI, superintelligence is likely near, leaving little time to develop AI safety at that point. Indeed, it may be necessary that the first AGI start off aligned: we may not have the time or resources to convince its developers to retrofit alignment to it The belief is that as soon as we create an AI with at least human-level general intelligence, it will be relatively easy to use it’s superior reasoning, extensive knowledge, and superhuman thinking speed to take over the world. This assumption is so pervasive in AI risk thinking that it’s often taken as obvious, and sometimes not even mentioned as a premise. I believe that this assumption is wrong, or at least, insufficiently proven. One of the reasons I believe this is due to the fact that the first AGI will, inevitably, be a buggy mess. Why the first AGI will almost certainly be buggy: Because writing bug-free code is impossible. There are codes that are nearly bug free. NASA code is very close to bug-free, but only because they build up reams of external documentations and testing before even daring to make a slight change to code. There are no indications that AI will be built in this manner. The more typical model is to put out alpha versions of software, then spent many months ironing out bugs as time goes on. Whatever insight or architecture is required for AGI, there is a very high likelihood it will first be implemented on an alpha or pre-alpha test build. The objection that comes to mind is that being buggy and being an AGI are mutually incompatible. The argument would be that AGI must be bug-free, because if the AI is buggy, then it's impossible for it to have human-level general intelligence. I have roughly 7 billion counterexamples to this argument. Humans are a human-level general intelligence that have bugs in spades, be it optical illusions, mental illnesses, or just general irrationality. Being perfectly bug-free was never an evolutionary requirement for our intelligence to develop, so it didn’t happen. The same logic applies to an AGI. Every single example of intelligence above a certain threshold, be it software, humans, or animals, has mental flaws in abundance, why would an AGI be any different? AGI does not need to perfect to be incredibly useful. It's much, much easier to create a flawed AGI than a flawless one, and the possibility space for fallible AGI is orders of magnitude greater than infallible AGI. It’s extremely unlikely that the first AGI (or really any AGI) will not have some bugs or mental flaws. In a way, this is an argument for why we should be concerned about AI going rogue. We say software is "buggy" if it doesn't do what we want. A misaligned AI is an AI that doesn’t do what we want. Stating that AI is misaligned is just saying the AI’s goal function implementation will be buggy (and the argument is that it only needs to be a little buggy to cause X-risk). In these terms, AI safety is just a very high stakes pre-emptive debugging problem. But bugs in the goal function will be paired with bugs in the execution functions, so it will also be buggy at doing the things that it wants to do. What type of bugs could occur? I can think of a few broad categories: Crashes/ glitches: logic errors, divide by zero errors, off by one errors, etc, the type you’ll find in every code, due to simple mistakes made by fallible programmers. Incorrect beliefs: Inevitably, to do tasks, we have to make assumptions. In some cases, like a program that solves the s...
