The Alignment Problem From a Deep Learning Perspective

AI Safety Fundamentals: Alignment - Ein Podcast von BlueDot Impact

Podimo 60!!! Tage kostenlos! testen

Ein Universum voller exklusiver Podcasts und Hörbücher. Klicken Sie hier um loszulegen!

Kategorien:

Within the coming decades, artificial general intelligence (AGI) may surpass human capabilities at a wide range of important tasks. We outline a case for expecting that, without substantial effort to prevent it, AGIs could learn to pursue goals which are undesirable (i.e. misaligned) from a human perspective. We argue that if AGIs are trained in ways similar to today's most capable models, they could learn to act deceptively to receive higher reward, learn internally-represented goals which g...

Visit the podcast's native language site