EA - A survey of concrete risks derived from Artificial Intelligence by Guillem Bas

The Nonlinear Library: EA Forum - Ein Podcast von The Nonlinear Fund

Kategorien:

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A survey of concrete risks derived from Artificial Intelligence, published by Guillem Bas on June 9, 2023 on The Effective Altruism Forum.Riesgos Catastróficos Globales has conducted a literature review and an expert elicitation exercise to categorize concrete risks associated with Artificial Intelligence (AI). This is part of our ongoing work on the implementation of the EU AI Act in Spain.Here we present a short overview of the risks we have found. This is meant to be a mental framework for policymakers to consider when developing AI policies, but we think it might also be useful to incentivize discussion within the community. Please feel free to leave your thoughts as comments!To facilitate comprehension, we have split the identified risks into two categories: adversarial and structural risks. Adversarial risks are those caused by the direct action of an agent, be it rogue groups, state actors, or misaligned AI. Structural risks are those derived from the wide-scale or high-impact deployment of AI, with diffuse causes.The distinction builds upon the categorization between accidents, misuse, and structural risks (Zwetsloot & Dafoe, 2019). We preferred to merge the first two because we considered there was not always a clear difference in how accidents (AI misalignment) and misuses (humans exploiting an AI system to cause harm) materialize as specific threats.As for this materialization, we outline risks integrating present and future implications. That is to say, we state that their long-term impact is potentially large, but we ground them on existing and modest evidence. This choice is based on the assumption that policymakers will tend to underestimate speculative framings. The underlying logic we try to convey is that damage will increase along with capabilities and deployment.We have identified nine concrete risks within these categories, which are summarized in the table below. The categorization is not perfect, but we tried to prioritize clarity and concreteness over accuracy and exhaustiveness.Risk categoryRiskExample vignetteAdversarial risks: directly caused by agents, either humans or misaligned AICyberattacks and other unauthorized accessLLM-enabled spear-phishing campaignsStrategic technology developmentDevelopment of a new biological weaponUser manipulationIndividuals persuaded to support a certain political optionStructural risks: caused by widespread automationJob market disruption10% increase in unemployment over a yearSocioeconomic inequalityLeading companies capturing AI-created surplusesBias amplificationMinority groups being systematically denied access to housing or loansEpistemic insecurityProliferation of deep fakesFaulty automation of critical processesAccidental nuclear attack from fully-automated C&CDefective optimizationHospitals rejecting patients with serious conditions to maximize performance metricsWe briefly introduce these risks below, together with references for further reading.Adversarial risksThis section compiles potential threats from rogue human actors and misaligned AI. The final list coincides with what Shevlane et al. (2023) call "extreme risks" and is slightly connected to the distinction between digital, physical, and political dimensions proposed by Brundage et al. (2018).Readers might note that our selected risks are commonly mentioned as instances of power-seeking behavior. We have not included vignettes about goal mispecification and misgeneralization for two reasons: they tended to be too vague and, to be impactful, most of them required the instrumental use of the actions listed below.Cyberattacks and other unauthorized accessAI promises to enhance the execution of cyber offenses, increasing their scale and impact (Brundage, et al., 2018). New tools can automate manual tasks (see ...

Visit the podcast's native language site