Introduction

Anders Sandberg, Eudoxa AB

Why do we want friendly superintelligence? Because we definitely do not want to encounter an unfriendly superintelligence. Perhaps more seriously, the goal of "friendly AI" is to maximise the benefits of artificial intelligence while minimising the risks. That artificial intelligence can be extremely useful is obvious: there are few areas of human activity that would not be helped by the presence of extra intelligence – everything from simplifying dangerous work to providing an alternate perspective in decision making. On the other hand we have a mistrust of independently acting systems not bound by ethics or other constraints on their actions that might prevent them from harming us.

Friendly AI is in the end a practical problem, seeking to create artificial intelligence that benefit rather than harm humans. It is a transdisciplinary problem involving issues from philosophy, cognitive science, economics and engineering among others. The current discussion deals with friendly superintelligence because it might provide the greatest challenge: how do we design and act in order to achieve not just beneficial artificial intelligence, but also beneficial powerful artificial intelligence? We want a better solution to it than Bill Joy’s suggestion that all such research should be relinquished (an impractical solution with minimal chance of working).

What do we mean with general intelligence? There is no agreed on definition of intelligence, but for the purposes of this discussion it might suffice to suggest two rough definitions. The first one is due to Eliezer Yudkowski and defines intelligence to be the ability to model, predict and manipulate regularities in reality. Another definition, due to Peter Voss and me is that intelligence is the ability to accomplish things in a general domain given knowledge and the environment. Both these definitions deal with general intelligence rather than abilities in a very restrictive domain.

Superintelligence is another diffuse concept. It is not obvious how to measure or compare forms of intelligence with each other, but it is clear that there can exist both quantitative and quantitative differences. An dog mind transferred to a vastly faster hardware would still not be able to solve mathematical problems, regardless of the amount of canine education given to it, since its basic structure likely does not allow this form of abstract thought. Minds may have different abilities to find and exploit patterns. For the discussion, the potential consequences are more relevant than particular measures. If an artificial intelligence can affect the world to a great extent then it might be regarded as a superintelligence for our discussion purposes, even when it is mentally inferior in some sense to (say) a human intelligence. Of course, a mind with greater abilities in pattern detection and exploitation would likely be potentially more dangerous than a less skilled mind in the same situation.

What is friendliness? Eliezer S. Yudkowsky defined it in Creating Friendly AI as: "The term "Friendly AI" refers to the production of human-benefiting, non-human-harming actions in Artificial Intelligence systems that have advanced to the point of making real-world plans in pursuit of goals."

With these preliminaries dealt with, we turn to the discussion.