October 09, 2010

What did you learn about the singularity today?

StairsWhat did I learn about the singularity during our track at ECAP10? Anna Salamon pressed me into trying to answer this good question. First, an overview of the proceedings:

Defining the subject

Amnon Eden started out by trying to clear up the concepts, or at least show where concepts need to be cleared up. His list of central questions included:


  • What is the singularity hypothesis? What is actually being claimed? This is really the first thing to be done, without clarifying the claim thee is no real subject.
  • What is the empirical content of that claim? Can it be refuted or corroborated, and if so, how?
  • What exactly is the nature of the event forecasted? A discontinuity a la a phase transition or a process like Toffler's waves?
  • What - if anything - accelerates? Which metrics can be used to measure acceleration? What evidence supports its existence?
  • Is Moore's law a law of nature? What about the law of accelerating returns? (after all, some people seem to think this, and if that were true it would indeed be a non-trivial result)
  • What is actual likelihood of an intelligence explosion (or other runaway effect)? Can it be prevented?
  • Has machine intelligence really been growing?
  • What does it mean to claim that biological evolution will be replaced by technological evolution?
  • How much can human intelligence increase?
  • How different would posthuman or cyborg children be from us?
  • What are the necessary and sufficient conditions for WBE? What must be emulated?


Conditions for the singularity


TensegrityAnna Salamon gave a talk on "How intelligible is intelligence?" The key question is whether there exists simple principles that could produce powerful optimizers that work on many domains and targets, and whether they are easy to figure out. This has obvious implications for the above questions, such as how likely sudden breakthroughs are compared to slow trial-and-error and complete failure (c.f. my and Carl's talk). While she did not have a precise and formal definition of intelligible concepts or things it is pretty intuitive: one can extend intelligible systems from principles or simpler versions, they are not domain specific, they can be implemented in many substrates, aliens would likely be able to come up with the concept. Unintelligible systems are just arbitrary accumulations of parts or everything-connected-to-everything jumbles.

Looking through evidence in theoretical computer science, real computer science, biology and human communities her conclusion was that intelligence is at least somewhat intelligible - it doesn't seem to rely just on accumulating domain-specific tricks, it seems to have a few general and likely relatively simple modules that are extendible. Overall, a good start. As she said, now we need to look at whether we can formally quantify the question and gather more evidence. It actually looks possible.

She made the point that many growth curves (in technology) look continuous rather than stair step-like, which suggests they are due to progress on unintelligible systems (an accumulation of many small hacks). It might also be that systems have an intelligibility spectrum: modules on different levels are differently difficult, and while there might be smooth progress on one level other levels might be resistant (for example, neurons are easier to figure out than cortical microcircuits). This again has bearing on the WBE problem: at what level does intelligibility, the ability to get the necessary data and having enough computer power intersect first? Depending on where, the result might be very different (Joscha Bach and me had a big discussion on whether 'generic brains'/brain-based AI (Joscha's view) or individual brains (my view) would be the first outcome of WBE, with professor Günther Palm arguing for brain-inspired AI).

Joscha Bach argued that there were four preconditions for reaching an AI singularity:


  1. Perceptual/cognitive access. The AIs need be able to sense and represent the environment, universal representations and general enough intelligence.
  2. Operational access. They need to be able to act upon the outside environment: they need write access, feedback, that they can reach critical environment and (in the case of self-improvement) access to their own substrate.
  3. Directed behaviour. They need to autonomously pursue behaviour that include reaching the singularity. This requires a motivational system or some functional equivalent, agency (directed behavior), autonomy (the ability to set own goals), and a tendency set goals that increase abilities and survivability.
  4. Resource sufficiency. There has to be enough resources for them to all this.

His key claim was that these functional requirements are orthogonal to architecture of actual implementations, and hence AI singularity is not automatically a consequence of having AI.

I think this claim is problematic: 1 (and maybe parts of 3) is essentially implied by any real progress in AI. But I think he clarified a set of important assumptions, and if all these preconditions are necessary it is enough that one of them is not met for an AI singularity not to happen. Refining this a bit further might be really useful.

He also made a very important point that is often overlooked: the threat/promise lies not in the implementation but is a functional one. We should worry about intelligent agents pursuing a non-human agenda that are self improving and self-extending. Many organisations come close. Just because they are composed of humans doesn't mean they work for the interest of those humans. I think he is right on the money that we should watch for the possibility of an organisational singularity, especially since AI or other technology might provide further enhancement of the preconditions above even when the AI itself is not enough to go singular.

Kaj Sotala talked about factors that give a system high "optimization power"/intelligence. Calling it optimization power has the benefit of discouraging anthropomorphizing, but it might miss some of the creative aspects of intelligence. He categorised them into: 1) Hardware advantages: faster serial processing, faster parallel processing, superior working memory equivalent. 2) Self improvement and architectural advantages: the ability to modify itself, overcome biased reasoning, algorithms for formally collect reasoning and adding new modules such as fully integrating complex models. 3) Software advantages: copyability, improved communication bandwidth, speed etc. Meanwhile humans have various handicaps, ranging from our clunky hardware to our tendency to model others by modelling on ourselves. So there are good reasons to think an artificial intelligence could achieve various optimization/intelligence advantages over humans relatively simply if it came into existence. Given the previous talk, his list is also interestingly functional rather than substrate-based. He concluded: "If you are building an AI, please be careful, please try to know what you are doing". Which nicely segues into the next pair of talks:


Will superintelligences behave morally or not?


Number One and Number Four

Joshua Fox argued that superintelligence does not imply benevolence. We cannot easily extrapolate nice moral trends among humans or human societies to (self)constructed intelligences with different cognitive architectures. He argued that instrumental morality is not guaranteed: reputations, the powers to monitor, punish and reward each other, and the economic incentives to cooperate are not reliable arguments for proving the benevolence of this kind of agents. Axiological morality depends on what the good truly is. Kant would no doubt argue that any sufficiently smart rational mind will discover moral principles making it benevolent. But looking at AIXI (an existence proof of superintelligence) suggests that there is no "room" for benevolence as a built in terminal value - to get a benevolent AIXI you need to put it into the utility function you plug in. A non-benevolent AIXI will not become benevolent unless it suits its overall goal, and it will not change its utility function to become more benevolent since preservation of goals is a stable equilibrium. Simple goals are too simple to subsume human welfare, while overly complex goals are unlikely to be benevolent. Reflective equilibrium

He concluded that if dispositions can be modified and verified, then weak AI can credibly commit to benevolence. If benevolent terminal values are built in then the AI will fight to protect them. But if AI advances abruptly, and does not have built in beneficence from the start, then benevolence does not follow from intelligence. We need to learn benevolence engineering and apply it.

I personally think instrumental morality is more reliable and robust than Joshua gave it credit for (e.g. things like comparative advantage), but this of course needs more investigation and might be contingent on factors such as whether fast intelligence explosions produce intelligence distributions that are discontinuous. Overall, it all links back to the original question: are intelligence explosions fast and abrupt? If they aren't, then benevolence can likely be achieved (if we are lucky) through instrumental means, the presence of some AIs engineered to be benevolent and the "rearing" of AI within the present civilization. But if they are sharp, then benevolence engineering (a replacement for 'friendliness theory'?) becomes essential - yet there are no guarantees it will be applied to the first systems to really improve themselves.

Mark R. Waser had a different view. He claims superintelligence implies moral behavior. Basically he argues that the top goal we have is "we want what we want". All morality is instrumental to this, and human morality is simply imperfect because it has evolved from emotional rules of thumb. Humans have a disparity of goals, but reasonably consensus on the morality of most actions (then ethicists and other smart people come in a confuse us about why, gaining social benefits). Basically ethics is instrumental, and cooperation gives us what we want while not preventing others from getting what they want. The right application of game theory a la the iterated prisoner's dilemma leads to a universal consistent goal system, and this will be benevolent.

I think he crossed the is-ought line a few times here. Overall, it was based more on assertion than a strict argument, although I think it can be refined into one. It clearly shows that the AI friendliness problem is getting to the stage where professional ethicists would be helpful - the hubris of computer scientists is needed to push forward into these tough metaethical matters, but it would help if the arguments were refined by people who actually know what they are doing.

Intelligence explosion dynamics


SpeakersStephen Kaas presented an endogenous growth model of the economic impact of AI. It was extremely minimal, just Romer's model with the assumption that beyond a certain technology level new physical capital also produces human capital (an AI or upload is after all physical capital that has 'human' capital). The result is a finite time singularity, of course. This is generic behaviour even when there are decreasing margins. The virtue of the model is that it is a minuscule extension of the standard model: no complex assumptions, just AI doing what it is supposed to do. Very neat.

Carl Shulman and me gave a talk about hardware vs. software as the bottleneck for intelligence explosions. See previous posting for details. Basically we argued that if hardware is the limiting factor we should see earlier but softer intelligence explosions than if software is hard to do, in which case we should see later, less expected and harder takeoffs.

There is an interesting interaction between intelligibility and this dynamics. Non-intelligible intelligence requires a lot of research and experimentation, slowing down progress and requiring significant amounts of hardware: a late breakthrough but likely slower. If intelligence is intelligible it does not follow that it is easy to figure out: in this case a late sharp transition is likely. Easy intelligence on the other hand gives us an early soft hardware scenario. So maybe unintelligible intelligence is a way to get a late soft singularity, and evidence for it (such as signs that there are just lots of evolved modules rather than overarching neural principles) should be seen as somewhat reassuring.

Constructing the singularity?

Marble worldScott Yim approached the whole subject from a future studies/sociology angle. According to "the central dogma of future studies" there is no future, we construct it. Many images of the future are a reaction to one's views on the uncertainty in life: it might be a rollercoaster - no control, moves in a deterministic manner (Kurzweil was mentioned), it might be like rafting - some control, a circumscribed path (Platt) or like sailing - ultimate control over where one is going (Ian Pearson).

While the talk itself didn't have much content I think Scott was right in bringing up the issue: what do our singularity models tell us about ourselves? And what kind of constructions do they suggest? In many ways singularity studies is about the prophecy that the human condition can be radically transformed, taking it from the more narrow individualistic view of transhumanism (where individuals can get transformed) to look at how the whole human system gets transformed. It might be implicit in posing the whole concept that we think singularities are in a sense desirable, even though they are also risky, and that this should guide our actions.

I also think it is important for singularity studies to try to find policy implications of their results. Even if it was entirely non-normative the field should (in order to be relevant) bring us useful information about our options and their likely impact. Today most advice is of course mostly of the "more research is needed" type, but at least meetings like this show that we are beginning to figure out *where* this research would be extra important.


What did I learn?

The big thing was that it actually looks like one could create a field of "acceleration studies", dealing with Amnon's questions in a intellectually responsible manner. Previously we have seen plenty of handwaving, but some of that handwaving has now been distilled through internal debate and some helpful outside action to the stage where real hypotheses can be stated, evidence collected and models constructed. It is still a pre-paradigmatic field, which might be a great opportunity - we do not have any hardened consensus on How Things Are, and plenty of potentially useful disagreements.

Intelligibility seems to be a great target for study, together with a more general theory of how technological fields can progress. I got some ideas that are so good I will not write about them until I have tried to turn them into a paper (or blog post).

One minor realization was to recall Amdahl's law, which had really slipped my mind but seems quite relevant for updates of our paper. Overall, the taxonomies of preconditions and optimization power increases, as well as Carl's analysis of 'serial' versus 'parallel' parts of technology development, suggests that this kind of analysis could be extended to look for bottlenecks in singularities: they will at the very least be dominated by the least 'parallelisable' element of the system. Which might very well be humans in society-wide changes.

Posted by Anders3 at October 9, 2010 10:20 AM
Comments