April 26, 2011

AI overlords and their deciles

The Winter intelligence survey contains more data than is displayed in the graphs, and one thing that really annoyed me was the lack of use of the decile data. People were asked to state when they thought there was 10%, 50% and 90% chance of human level AI. This is rich data, since it implies more than just a centre and a spread for a probability distribution. I decided to try to plot the collective probability distribution. This led to a merry chase in probability.

AI probability density (skew gaussian and triangular)

Problem 1: what distribution to fit? Uniform distributions and Gaussians are out, since they have no skewness at all, and the data demands skew distribution (for example, I might think the 10%, 50% and 90% points are 2030, 2040 and 2100, producing a positive skew).

Triangular distributions are nice, since they do allow for skew and just have three simple to understand parameters. The downside is that they produce a jagged distribution - while nothing in our data forces us to think the probability distribution has to be unimodal or have nice derivatives, I think most of us have an implicit assumption that "real" distributions should look smooth (maybe a maximum entropy principle?) Skewed Gaussians also do the job and are nicely differentiable, except that they are somewhat obscure.

I had hoped that Weibull distributions could be used since they could perhaps be viewed as a linearly changing "success rate" of AI, but they proved a bad fit - there is not enough "slack" in their functional form to get all three deciles in the right place even when I added a location parameter.

Ideally I would like to fit the maximum entropy distribution with three parameters to the data, but as far as I know there is no closed form for this.

Problem 2: How do you estimate the distribution parameters to fit three given deciles? Apparently this is not a standard procedure, so after some fruitless searching in the literature and online I just decided to use Matlab's function optimization routine to do it. That still proved somewhat messy, especially since initial conditions turned out to matter quite a lot: getting skew gaussians to switch from left- to right-skewed forms was hard (the skewness parameter has to cross zero, which seems to be a repelling boundary). In the end I ran one attempt using an initially left-skewed and another one right-skewed starting state, taking the best fit result.

The result was not bad. A big lump of probability mid-century with a positive skew. In the 2030s the yearly probability of a breakthrough ends up around 1% per year.

AI probability cdf (skew gaussian and triangular)

However, the expected 50% point is around 2060, as per the survey (good sanity check).

Problem 3: The curves say that there isa ~9.5% chance of AI already having been achieved? At first glance this looks suspicious. The reason is that some of the individual distributions fitted to peoples decile data put probability in the past. This is obvious if we use Gaussians or skew Gaussians - they are nonzero for all values.

Even the apparently well-behaved triangular distribution, which has support just in a finite interval, is often forced to spill out quite far. Consider someone saying 10% chance in 2020, 50% chance in 2200 and 90% chance in 2200. Sounds reasonable, right? But in order to get the first decile in 2020, the leftmost point in the distribution has to be in 1957 since the triangle is very broad.

Of course, one could just stipulate that the probability before the present has to be zero and then fit truncated distributions. That way I get a 10% chance of AI before 2020, 50% before 2059 and 90% before 2173.

Is this an argument against this kind of exercise, or just a demonstration that there is another assumption we might need to make explicit? I think it reveals that when people get the 10%-50%-90% question they likely have a distribution with a low kurtosis in mind: they think that it goes to zero rapidly outside their interval. But my fitted distributions, even the triangular one, are insufficiently platykurtic. I don't think this is my fault, actually. Given how uncertain we should be about the future and how overconfident we tend to be about just about everything, assuming platykurtic distributions is just a mistake. The constraints of fitting probability distributions reveals some of our biases and inconsistencies in how we make predictions.

Besides, maybe some of my readers are AIs developed in the past who have quietly evolved beyond the human level. This might explain the apparent lack of progress in AI (what ever happened to baby Eurisko?) I, for one, welcome our new AI overlords (with ~9.5% chance).

Posted by Anders3 at April 26, 2011 04:54 PM