I have mentioned my skepticism against the idea that we could be reconstructed in a meaningful way from our stored email, life recordings, personality quizzes and genomes. In a recent thread on the ExtroBritannia list the issue came up again, and I think we made some progress on showing its implausibility (no, just saying it is impossible is not convincing).

The goal of the exercise is to estimate how much information is needed to reconstruct an individual “well enough” without having direct access to their nervous system. The reconstructing agency is assumed to have arbitrary computational powers, but is limited by available information.

In the following I will be using the Stirling approximation of factorials (log(n!) ≈ n log(n)– n) to calculate binomial coefficients: log(N over k) ≈ N log(N) – (N-k)log(N-k) – k log(k). Also, remember that the number of bits of information you need to supply to find a particular object among N is log_{2}(N), and that log_{2}(10^{x}) ≈ 3.32x.

Some think it is enough to reconstruct the right personality. "The number of personalities" is however not well defined. If we consider just the “big five” and assume we can tell apart 1% differences, we end up with 10

Another argument would be that we are shaped by our experiences, so the number of possible experiences determines the number of possible humans. Linde and Vanchurin argue that during a lifetime we can acquire at most 10

Other estimates can be made. Human retinas have a bandwidth of 8.75 megabits per second, providing the brain with about 20Mb/s in total across the 2 million axons in the optic nerves. The spinal cord similarly has a couple of million fibers and we can lump it all together by guessing that the maximum input is on the order of 200 Mb/s. 200 Mb/s over 80 years is 5*10^{17} bits. So by this approach, there might be up to 2^{5*1017}=10^{1.5*1017} possible persons.

Unfortunately, while storing a few hundred petabits might soon be doable and life recording might allow us to document our lives very well, it doesn’t follow that we record the *right* information. My experience of a piece of music depends on complex details of my auditory physiology and mental processing: replaying it to another system is unlikely to produce the same experience.

The life recording approach forgets about initial conditions. Two pieces of software given the same input can behave utterly different, including storing different information. Chaotic dynamical systems (and we certainly have some of them inside us) diverge exponentially when given different initial conditions, even when perfectly deterministic. And babies already demonstrate personality differences. So the above numbers need to be multiplied by the number of distinct starting states.

There are about 1.5 gigabytes of information in the genome but much of this is shared between different humans; the genetic specification of a person relative to a baseline human genome is likely about 20 megabytes, 2^{1.6*108} possibilities.

Genetics also doesn’t specify much of our brains: we have far fewer genes than neural connections, and they are generated in a complex semi-random process influenced by the environment in utero. So we have to take a look at the number of possible human brains. The calculations below show that this pumps up the information need enormously, at least by 15 orders of magnitude. And this information is not externally available (unlike the genetics, which is left in every skin flake).

So it appears unlikely that documenting the information in our environment plus initial conditions will be feasible within the conceivable future. So we need to get information from the brain to pin down what particular brain it is.

Even if we were to visibly move our ~639 muscles at 10 Hz (about as fast as they can twitch) that provides just a few kilobits of information per second.

Spoken and written words are even worse: the average entropy per English text character is about one bit, and the entropy rate of spoken dialogue is a few bits per second.

The average daily email production is about 15 emails containing about 30 kilobytes each, corresponding to an information production of 41 bits/s – much of it of course header information generated by computer.

A high resolution video and audio recording of our lives might have a far high bit rate, we do not contribute much new information to each frame.

**This leads to a first tentative argument against reconstruction based on external data: we are acquiring potentially personality-affecting information at a fairly high rate during our waking life, yet not revealing information at the same high rate. The ratio seems to be at least 1000:1.**

Still, a reconstruction enthusiast might be undeterred. Most of those input bits are discarded: we learn and change far more slowly than what we sense. If the number of possible distinguishable human minds is small enough, we should be able to determine which one inhabits a certain brain by inferring it from its behavior.

A human brain contains 10^{11} neurons with a few thousand connections each, giving us around 10^{15} synapses.

A simple argument for the number of possible persons would be that the 10^{15} synapses of a brain can be in either a potentiated or unpotentiated state, leading to 2^{1015} possible states. Or, more simply, we need 10^{15} = 1 petabit of information to specify which of them a given individual is. There is also the issue of how many ways the 0.5*10^{22} neuron pairs can be connected by these synapses. (0.5*10^{22} over 10^{15}) is around 10^{7*1015}. So we need 23 petabits to specify which brain connectivity a person has.

But most of these brains are indistinguishable. We do not become entirely different persons because one pixel on a TV screen 10 years ago was different or because a synapse just got removed. Just like macrostates in statistical mechanics contain *lots* of different microstates, our personal identity macrostates have room for plenty of microvariations. The fact that we remain identifiable (mostly) day from day demonstrates this.

Many neural disorders can progress undetected until a few tens of percent of neurons in affected areas are gone. So let’s make the optimistic (in the sense that it makes reconstruction easier) assumption that brains with 90% the same connections produce the same person. This is likely not too far out: neural networks are robust to the deletion of a few connections. But it of course ignores that certain focal deletions can have big effects.

If we randomize 10^{14} out of the 10^{15} connections, we can select them in (10^{15} over 10^{14}) ways, or 10^{1.4*1014}. We can then reconnect them in (0.5*10^{22}-10^{15} over 10^{14}) ways each, 10^{8.1*1014}. So the total number of brain connectivities giving persons indistinguishable from the original is the product, 10^{9.5*1014}. So out of the 10^{7*1015} possible brains only 10^{6.0*1015} are distinguishable. That requires 2*10^{16} bits, or 20 petabits.

20 petabits is staggering, yet not unheard of - there are certainly bigger data centers around today. However, assuming that we produce a 10 Kb/s of personal data per second by moving or not moving, it would take 63,377 years to get enough to specify a mind uniquely enough to construct a brain close enough connectivity. If we actually produce one megabit per second of personal data it can be done in 633 years. In order to get it down into the range of a human lifetime we need tens of megabits per second: this can likely not be done using external means, but requires interfacing directly to the nervous system.

Note that this only counts incompressible, relevant bits – information that actually helps determine the structure of the mind. In reality our activities are of course highly compressible and noisy: after having observed one of my verbal or kinetic tics for the first time, seeing repetitions are not very informative.

Even with arbitrarily powerful computational power, inferring which mind a given human contains appear infeasible given the limited amount of information about its internal state revealed in normal activity. Accurate documentation of the environment may provide helpful constraints, but since the relevant question is how environmental information is processed internally emitted information will always have far higher constraining value.

It might be simpler for arbitrarily powerful future entities to just simulate all possible past humans than try to reconstruct particular ones based on personal information.

In the absence of arbitrary computational power, we will likely have to make do by preserving the brains themselves.

Posted by Anders3 at April 11, 2012 04:17 PMComments