Superintelligence and Complexity

Jun 16, 2016 20:46

I just finished chapter 10 of Bostrom's Superintelligence, on Oracles, Genies, Sovereigns, and Tools. It has been one of the more disappointing chapters of the book. Since I've followed along these sorts of debates for a long time, I've never really been convinced that the "tool AI" approach was "bad" in the way Bostrom and Yudkowsky seem to claim.

I'll present a few of my issues with the chapter here. The overall theme is going to be the difference between machine learning and autonomous systems, and the varying complexities of different intelligence tasks.

First, when discussing a "genie with a preview," he notes that the "preview" option could be applied to a sovereign as well. Just having finished listening to the audiobook of superforecasters, I'm inclined to take this "preview" possibility a bit more seriously. With current human forecasters, making predictions 100 days into the future is possible. And getting up to 300 days or so is possible with cooperative groups of superforecasters. But everyone's judgement falls to chance (or below) before getting five years out. The proposed reason for this is somewhere between the inherent unpredictability of dynamical systems and the combinatorial explosion of possible interacting factors. If we assume that it gets exponentially more difficult to predict as time extends outward, that gives us a range of about 10x from "normal humans" to "top human performance," which matches my intuitions reasonably well. An AI could be strongly superhuman by being about 100x as good a predictor as the best humans. This would certainly make it an invaluable strategic asset for anyone, with value approaching that of the world's economy. This would correspond to about twice the difference between average predictors and the best predictors, which means it would be able to predict events about 2 years in the future as well as most people can predict the next few months to a year. If the government of a small country had access to this they would be able to outpredict and outnegotiate the entire world's intelligence community handily. But if they wanted an oracle or a genie to give them a preview of a plan of action, by the time the preview reached the 2 year long mark it would be as uncertain as a normal human's prediction. And this is for an intelligence 100x as smart as the best coordinated groups of humans!

Having this kind of preview on a genie is great! Very few individual tasks that one might give a genie (especially constrained by some amount of domesticity) would have substantial direct impacts on that timescale. Most could be accomplished in days. But a sovereign is designed to have free reign over the long term. If it decided to take a course of action that it thought was most likely to lead to interstellar colonization, it would run out of certainty long before it got to the nearest star.

This has a side benefit, as well! If a superintelligence understood its own uncertainty, it would be less inclined to move toward the sort of "hair-braned scheme" thinking that leads to a certain amount of the nervousness and hand-wringing in the friendly AI community.

Of course, many assumptions lie in this setup. Two in particular I'd like to highlight. First is the assumption that prediction is exponentially hard in a meaningful sense. While this is intuitively a strong case, it's far from mathematically provable, which is certainly the desirable bar to pass when unleashing a genie on the world. Second, is the assumption that AI's ability to predict geo-political and psychological events will, when the AGI comes into existence, be within a few orders of magnitude of humans' abilities. This also seems like a strong intuitive case to me, as I'd expect that the level of understanding required to be a good forecaster is something like "AI-complete" in Bostrom's sense. If one can parse complex political questions, do the serious research needed to understand them, and combine that coherently into beliefs with well-calibrated uncertainty, I would be very surprised if AGI was far off. However the sorts of hardware-increasing tricks and algorithmic shortcomings that could be bypassed might easily lead to AGI getting a ten billion times speedup upon first being able to predict political events. Even this would be limited to seeing three weeks out as clearly as we see tomorrow, but at these scales obviously it's hard to peg down anything meaningfully.

My second issue is his discussion of the current world economy as a genie. He denies this comparison, because he can get it to "deliver a pizza," but "cannot order it to deliver peace." However this again blurs the definition of a genie. An AGI could start as something like a singleton with the total power of the world economy, and if it were it would certainly qualify as a superintelligence which constitutes an existential threat! But it would not be able to deliver peace on command. One could quibble about whether this is simply because we cannot phrase our desire for peace nicely, in which case the genie could oblige us. However if this is the true rejection I'd suggest we scale down the genie; what is the minimum level of power the genie requires to constitute a superintelligence? And in practice I am very interested in cases slightly less severe than Bostrom is. For example, the ability to produce billion-person or quadrillion-dollar results would certainly constitute a superintelligence in my book, though it falls short of the "total extinction" malign failure mode mark. His footnote 12 in this chapter drives this difference of perspective home quite solidly.

Bostrom essentially dismisses the possibility that a genie could have a goal structure that allows its requests to be countermanded, however MIRI and DeepMind recently released a paper which proposes to make great progress on the problem of corrigibility. It certainly does not seem to be a fanciful idea that we could ask a genie to preview near consequences and override actions that quickly leave our expected domain of outcomes!

Finally, when discussing intelligence in the context of machine learning techniques, Bostrom makes no distinction between the machine learning algorithms which provide predictions and guidance, and the autonomous systems in which they are implanted.

A supervised learning algorithm can, today, do a better job coloring in black and white photos than any human would be able to. Deep learning systems in the setup of AlphaGo or Watson could probably be made to make business decisions, and automated trading algorithms make huge numbers of financial decisions every day. I'll stay with the automated trading systems for a moment because they illustrate my point well. In May 2010, there was a "flash crash" in which, possibly provoked by a savvy saboteur, automated trading systems went into a feedback loop and drained over a trillion dollars from the NYSE over the course of 45 minutes. By the 45 minute mark, automatic safety systems had kicked in, causing the systems to stop trading. During the rest of the day, traders were able to undo many trades, with the heuristic used that any trade that occurred more than 10% below the starting price was "obviously in error."

While this was a large problem for the world economy, it was easily contained and while similar incidents occur, the safeguards that are in place regularly prevent them from spiraling out of control.

In this system, there are AI systems that vastly outperform human actors. These systems are able to make impactful decisions, and the world has seen the possible consequences of bad stock market outcomes in 1929 and 2008. But the flash crash of 2010 was news to me when I heard about it. The safeguards and reparative measures kept the damage down. And they did this because of the way one "superintelligent" system was implanted in an autonomous system.

ai, ethics, takeoff, machine ethics, industrious

Previous post Next post
Up