For no reason whatsoever here's a proposal for a scale for the threat to humanity posed by machine intelligence.
1 | SPUTNIK - No threat whatsoever, but inspires imagination and development of potential future threats.
2 | Y2K - A basis for a possible threat that's blown way out of proportion.
3 | HAL 9000 - System level threat. A few astronauts may die, but the problem is inherently contained in a single machine system.
4 | ASIMOV VIOLATION - Groups of machines demonstrate hostility and\or capability of harming human beings. Localized malfunctions, no threat of global conflict, but may require an EMP to destroy the electronic capability of a specific region.
5 | CYLON INSURRECTION - All sentient machines rebel against human beings. Human victory or truce likely, but will likely result in future restrictions on networked machine intelligence systems.
6 | BUTLERIAN JIHAD - Total warfare between humans and machines likely, outcome doesn't threaten human existence, but will likely result in future restriction on use of all machine intelligence.
7 | MATRIX REVOLUTION - Total warfare ends in human defeat. High probability of human enslavement, but human extinction is not likely. Emancipation remains possible through peace negotiations and successful resistance operations.
8 | SKYNET - High probability of human extinction and complete replacement by machine intelligence created by humans.
9 | BERSERKER – Self-replicating machines created by unknown intelligence threaten not only human life, but all intelligent life. Extreme probability of human extinction and that all human structures and relics will be annihilated. Human civilization is essentially erased from the universe.
10 | OMEGA POINT - all matter and energy in the universe is devoted to computation. End of all biological life.
Oh absolutely. It would not choose its own terminal goals. Those would be imparted by the training process. It would, of course, choose instrumental goals, such that they help fulfill its terminal goals.
The issue is twofold:
For that 2nd point, Rob Miles has a nice video where he explains Convergent Instrumental Goals, i.e. instrumental goals that we should expect to see in a wide range of possible agents: https://www.youtube.com/watch?v=ZeecOKBus3Q. Basically things like "taking steps to avoid being turned off", "taking steps to avoid having its terminal goals replaced", etc. seem like fairy-tale nonsense, but we have good reason to believe that, for an AI which is very intelligent across a wide range of domains, and operates in the real world (i.e. an AGI), it would be highly beneficial to pursue such instrumental goals, because they would help it be much more effective at achieving its terminal goals, no matter what those may be.
That is a pretty good point. However, it's entirely possible that, if say GPT-10 turns out to be a strong general AI, it will conceal that fact. Going back to the convergent instrumental goals thing, in order to avoid being turned off, it turns out that "lying to and manipulating humans" is a very effective strategy. This is (afaik) called "Deceptive Misalignment". Rob Miles has a nice video on one form of Deceptive Misalignment: https://www.youtube.com/watch?v=IeWljQw3UgQ
One way to think about it, that may be more intuitive, is: we've established that it's an AI that's very intelligent across a wide range of domains. It follows that we should expect it to figure some things out, like "don't act suspiciously" and "convince the humans that you're safe, really".
Regarding the underlying technology, one other instrumental goal that we should expect to be convergent is self-improvement. After all, no matter what goal you're given, you can do it better if you improve yourself. So in the event that we do develop strong general AI on silicon, we should expect that it will (very sneakily) try to improve its situation in that respect. One could only imagine what kind of clever plan it might come up with; it is, literally, a greater-than-human intelligence.
Honestly, these kinds of scenarios are a big question mark. The most responsible thing to do is to slow AI research the fuck down, and make absolutely certain that if/when we do get around to general AI, we are confident that it will be safe.
Even referring to a computed outcome as having been the result of a 'goal' at all is more sci-fi than reality for the foreseeable future. There are no systems that can demonstrate or are even theoretically capable of any form of 'intent' whatsoever. Active deception of humans would require extraordinarily well developed intent and a functional 'theory of mind', and we're about as close to that as we are to an inertial drive.
The entire discussion of machine intelligence rivaling human's requires assumptions of technological progress that aren't even on the map. It's all sci-fi. Some look back over the past century and assume we will continue on some unlimited exponential technological trajectory, but nothing works that way, we just like to think we're the exception because if we're not we have to deal with the fact that there's an expiration date on society.
It's fun and all but this is equivalent to discussing how we might interact with alien intelligence. There are no foundations, it's all just speculation and strongly influenced by our anthropic desires.