The question of urgency
How urgent is the task of clarifying and adopting the Singularity Principles?
I see three good reasons why this task needs to proceed immediately, without any delay.
First, there are credible scenarios of the future in which AGI (Artificial General Intelligence) arrives as early as 2030, and in which significantly more capable versions (sometimes called Artificial Superintelligence, ASI) arise very shortly afterwards. These scenarios aren’t necessarily the ones that are most likely. Scenarios in which AGI arises some time before 2050 are more credible. However, the early-Singularity scenarios cannot easily be ruled out. That’s the argument explored later in this chapter. Therefore, there is no time to lose.
Second, even if the advent of AGI might be 20, 30, or even more years in the future, there’s the problem that it might take considerable time before all the relevant parts of the world understand and adopt the Singularity Principles. Some powerful forces will need to be challenged and redirected: that won’t happen overnight. It’s similar to the fact that the threat of runaway climate change has already been known for several decades, but significant action to head off that outcome has only recently been taken. Therefore, there is no time to lose.
Third, the Singularity Principles make good sense, not only for the anticipation and management of the rise of AGI, but also for the anticipation and management of many other technologies which will become significantly more powerful in the years ahead – including aspects of nanotech, biotech, infotech (today’s AI systems), and cognotech.
Therefore the conclusion can be stated again: there is no time to lose.
Factors causing AI to improve
The question of the rate of improvement in AI can be split into three:
- Demand factors – forces which are intensifying the search for improvements in AI
- Multiplication factors – mechanisms which mean that the same amount of effort can have larger impact
- Supply factors – ideas which could be developed in order to improve AI.
What has transformed the demand for improved AI is the increasing perception that leading-edge AI can make a significant difference in the effectiveness of products and services. AI is no longer merely a “science project” of interest only to academics and people with a philosophical bent. Instead, gaining a decisive advantage with AI can make all the difference:
- Between commercial success and commercial failure – via AI that increases product performance, reduces time-to-market, and improves customer service
- Between geopolitical success and geopolitical failure – via the utilisation of AI breakthroughs in defensive or offensive weapons systems, including cyberweapons and tools for psychological manipulation.
The world’s most valuable companies are mainly companies that develop and utilise AI in many of their products and services. This includes Amazon, Microsoft, Alphabet (Google), Apple, Meta (Facebook), Tesla, Nvidia, Tencent, and Alibaba. As AI becomes more powerful, the advantages to owning and using the best AI are likely to become even more significant. For this reason, the companies just listed – along with many others – are investing ever greater amounts of money and resources to attain breakthroughs in AI capability.
That brings us to the multiplication factors:
- More people around the world than ever before have received sufficient education to be able to apply themselves in significant research and development of new AI
- When people need to learn about new AI methods, they can access unparalleled quantities of free online training materials
- The wide availability of cheap cloud computing resources and specialist chips (such as GPUs and TPUs) means that companies can carry out large numbers of lines of research in parallel, in order to determine which ones look most promising
- Each new generation of AI provides tools to assist with the construction of a subsequent, improved generation of AI.
In other words, the more that AI improves, the greater the conditions for yet more improvements to take place in AI. For example, one generation of AI can automate online personalised training courses, to allow students to learn more quickly the skills that will bring them up to date with the latest ideas on best practice with AI development.
But neither the demand factors nor the multiplication factors would have much effect, if there were no options on the table for the significant improvement of AI – no supply line of innovative ideas. If it turns out that all the “low hanging fruit” has already been picked, we might experience a stasis in AI capability, rather than ongoing improvements.
However, as I’ll now review, there are numerous options on the table – plenty of promising ideas for how AI can be made much more capable than at present.
15 options on the table
Here is a list of fifteen ways in which AI could change over the next 5-10 years. These are ways which would each (probably) still leave AI short of AGI. But each step forward opens new possibilities.
1: Synthetic data sets: So far, the data sets which are used to train Deep Learning systems have generally been assembled from real-world data. Progress has been restricted by the effort required to label items in the training sets – for example, with human volunteers giving their assessments of the content of each picture in the set. However, new training sets can be created synthetically, with a vast diversity of pictures being created and labelled by one AI (which knows what it has put in each picture, so the labelling is trivially easy) before being passed to another AI to strengthen its skills in recognition.
2: Cleaned data sets: Existing training sets have been limited by quality as well as by quantity. A solution for quantity has just been described: synthetic data sets. The quality problem is where a proportion of some real-world data may have been mislabelled, as a result of limitations in how that data was collected and categorised. However, once again, a division of AI responsibility can come to the rescue. A specialist AI can be tasked with checking the labels on the data set, using a variety of clues. Once the data has been cleaned up in this way – with portions that remain uncertain having been removed – it can be passed to the main AI to train it more effectively. For an example, consider the solution used by DeepMind to learn how to lip-read, based on 5,000 hours of recordings from various BBC programmes. Before the training could produce good results, the software system needed to detect – and fix – cases where the audio and video were slightly misaligned.
3: Transfer learning: Systems that have been trained with one task in mind can sometimes be repurposed, relatively quickly, to handle another task as well. For this new task, only small amounts of additional training data are required, since learning transfers over from the previous task. This is similar to how evolution creates brains in a state ready to learn new tasks with limited numbers of input examples. For real breakthroughs with transfer learning, it is likely that changes will be needed in how the initial training is done. As with many of the items on this list, the speed of progress cannot be predicted in advance.
4: Self-learning of natural language: Systems that explore vast quantities of text have gained more prominence due to the surprising results from the GPT-3 text prediction tool released by OpenAI in May 2020. The training of GPT-3 involved scanning 45 terabytes of text – equivalent to more than a hundred million average-length books – and the consequent fine-tuning of a vast matrix of 175 billion numbers (“parameters”). When presented with some new text, GPT-3 generates sentences of text in response, based on its internal model of what flows of text tend to look like. It lacks any genuine understanding, but parts of the text generated do resemble what an intelligent human might have typed in response to the prompt. It even generates some passable humour. Variations of these methods – perhaps with names of the form “GPT-n”, but likely also using some new mechanisms – are likely to increase the degree to which the output text appears to possess “common sense” knowledge.
5: Generative Adversarial Networks: Another area of AI where progress has taken observers by surprise is the output of GANs (Generative Adversarial Networks). These involve an arms race between one deep network that aims to create new examples conforming to a general pattern, and another deep network that aims to identify which examples have been generated, and which belong to the original data set. Like an arms race between coin counterfeiters and authorities wishing to spot counterfeits in circulation, the competition between the two networks can produce results that increasingly look indistinguishable from genuine examples. The first applications of GANs included the generation of realistic photographs from given specifications, showing what someone’s face would look like at a different age, altering the clothing in a photograph, predicting subsequent frames in a video, and improving the resolution of blurry images or videos. Wider uses of GANs have been explored subsequently, in fields such as chemistry and drug discovery.
6: Evolutionary algorithms: The arms race aspect of GANs is an example of the wider possibilities in which AIs could be improved by copying methods from biological evolution. An idea that has been explored since the 1950s is to include “genetic algorithms”, in which decisions are based on the combinations of small “genes”. Much as in biological evolution, sets of genes that result in greater algorithmic fitness are preferentially used as the basis for the next generation of algorithms, obtained from previous ones by a mixture of random mutation and recombination. Until now, genetic algorithms have had limited success. That’s a bit like the situation with neural networks until around 2012. It’s an open question as to what kinds of changes in genetic algorithms could result in similar kinds of dramatic breakthrough as for neural networks.
7: Learning from neuroscience: Do neurons in the brain actually operate in ways similar to neurons in deep neural networks? Can the differences in operating modes be ignored? Or might a deeper appreciation of what actually happens in brains lead to new directions in AI? Significant fractions of the researchers in large AI companies have done research, not just in the computer science departments of universities, but in their neuroscience departments. One example is Demis Hassabis, founder of DeepMind, who studied neuroscience at Harvard, MIT, and the Gatsby Computational Neuroscience Unit of UCL – where Shane Legg, another DeepMind co-founder, also studied. On their website, DeepMind declare that “better understanding biological brains could play a vital role in building intelligent machines”. Consider also Jeff Hawkins, inventor of the Palm Pilot, who moved on from his ground-breaking career in the mobile computing and smartphone industry to lead teams carrying out brain research at his new company Numenta; Hawkins recently explained his theories for significantly improving AI in his thought-provoking book A Thousand Brains.
8: Neuromorphic computing: It’s not only the software systems of AI that could be improved by studying what happens in human brains. In their use of energy, brains are much more efficient than their silicon equivalents. A typical laptop computer consumes energy at the rate of around 100W, whereas a human brain operates at around 10W. Companies such as Intel have dedicated units looking at what they call “neuromorphic computing”, to see if novel hardware structures inspired by the biology of brains could enable leaps forward in AI capability.
9: Quantum computing: The novel capabilities of quantum computers could enable new sorts of AI algorithms, and could radically speed up existing algorithms that are presently too slow to be useful. For example, quantum computers can accelerate the machine-learning task of “feature matching”, as well as “dimensionality reduction algorithms” as used in non-supervised learning. Given that quantum computing is such a new field, it’s likely that further applications for AI will come to mind as the field matures.
10: Affective computing: What will happen to AI systems as they gain a richer understanding of human emotion? Research in affective computing, such as carried out by the company Affectiva, looks for ways to make software notice and understand human expressions of emotion, to add apparent emotion into interactions with humans, and to influence the emotional states of humans. Such software possesses emotional intelligence, even though it need have no intrinsic emotional feelings of its own. This will surely alter the dynamics of interactions between humans and computers – though it remains to be seen whether these changes will truly benefit humans, or instead manipulate us into actions different from our actual best interests.
11: Sentient computing: A different approach to computers with emotional intelligence is to try to understand which aspects of biology give rise to inner sensations – sentience – and then to duplicate these relationships in computer hardware. Sentience is a subject that is more elusive and controversial than intelligence, and it is unclear whether any real progress with sentient computing can be expected any time soon. Nevertheless, a growing number of researchers are interested in this subject, including Mark Solms, Susan Schneider, Anil Seth, and David Chalmers. We would be wise to keep an open mind.
12: Algorithms that understand not just correlation but causation: Much of machine learning is about spotting patterns: data with such-and-such a characteristic is usually correlated with such-and-such an output. However, humans have a strong intuition that there is a difference between correlation – when two events, A and B, are associated with each other – and causation – when event A is understood to be the cause of event B. In a case of causation, if we want B to happen, we can arrange for A to happen; and if we want to reduce the chance of B happening, we can stop A from happening. Thus, stopping smoking is recommended as a way to decrease the chance of catching lung cancer. However, the correlation in a city between rising ice cream sales and greater deaths from drowning accidents provides no reason to reduce ice cream sales as an attempt to reduce the number of drowning accidents; instead, both these events likely have a common cause: the weather being hotter. The computer scientist Judea Pearl argues that software that can reliably detect causation will embody a significant step forward in intelligence: see his book The Book of Why: The New Science of Cause and Effect. Breakthroughs here may come by applying methods from the field known as “Probabilistic Programming” which has recently been generating considerable interest.
13: Decentralised network intelligence: The human brain can be considered as a network of modular components with a division of responsibilities. Various parts of the brain are specialised in recognising faces, in recognising music, in preventing the body from falling over, in consolidating memories, and so on. It’s the same with organisations: they draw their capabilities from relationships between the different people in the organisation who have different skills and responsibilities. AI systems often embody similar modularity: decisions are taken as the outcome of multiple sub-units performing individual calculations whose results are then integrated. One approach to building better AI is to take that idea further: allow vast numbers of different AI modules to discover each other and interact with each other in a decentralised manner, without any predetermined hierarchy. Higher levels of intelligence might emerge from this kind of relationship. That’s the driving thought behind, for example, the SingularityNET “AI marketplace”.
14: Provably safe AI: Most AI development regards safety as a secondary consideration. Yes, software might have bugs, but these bugs can be found and removed later, once they prove troublesome. Yes, software might behave unexpectedly when placed into a new environment – for example, one in which other novel software algorithms have been placed. But, again, these interactions can be reconfigured later, if the need arises. At least, that’s the dominant practice behind much of the industry. AI developers might nowadays shy away from the infamous mantra from the early days of Facebook, “Move fast and break things”, but they often still seem to be guided by that slogan in practice. However, a minority movement puts the issue of safety at the heart of its research. It’s possible that the new designs for AI that arise from this different focus will, as well as being safer, attain new capabilities. These ideas are explored in, for example, the book by Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control.
15: Combination approaches: To the fourteen items already on this list, we should add two more that are straightforward: improvements with classical-style AI expert systems, and improvements with deep neural networks. That takes the count to sixteen. Next consider combinations of at least two of these sixteen items: that makes a total of 120 approaches to consider. Adding ideas from a third item into the mix raises the number yet higher (over five thousand). OK, many of these combinations could provide little additional value, but in other cases, who knows in advance what disruptive new insights might arise?
The difficulty of measuring progress
For some tasks, it’s relatively easy to predict when a particular target will be reached. That’s when the landscape is already well understood, and there are previous examples from which useful comparisons can be drawn.
But when the landscape still contains significant uncertainties, no confident predictions can be made. It’s like travelling down a road at a reasonably steady pace, but the road ahead contains a swamp made out of an unknown kind of sticky vegetation. In that case, the vehicle could be held up indefinitely.
In the case of AGI, it’s possible to list a number of aspects of human intelligence which it seems no form of AI can currently match. This includes:
- Being able to learn new concepts from being shown only a small number of examples
- Having a rich “common sense” that can resolve ambiguities in communications
- When two data points are correlated, being able to deduce whether one of these points is the cause of the other one.
Different writers include different sets of items in their own versions of this list. It’s often the case, however, that forms of AI can already deliver the functionality that the writer supposes is beyond AI capability. For example, writing in 2021, Kai-Fu Lee, a leading Venture Capitalist, lists “three capabilities where I see AI falling short, and that AI will likely still struggle to master even in 2041”:
- Creativity. AI cannot create, conceptualize, or plan strategically. While AI is great at optimizing for a narrow objective, it is unable to choose its own goals or to think creatively. Nor can AI think across domains or apply common sense.
- Empathy. AI cannot feel or interact with feelings like empathy and compassion. Therefore, AI cannot make another person feel understood and cared for. Even if AI improves in this area, it will be extremely difficult to get the technology to a place where humans feel comfortable interacting with robots in situations that call for care and empathy, or what we might call “human-touch services.”
- Dexterity. AI and robotics cannot accomplish complex physical work that requires dexterity or precise hand-eye coordination. AI can’t deal with unknown and unstructured spaces, especially ones that it hasn’t observed.
As it happens, each item on this list by Kai-Fu Lee can be strongly disputed. AIs are already involved in many creative activities, for example using GANs (as described earlier). AI that includes affective computing can indeed convey feelings of support and empathy to people interacting with that software. And modern robots, using software from Covariant.AI, that are trained in simulated environments before being deployed in the real world, demonstrate remarkable dexterity.
In other words, there’s scope to disagree with what exactly belongs on the list of tasks that are presently beyond the ability of AI.
Such disagreements shouldn’t come as a surprise. They’re a consequence of our lack of knowledge of the challenges that still lie ahead. They also reflect our far-from-complete understanding of what is happening inside the human brain. That is, we still only have a rudimentary grasp of the nature of the “general intelligence” which humans possess.
Despite these disagreements, the point is clear that there are at least some aspects of human general intelligence which are currently beyond AI capability. What is not clear is:
- How much effort will be required, in order to solve any of these shortcomings
- How independent are the various items on the list.
As a thought experiment, suppose that it were agreed that the list contained seven different items. Suppose also that some AI researchers have a credible idea for a way to improve AI in order to address one of these items. Once that piece of research completes, in line with the idea the researchers had in mind, three outcomes are possible:
- The research results in a small step forward, but doesn’t actually deliver the missing functionality. That missing functionality turns out to be even harder to create than expected
- The research does deliver the expected functionality, and it also turns out to deliver several other items on the list of seven unsolved problems; in other words, the items weren’t as independent as had previously been thought
- The research delivers just the single piece of missing functionality; six other tasks remain.
It’s because all three possibilities are credible, that it’s particularly difficult to make any confident predictions about the date by when AGI might emerge.
That realisation should warn us against making either of two forecasting mistakes:
- Wrongly insisting that AGI cannot be attained before a specified date, such as 2030
- Wrongly insisting that AGI must surely be attained after a specified date, such as 2065.
Learning from Christopher Columbus
As just reviewed, it’s hard to calculate the distance between today’s AI and AGI. Accordingly, it’s hard to estimate the amount of effort required for a project to convert today’s AI into AGI.
This situation can be compared to a problem facing European navigators in the late fifteenth century. They were interested in a particular destination, namely the Far East. In this case, there already was one known route to travel to the destination, namely overland, travelling eastward, following in the footsteps of Marco Polo. But could there be a more convenient route travelling in the opposite direction?
That was the idea of maverick seafarer Christopher Columbus. Columbus spent years trying to drum up support for an idea that most educated people of the time considered to be a hare-brained scheme. These observers believed that Columbus had fallen victim to a significant mistake – he estimated that the distance from the Canary Islands (off the coast of Morocco) to Japan was around 3,700 km, whereas the generally accepted figure was closer to 20,000 km. Indeed, the true size of the sphere of the Earth had been known since the 3rd century BC, due to a calculation by Eratosthenes, based on observations of shadows at different locations.
Accordingly, when Columbus presented his bold proposal to courts around Europe, the learned members of the courts time and again rejected the idea. The effort would be hugely larger than Columbus supposed, they said. It would be a fruitless endeavour.
Columbus, an autodidact, wasn’t completely crazy. He had done a lot of his own research. However, he was misled by a number of factors:
- Confusion between various ancient units of distance (the “Arabic mile” and the “Roman mile”)
- How many degrees of latitude the Eurasian landmass occupied (225 degrees versus 150 degrees)
- A speculative 1474 map, by the Florentine astronomer Toscanelli, which showed a mythical island “Antilla” located to the east of Japan; therefore “the east” might be closer than previously expected.
No wonder Columbus thought his plan might work after all. Nevertheless, the 1490s equivalents of today’s VCs kept saying “No” to his pitches. Finally, spurred on by competition with the neighbouring Portuguese (who had, just a few years previously, successfully navigated around the southern tip of Africa), the Spanish king and queen agreed to take the risk of supporting his adventure. After stopping in the Canaries to restock, the Nina, the Pinta, and the Santa Maria set off westward. Five weeks later, the crew spotted land, in what we now call the Bahamas. And the rest is history.
But it wasn’t the history expected by Columbus, or by his backers, or by his critics. No-one had foreseen that a huge continent existed in the oceans in between Europe and Japan. No ancient writer – either secular or religious – had spoken of such a continent.
Nevertheless, once Columbus had found it, the history of the world proceeded in a very different direction – including mass deaths from infectious diseases transmitted from the European sailors, genocide and cultural apocalypse, and enormous trade in both goods and slaves. In due course, it would be the ingenuity and initiatives of people subsequently resident in the Americas that propelled humans beyond the Earth’s atmosphere all the way to the moon.
Here’s the relevance of this analogy to the future of AI.
Rational critics may have ample justification in thinking that true AGI is located many decades in the future. But this fact does not deter a multitude of modern-day AGI explorers from setting out, Columbus-like, in search of some dramatic breakthroughs. And who knows what new intermediate forms of AI might be discovered, unexpectedly?
Just as the contemporaries of Columbus erred in presuming they already knew all the large features of the earth’s continents (after all: if America really existed, surely God would have written about it in the Bible…), modern-day critics of AI can err in presuming they already know all the large features of the landscape of possible artificial minds.
When contemplating the space of all possible minds, some humility is in order. We cannot foretell in advance what configurations of intelligence are possible. We don’t know what may happen, if separate modules of reasoning are combined in innovative ways.
When critics say that it is unlikely that present-day AI mechanisms will take us all the way to AGI, they are very likely correct. But it would be a serious error to draw the conclusion that meaningful new continents of AI capability are inevitably still the equivalent of 20,000 km into the distance. The fact is, we simply don’t know. And for that reason, we should keep an open mind.
One day soon, indeed, we might read news of some new “AUI” having been discovered – some Artificial Unexpected Intelligence, which changes history. It won’t be AGI, but it could have all kinds of unexpected consequences.
To be clear, every analogy has its drawbacks. Here are three ways in which the discovery of an AUI could be different from the discovery by Columbus of America:
- In the 1490s, there was only one Christopher Columbus. Nowadays, there are hundreds (perhaps thousands) of schemes underway to try to devise new models of AI. Many of these are proceeding with significant financial backing.
- Whereas the journey across the Atlantic (and, eventually, the Pacific) could be measured by a single variable (latitude), the journey across the vast multidimensional landscape of artificial minds is much less predictable. That’s another reason to avoid dogmatism.
- Discovering an AUI could drastically transform the future of exploration in the landscape of artificial minds. Assisted by AUI, we might get to AGI much quicker than without it. Indeed, in some scenarios, it might take only a few months after we reach AUI for us (now going much faster than before) to reach AGI. Or days. Or hours.
The possibility of fast take-off
Discussing the possible date for the technological singularity involves two separate questions:
- How long will it take for AI to match all the thinking capabilities of humans, that is, to reach AGI?
- How soon after the advent of AGI will AI reach superintelligence – levels of all-round capability that completely surpass human intelligence?
The discussion in this chapter so far can be summarised as follows: there’s wide uncertainty about the date at which AGI is reached, but it is unwise to categorically rule out reaching AGI within just a few years time. That’s because any of a number of breakthroughs, that can already be foreseen, could unexpectedly turn out to solve not just one but a number of apparently different problems that have been holding up AGI. And once some extra progress has been made, the additional capabilities created could play important roles in progressing AI yet further.
What remains open for consideration is the second question: once AGI has been reached, how soon thereafter will further improvements take place?
The best answer to this second question is similar to the best answer to the first question: it’s not possible to give an answer with any certainty, but it’s wise to keep an open mind. That is, it’s not possible to rule out the scenario of “fast take-off” in which AGI is able to contribute to the creation of superintelligence within a timescale of perhaps just months or weeks.
The possibility of fast take-off was described in the early 1960s by IJ Good, a long-time colleague of Alan Turing:
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.
In other words, not long after humans manage to create an AGI, the AGI is likely to evolve itself into an artificial superintelligence that far exceeds human powers.
In case the idea seems far-fetched, of an AI redesigning itself without any human involvement, consider a slightly different possibility: humans will still be part of that design process, at least in the initial phases. That’s already the case today, when humans use one generation of AI tools to help design a new generation of improved AI tools, before going on to repeat the process.
IJ Good foresaw that too. This is from a lecture he gave at IBM in New York in 1959:
Once a machine is designed that is good enough…, it can be put to work designing an even better machine…
There will only be a very short transition period between having no very good machine and having a great many exceedingly good ones.
At this point an “explosion” will clearly occur; all the problems of science and technology will be handed over to machines and it will no longer be necessary for people to work. Whether this will lead to a Utopia or to the extermination of the human race will depend on how the problem is handled by the machine.
An AI that is able to reason more precisely and more comprehensively than any human would, in principle, have the following methods to improve its performance yet further:
- Reading and understanding vast swathes of published articles (including many in relatively obscure locations), to notice where important new ideas had been mentioned, which had not yet received an appropriate amount of analysis, regarding their potential to improve AI capability
- Modelling vast numbers of new possibilities in advance, to determine which are likely to result in significant enhancements
- Identifying new ways to connect together different hardware resources, in order to boost the overall computing power available to an AI.
Each time one of these improvements are adopted, it raises the possibility of enabling better research into yet more improvement possibilities.
There’s no need to assume any indefinite ongoing sequence of significant improvement steps. All that’s necessary is to assume:
- That some additional improvements remain possible
- That the reasoning capability of the human brain represents no absolute upper bound on possible general intelligence systems.
On the other hand, other factors could act to slow down these potential improvements. For example, new hardware configurations might require experimentation that takes more time.
But it’s by no means obvious, in advance, whether the limiting factor to significant additional improvement will be:
- A “soft” factor, that can be modified quickly
- A “hard” factor, whose modification would take longer.
Any presupposition that the second case applies would be reckless.
Therefore, I say again, there is no time to lose.
So without any further preliminary, let’s now review the Singularity Principles themselves.