Dual process theory: Curious AI combines two advanced technologies to create the world’s most advanced AI. Historically, problems are solved through two modes of cognition as described by dual process theory. One of them is fast, automatic and subconscious while the other is slow, deliberate, methodical and conscious. We define these methods of problem solving System 1 and System 2. There is a clear parallel between System 1 and the current developments in AI, which is based on imitating the responses seen in large amounts of data (big data). Typical machine learning systems learn immediate responses based on this imitation, meaning that reasoning and planning the development of more elaborate analysis are largely missing.
System 2 cognition is the more reasoned approach to problem solving, based on internal models and simulations, drawing on imagination, planning and analysis by simulating potential root causes. Current AI systems based on internal models have typically been hand-crafted by human intervention since internal models learned from data have not been compatible with planning and internal simulations.
There are many benefits of System 2 AI, namely:
- The ability to work with small data sets. In many cases, System 2 AI reaches the same performance as System 1 AI, but with several orders of magnitude less data, making it more suitable to more commercial applications.
- Explainability. While System 1 AI results in a black box whose decisions and classifications are not open to interpretation or explanation, System 2 AI is inherently explainable, meaning it can be deployed and accepted more readily.
- Adaptability. When changes occur in an environment, System 2 AI is typically aware of this and can learn these changes very quickly and often in real-time. In many real-world cases, outdated data is a big problem for System 1 AI, particularly when the system is controlling a real-time environment.
A potential drawback of System 2 AI is that deriving answers in this more deeply structured way, may take more time than with System 1 models. These two approaches of AI are therefore best seen as complementary rather than competing solutions.
Deep neural networks, the mathematical models that power modern AI, are capable of learning different tasks from examples, but are typically powered by System 1 AI, with all its inherent problems. Curious AI uniquely harness the power of these models for learning in a System 2 AI framework.
Key to this success has been a unique combination of deep neural networks and Bayesian inference.
Bayesian inference is the theoretically optimal way to draw conclusions and make decisions in the face of uncertainty. It is the theoretical framework for AI (in the same way as aerodynamics is the theoretical framework for aeroplanes). If there were infinite computational resources, AI would be solved by simply applying the rules of Bayesian statistics (much the same way as if we didn’t need to worry about real-world practical constraints, we could build infinitely efficient aeroplanes by giving them infinitely long and thin wings).
In practice, any real-world AI system needs to manage with finite computational resources and they can only approximate Bayesian inference. Normal deep neural networks are efficient and rich models but are quite crude approximations to optimal Bayesian inference. For instance, they do not realise when they are given completely unfamiliar inputs and can make drastic errors while still being confident about their answers. For critical systems, this can make such models unusable. Curious AI has developed technologies which uniquely augment deep neural networks with the ability to estimate their own level of confidence, a primary goal for many AI practitioners. This has been a key step in harnessing deep neural networks for System 2 AI.