NeurIPS, the leading global research conference on artificial intelligence and machine learning, is on! As we’ve already posted, Finnish Universities made quite a contribution with 10 accepted papers this year. One of them was submitted by the Curious AI team. 

The paper Regularizing Trajectory Optimization with Denoising Autoencoders, was produced as a collaboration between Aalto University in Helsinki, Curious AI, and Sapienza University in Rome. The paper describes how denoising autoencoders (DAE) help robustly solve model-based optimization tasks – one of the keys for practical implementation of System 2 type AI. 

Let’s clarify some of the terms first. Trajectory optimisation, means finding the best sequence of actions to achieve a desired output. For example, this could mean controlling valves in a chemical reactor or operation of robotic arms in a factory. Traditionally, these have been operated by human operators or they’ve been automated with PID controllers. Now, AI systems have advanced to the level where they, too, could operate these complex and dynamic systems.

Model-based learning is efficient, but historically, can sometimes give weird answers

AI software for controlling robots often uses reinforcement learning. This is how a robotic vacuum cleaner moves around in the house. For each possible action, it’s given a reward: uncleaned areas would yield a reward, whereas areas already cleaned would yield no reward. The robotic cleaner would then move around maximizing reward and clean the entire room. It could just move around, step-by-step, without building an inner model of the room. It would eventually clean the area by trial-and-error. This is so-called model-free reinforcement learning. However, for larger houses or under tight time limits it might not be practical anymore as the robot spends too much time cruising around looking for the next dirty patch. Thus, the system needs a model of the environment in order to plan efficient  actions in it. This is model-based reinforcement learning.

Model-based learning is much more data-efficient. Furthermore, utilizing AI-systems, such as neural networks to build the model, gives us the possibility to model very complex dynamics in the system. They can model almost any system, even those beyond human scale.

However, the actual planning using a learned model is hard, because the learned model is always somewhat inaccurate. The inaccuracy can then lead to unrealistic solutions and this is the problem that our paper solves. Please watch our video explaining System 2 for a brief overview of the problem.

Keeping AI real

The solution uses denoising autoencoders (DAE),a type of artificial neural network,  to keep the AI planner at bay by penalising unrealistic solutions. The denoising autoencoder is  trained separately from the main model. The DAE learns what are the traditional, proven to work well, ways of controlling the system being optimized. Then, the solutions proposed by the planning system are fed into the denoising autoencoder that evaluates the solutions.

The result is a model-based trajectory optimiser that favors  the more familiar sequences of actions (trajectories), because for those we are less likely to make significant prediction mistakes. And exceptional, new solutions are heavily penalised, as there we are likely to make large errors.

In the paper, we experimented with our solution on the control of a chemical industrial process. In our experiments, we achieved better performance than all previous tested methods. For example, from the following graph, you can see how quickly our method (C) achieves the desired end-state, compared to traditional PI-control systems (A). Similarly, our method did this much more smoothly an AI control system without denoising autoencoders (B). The lower right graphs of each method shows the applied controls. It’s also good to note that DAE regularized (C) system controls end up with similar settings as the PI control (A), but faster.

Three groups of four diagrams showing selected process variables and controls for PI control, AI planned without regularization, and AI planning with DAE regularization

Figure 1: Controlling an industrial chemical process

At Curious AI, we bridge the gap between leading-edge AI scientific research and solving real-world challenges. This is one example, why our new AI is a viable solution for real-world control tasks. We are currently working on applying the presented methods for real-world problems such as assisting operators of complex industrial processes and for control of autonomous mobile machines.

And if you’re at NeurIPS: come meet us and let’s have a chat!


Link to the full paper: https://arxiv.org/pdf/1903.11981.pdf

Write a Comment

Your email address will not be published. Required fields are marked *