Reinforcement Learning (RL) techniques are a series of powerful AI methods that allow users to express high-level goals and targets and let the AI system explore the optimal way to achieve them. This replaces the traditional technique of needing to create hard-coded, step-by-step instructions that tell a system what to do in every unique situation. Reinforcement Learning has been effective in tasks like maximising scores on computer games and beating human experts at board games. It is the technology that drives autonomous vehicles safely and in everyday life, it powers smart speakers, text analysis and is now increasingly used in medical diagnosis. Reinforcement Learning, then, is very fast at comprehending complex tasks, their associated rules and objectives and calculating optimal outcomes.
Interest in Reinforcement Learning now extends to real-time industrial processes where control engineers and process managers want to leverage data from their existing infrastructure to stabilize large-scale systems and respond flexibly to changes in demand. For example, processes in industries like pulp & paper and petrochemicals are difficult to optimise, due to variations in quality and complexity of materials and the economic modelling needed to maximise profits can be complicated too. Add to this, a high number of system constraints and variables and it’s easy to see why these processes have been run conservatively by experienced control staff, after years of on the job training and experience.
To demonstrate use of these techniques in process industries, we created a model of a large-scale industrial processing plant with nearly 200 measured variables and 6 main control variables and trained a Deep Reinforcement Learning model with simulated process data, including disturbances, unplanned events and multiple changes to the end products being manufactured. The AI model interfaced with the simulated plant directly, through the existing OPC-UA interface.
The model had to accurately predict production process variables for a rolling 24 hour period and this included fast, smooth and safe transitions to new product manufacture, with different blend mixes, temperatures and processes. The AI model had to generate an action plan that could be reviewed and studied in detail by plant operators and management, with an essential part of the optimisation being to regularise the neural network training to ensure actions were safe, easy to implement and understandable. Output was a 30 second time-series for every control variable, which optimised the profit function, a task that included nearly 250 constraints.
The example below shows how the AI model accurately predicted the profitability of the process and consistently outperformed the most experienced human operators at the task. Importantly, the plan generated by the AI model stayed well within the plants operating procedures, giving optimal profit in a safe and realistic way.
Figure 1: The model consistently outperformed human operators in safely optimising profit. The solid blue line is the profit of the AI created control sequence. Light blue line is the AI’s own prediction for the same profit. The yellow line is the profit of the human operator solution to the same transition.
Reinforcement Learning has matured from being effective in small scale “toy” problem areas, like playing chess or computer games, to being robust and effective in industrial environments. They are, by nature, probabilistic, high-dimensional, and non-linear, in other words, intrinsically hard to control both safely and profitably. Over the next few months, we’ll be diving deeper into specific industrial problem areas and show why reinforcement learning isn’t just industry’s new best friend, but its long term partner! You can start by checking out our NAPCON Advisor white paper.