Reinforcement Learning for Model-Free Output Feedback Optimal Control

Author: ORCID icon
Rizvi, Syed Ali Asad, Electrical Engineering - School of Engineering and Applied Science, University of Virginia
Lin, Zongli, EN-Elec/Computer Engr Dept, University of Virginia

Stability is a bare minimum requirement in the design of control systems. It is often desirable that a control system operates in a prescribed optimal manner. Traditionally, optimal control has been a model-based paradigm, which relies on the availability of an accurate system model to design a controller that achieves closed-loop stability while minimizing a certain cost function. An accurate system model is often difficult and sometimes impossible to obtain owing to the increasing complexity of systems and the ever presence of modeling uncertainties. Modeling inaccuracies not only compromise the optimality of the controller, but may also destabilize the system. The frameworks of adaptive and robust control address the modeling uncertainty problem but optimality is generally not the prime goal in these approaches.

Reinforcement learning (RL) is a form of machine learning, which is inspired by living organisms that learn and optimize their behavior based on their past experiences. Recently, RL has been used to design model-free controllers that are both adaptive and optimal. However, the applications of RL based controllers are still limited owing to some challenges. Being a data-driven approach, RL requires a number of sensors to capture the complete internal state of the system. In many applications, the measurement of the complete internal state is not feasible owing to the availability and cost associated with installing a large number of sensors. Another difficulty is that many of these methods need an initially stabilizing control. Furthermore, practical control systems are subject to challenges such as external disturbances, actuator limitations, and time delays. The development of RL algorithms that address these difficulties is essential for their practical applicability, which serves as a motivation of this research.

In the first part of this research, we present model-free output feedback RL algorithms to solve the optimal control problem for linear dynamical systems. The proposed output feedback methods relax the requirement of the number of sensors, and thereby, improve upon the cost, reliability and complexity of the control system. These new algorithms have the advantage that they do not incur any estimation bias because of the use of exploration signals. Furthermore, the need of employing a discounted cost function is eliminated in our approach, which has been a bottleneck in the earlier works in ensuring closed-loop stability. In the second part of this research, we address some practical control challenges in the design of RL controllers. We employ the framework of game theory to develop an output feedback model-free H-infinity controller, which is capable of rejecting external disturbances. Another practical issue we address is the saturation of the actuators. A model-free low gain feedback design method is developed that achieves global stabilization of the system without causing the actuators to saturate. Finally, this research also presents model-free techniques to handle time delays in the control loop. All these developments are aimed towards strengthening the framework of reinforcement learning control by enabling it to deal with practical control challenges.

PHD (Doctor of Philosophy)
Reinforcement Learning, Model-Free Optimal Control, Output Feedback Control, Adaptive Optimal Control, Adaptive Dynamic Programming, Approximate Dynamic Programming
All rights reserved (no additional license for public reuse)
Issued Date: