Volume 13
Issue 5
IEEE/CAA Journal of Automatica Sinica
| Citation: | Y. Su, D. Wang, M. Zhao, D. Xiong, Y. Huang, and W. Han, “Intelligent safe optimal control towards Koopman operator-driven nonlinear systems with asymmetric state and input constraints,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 5, pp. 1135–1150, May 2026. doi: 10.1109/JAS.2025.125945 |
| [1] |
L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annu. Rev. Control Robot. Auton. Syst., vol. 5, no. 1, pp. 411–444, May 2022. doi: 10.1146/annurev-control-042920-020211
|
| [2] |
L. Yang, H. Dai, A. Amice, and R. Tedrake, “Approximate optimal controller synthesis for cart-poles and quadrotors via sums-of-squares,” IEEE Robot. Autom. Lett., vol. 8, no. 11, pp. 7376–7383, Nov. 2023. doi: 10.1109/LRA.2023.3315228
|
| [3] |
D. Wang, N. Gao, D. Liu, J. Li, and F. L. Lewis, “Recent progress in reinforcement learning and adaptive dynamic programming for advanced control applications,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 1, pp. 18–36, Jan. 2024. doi: 10.1109/JAS.2023.123843
|
| [4] |
B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and autonomous control using reinforcement learning: A survey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 6, pp. 2042–2062, Jun. 2018. doi: 10.1109/TNNLS.2017.2773458
|
| [5] |
R. Kamalapurkar, P. Walters, and W. E. Dixon, “Model-based reinforcement learning for approximate optimal regulation,” Automatica, vol. 64, pp. 94–104, Feb. 2016. doi: 10.1016/j.automatica.2015.10.039
|
| [6] |
C. Li, J. Ding, F. L. Lewis, and T. Chai, “A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems,” Automatica, vol. 129, Art. no. 109687, Jul. 2021. doi: 10.1016/j.automatica.2021.109687
|
| [7] |
J. Li, S. E. Li, J. Duan, Y. Lyu, W. Zou, Y. Guan, and Y. Yin, “Relaxed policy iteration algorithm for nonlinear zero-sum games with application to H-infinity control,” IEEE Trans. Autom. Control, vol. 69, no. 1, pp. 426–433, Jan. 2024. doi: 10.1109/TAC.2023.3266277
|
| [8] |
Y. Jiang, W. Gao, J. Wu, T. Chai, and F. L. Lewis, “Reinforcement learning and cooperative H∞ output regulation of linear continuous-time multi-agent systems,” Automatica, vol. 148, Art. no. 110768, Feb. 2023. doi: 10.1016/j.automatica.2022.110768
|
| [9] |
B. Zhao, S. Zhang, and D. Liu, “Self-triggered approximate optimal neuro-control for nonlinear systems through adaptive dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 36, no. 3, pp. 4713–4723, Mar. 2025. doi: 10.1109/TNNLS.2024.3362800
|
| [10] |
D. P. Bertsekas, “Value and policy iterations in optimal control and adaptive dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 500–509, Mar. 2017. doi: 10.1109/TNNLS.2015.2503980
|
| [11] |
M. Zhao, D. Wang, J. Qiao, M. Ha, and J. Ren, “Advanced value iteration for discrete-time intelligent critic control: A survey,” Artif. Intell. Rev., vol. 56, no. 10, pp. 12315–12346, May 2023. doi: 10.1007/s10462-023-10497-1
|
| [12] |
K. Zhang, S. Luo, H.-N. Wu, and R. Su, “Data-driven tracking control for nonaffine yaw channel of helicopter via off-policy reinforcement learning,” IEEE Trans. Aerosp. Electron. Syst., vol. 61, no. 3, pp. 7725–7737, Jun. 2025. doi: 10.1109/TAES.2025.3539264
|
| [13] |
O. Qasem, H. Gutierrez, and W. Gao, “Experimental validation of data-driven adaptive optimal control for continuous-time systems via hybrid iteration: An application to rotary inverted pendulum,” IEEE Trans. Ind. Electron., vol. 71, no. 6, pp. 6210–6220, Jun. 2024. doi: 10.1109/TIE.2023.3292873
|
| [14] |
K. Zhang, R. Su, H. Zhang, and Y. Tian, “Adaptive resilient event-triggered control design of autonomous vehicles with an iterative single critic learning framework,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 12, pp. 5502–5511, Dec. 2021. doi: 10.1109/TNNLS.2021.3053269
|
| [15] |
C. Mu, D. Wang, and H. He, “Novel iterative neural dynamic programming for data-based approximate optimal control design,” Automatica, vol. 81, pp. 240–252, Jul. 2017. doi: 10.1016/j.automatica.2017.03.022
|
| [16] |
D. Wang, H. He, C. Mu, and D. Liu, “Intelligent critic control with disturbance attenuation for affine dynamics including an application to a microgrid system,” IEEE Trans. Ind. Electron., vol. 64, no. 6, pp. 4935–4944, Jun. 2017. doi: 10.1109/TIE.2017.2674633
|
| [17] |
P. Bevanda, S. Sosnowski, and S. Hirche, “Koopman operator dynamical models: Learning, analysis and control,” Annu. Rev. Control, vol. 52, pp. 197–212, 2021. doi: 10.1016/j.arcontrol.2021.09.002
|
| [18] |
M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, “A data-driven approximation of the Koopman operator: Extending dynamic mode decomposition,” J. Nonlinear Sci., vol. 25, pp. 1307–1346, Jun. 2015. doi: 10.1007/s00332-015-9258-5
|
| [19] |
M. Korda and I. Mezić, “Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control,” Automatica, vol. 93, pp. 149–160, Jul. 2018. doi: 10.1016/j.automatica.2018.03.046
|
| [20] |
D. Bruder, D. Bombara, and R. J. Wood, “A Koopman-based residual modeling approach for the control of a soft robot arm,” Int. J. Robot. Res., vol. 44, no. 3, pp. 388–406, Mar. 2025. doi: 10.1177/02783649241272114
|
| [21] |
J. Jia, W. Zhang, K. Guo, J. Wang, X. Yu, Y. Shi, and L. Guo, “EVOLVER: Online learning and prediction of disturbances for robot control,” IEEE Trans. Robot., vol. 40, pp. 382–402, 2024. doi: 10.1109/TRO.2023.3326318
|
| [22] |
M. Zhou, M. Lu, G. Hu, Z. Guo, and J. Guo, “Koopman operator-based integrated guidance and control for strap-down high-speed missiles,” IEEE Trans. Control Syst. Technol., vol. 32, no. 6, pp. 2436–2443, Nov. 2024. doi: 10.1109/TCST.2024.3401609
|
| [23] |
X. Zhang, W. Pan, R. Scattolini, S. Yu, and X. Xu, “Robust tube-based model predictive control with Koopman operators,” Automatica, vol. 137, Art. no. 110114, Mar. 2022. doi: 10.1016/j.automatica.2021.110114
|
| [24] |
L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger, “Learning-based model predictive control: Toward safe learning in control,” Annu. Rev. Control Robot. Autonom. Syst., vol. 3, pp. 269–296, May 2020. doi: 10.1146/annurev-control-090419-075625
|
| [25] |
C. Dawson, S. Gao, and C. Fan, “Safe control with learned certificates: A survey of neural Lyapunov, barrier, and contraction methods for robotics and control,” IEEE Trans. Robot., vol. 39, no. 3, pp. 1749–1767, Jun. 2023. doi: 10.1109/TRO.2022.3232542
|
| [26] |
H. Modares and F. L. Lewis, “Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning,” Automatica, vol. 50, no. 7, pp. 1780–1792, Jul. 2014. doi: 10.1016/j.automatica.2014.05.011
|
| [27] |
Y. Yang, Y. Yin, W. He, K. G. Vamvoudakis, H. Modares, and D. C. Wunsch, “Safety-aware reinforcement learning framework with an actor-critic-barrier structure,” in Proc. American Control Conf., Philadelphia, USA, 2019, pp. 2352−2358.
|
| [28] |
Y. Yang, K. G. Vamvoudakis, H. Modares, Y. Yin, and D. C. Wunsch, “Safe intermittent reinforcement learning with static and dynamic event generators,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 12, pp. 5441–5455, Dec. 2020. doi: 10.1109/TNNLS.2020.2967871
|
| [29] |
Y. Yang, K. G. Vamvoudakis, and H. Modares, “Safe reinforcement learning for dynamical games,” Int. J. Robust Nonlinear Control, vol. 30, no. 9, pp. 3706–3726, Jun. 2020. doi: 10.1002/rnc.4962
|
| [30] |
Z. Marvi and B. Kiumarsi, “Safe reinforcement learning: A control barrier function optimization approach,” Int. J. Robust Nonlinear Control, vol. 31, no. 6, pp. 1923–1940, Apr. 2021. doi: 10.1002/rnc.5132
|
| [31] |
M. Zhao, D. Wang, S. Song, and J. Qiao, “Safe Q-learning for data-driven nonlinear optimal control with asymmetric state constraints,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 12, pp. 2408–2422, Dec. 2024. doi: 10.1109/jas.2024.124509
|
| [32] |
S. Liu, L. Liu, and Z. Yu, “Safe reinforcement learning for affine nonlinear systems with state constraints and input saturation using control barrier functions,” Neurocomputing, vol. 518, pp. 562–576, Jan. 2023. doi: 10.1016/j.neucom.2022.11.006
|
| [33] |
L. Zhang, L. Xie, Y. Jiang, Z. Li, X. Liu, and H. Su, “Optimal control for constrained discrete-time nonlinear systems based on safe reinforcement learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 36, pp. 854–865, Jan. 2025. doi: 10.1109/TNNLS.2023.3326397
|
| [34] |
G. Mamakoukas, I. Abraham, and T. D. Murphey, “Learning stable models for prediction and control,” IEEE Trans. Robot., vol. 39, no. 3, pp. 2255–2275, Jun. 2023. doi: 10.1109/TRO.2022.3228130
|
| [35] |
S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proc. Natl. Acad. Sci. USA, vol. 113, no. 15, pp. 3932–3937, Mar. 2016. doi: 10.1073/pnas.1517384113
|
| [36] |
L. Shi and K. Karydis, “ACD-EDMD: Analytical construction for dictionaries of lifting functions in Koopman operator-based nonlinear robotic systems,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 906–913, Apr. 2022. doi: 10.1109/LRA.2021.3133001
|
| [37] |
W. Hao, B. Huang, W. Pan, D. Wu, and S. Mou, “Deep Koopman learning of nonlinear time-varying systems,” Automatica, vol. 159, Art. no. 111372, Jan. 2024. doi: 10.1016/j.automatica.2023.111372
|
| [38] |
A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Trans. Autom. Control, vol. 62, no. 8, pp. 3861–3876, Aug. 2017. doi: 10.1109/TAC.2016.2638961
|
| [39] |
A. G. Wills and W. P. Heath, “Barrier function based model predictive control,” Automatica, vol. 40, no. 8, pp. 1415–1422, Aug. 2004. doi: 10.1016/j.automatica.2004.03.002
|
| [40] |
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof,” IEEE Trans. Syst. Man Cybern. Part B Cybern., vol. 38, no. 4, pp. 943–949, Aug. 2008. doi: 10.1109/TSMCB.2008.926614
|