CN116382093A - Optimal control method and equipment for nonlinear system with unknown model - Google Patents

Optimal control method and equipment for nonlinear system with unknown model Download PDF

Info

Publication number
CN116382093A
CN116382093A CN202310559968.7A CN202310559968A CN116382093A CN 116382093 A CN116382093 A CN 116382093A CN 202310559968 A CN202310559968 A CN 202310559968A CN 116382093 A CN116382093 A CN 116382093A
Authority
CN
China
Prior art keywords
equation
nonlinear system
optimal
cost function
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310559968.7A
Other languages
Chinese (zh)
Inventor
张斌
马聪辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310559968.7A priority Critical patent/CN116382093A/en
Publication of CN116382093A publication Critical patent/CN116382093A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Abstract

The optimal control method and the optimal control equipment for the nonlinear system with unknown model are provided, an optimal cost function of the system is established aiming at the nonlinear system with unknown model, and a partial differential equation for solving the optimal cost function is determined; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.

Description

Optimal control method and equipment for nonlinear system with unknown model
Technical Field
The invention belongs to the field of control planning, and particularly relates to a nonlinear system optimal control method and equipment for model unknown.
Background
Control theory in control system engineering is a sub-field of control in mathematics that deals with engineering processes and dynamic systems of continuous operation in machines. The aim is to develop a control strategy that controls such a system in an optimal way using control actions while not delaying or overshooting and ensuring control stability.
For example, optimization-based control and estimation techniques, such as Model Predictive Control (MPC), allow for a model-based design framework in which system dynamics and constraints can be directly considered. MPC is used in many applications to control power systems of various complexities. Examples of such systems include production lines, automotive engines, robots, numerically controlled processes, motors, satellites, and generators. However, in many cases, the model of the controlled system is nonlinear and may be difficult to design, use in real time, or may be inaccurate. Examples of such situations are common in robotics, building control (HVAC), smart grids, factory automation, transportation, self-regulating machines, and transportation networks. In addition, even if the nonlinear model is fully available, designing an optimal controller is inherently a challenging task because of the need to solve partial differential equations known as Hamilton-Jacobi-Bellman (Hamilton-Jacobi-Bellman equation: HJB) equations.
Finding the optimal control law of a general nonlinear system requires solving a Hamilton-Jacobi-Bellman (HJB) partial differential equation, hereinafter referred to as the HJB equation, and there are various conventional solutions for the optimal control problem of a dynamic system with performance indexes or so-called cost functions, but these conventional solutions have two drawbacks. On the one hand, the HJB equation solving process has inherent computational complexity, which grows exponentially with the change of state dimension, i.e., there is a "dimension disaster". On the other hand, the implementation of the conventional solution depends on an accurate system model and cannot be applied to a difficult modeling system. Thus, the problem of optimal control independent of mathematical models remains a hotspot in current research.
Disclosure of Invention
In view of the foregoing problems of the prior art, it is an object of the present invention to provide a method and apparatus for optimal control of a nonlinear system with unknown model, which can improve the calculation efficiency of optimal control of the nonlinear system.
In order to solve the technical problems, the specific technical scheme is as follows:
in one aspect, provided herein is a method for optimal control of a model-unknown nonlinear system, the method comprising:
aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
Further, for a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:
establishing a state equation of a nonlinear system with unknown model:
Figure BDA0004234609100000021
x(t 0 )=x 0 wherein->
Figure BDA0004234609100000022
Is a state variable of the system,/>
Figure BDA0004234609100000023
Is a control input of the system,/->
Figure BDA0004234609100000024
Is a system dynamic equation, +.>
Figure BDA0004234609100000025
Is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system:
Figure BDA0004234609100000026
Figure BDA0004234609100000027
wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
determining a partial differential equation that solves the optimal cost function:
Figure BDA0004234609100000028
further, according to the empirical data set of the nonlinear system, the partial differential equation is subjected to a developing process to obtain a higher-order Chang Weifen equation, including:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
Figure BDA0004234609100000031
wherein->
Figure BDA00042346091000000312
For initial value->
Figure BDA0004234609100000032
Status trace of->
Figure BDA0004234609100000033
Defined as unknown optimal control;
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
Figure BDA0004234609100000034
wherein (1)>
Figure BDA0004234609100000035
Figure BDA0004234609100000036
Figure BDA0004234609100000037
S 12 =V xxg ,/>
Figure BDA0004234609100000038
S 22 =W uu ,/>
Figure BDA0004234609100000039
Boundary condition of->
Figure BDA00042346091000000310
Further, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system, including:
determining a basis function due to approximating the optimal cost function; the following is shown:
Figure BDA00042346091000000311
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
Further, said bringing said estimation function into said Gao Jiechang differential equation, resulting in a higher order differential dynamic approximation, further comprising:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
Further, according to the set constraint, performing iterative processing on the data-driven model to determine optimal control of the nonlinear system, including:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
Further, differential dynamic iteration processing is performed on the high order differential dynamic approximation according to the approximation estimated hamiltonian and the definition of the control input to determine optimal control of the nonlinear system, including:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
In another aspect, there is provided herein a nonlinear system optimum control apparatus for model unknowns, the apparatus comprising:
the partial differential equation determining module is used for establishing an optimal cost function of a nonlinear system with unknown model and determining a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
the data driving model determining module is used for introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module is used for carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
In another aspect, a nonlinear system optimal control apparatus for model-agnostic, the apparatus comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method described above and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
Finally, a computer readable storage medium is provided herein, which stores a computer program which, when executed by a processor, implements a method as described above.
By adopting the technical scheme, the optimal control method and the optimal control equipment for the nonlinear system with unknown model are used for establishing an optimal cost function of the system aiming at the nonlinear system with unknown model, and determining a partial differential equation for solving the optimal cost function; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.
The foregoing and other objects, features and advantages will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments herein or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments herein and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system.
FIG. 2 illustrates a schematic step diagram of a nonlinear system optimal control method for model unknowns provided by embodiments herein;
FIG. 3 illustrates a flow diagram of a method for model-agnostic nonlinear system optimal control provided by embodiments herein;
FIG. 4 illustrates a state trace comparison for a system under initial control and optimal control inputs, respectively, in an embodiment herein;
FIG. 5 illustrates a state trace comparison plot for another embodiment system herein under initial control and optimal control inputs, respectively;
FIG. 6 shows an initial cost function V for a system in embodiments herein 0 And an optimal cost function V 17
FIG. 7 shows a schematic diagram of a nonlinear system optimal control apparatus for model agnostic provided by embodiments herein;
fig. 8 shows a schematic structural diagram of a control device provided by the embodiments herein.
Description of the drawings:
100. a control device; 102. a system; 104. a model; 106. a control instruction;
701. a partial differential equation determination module; 702. gao Jiechang differential equation determination module; 703. a data driven model determination module; 704. and an optimal control module.
Detailed Description
The following description of the embodiments of the present disclosure will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the disclosure. All other embodiments, based on the embodiments herein, which a person of ordinary skill in the art would obtain without undue burden, are within the scope of protection herein.
It should be noted that the terms "first," "second," and the like in the description and claims herein and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system. Some embodiments provide a control device 100 configured to control a system 102. For example, the apparatus 100 may be configured as a dynamic system 102 that controls continuous operation in engineering processes and machines. Hereinafter, "control device" and "device" may be used interchangeably and will have the same meaning. Hereinafter, "continuously operating power system" and "system" may be used interchangeably and will be synonymous. Examples of the system 102 are HVAC systems, LIDAR systems, condensing units, production lines, self-tuning machines, smart grids, automotive engines, robots, numerically controlled machining, motors, satellites, generators, traffic networks, and the like. Some embodiments are based on the following recognition: the apparatus 100 develops control instructions 106 for controlling the system 102 using control actions in an optimal manner without delay or overshoot and ensuring control stability.
In some implementations, the apparatus 100 uses model-based and/or optimization-based control and estimation techniques, such as Model Predictive Control (MPC), to develop the control commands 106 for the system 102. Model-based techniques may be advantageous for control of dynamic systems. For example, MPC allows for a model-based design framework in which the dynamics and constraints of the system 102 can be directly considered. The MPC develops control commands 106 based on the model 104 of the system. The model 104 of the system refers to the dynamics of the system 102 described using differential equations. In some implementations, the model 104 is non-linear and may be difficult to design and/or difficult to use in real-time. For example, even if a nonlinear model is fully available, estimating the optimal control commands 106 is inherently a challenging task because it is computationally challenging to solve a Partial Differential Equation (PDE) (known as the Hamilton-Jacobi-Bellman (HJB) equation) that describes the dynamics of the system 102.
Some embodiments use data-driven control techniques to design the model 104. The data-driven technique utilizes operational data generated by the system 102 in order to build a feedback control strategy that stabilizes the system 102.
Further, the embodiment provides the nonlinear system optimal control method for model unknown, which can effectively solve the problem of dimension disaster caused by large calculation amount and has high algorithm convergence speed. FIG. 2 is a schematic diagram of the steps of a method for model-unknown optimal control of a nonlinear system provided by embodiments herein, which provides the method operational steps described in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When a system or apparatus product in practice is executed, it may be executed sequentially or in parallel according to the method shown in the embodiments or the drawings. As shown in fig. 2, the method may include:
s201: aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
s202: according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
s203: introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
s204: and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
It will be appreciated that for nonlinear systems, the partial differential equation (Hamilton-Jacobi-Bellman, HJB) is developed as a higher order Chang Weifen equation, i.e., (Differential Dynamic Programming, DDP) expansion, in conjunction with Differential Dynamic Programming (DDP) techniques. And then introducing function approximation into the DDP expansion to form an actor-critic structure, and constructing a data driving model. Based on the data driven model, a DDP iterative algorithm with strict convergence proof was developed. The novel algorithm provided by the patent overcomes the technical obstacle and solves the time-varying behavior of the HJB partial differential equation under the condition of the finite time domain cost function.
In the embodiment of the present specification, for a nonlinear system whose model is unknown, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:
establishing a state equation of a nonlinear system with unknown model:
Figure BDA0004234609100000081
x(t 0 )=x 0 wherein->
Figure BDA0004234609100000082
Is a state variable of the system,/>
Figure BDA0004234609100000083
Is a control input of the system,/->
Figure BDA0004234609100000084
Is a system dynamic equation, +.>
Figure BDA0004234609100000085
Is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system:
Figure BDA0004234609100000086
Figure BDA0004234609100000087
wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
determining a partial differential equation that solves the optimal cost function:
Figure BDA0004234609100000088
illustratively, the nonlinear system may be as follows:
Figure BDA0004234609100000089
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00042346091000000810
is a state variable of the system,/>
Figure BDA00042346091000000811
Is a control input of the system,/->
Figure BDA00042346091000000812
Is a system dynamic equation, +.>
Figure BDA00042346091000000813
Is a system input state equation. Assuming f (x) +g (x) u satisfies Lipschitz continuous conditions,
Figure BDA00042346091000000814
a closed bounded set for all saturated inputs, where γ is a constraint. For a fixed time interval t= [ T ] 0 ,t f ]We define the cost function associated with the system shown in equation (1) as:
Figure BDA00042346091000000815
where Q (x) is a positive definite function, W (u) is a non-negative multiplicative function, and τ is an integral argument.
The objective of the optimal control problem is to design a constrained optimal control
Figure BDA00042346091000000816
So that the cost function (2) satisfies:
J(x 0 ,t 0 ,u)≥J(x 0 ,t 0 ,u * ) (3)
under dynamic constraints in the system (1), the following generalized non-quadratic function is employed to cope with the input constraints:
Figure BDA0004234609100000091
wherein r is i >0, i=1, 2, …, m is a positive weight factor.
Formula (4) is rewritable in the following compact form:
Figure BDA0004234609100000092
wherein r=diag (R 1 ,r 2 ,…,r m ),v=(v 1 ,v 2 ,…,v m ) T ,tanh -1 (v/γ)=(tanh -1 (v 1 /γ),tanh -1 (v 2 /γ),...,tanh -1 (v m /γ)) T
Describing the optimal control problem with the following optimal cost function:
Figure BDA0004234609100000093
wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]And (3) inner part. Assuming that V (x, t) belongs to a first order continuous derivative function, an optimal cost function can be found that satisfies the HJB partial differential equation:
Figure BDA0004234609100000094
for all of
Figure BDA0004234609100000095
Optimal control strategy->
Figure BDA0004234609100000096
The control input u can be differentiated by the HJB equation as follows:
Figure BDA0004234609100000097
the hamiltonian equation defining the optimal control is:
H(x,u,λ)=Q(x)+W(u)+λ T (f(x)+g(x)u) (9)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004234609100000098
is a vector parameter. We can rewrite HJB equation (7) as:
Figure BDA0004234609100000099
in the embodiment of the present specification, the expanding process is performed on the partial differential equation according to the empirical data set of the nonlinear system to obtain a higher-order Chang Weifen equation, including:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
Figure BDA0004234609100000101
wherein->
Figure BDA00042346091000001020
For initial value->
Figure BDA0004234609100000102
Status trace of->
Figure BDA0004234609100000103
Defined as unknown optimal control;
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
Figure BDA0004234609100000104
wherein (1)>
Figure BDA0004234609100000105
Figure BDA0004234609100000106
Figure BDA0004234609100000107
S 12 =V xxg ,/>
Figure BDA0004234609100000108
Boundary condition of->
Figure BDA0004234609100000109
Illustratively, a test control input is first selected
Figure BDA00042346091000001010
Let->
Figure BDA00042346091000001011
Is of initial value
Figure BDA00042346091000001012
Is a state trace of (a). For an initial value x 0 We will->
Figure BDA00042346091000001013
Defined as an unknown optimal control. Thus, any saturated input U (t) ∈U, t ε [ t ] 0 ,t f ]The lower state trace x (t) is parameterized +.>
Figure BDA00042346091000001014
The representation is:
Figure BDA00042346091000001015
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00042346091000001016
is a state error of the system,/->
Figure BDA00042346091000001017
Is the error of the control input. The state equation and the HJB equation can be written as follows:
Figure BDA00042346091000001018
the above equation is surrounded
Figure BDA00042346091000001019
Expansion, the following DDP expansion can be derived:
DDP expansion: let d based on the state equation (1) and the cost function (2) i Is a column vector
Figure BDA0004234609100000111
Figure BDA0004234609100000112
G= ((G) 1 ) x ,(g 2 ) x ,...,(g m ) x ). Wherein g i Is the i-th column vector of g, i=1, 2, …, m. Then, the optimal cost function V and its partial derivative V x 、V xx Satisfies the following formula:
Figure BDA0004234609100000113
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004234609100000114
Figure BDA0004234609100000115
S 12 =V xxg ,/>
Figure BDA0004234609100000116
S 22 =W uu
Figure BDA0004234609100000117
boundary condition of->
Figure BDA0004234609100000118
In the equations (14) to (16), the functions V, V x ,V xx ,f,f x ,g,G,Q,Q x ,Q xx ,W,W uu Are all at
Figure BDA0004234609100000119
The parameters are omitted for simplicity of evaluation.
In the embodiment of the present specification, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system includes:
determining a basis function due to approximating the optimal cost function; the following is shown:
Figure BDA00042346091000001110
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
Illustratively, to obtain the optimal cost function V in the case where the system model f is unknown, the unknown function is approximated with an independent set of basis functions. The definition is as follows:
Figure BDA00042346091000001111
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004234609100000121
and->
Figure BDA0004234609100000122
Is a set of basis functions; />
Figure BDA0004234609100000123
And
Figure BDA0004234609100000124
is a set of weights; n (N) a And N b The number of the basis functions in each group of basis functions is the number of the basis functions in the set; />
Figure BDA0004234609100000125
And
Figure BDA0004234609100000126
is an approximation error. When N is a And N b Approximation error e when approaching infinity respectively ia And e b Consistently converged to zero. Since the exact value of the weight is unknown, the estimation function is defined as +.>
Figure BDA0004234609100000127
Figure BDA0004234609100000128
Wherein->
Figure BDA0004234609100000129
Is a set of weight estimates.
For simplicity, the following compact form is defined:
Figure BDA00042346091000001210
Figure BDA00042346091000001211
Figure BDA00042346091000001212
further, by the above-defined symbols, the following expression in compact form can be obtained:
Figure BDA00042346091000001213
based on the formula (8), it is possible to obtain
Figure BDA00042346091000001214
Is:
Figure BDA00042346091000001215
order the
Figure BDA00042346091000001216
The above formula can be rewritten as: />
Figure BDA00042346091000001217
Substituting the estimation function into the second-order expansions (14) - (15) or the third-order expansions (14) - (16) may result in a second-order or third-order DDP approximation.
In this embodiment of the present disclosure, said bringing the estimation function into the Gao Jiechang differential equation results in a higher order differential dynamic approximation, and then further includes:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
Illustratively, without loss of generality, we provide a second order DDP approximation, as follows:
second order DDP approximation: based on the second order DDP expansions (14) - (15), weight estimates are obtained
Figure BDA00042346091000001218
And->
Figure BDA00042346091000001219
At any time interval->
Figure BDA00042346091000001221
The following algebraic matrix equation is satisfied:
Figure BDA00042346091000001220
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004234609100000131
Figure BDA0004234609100000132
Figure BDA0004234609100000133
Figure BDA0004234609100000134
Figure BDA0004234609100000135
Figure BDA0004234609100000136
Figure BDA0004234609100000137
furthermore, if the continuous excitation (PE) condition is satisfied, there is a constant ρ>0 and a plurality of time intervals
Figure BDA0004234609100000138
Make->
Figure BDA0004234609100000139
Then it is possible to obtain:
Figure BDA00042346091000001310
further, according to the set constraint, performing iterative processing on the data-driven model to determine optimal control of the nonlinear system, including:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
Illustratively, the following definitions are given first:
definition 2: the approximate estimated hamiltonian is defined as:
Figure BDA00042346091000001311
furthermore, the exact weight of the basis function defining the system f (x) for which the model is unknown is A * ,i.e.,f(x)=A * Ψ(x)。
Definition 3: set to any of
Figure BDA00042346091000001312
Is a constraint operator. />
Figure BDA00042346091000001313
Where γ is the constraint of the control input.
Further, differential dynamic iteration processing is performed on the high order differential dynamic approximation according to the approximation estimated hamiltonian and the definition of the control input to determine optimal control of the nonlinear system, including:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
Illustratively, the iterative process is as follows:
s301: selecting an initial value
Figure BDA0004234609100000141
Convergence accuracy epsilon>0. Let x be 0 (t),t∈[t 0 ,t f ]Is with an initially given control input +.>
Figure BDA00042346091000001413
The corresponding state of the system of (2) satisfies the following equation:
Figure BDA0004234609100000142
wherein the initial control is that
Figure BDA0004234609100000143
And->
Figure BDA0004234609100000144
Calculating an initial cost function
Figure BDA0004234609100000145
And i=0 is set.
S302: calculation of
Figure BDA0004234609100000146
And lambda (lambda) i By solving the following formula:
Figure BDA0004234609100000147
s303: calculating x i+1 By solving the following formula:
Figure BDA0004234609100000148
and calculates a cost function
Figure BDA0004234609100000149
S304: calculation of
Figure BDA00042346091000001410
And->
Figure BDA00042346091000001411
Based on the following formula:
Figure BDA00042346091000001412
if it is
Figure BDA0004234609100000151
Let k be i >2k i And proceeds to Step3. Otherwise, go to Step5./>
S305: if J i+1 -J i ≥0,k i >2k i And proceeds to Step2. Otherwise, set up
Figure BDA0004234609100000152
And go to Step2 until
Figure BDA0004234609100000153
S306: setting a cost function
Figure BDA0004234609100000154
And the control input +.>
Figure BDA0004234609100000155
Illustratively, the nonlinear system in this embodiment is selected as follows:
Figure BDA0004234609100000156
the basis functions of the system equations are selected as:
Figure BDA0004234609100000157
defining a cost function as:
Figure BDA0004234609100000158
wherein γ=0.5, r=1. The optimal cost function of the approximation system is:
Figure BDA0004234609100000159
according to step S204, developing a data-driven differential dynamic programming algorithm, and setting the initial condition of the system as x 0 =[2,1] T The constraint of the control input is set to |u|<0.5。
From an optimal cost function
Figure BDA00042346091000001510
Figure BDA00042346091000001511
Initially, after 17 iterations, the values of the weight vector are obtained as:
Figure BDA00042346091000001512
based on equation (24), the 17 th optimal control input can be found as:
Figure BDA00042346091000001513
as shown in fig. 3 and 4, the state traces of the system under the initial control and the optimal control input in the present embodiment are respectively compared, and fig. 5 is an initial cost function V of the system in the present embodiment 0 And an optimal cost function V 17
Compared with the prior art, the embodiment of the specification has the following advantages and effects:
1. the invention expands the HJB partial differential equation into a higher-order Chang Weifen equation based on a differential dynamic programming algorithm, and builds a new data driving model;
2. the method effectively solves the problem of dimension disaster caused by large calculation amount, and has high algorithm convergence speed;
3. the invention can overcome the technical obstacle of the time-varying behavior of the HJB equation caused by the finite time domain cost function.
On the basis of the method provided above, the embodiment of the present disclosure further provides a nonlinear system optimal control apparatus for model unknown, as shown in fig. 7, where the method includes:
the partial differential equation determining module 701 is configured to establish an optimal cost function of a nonlinear system with unknown model, and determine a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module 702, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
a data driving model determining module 703, configured to introduce the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module 704 is configured to perform iterative processing on the data driving model according to the set constraint, so as to determine optimal control of the nonlinear system.
The effects obtained by the device are consistent with the beneficial effects obtained by the method, and the embodiments of the present disclosure are not repeated.
Further, the present specification also provides an apparatus for optimal control of a nonlinear system whose model is unknown, the apparatus comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method described above and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
For example, an internal structural view thereof may be as shown in fig. 8. The device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the device is configured to provide computing and control capabilities. The memory of the device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for identifying a driving surface covering of a computer device.
It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion of the structure associated with the present application and does not constitute a limitation of the apparatus to which the present application is applied, and that a particular apparatus may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
It should also be understood that in embodiments herein, the term "and/or" is merely one relationship that describes an associated object, meaning that three relationships may exist. For example, a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the elements may be selected according to actual needs to achieve the objectives of the embodiments herein.
Specific examples are set forth herein to illustrate the principles and embodiments herein and are merely illustrative of the methods herein and their core ideas; also, as will be apparent to those of ordinary skill in the art in light of the teachings herein, many variations are possible in the specific embodiments and in the scope of use, and nothing in this specification should be construed as a limitation on the invention.

Claims (10)

1. A method for optimal control of a nonlinear system for model unknowns, the method comprising:
aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
2. The method of claim 1, wherein establishing an optimal cost function for the system for a nonlinear system for which the model is unknown and determining a partial differential equation that solves the optimal cost function comprises:
establishing a state equation of a nonlinear system with unknown model:
Figure FDA0004234609090000011
x(t 0 )=x 0 wherein->
Figure FDA0004234609090000012
Is a state variable of the system,/>
Figure FDA0004234609090000013
Is a control input of the system,/->
Figure FDA0004234609090000014
Is a dynamic equation of the system and,
Figure FDA0004234609090000015
is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system:
Figure FDA0004234609090000016
Figure FDA0004234609090000017
wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
determining a partial differential equation that solves the optimal cost function:
Figure FDA0004234609090000018
3. the method of claim 1, wherein expanding the partial differential equation to obtain a higher order Chang Weifen equation from the empirical data set of the nonlinear system comprises:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
Figure FDA0004234609090000021
wherein->
Figure FDA0004234609090000022
For initial value->
Figure FDA0004234609090000023
Status trace of->
Figure FDA0004234609090000024
Defined as unknown optimal control;
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
Figure FDA0004234609090000025
wherein (1)>
Figure FDA0004234609090000026
Figure FDA0004234609090000027
Figure FDA0004234609090000028
S 12 =V xx g,/>
Figure FDA0004234609090000029
S 22 =W uu ,/>
Figure FDA00042346090900000210
Boundary conditions are
Figure FDA00042346090900000211
4. The method of claim 1, wherein introducing the Gao Jiechang differential equation into a functional approximation yields a data driven model of the nonlinear system, comprising:
determining a basis function due to approximating the optimal cost function; the following is shown:
Figure FDA00042346090900000212
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
5. The method of claim 4, wherein said bringing the estimation function into the Gao Jiechang differential equation yields a higher order differential dynamic approximation, further comprising thereafter:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
6. The method of claim 4, wherein iteratively processing the data driven model to determine optimal control of the nonlinear system according to the set constraints comprises:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
7. The method of claim 6, wherein performing differential dynamic iterative processing on the higher order differential dynamic approximations to determine optimal control of the nonlinear system based on the approximated hamiltonian and a definition of a control input, comprises:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
8. An optimal control device for a nonlinear system whose model is unknown, said device comprising:
the partial differential equation determining module is used for establishing an optimal cost function of a nonlinear system with unknown model and determining a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
the data driving model determining module is used for introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module is used for carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
9. An optimal control device for a nonlinear system whose model is unknown, said device comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method of any one of claims 1 to 7 and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of claims 1 to 7.
CN202310559968.7A 2023-05-18 2023-05-18 Optimal control method and equipment for nonlinear system with unknown model Pending CN116382093A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310559968.7A CN116382093A (en) 2023-05-18 2023-05-18 Optimal control method and equipment for nonlinear system with unknown model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310559968.7A CN116382093A (en) 2023-05-18 2023-05-18 Optimal control method and equipment for nonlinear system with unknown model

Publications (1)

Publication Number Publication Date
CN116382093A true CN116382093A (en) 2023-07-04

Family

ID=86973521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310559968.7A Pending CN116382093A (en) 2023-05-18 2023-05-18 Optimal control method and equipment for nonlinear system with unknown model

Country Status (1)

Country Link
CN (1) CN116382093A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290965A (en) * 2023-11-22 2023-12-26 中汽研汽车检验中心(广州)有限公司 Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290965A (en) * 2023-11-22 2023-12-26 中汽研汽车检验中心(广州)有限公司 Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software
CN117290965B (en) * 2023-11-22 2024-04-09 中汽研汽车检验中心(广州)有限公司 Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software

Similar Documents

Publication Publication Date Title
US20220326664A1 (en) Improved machine learning for technical systems
EP3935580B1 (en) Apparatus and method for controlling operation of machine subject, and storage medium
US10620597B2 (en) Gray box model estimation for process controller
US10895854B1 (en) System and method for control constrained operation of machine with partially unmodeled dynamics using Lipschitz constant
Rao et al. Modeling of room temperature dynamics for efficient building energy management
JP7090734B2 (en) Control system, control method and storage medium
CN116382093A (en) Optimal control method and equipment for nonlinear system with unknown model
EP3928167B1 (en) Apparatus and method for control with data-driven model adaptation
Rober et al. Backward reachability analysis for neural feedback loops
KR20190139161A (en) Pre-step co-simulation method and device
US11790247B2 (en) Robust adaptive dynamic mode decomposition for modeling, prediction, and control of high dimensional physical systems
US20240152748A1 (en) System and Method for Training of neural Network Model for Control of High Dimensional Physical Systems
Margolis A Sweeping Gradient Method for Ordinary Differential Equations with Events
Ma Modeling and control of partial differential equations (PDE) described systems
US20230341141A1 (en) Time-varying reinforcement learning for robust adaptive estimator design with application to HVAC flow control
Magalhaes et al. Data-Driven Controller and Multi-Gradient Search Algorithm for Morphing Configurations
Yamaguchi et al. Multirotor Ensemble Model Predictive Control I: Simulation Experiments
Wei Discrete-time Contraction Analysis and Controller Design for Nonlinear Processes
Agvik et al. Adaptive Control of Hydraulic Drive System: Real Time Steady-State Estimation Using Kalman Filter
Alvarez et al. Nonlinear Discrete-Time Observers with Physics-Informed Neural Networks
Kanai et al. Model Predictive Control with Model Error Compensation by Koopman Approach
CN116819959A (en) Multi-agent optimization controller construction method and system based on sliding mode mechanism
Rebaï et al. A Decoupling Approach to Design Observers for Polytopic Takagi-Sugeno Models Subject to Unknown Inputs
CN116755339A (en) Self-adaptive non-backstepping control method and system for nonlinear strict feedback system
CN117406748A (en) Engineering machinery travel path tracking control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination