CN116382093A

CN116382093A - Optimal control method and equipment for nonlinear system with unknown model

Info

Publication number: CN116382093A
Application number: CN202310559968.7A
Authority: CN
Inventors: 张斌; 马聪辉
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2023-05-18
Filing date: 2023-05-18
Publication date: 2023-07-04

Abstract

The optimal control method and the optimal control equipment for the nonlinear system with unknown model are provided, an optimal cost function of the system is established aiming at the nonlinear system with unknown model, and a partial differential equation for solving the optimal cost function is determined; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.

Description

Optimal control method and equipment for nonlinear system with unknown model

Technical Field

The invention belongs to the field of control planning, and particularly relates to a nonlinear system optimal control method and equipment for model unknown.

Background

Control theory in control system engineering is a sub-field of control in mathematics that deals with engineering processes and dynamic systems of continuous operation in machines. The aim is to develop a control strategy that controls such a system in an optimal way using control actions while not delaying or overshooting and ensuring control stability.

For example, optimization-based control and estimation techniques, such as Model Predictive Control (MPC), allow for a model-based design framework in which system dynamics and constraints can be directly considered. MPC is used in many applications to control power systems of various complexities. Examples of such systems include production lines, automotive engines, robots, numerically controlled processes, motors, satellites, and generators. However, in many cases, the model of the controlled system is nonlinear and may be difficult to design, use in real time, or may be inaccurate. Examples of such situations are common in robotics, building control (HVAC), smart grids, factory automation, transportation, self-regulating machines, and transportation networks. In addition, even if the nonlinear model is fully available, designing an optimal controller is inherently a challenging task because of the need to solve partial differential equations known as Hamilton-Jacobi-Bellman (Hamilton-Jacobi-Bellman equation: HJB) equations.

Finding the optimal control law of a general nonlinear system requires solving a Hamilton-Jacobi-Bellman (HJB) partial differential equation, hereinafter referred to as the HJB equation, and there are various conventional solutions for the optimal control problem of a dynamic system with performance indexes or so-called cost functions, but these conventional solutions have two drawbacks. On the one hand, the HJB equation solving process has inherent computational complexity, which grows exponentially with the change of state dimension, i.e., there is a "dimension disaster". On the other hand, the implementation of the conventional solution depends on an accurate system model and cannot be applied to a difficult modeling system. Thus, the problem of optimal control independent of mathematical models remains a hotspot in current research.

Disclosure of Invention

In view of the foregoing problems of the prior art, it is an object of the present invention to provide a method and apparatus for optimal control of a nonlinear system with unknown model, which can improve the calculation efficiency of optimal control of the nonlinear system.

In order to solve the technical problems, the specific technical scheme is as follows:

in one aspect, provided herein is a method for optimal control of a model-unknown nonlinear system, the method comprising:

aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;

according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;

introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;

and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.

Further, for a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:

establishing a state equation of a nonlinear system with unknown model:

x(t ₀ )＝x ₀ wherein->

Is a state variable of the system,/>

Is a control input of the system,/->

Is a system dynamic equation, +.>

Is a system input state equation;

determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system:

wherein t is [ t ] ₀ ,t _f ]And u [ t, t ] _f ]Indicating that the control input u is limited to a time interval t, t _f ]An inner part;

determining a partial differential equation that solves the optimal cost function:

further, according to the empirical data set of the nonlinear system, the partial differential equation is subjected to a developing process to obtain a higher-order Chang Weifen equation, including:

establishing the experience data set according to the historical input and output data of the nonlinear system;

from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:

wherein->

For initial value->

Status trace of->

Defined as unknown optimal control;

and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:

wherein (1)>

S ₁₂ ＝V _xxg ，/>

S ₂₂ ＝W _uu ，/>

Boundary condition of->

Further, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system, including:

determining a basis function due to approximating the optimal cost function; the following is shown:

determining an estimation function of optimal control according to the basis function;

and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.

Further, said bringing said estimation function into said Gao Jiechang differential equation, resulting in a higher order differential dynamic approximation, further comprising:

according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;

and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.

Further, according to the set constraint, performing iterative processing on the data-driven model to determine optimal control of the nonlinear system, including:

defining an approximately estimated hamiltonian amount and a control input;

and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.

Further, differential dynamic iteration processing is performed on the high order differential dynamic approximation according to the approximation estimated hamiltonian and the definition of the control input to determine optimal control of the nonlinear system, including:

step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;

step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;

step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;

step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;

step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.

In another aspect, there is provided herein a nonlinear system optimum control apparatus for model unknowns, the apparatus comprising:

the partial differential equation determining module is used for establishing an optimal cost function of a nonlinear system with unknown model and determining a partial differential equation for solving the optimal cost function;

gao Jiechang differential equation determining module, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;

the data driving model determining module is used for introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;

and the optimal control module is used for carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.

In another aspect, a nonlinear system optimal control apparatus for model-agnostic, the apparatus comprising:

an input interface configured to receive a state trace of a nonlinear system;

a memory;

a processor configured to perform the method described above and generate control instructions;

an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.

Finally, a computer readable storage medium is provided herein, which stores a computer program which, when executed by a processor, implements a method as described above.

By adopting the technical scheme, the optimal control method and the optimal control equipment for the nonlinear system with unknown model are used for establishing an optimal cost function of the system aiming at the nonlinear system with unknown model, and determining a partial differential equation for solving the optimal cost function; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.

The foregoing and other objects, features and advantages will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments herein or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments herein and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.

Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system.

FIG. 2 illustrates a schematic step diagram of a nonlinear system optimal control method for model unknowns provided by embodiments herein;

FIG. 3 illustrates a flow diagram of a method for model-agnostic nonlinear system optimal control provided by embodiments herein;

FIG. 4 illustrates a state trace comparison for a system under initial control and optimal control inputs, respectively, in an embodiment herein;

FIG. 5 illustrates a state trace comparison plot for another embodiment system herein under initial control and optimal control inputs, respectively;

FIG. 6 shows an initial cost function V for a system in embodiments herein ₀ And an optimal cost function V ₁₇ ；

FIG. 7 shows a schematic diagram of a nonlinear system optimal control apparatus for model agnostic provided by embodiments herein;

fig. 8 shows a schematic structural diagram of a control device provided by the embodiments herein.

Description of the drawings:

100. a control device; 102. a system; 104. a model; 106. a control instruction;

701. a partial differential equation determination module; 702. gao Jiechang differential equation determination module; 703. a data driven model determination module; 704. and an optimal control module.

Detailed Description

The following description of the embodiments of the present disclosure will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the disclosure. All other embodiments, based on the embodiments herein, which a person of ordinary skill in the art would obtain without undue burden, are within the scope of protection herein.

It should be noted that the terms "first," "second," and the like in the description and claims herein and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.

Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system. Some embodiments provide a control device 100 configured to control a system 102. For example, the apparatus 100 may be configured as a dynamic system 102 that controls continuous operation in engineering processes and machines. Hereinafter, "control device" and "device" may be used interchangeably and will have the same meaning. Hereinafter, "continuously operating power system" and "system" may be used interchangeably and will be synonymous. Examples of the system 102 are HVAC systems, LIDAR systems, condensing units, production lines, self-tuning machines, smart grids, automotive engines, robots, numerically controlled machining, motors, satellites, generators, traffic networks, and the like. Some embodiments are based on the following recognition: the apparatus 100 develops control instructions 106 for controlling the system 102 using control actions in an optimal manner without delay or overshoot and ensuring control stability.

In some implementations, the apparatus 100 uses model-based and/or optimization-based control and estimation techniques, such as Model Predictive Control (MPC), to develop the control commands 106 for the system 102. Model-based techniques may be advantageous for control of dynamic systems. For example, MPC allows for a model-based design framework in which the dynamics and constraints of the system 102 can be directly considered. The MPC develops control commands 106 based on the model 104 of the system. The model 104 of the system refers to the dynamics of the system 102 described using differential equations. In some implementations, the model 104 is non-linear and may be difficult to design and/or difficult to use in real-time. For example, even if a nonlinear model is fully available, estimating the optimal control commands 106 is inherently a challenging task because it is computationally challenging to solve a Partial Differential Equation (PDE) (known as the Hamilton-Jacobi-Bellman (HJB) equation) that describes the dynamics of the system 102.

Some embodiments use data-driven control techniques to design the model 104. The data-driven technique utilizes operational data generated by the system 102 in order to build a feedback control strategy that stabilizes the system 102.

Further, the embodiment provides the nonlinear system optimal control method for model unknown, which can effectively solve the problem of dimension disaster caused by large calculation amount and has high algorithm convergence speed. FIG. 2 is a schematic diagram of the steps of a method for model-unknown optimal control of a nonlinear system provided by embodiments herein, which provides the method operational steps described in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When a system or apparatus product in practice is executed, it may be executed sequentially or in parallel according to the method shown in the embodiments or the drawings. As shown in fig. 2, the method may include:

s201: aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;

s202: according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;

s203: introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;

s204: and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.

It will be appreciated that for nonlinear systems, the partial differential equation (Hamilton-Jacobi-Bellman, HJB) is developed as a higher order Chang Weifen equation, i.e., (Differential Dynamic Programming, DDP) expansion, in conjunction with Differential Dynamic Programming (DDP) techniques. And then introducing function approximation into the DDP expansion to form an actor-critic structure, and constructing a data driving model. Based on the data driven model, a DDP iterative algorithm with strict convergence proof was developed. The novel algorithm provided by the patent overcomes the technical obstacle and solves the time-varying behavior of the HJB partial differential equation under the condition of the finite time domain cost function.

In the embodiment of the present specification, for a nonlinear system whose model is unknown, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:

establishing a state equation of a nonlinear system with unknown model:

x(t ₀ )＝x ₀ wherein->

Is a state variable of the system,/>

Is a control input of the system,/->

Is a system dynamic equation, +.>

Is a system input state equation;

illustratively, the nonlinear system may be as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

is a state variable of the system,/>

Is a control input of the system,/->

Is a system dynamic equation, +.>

Is a system input state equation. Assuming f (x) +g (x) u satisfies Lipschitz continuous conditions,

a closed bounded set for all saturated inputs, where γ is a constraint. For a fixed time interval t= [ T ] ₀ ,t _f ]We define the cost function associated with the system shown in equation (1) as:

where Q (x) is a positive definite function, W (u) is a non-negative multiplicative function, and τ is an integral argument.

The objective of the optimal control problem is to design a constrained optimal control

So that the cost function (2) satisfies:

J(x ₀ ,t ₀ ,u)≥J(x ₀ ,t ₀ ,u ^* ) (3)

under dynamic constraints in the system (1), the following generalized non-quadratic function is employed to cope with the input constraints:

wherein r is _i >0, i=1, 2, …, m is a positive weight factor.

Formula (4) is rewritable in the following compact form:

wherein r=diag (R ₁ ,r ₂ ,…,r _m )，v＝(v ₁ ,v ₂ ,…,v _m ) ^T ，tanh ^-1 (v/γ)＝(tanh ^-1 (v ₁ /γ),tanh ^-1 (v ₂ /γ),...,tanh ^-1 (v _m /γ)) ^T 。

Describing the optimal control problem with the following optimal cost function:

wherein t is [ t ] ₀ ,t _f ]And u [ t, t ] _f ]Indicating that the control input u is limited to a time interval t, t _f ]And (3) inner part. Assuming that V (x, t) belongs to a first order continuous derivative function, an optimal cost function can be found that satisfies the HJB partial differential equation:

for all of

Optimal control strategy->

The control input u can be differentiated by the HJB equation as follows:

the hamiltonian equation defining the optimal control is:

H(x,u,λ)＝Q(x)+W(u)+λ ^T (f(x)+g(x)u) (9)

is a vector parameter. We can rewrite HJB equation (7) as:

in the embodiment of the present specification, the expanding process is performed on the partial differential equation according to the empirical data set of the nonlinear system to obtain a higher-order Chang Weifen equation, including:

wherein->

For initial value->

Status trace of->

Defined as unknown optimal control;

wherein (1)>

S ₁₂ ＝V _xxg ，/>

Boundary condition of->

Illustratively, a test control input is first selected

Let->

Is of initial value

Is a state trace of (a). For an initial value x ₀ We will->

Defined as an unknown optimal control. Thus, any saturated input U (t) ∈U, t ε [ t ] ₀ ,t _f ]The lower state trace x (t) is parameterized +.>

The representation is:

is a state error of the system,/->

Is the error of the control input. The state equation and the HJB equation can be written as follows:

the above equation is surrounded

Expansion, the following DDP expansion can be derived:

DDP expansion: let d based on the state equation (1) and the cost function (2) _i Is a column vector

G= ((G) ₁ ) _x ,(g ₂ ) _x ,...,(g _m ) _x ). Wherein g _i Is the i-th column vector of g, i=1, 2, …, m. Then, the optimal cost function V and its partial derivative V _x 、V _xx Satisfies the following formula:

S ₁₂ ＝V _xxg ，/>

S ₂₂ ＝W _uu ，

boundary condition of->

In the equations (14) to (16), the functions V, V _x ,V _xx ,f,f _x ,g,G,Q,Q _x ,Q _xx ,W,W _uu Are all at

The parameters are omitted for simplicity of evaluation.

In the embodiment of the present specification, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system includes:

Illustratively, to obtain the optimal cost function V in the case where the system model f is unknown, the unknown function is approximated with an independent set of basis functions. The definition is as follows:

and->

Is a set of basis functions; />

And

is a set of weights; n (N) _a And N _b The number of the basis functions in each group of basis functions is the number of the basis functions in the set; />

And

is an approximation error. When N is _a And N _b Approximation error e when approaching infinity respectively _ia And e _b Consistently converged to zero. Since the exact value of the weight is unknown, the estimation function is defined as +.>

Wherein->

Is a set of weight estimates.

For simplicity, the following compact form is defined:

further, by the above-defined symbols, the following expression in compact form can be obtained:

based on the formula (8), it is possible to obtain

Is:

order the

The above formula can be rewritten as: />

Substituting the estimation function into the second-order expansions (14) - (15) or the third-order expansions (14) - (16) may result in a second-order or third-order DDP approximation.

In this embodiment of the present disclosure, said bringing the estimation function into the Gao Jiechang differential equation results in a higher order differential dynamic approximation, and then further includes:

Illustratively, without loss of generality, we provide a second order DDP approximation, as follows:

second order DDP approximation: based on the second order DDP expansions (14) - (15), weight estimates are obtained

And->

At any time interval->

The following algebraic matrix equation is satisfied:

furthermore, if the continuous excitation (PE) condition is satisfied, there is a constant ρ>0 and a plurality of time intervals

Make->

Then it is possible to obtain:

defining an approximately estimated hamiltonian amount and a control input;

Illustratively, the following definitions are given first:

definition 2: the approximate estimated hamiltonian is defined as:

furthermore, the exact weight of the basis function defining the system f (x) for which the model is unknown is A ^* ,i.e.,f(x)＝A ^* Ψ(x)。

Definition 3: set to any of

Is a constraint operator. />

Where γ is the constraint of the control input.

Illustratively, the iterative process is as follows:

s301: selecting an initial value

Convergence accuracy epsilon>0. Let x be ⁰ (t),t∈[t ₀ ,t _f ]Is with an initially given control input +.>

The corresponding state of the system of (2) satisfies the following equation:

wherein the initial control is that

And->

Calculating an initial cost function

And i=0 is set.

S302: calculation of

And lambda (lambda) ⁱ By solving the following formula:

s303: calculating x ⁱ⁺¹ By solving the following formula:

and calculates a cost function

S304: calculation of

And->

Based on the following formula:

if it is

Let k be ⁱ >2k ⁱ And proceeds to Step3. Otherwise, go to Step5./>

S305: if J ⁱ⁺¹ -J ⁱ ≥0，k ⁱ >2k ⁱ And proceeds to Step2. Otherwise, set up

And go to Step2 until

S306: setting a cost function

And the control input +.>

Illustratively, the nonlinear system in this embodiment is selected as follows:

the basis functions of the system equations are selected as:

defining a cost function as:

wherein γ=0.5, r=1. The optimal cost function of the approximation system is:

according to step S204, developing a data-driven differential dynamic programming algorithm, and setting the initial condition of the system as x ₀ ＝[2,1] ^T The constraint of the control input is set to |u|<0.5。

From an optimal cost function

Initially, after 17 iterations, the values of the weight vector are obtained as:

based on equation (24), the 17 th optimal control input can be found as:

as shown in fig. 3 and 4, the state traces of the system under the initial control and the optimal control input in the present embodiment are respectively compared, and fig. 5 is an initial cost function V of the system in the present embodiment ₀ And an optimal cost function V ₁₇ 。

Compared with the prior art, the embodiment of the specification has the following advantages and effects:

1. the invention expands the HJB partial differential equation into a higher-order Chang Weifen equation based on a differential dynamic programming algorithm, and builds a new data driving model;

2. the method effectively solves the problem of dimension disaster caused by large calculation amount, and has high algorithm convergence speed;

3. the invention can overcome the technical obstacle of the time-varying behavior of the HJB equation caused by the finite time domain cost function.

On the basis of the method provided above, the embodiment of the present disclosure further provides a nonlinear system optimal control apparatus for model unknown, as shown in fig. 7, where the method includes:

the partial differential equation determining module 701 is configured to establish an optimal cost function of a nonlinear system with unknown model, and determine a partial differential equation for solving the optimal cost function;

gao Jiechang differential equation determining module 702, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;

a data driving model determining module 703, configured to introduce the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system;

and the optimal control module 704 is configured to perform iterative processing on the data driving model according to the set constraint, so as to determine optimal control of the nonlinear system.

The effects obtained by the device are consistent with the beneficial effects obtained by the method, and the embodiments of the present disclosure are not repeated.

Further, the present specification also provides an apparatus for optimal control of a nonlinear system whose model is unknown, the apparatus comprising:

an input interface configured to receive a state trace of a nonlinear system;

a memory;

For example, an internal structural view thereof may be as shown in fig. 8. The device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the device is configured to provide computing and control capabilities. The memory of the device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for identifying a driving surface covering of a computer device.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion of the structure associated with the present application and does not constitute a limitation of the apparatus to which the present application is applied, and that a particular apparatus may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

It should also be understood that in embodiments herein, the term "and/or" is merely one relationship that describes an associated object, meaning that three relationships may exist. For example, a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the elements may be selected according to actual needs to achieve the objectives of the embodiments herein.

Specific examples are set forth herein to illustrate the principles and embodiments herein and are merely illustrative of the methods herein and their core ideas; also, as will be apparent to those of ordinary skill in the art in light of the teachings herein, many variations are possible in the specific embodiments and in the scope of use, and nothing in this specification should be construed as a limitation on the invention.

Claims

1. A method for optimal control of a nonlinear system for model unknowns, the method comprising:

2. The method of claim 1, wherein establishing an optimal cost function for the system for a nonlinear system for which the model is unknown and determining a partial differential equation that solves the optimal cost function comprises:

establishing a state equation of a nonlinear system with unknown model:

x(t ₀ )＝x ₀ wherein->

Is a state variable of the system,/>

Is a control input of the system,/->

Is a dynamic equation of the system and,

is a system input state equation;

wherein t is [ t ] ₀ ，t _f ]And u [ t, t ] _f ]Indicating that the control input u is limited to a time interval t, t _f ]An inner part;

3. the method of claim 1, wherein expanding the partial differential equation to obtain a higher order Chang Weifen equation from the empirical data set of the nonlinear system comprises:

wherein->

For initial value->

Status trace of->

Defined as unknown optimal control;

wherein (1)>

S ₁₂ ＝V _xx g，/>

S ₂₂ ＝W _uu ，/>

Boundary conditions are

4. The method of claim 1, wherein introducing the Gao Jiechang differential equation into a functional approximation yields a data driven model of the nonlinear system, comprising:

5. The method of claim 4, wherein said bringing the estimation function into the Gao Jiechang differential equation yields a higher order differential dynamic approximation, further comprising thereafter:

6. The method of claim 4, wherein iteratively processing the data driven model to determine optimal control of the nonlinear system according to the set constraints comprises:

defining an approximately estimated hamiltonian amount and a control input;

7. The method of claim 6, wherein performing differential dynamic iterative processing on the higher order differential dynamic approximations to determine optimal control of the nonlinear system based on the approximated hamiltonian and a definition of a control input, comprises:

8. An optimal control device for a nonlinear system whose model is unknown, said device comprising:

9. An optimal control device for a nonlinear system whose model is unknown, said device comprising:

an input interface configured to receive a state trace of a nonlinear system;

a memory;

a processor configured to perform the method of any one of claims 1 to 7 and generate control instructions;

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of claims 1 to 7.