CN111352735A

CN111352735A - Data acceleration method, device, storage medium and equipment

Info

Publication number: CN111352735A
Application number: CN202010124620.1A
Authority: CN
Inventors: 刘春江; 段璞; 韩东升
Original assignee: Shanghai University Ding Tech Software Co ltd
Current assignee: Shanghai University Ding Tech Software Co ltd
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2020-06-30

Abstract

The embodiment of the invention discloses a data acceleration method, a data acceleration device, a storage medium and equipment. The method comprises the following steps: acquiring acceleration requirement information, wherein the acceleration requirement information comprises the number of the applied acceleration tasks and the corresponding acceleration service type of each acceleration task; determining a configuration strategy of the plurality of accelerating units according to the accelerating demand information and the state information of the plurality of accelerating units; and controlling a plurality of acceleration units to execute corresponding acceleration tasks according to the configuration strategy aiming at the acceleration demand information. By adopting the technical scheme, the plurality of accelerating units are arranged in the data accelerating equipment, and the plurality of accelerating units are dynamically configured according to the number of the applied accelerating tasks and the accelerating service type corresponding to each accelerating task, so that the flexibility of the data accelerating scheme is improved.

Description

Data acceleration method, device, storage medium and equipment

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a data acceleration method, a data acceleration device, a storage medium and data acceleration equipment.

Background

With the advent of the big data age, decisions will increasingly be made based on data and analytics, rather than experience and intuition, in business, economic, internet of things, and other areas. Potential useful information is accurately and efficiently mined from big data so as to support decision making, and the potential useful information becomes more important and becomes a focus of attention in the field of data science. The Processing rate of a Central Processing Unit (CPU) based on an x86 architecture has not met the demands of many businesses, for example, a deep learning model requires a very large amount of data and computing power, and thus a concept of big data acceleration is proposed.

Existing solutions use a Graphics Processing Unit (GPU), and although the GPU is a good choice for the deep learning algorithm in terms of performance, its power consumption is so high that the application is also very limited. In the Field of big data acceleration, a Field Programmable Gate Array (FPGA) platform can achieve a high performance power consumption ratio, and the cases of big data acceleration based on the FPGA are more and more. However, when there are multiple service demands, the current data acceleration schemes are not flexible enough and need to be improved.

Disclosure of Invention

The embodiment of the invention provides a data acceleration method, a data acceleration device, a storage medium and equipment, which can optimize the existing data acceleration scheme.

In a first aspect, an embodiment of the present invention provides a data acceleration method, which is applied to a data acceleration device, where the data acceleration device includes multiple acceleration units, and the method includes:

acquiring acceleration demand information, wherein the acceleration demand information comprises the number of applied acceleration tasks and the acceleration service type corresponding to each acceleration task;

determining configuration strategies of the plurality of acceleration units according to the acceleration demand information and the state information of the plurality of acceleration units;

and controlling the plurality of acceleration units to execute corresponding acceleration tasks according to the configuration strategy aiming at the acceleration demand information.

In a second aspect, an embodiment of the present invention provides a data acceleration apparatus, which is integrated in a data acceleration device, where the data acceleration device includes a plurality of acceleration units, and the apparatus includes:

the system comprises an acceleration demand acquisition module, a service processing module and a service processing module, wherein the acceleration demand acquisition module is used for acquiring acceleration demand information which comprises the number of applied acceleration tasks and the type of an acceleration service corresponding to each acceleration task;

the configuration strategy determining module is used for determining the configuration strategies of the plurality of accelerating units according to the accelerating demand information and the state information of the plurality of accelerating units;

and the acceleration control module is used for controlling the plurality of acceleration units to execute corresponding acceleration tasks according to the configuration strategy aiming at the acceleration demand information.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a data acceleration method as provided by an embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a data acceleration apparatus, including a power management unit, a communication interface, a processing unit, a storage unit, a plurality of acceleration units, and a computer program stored on the storage unit and executable on the processing unit;

the communication interface is used for communicating with a server applying for an acceleration task;

the processing unit implements the data acceleration method provided by the embodiment of the invention when executing the computer program.

According to the data acceleration scheme provided by the embodiment of the invention, the acceleration demand information is obtained, the acceleration demand information comprises the number of the applied acceleration tasks and the acceleration service type corresponding to each acceleration task, the configuration strategies of the acceleration units are determined according to the acceleration demand information and the state information of the acceleration units, and the acceleration units are controlled to execute the corresponding acceleration tasks according to the configuration strategies aiming at the acceleration demand information. By adopting the technical scheme, the plurality of accelerating units are arranged in the data accelerating equipment, and the plurality of accelerating units are dynamically configured according to the number of the applied accelerating tasks and the accelerating service type corresponding to each accelerating task, so that the flexibility of the data accelerating scheme is improved.

Drawings

Fig. 1 is a schematic flowchart of a data acceleration method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of another data acceleration method according to an embodiment of the present invention;

FIG. 3 is a flow chart illustrating another data acceleration method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an efficiency tradeoff algorithm according to an embodiment of the present invention;

FIG. 5 is a flow chart illustrating another data acceleration method according to an embodiment of the present invention;

fig. 6 is a block diagram of a data acceleration apparatus according to an embodiment of the present invention;

fig. 7 is a block diagram of a data acceleration device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an acceleration unit according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Fig. 1 is a flowchart of a data acceleration method according to an embodiment of the present invention, where the method may be executed by a data acceleration apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a data acceleration device, where the data acceleration device includes a plurality of acceleration units. As shown in fig. 1, the method includes:

step 101, obtaining acceleration requirement information, wherein the acceleration requirement information comprises the number of the applied acceleration tasks and the acceleration service type corresponding to each acceleration task.

For example, the device that applies for the data acceleration service may be a server, which is also called host. In the prior art, a device for providing data acceleration service, such as an FPGA accelerator card, is generally built in a server, but is limited to the fact that a reconfiguration interface of the FPGA accelerator card is not open to a user and an internal interface of a server host is limited, such as a Peripheral Component Interconnect express (PCIe) interface, so that the flexibility of configuration of the acceleration service is poor. The data acceleration equipment in the embodiment of the invention can be independent plug-in equipment, can quickly realize the butt joint of the data acceleration equipment and the server on the premise of not changing the original structure of the server, can quickly realize the expansion of acceleration resources by the butt joint of the data acceleration equipment and the server due to the independent equipment, and does not need the upgrading of hardware of the server. Optionally, the data acceleration device may communicate with the server through an optical fiber interface, that is, the interface is in the form of an optical fiber interface, so as to ensure high data transmission efficiency. Illustratively, when the host applies for acceleration, the acceleration service type needs to be sent to the data acceleration device, and an agreed interface protocol is followed between the host and the data acceleration device. When a host proposes a data acceleration service application, the data acceleration device determines corresponding acceleration tasks according to the application, each acceleration task corresponds to a fixed acceleration service type, and the acceleration tasks can also be called acceleration services.

In the embodiment of the invention, the specific number of the accelerating units is not limited, and the accelerating units can be set according to actual requirements, so that the attention of users to cost or efficiency can be flexibly dealt with. The acceleration unit can be developed based on an FPGA, and the specific structure is not limited. The acceleration unit can preset an acceleration algorithm, and can also load a new acceleration algorithm through dynamic configuration reconfiguration, wherein the acceleration algorithm generally corresponds to the type of the acceleration service and can be set according to specific requirements. For the preset acceleration algorithm, the specific algorithm type and number are not limited, and may be, for example, four.

For example, the acceleration requirement information includes the number of requested acceleration tasks and the acceleration service type corresponding to each acceleration task (i.e., the type of acceleration service to be performed). The acceleration requirement information may be stored in a preset register of the data acceleration device, and may be referred to as a host register in the embodiment of the present invention, and is used to indicate that host-related data is processed. Optionally, when the acceleration requirement information needs to be acquired, the host register can be read immediately.

In some embodiments, the acceleration demand information may be obtained when a new acceleration task application is detected, or may be obtained when an already performed acceleration task is detected to be completed. The method has the advantages that the configuration strategies of the plurality of accelerating units are re-determined in time, certain accelerating units cannot be idle when a service is finished, a new host cannot be in a waiting state when a new service is applied, and higher efficiency is achieved.

And 102, determining configuration strategies of the plurality of acceleration units according to the acceleration demand information and the state information of the plurality of acceleration units.

Illustratively, the state information may include information such as the number of acceleration units that can participate in the acceleration task, the location number, and the type of acceleration traffic currently supported. Optionally, the state information may be updated in a self-checking manner, and the self-checking may avoid a problem in allocation of acceleration resources due to damage of a certain acceleration unit, so as to prepare for acceleration service. For example, the acceleration unit may be self-checked when a power-on reset is detected, or may be self-checked when no data acceleration task is detected (i.e., when the acceleration unit is in an idle state). Illustratively, when detecting that no data acceleration task exists, the plurality of acceleration units are self-checked based on a preset self-checking rule to update the state information of the plurality of acceleration units. The preset self-check rule may be, for example, a periodic self-check. And finishing all acceleration services and not receiving a new acceleration application, automatically accumulating the idle timer value, and triggering a self-checking process when the idle timer value reaches a threshold value. And in the accumulation process of the idle counter value, if a new task application is received, the idle timer is immediately cleared. The self-checking process is short, and even if a new task application is received in the self-checking process, the waiting time of the host is short.

For example, the plurality of acceleration units in the embodiment of the present invention may all be static acceleration units, may all be dynamic acceleration units, and may also include partial static acceleration units and partial dynamic acceleration units. The acceleration service type corresponding to the static acceleration unit is fixed, and the acceleration service type corresponding to the dynamic acceleration unit supports reconfiguration. The configuration policy of the plurality of acceleration units can be obtained by determining which acceleration unit or units specifically execute the requested acceleration task according to the acceleration demand information and the state information of the plurality of acceleration units. For example, the number of units of the acceleration unit corresponding to each acceleration task may be decided according to the type of the acceleration service, and then the acceleration unit may be allocated to each acceleration task according to an efficiency balancing algorithm. For the dynamic acceleration unit, the service type adaptation can be performed, and for the static acceleration unit, the adaptation process can be omitted.

And 103, controlling the plurality of acceleration units to execute corresponding acceleration tasks according to the configuration strategy aiming at the acceleration demand information.

Illustratively, receiving task data corresponding to an acceleration task sent by a host, distributing the task data to a corresponding acceleration unit according to a configuration strategy, and controlling the acceleration unit to execute the corresponding acceleration task by using an acceleration algorithm corresponding to an acceleration service type.

Optionally, the step may specifically include: and according to the configuration strategy, the data corresponding to the acceleration task are divided in parallel, and the divided data are distributed to the acceleration units so as to instruct the acceleration units to perform corresponding acceleration processing in parallel based on the received data. And (3) dividing a plurality of acceleration services in parallel, wherein the division method is related to the types of the acceleration services, and the divided data are sequentially sent to the distributed acceleration units for parallel processing. The data of each service is sent to the data acceleration equipment through the optical fiber interface, and the division of the service data can enable each acceleration unit to receive data blocks with the same size, so that the acceleration units allocated to a certain acceleration service all bear the processing work of the same data volume, and the data acceleration equipment can conveniently finish the splicing work of the output data of each acceleration unit.

Fig. 2 is a schematic flow chart of another data acceleration method according to an embodiment of the present invention, which is optimized based on the above-mentioned optional embodiments.

Illustratively, the plurality of acceleration units include a plurality of static acceleration units and at least one dynamic acceleration unit, acceleration service types corresponding to the plurality of static acceleration units are fixed, and the state information includes a total number of acceleration units capable of participating in acceleration. The determining the configuration strategies of the plurality of acceleration units according to the acceleration demand information and the state information of the plurality of acceleration units comprises: acquiring a preset weight factor of an acceleration service type corresponding to each acceleration task, wherein the numerical value of the preset weight factor is related to the complexity of the corresponding acceleration service type; and distributing a corresponding acceleration unit set for each acceleration task according to the preset weight factor and the total number, wherein the acceleration unit set comprises at least one static acceleration unit. The advantage of this arrangement is that efficient allocation of acceleration units can be achieved, and the acceleration efficiency of each host is pursued to the greatest extent on the basis of reasonably dealing with competition of multiple hosts for shared acceleration resources.

Optionally, the method includes:

step 201, when a new acceleration task application is detected or when the completion of an already performed acceleration task is detected, acquiring acceleration demand information.

The acceleration requirement information includes the number of the applied acceleration tasks and the acceleration service type corresponding to each acceleration task. When the completion of the executed acceleration task is detected, the acceleration demand information is acquired, the allocation scheme of the acceleration unit can still refer to the value of the host register, and allocation is performed under the guidance of the efficiency balancing algorithm, that is, allocation of the acceleration unit is performed again for the uncompleted acceleration task of the application form.

Step 202, obtaining a preset weight factor of the acceleration service type corresponding to each acceleration task, wherein a numerical value of the preset weight factor is related to a complexity of the corresponding acceleration service type.

Illustratively, the core of the efficiency tradeoff algorithm in the embodiment of the present invention is static reservation and dynamic scheduling, and the principle thereof is to select a proportion according to the operating frequency of each acceleration service and divide all acceleration units into static acceleration units and dynamic acceleration units according to the proportion, a fixed static acceleration unit may be pre-allocated to each acceleration service, the static acceleration units may respond quickly and execute a new acceleration service, and the dynamic acceleration units may rapidly expand hardware resources that can be called by the acceleration service through reconfiguration. Optionally, the running frequency for each acceleration service may be statistically known according to the processed services. For example, under the condition that multiple hosts accelerate simultaneously, each host has many services running, statistics can show that some services run more times and some services run less times, and further know the running frequency corresponding to each accelerated service. The efficiency balancing algorithm can allocate weight factors to different acceleration services, the size of the weight factors is related to the complexity of the services, if the time for a single acceleration unit to complete an acceleration task is longer, the weight factors are larger, and the allocation of the number of acceleration units participating in different acceleration services is related to the weight factors. The specific values of the weighting factors may be dynamically adjusted during use.

Step 203, distributing a corresponding acceleration unit set to each acceleration task according to a preset weight factor and the total number of acceleration units capable of participating in acceleration to obtain a configuration strategy of the multiple acceleration units, wherein the acceleration unit set comprises at least one static acceleration unit.

For example, for a first acceleration task, if the set of acceleration units corresponding to the first acceleration task includes a dynamic acceleration unit, when it is determined that the type of acceleration service currently supported by the included dynamic acceleration unit does not match the first acceleration task, the included dynamic acceleration unit is reconfigured. The first acceleration task may be any one of the requested acceleration tasks. In order to meet the acceleration requirements of various services, if an acceleration unit is allocated to execute a certain acceleration task, the acceleration unit needs to be adapted to the acceleration service type of the acceleration task before working. The adaptation is mainly to check whether the service type currently supported by the acceleration unit is consistent with the acceleration service type, and if not, the acceleration unit is immediately reconfigured to meet the requirement. The benefit balancing algorithm divides the acceleration unit into a static acceleration unit and a dynamic acceleration unit, and because each acceleration service has a corresponding static acceleration unit, the static acceleration unit allocated to a certain acceleration service certainly supports the acceleration service and can skip adaptation, while the dynamic acceleration unit needs to perform adaptation work.

Optionally, the dynamic acceleration unit includes an FPGA and a Complex Programmable Logic Device (CPLD). The reconfiguring the included dynamic acceleration unit comprises: and acquiring a configuration file of the acceleration service type corresponding to the first acceleration task through a CPLD in the contained dynamic acceleration unit, and loading the configuration file into a corresponding FPGA internal storage space to realize the reconfiguration of the contained dynamic acceleration unit. The advantage of this is that the reconfiguration of the dynamic acceleration unit can be done quickly. Specifically, if the acceleration service corresponding to the locally stored configuration file does not meet the new service requirement, the FPGA receives the configuration data through the PCIe interface, and then forwards the configuration data to the CPLD, the CPLD stores the configuration data in the external RAM and the FLASH, and finally the CPLD independently completes the reconfiguration of the FPGA; and if the acceleration service corresponding to the locally stored configuration file meets the requirement, the CPLD reads the appropriate configuration file from the FLASH and then reconfigures the FPGA.

And 204, dividing the data corresponding to the acceleration task in parallel according to the configuration strategy, and distributing the divided data to a plurality of acceleration units to instruct the plurality of acceleration units to perform corresponding acceleration processing in parallel based on the received data.

The data acceleration method provided by the embodiment of the invention is characterized in that a static acceleration unit and a dynamic acceleration unit are arranged based on the ideas of static reservation and dynamic scheduling, the complexity of the type of acceleration service is fully considered when the acceleration unit is configured, the acceleration unit is reasonably distributed, various requirements of the acceleration service can be met through dynamic reconfiguration, hardware resources are fully utilized, the idle of the acceleration unit is reduced to the maximum extent, and the acceleration flexibility and the acceleration efficiency are improved.

Fig. 3 is a flowchart of another data acceleration method provided in the embodiment of the present invention, and for convenience of description, a big data acceleration service is taken as an example, it is assumed that there are a service a corresponding to host a, a service B corresponding to host B, a service C corresponding to host C, and a service D corresponding to host D, weight factors of each service are 0.1, 0.3, 0.2, and 0.4, execution frequencies of each service are the same by default, one static acceleration unit of each service is reserved, and all of the remaining 12 acceleration units are dynamic acceleration units. The method in this embodiment may be specifically executed by a processing unit (also referred to as a resource scheduling processing unit) in the data acceleration device. As shown in fig. 3, the method may include:

and 301, periodically performing self-check on the acceleration unit in the data acceleration equipment when no acceleration task exists.

Before providing acceleration service for host, the resource scheduling processing unit carries out self-check on the acceleration unit except when detecting power-on reset, and the resource scheduling processing unit carries out self-check regularly when in an idle state, namely when no acceleration task exists. And finishing all acceleration services and not receiving a new acceleration application, automatically accumulating the idle timer value, and triggering a self-checking process when the idle timer value reaches a threshold value. The self-check mainly obtains the number, position number, currently supported acceleration service type and other information of the acceleration units which can participate in the acceleration task. Through self-checking, the number of acceleration units which can participate in acceleration tasks is 16, the positions are numbered 0-15(0,1,2,3,4, …,14 and 15), the acceleration units with the positions numbered 0-3 support acceleration of service A, service B, service C and service D respectively, the acceleration units with the positions 4-6 support service A, the acceleration units with the positions 7-9 support service B, the acceleration units with the positions 10-12 support service C, and the acceleration units with the positions 13-15 support service D. Wherein, the positions 0-3 are static accelerating units, and the positions 4-15 are dynamic accelerating units.

Step 302, refreshing a host register, and judging the number of units participating in acceleration according to the acceleration service type.

The host and the data acceleration equipment of the embodiment of the invention follow the appointed interface protocol, the resource scheduling processing unit receives acceleration applications sent by the host A and the host C through the optical fiber interface, after the host register is read, the number M of the hosts which have applied for acceleration is 2, and the accelerated services are the service A and the service C, so that the total number of the acceleration units which can participate in the service A and the service C is 14.

Step 303, allocating the acceleration units to each acceleration service according to an efficiency trade-off algorithm.

Fig. 4 is a schematic diagram of an efficiency balancing algorithm provided in an embodiment of the present invention, as shown in fig. 4, a core idea of the efficiency balancing algorithm is that an initial state of the algorithm is waiting, when a resource scheduling processing unit detects a new service application or a service is finished, a static accelerating unit and a dynamic accelerating unit are identified first, a static unit or a dynamic unit is reserved according to the algorithm, the dynamic unit is configured according to an efficiency optimization principle, the scheduling is completed, and then the algorithm returns to a waiting state, and after a period of time, the algorithm counts an operating frequency of each service, and determines whether to change an allocation scheme of the dynamic and static units according to the operating frequency. The algorithm pursues the acceleration efficiency of each host to the maximum extent on the basis of reasonably processing the competition of multiple hosts on the shared acceleration resource. The accelerating units at the position 1 and the position 3 do not participate in accelerating the task and respond to a new acceleration application at any time, the accelerating unit at the position 0 participates in accelerating the service A, the accelerating unit at the position 2 participates in accelerating the service C, and the rest accelerating units are distributed to the service A and the service C according to the proportion of 1:2, so that 5 accelerating units exist for participating in accelerating the service A, and 9 accelerating units exist for participating in accelerating the service C. Where the acceleration units of positions 4-7 are assigned to service a and the acceleration units of positions 8-15 are assigned to service C.

Step 304, the acceleration unit performs service adaptation.

The adaptation is mainly to check that the service type currently supported by the acceleration unit is consistent with the acceleration service type, and if the service type is not consistent with the acceleration service type, the acceleration unit is immediately reconfigured to meet the requirement. The position 0 acceleration unit and the position 2 acceleration unit are static acceleration units, the adaptation can be skipped, and the acceleration units at positions 4-15 need to be adapted. During the adaptation process, the accelerators at positions 4-6 and 10-12 need not be reconfigured, the accelerator at position 7 needs to be reconfigured to support service a, and the accelerators at positions 8-9 and 13-15 need to be reconfigured to support service C.

Step 305, the acceleration unit processes the service in parallel.

And the resource scheduling processing unit performs data segmentation on the service A and the service C in parallel, and the segmented data are sequentially sent to the distributed accelerating units for parallel processing. The data of each service is sent to the resource scheduling processing unit through the optical fiber interface, the segmentation of the service data enables each acceleration unit to receive data blocks with the same size, the acceleration unit allocated to the service A bears the processing work of the same data volume, and the same is true for the acceleration unit allocated to the service C. And the resource scheduling processing unit receives the return data of the accelerating units participating in the service A and the service C in parallel and then splices the return data according to the time sequence.

And step 306, detecting that the service is finished or a new service is applied, and re-allocating the acceleration units.

Before the service A and the service C are finished, the allocation scheme is not fixed, and if the resource scheduling processing unit detects that the service B applies for acceleration, the allocation scheme of the acceleration unit is immediately re-planned. It can be seen that, the number of host M for which acceleration has been applied changes from 2 to 3, and in order to reduce the waiting time of host B, the position 1 acceleration unit immediately participates in the acceleration of service B, while the number of dynamic acceleration units participating in service a and service C needs to be adjusted to 2 and 4 before the next data segmentation, i.e. the number of dynamic acceleration units participating in service a and service C is respectively reduced by 2 and 4. The acceleration units of positions 6-7 and 12-15 will be assigned to service B, and after the adaptation of these acceleration units is completed, the data partitioning of service B is adjusted to achieve distribution to 7 acceleration units.

According to the data acceleration method provided by the embodiment of the invention, the acceleration unit can be dynamically reconfigured to meet various acceleration service requirements, the change of a hardware scheme can be avoided when a new service requirement is met, the end of a service or a new service application is detected, the acceleration unit is distributed again, hardware resources are fully utilized, the idle of the acceleration unit is reduced to the maximum extent, an efficiency balancing algorithm is used when the acceleration unit is distributed, so that a user obtains the best acceleration experience, and the acceleration flexibility and the acceleration efficiency are improved.

Fig. 5 is a schematic flow chart of another data acceleration method according to an embodiment of the present invention, which can be further understood with reference to fig. 5. Refreshing a host register to obtain host data M applied for acceleration, determining the host data M to be in an idle state if the M is 0, periodically performing self-inspection on acceleration units in the equipment, judging the number n of the acceleration units participating in acceleration tasks according to the type of the acceleration service if the M is not 0, distributing the acceleration units to each acceleration task according to an efficiency balance algorithm, performing parallel processing on the service after the acceleration units are adapted, reducing the value of the M when the service is finished, adding the value of the M when a new host applies for acceleration, and then returning to judge whether the M is 0 or not.

Fig. 6 is a block diagram of a data acceleration apparatus according to an embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and is specifically integrated in a data acceleration device, and may perform data acceleration by performing a data acceleration method. As shown in fig. 6, the apparatus includes:

an acceleration requirement obtaining module 601, configured to obtain acceleration requirement information, where the acceleration requirement information includes the number of requested acceleration tasks and an acceleration service type corresponding to each acceleration task;

a configuration policy determining module 602, configured to determine configuration policies of the plurality of acceleration units according to the acceleration demand information and the state information of the plurality of acceleration units;

an acceleration control module 603, configured to control the plurality of acceleration units to execute corresponding acceleration tasks according to the configuration policy for the acceleration requirement information.

The data acceleration device provided in the embodiment of the invention obtains acceleration demand information, the acceleration demand information comprises the number of the applied acceleration tasks and the acceleration service type corresponding to each acceleration task, determines the configuration strategies of the acceleration units according to the acceleration demand information and the state information of the acceleration units, and controls the acceleration units to execute the corresponding acceleration tasks according to the configuration strategies aiming at the acceleration demand information. By adopting the technical scheme, the plurality of accelerating units are arranged in the data accelerating equipment, and the plurality of accelerating units are dynamically configured according to the number of the applied accelerating tasks and the accelerating service type corresponding to each accelerating task, so that the flexibility of the data accelerating scheme is improved.

Optionally, the acquiring the acceleration requirement information includes:

and acquiring acceleration demand information when detecting a new acceleration task application and/or detecting that an already-performed acceleration task is completed.

Optionally, the apparatus further comprises:

and the self-checking module is used for self-checking the plurality of accelerating units based on a preset self-checking rule when no data accelerating task exists, so as to update the state information of the plurality of accelerating units.

Optionally, the plurality of acceleration units include a plurality of static acceleration units and at least one dynamic acceleration unit, acceleration service types corresponding to the plurality of static acceleration units are fixed, and the state information includes a total number of acceleration units that can participate in acceleration;

the determining the configuration strategies of the plurality of acceleration units according to the acceleration demand information and the state information of the plurality of acceleration units comprises:

acquiring a preset weight factor of an acceleration service type corresponding to each acceleration task, wherein the numerical value of the preset weight factor is related to the complexity of the corresponding acceleration service type;

and distributing a corresponding acceleration unit set for each acceleration task according to the preset weight factor and the total number to obtain a configuration strategy of the multiple acceleration units, wherein the acceleration unit set comprises at least one static acceleration unit.

Optionally, the apparatus further comprises:

a configuration module, configured to, after the allocating a corresponding set of acceleration units to each acceleration task according to the preset weighting factor and the total number, further include: for a first acceleration task, if the set of acceleration units corresponding to the first acceleration task includes a dynamic acceleration unit, when it is determined that the current supported acceleration service type of the included dynamic acceleration unit is not matched with the first acceleration task, the included dynamic acceleration unit is reconfigured.

Optionally, the dynamic acceleration unit includes a field programmable gate array FPGA and a complex programmable logic device CPLD;

the reconfiguring the included dynamic acceleration unit comprises:

and acquiring a configuration file of the acceleration service type corresponding to the first acceleration task through a CPLD in the contained dynamic acceleration unit, and loading the configuration file into a corresponding FPGA internal storage space to realize the reconfiguration of the contained dynamic acceleration unit.

Optionally, the controlling, according to the configuration policy, the plurality of acceleration units to execute corresponding acceleration tasks according to the acceleration requirement information includes:

and according to the configuration strategy, the data corresponding to the acceleration tasks are divided in parallel, and the divided data are distributed to corresponding acceleration units so as to instruct the acceleration units to perform corresponding acceleration processing in parallel based on the received data.

Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a data acceleration method, the method comprising:

Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDRRAM, SRAM, EDORAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.

Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the data acceleration operation described above, and may also perform related operations in the data acceleration method provided by any embodiment of the present invention.

The embodiment of the invention provides data acceleration equipment, and the data acceleration device provided by the embodiment of the invention can be integrated in the data acceleration equipment. The data acceleration device is characterized by comprising a power management unit, a communication interface, a processing unit, a storage unit, a plurality of acceleration units and a computer program which is stored on the storage unit and can run on the processing unit. The communication interface is used for communicating with a server applying for an acceleration task; the processing unit executes the computer program to provide the data acceleration method according to the embodiment of the invention.

The data acceleration equipment provided by the embodiment of the invention is provided with a plurality of acceleration units, and the plurality of acceleration units are dynamically configured according to the number of the applied acceleration tasks and the acceleration service type corresponding to each acceleration task, so that the flexibility of a data acceleration scheme is improved.

Fig. 7 is a block diagram of a data acceleration device according to an exemplary embodiment of the present invention. The data acceleration device may include: the system comprises a power management unit 701, an air cooling unit 702, an optical fiber interface 703, a resource scheduling processing unit 704, a storage unit, a PCIe interface 707, and an acceleration unit 708, wherein the resource scheduling processing unit 704 is configured to execute the data acceleration method according to the embodiment of the present invention.

Specifically, the power management unit 701 may include an AC-DC circuit, a DC-DC circuit, and a power filter circuit, and mainly meets the power characteristics required for the operation of the device. The air cooling unit 702 is used for heat dissipation of the high power supply. The fibre interface 703 is used to enable data exchange with host. The resource scheduling processing unit 704 can be developed based on an FPGA platform, and will fully exert the advantages of parallel processing to pursue high-speed processing capability of data, and mainly complete scheduling of an acceleration unit, acceleration of service data segmentation, acceleration of splicing of processed data, and the like. The storage units are mainly a DDR memory 705 and a FLASH706, which respectively provide data cache and data records for the resource scheduling processing unit 704. The PCIe interface 707 is used to implement data exchange between the resource scheduling processing unit 704 and the acceleration unit 708. The acceleration unit 708 is used to implement acceleration services for different acceleration services. The whole equipment is light and convenient, can simultaneously meet the acceleration service requirements of four host, and can be increased in proportion aiming at more host schemes.

Fig. 8 is a block diagram of a structure of an acceleration unit according to an embodiment of the present invention, and as shown in fig. 4, the acceleration unit is developed based on an FPGA and supports dynamic reconfiguration. The acceleration algorithm designed based on the parallel processing mechanism is the core of the acceleration unit, the acceleration algorithm and the acceleration service need to be adapted to each other, and different acceleration algorithms are needed for different acceleration services. The device meets the acceleration requirements of various services, four acceleration algorithms are locally stored in the acceleration unit, and a new acceleration algorithm can be loaded through dynamic reconfiguration. The reconfiguration of the FPGA is completed through one CPLD, if the acceleration service corresponding to the locally stored configuration file does not meet the new service requirement, the FPGA receives the configuration data through a PCIe interface firstly and then forwards the configuration data to the CPLD, the CPLD stores the configuration data to an external RAM and a FLASH, and finally the CPLD independently completes the reconfiguration of the FPGA; and if the acceleration service corresponding to the locally stored configuration file meets the requirement, the CPLD reads the appropriate configuration file from the FLASH and then reconfigures the FPGA. In addition, the number of the accelerating units loaded in the device can be freely changed within a certain range, and the device can flexibly deal with the attention of users to cost or efficiency. The change of the number of the accelerating units has no influence on the resource scheduling processing unit, and the resource scheduling processing unit can freely respond to the change of the number of the accelerating units, thereby fully meeting the use requirements of users.

The data acceleration device, the storage medium and the data acceleration apparatus provided in the above embodiments may execute the data acceleration method provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. For technical details that are not described in detail in the above embodiments, reference may be made to a data acceleration method provided in any embodiment of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A data acceleration method is applied to a data acceleration device, wherein the data acceleration device comprises a plurality of acceleration units, and the method comprises the following steps:

2. The method of claim 1, wherein the obtaining acceleration demand information comprises:

3. The method of claim 1, further comprising:

and when detecting that no data acceleration task exists, performing self-checking on the plurality of acceleration units based on a preset self-checking rule so as to update the state information of the plurality of acceleration units.

4. The method according to claim 1, wherein the plurality of acceleration units include a plurality of static acceleration units and at least one dynamic acceleration unit, acceleration service types corresponding to the plurality of static acceleration units are fixed, and the state information includes a total number of acceleration units capable of participating in acceleration;

5. The method of claim 4, further comprising, after said assigning a corresponding set of acceleration units to each of said acceleration tasks according to said preset weighting factor and said total number:

for a first acceleration task, if the set of acceleration units corresponding to the first acceleration task includes a dynamic acceleration unit, when it is determined that the current supported acceleration service type of the included dynamic acceleration unit is not matched with the first acceleration task, the included dynamic acceleration unit is reconfigured.

6. The method according to claim 5, wherein the dynamic acceleration unit comprises a Field Programmable Gate Array (FPGA) and a Complex Programmable Logic Device (CPLD);

the reconfiguring the included dynamic acceleration unit comprises:

7. The method according to claim 1, wherein said controlling the plurality of acceleration units to execute the corresponding acceleration tasks according to the configuration policy with respect to the acceleration requirement information comprises:

8. A data acceleration apparatus integrated into a data acceleration device, the data acceleration device including a plurality of acceleration units therein, the apparatus comprising:

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.

10. A data acceleration apparatus, comprising a power management unit, a communication interface, a processing unit, a storage unit, a plurality of acceleration units, and a computer program stored on the storage unit and executable on the processing unit;

the processing unit, when executing the computer program, implements the method of any of claims 1-7.