CN101645052A - Quick direct memory access (DMA) ping-pong caching method - Google Patents
Quick direct memory access (DMA) ping-pong caching method Download PDFInfo
- Publication number
- CN101645052A CN101645052A CN200810142301A CN200810142301A CN101645052A CN 101645052 A CN101645052 A CN 101645052A CN 200810142301 A CN200810142301 A CN 200810142301A CN 200810142301 A CN200810142301 A CN 200810142301A CN 101645052 A CN101645052 A CN 101645052A
- Authority
- CN
- China
- Prior art keywords
- data
- dma
- moving
- buffer memory
- ping
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Image Input (AREA)
Abstract
The invention relates to a quick direct memory access (DMA) ping-pong caching method, which is used for moving data blocks of which part of data are the same from adjacent data blocks. The method comprises the following steps that: the DMA firstly moves the data blocks in a processable byte quantity by a CPU into a target cache and then sequentially moves the data blocks into the target cache until the target cache is completely covered and the turn of data moving is completed, wherein the byte quantity of the sequentially moved data block is equal to that of the part of the different data inthe adjacent data blocks from the data blocks which need to be moved. The method can reduce the redundancy in the process of processing the part of the same data in the adjacent data so as to reduce the quantity of the data moved by the DMA at each time, thereby decreasing the wait time of the CPU.
Description
Technical field
The present invention relates to technical field of information processing, be specifically related to a kind of quick DMA (Direct MemoryAccess, direct memory access) ping-pong buffer method.
Background technology
In existing mainstream chip processor (as DSP, FPGA, ASIC etc.), its on-chip memory space all is limited, and the access speed difference of on-chip memory and chip external memory is sizable.So, when handling the data (as audio-video signal) of highly dense intensity,, must use the relatively slow chip external memory of speed because data volume is very big, CPU (CPU (central processing unit)) directly visits chip external memory and can cause the reading and writing data disappearance in addition, causes the low of treatment effeciency.DMA is parts that can be independent of CPU work, and in existing main flow processor, DMA is indispensable parts.
In order to reduce the reading and writing data disappearance, a kind of disposal route commonly used just is to use DMA table tennis processing procedure, the table tennis processing procedure of DMA is meant when CPU handles the data in the memory headroom, the DMA parts are moved memory headroom to be processed CPU next time with the data that CPU needs to handle next time from external memory, because the running background characteristic of DMA, the time that makes DMA move hides in CPU in the process of pre-treatment.Like this, CPU just can directly obtain from on-chip memory when handling the data of internal memory next, finally reduces the time of reading and writing data disappearance.
In some cases, two adjacent data moving for twice of DMA are relevant.For example in video compress, existing ping-pong buffer method is when the current macro row is carried out estimation, start DMA the needed 48 row reference datas of next macro-block line are moved BUFFER_B in the sheet, as shown in Figure 1, that current macro is exercised usefulness here is BUFFER_A, be to take exercises by previous macro-block line to utilize DMA to move into when estimating, data structure by the used reference frame of estimation in the video coding can be noticed, have 2/3 to be that 32 row pixels are duplicate during adjacent twice DMA moves, adopt existing DMA table tennis processing mode can bring a lot of redundancies.
Summary of the invention
Technical matters to be solved by this invention is, a kind of ping-pong buffer of DMA fast method is provided, the present invention can reduce the redundancy in the process that the identical data processing of part is arranged in the adjacent data, and the data volume that makes each DMA move reduces, thereby reduces the time that CPU waits for.
A kind of quick direct memory access (DMA) ping-pong caching method, be used for moving adjacent data blocks the identical data block of partial data is arranged, described method is, DMA at first moves the data block of the each accessible amount of bytes size of CPU in the purpose buffer memory, and then move successively in the data block that need move with adjacent data blocks in the equal-sized data block of partial data amount of bytes inequality in ensuing purpose buffer memory, covered fully up to the purpose buffer memory, the epicycle data-moving finishes.
After the each data-moving of described DMA finished, CPU all can begin the data in the purpose buffer memory are handled, and started DMA simultaneously and carry out data-moving next time; And the like, all data in the purpose buffer memory are all processed to finish.
Described DMA is every to take turns in the data-moving, and for the first time the destination address of data-moving is the first address of purpose buffer memory, and the destination address of each data-moving is the data block amount of bytes of once moving before the destination address skew that last secondary data moves afterwards.
The start address of the data that described CPU handles for the first time is the first address of purpose buffer memory, the data block amount of bytes of once moving before the start address skew DMA of the data of single treatment before the start address of each data of handling is afterwards.
With respect to existing ping-pong buffer method, there is identical data in the method for the invention for adjacent data blocks and uses under the situation of same memory space, has reduced by 50% DMA data-moving amount.
Description of drawings
Fig. 1 is existing ping-pong buffer structural representation;
Fig. 2 is a ping-pong buffer structural representation of the present invention;
Fig. 3 is the method for the invention process flow diagram.
Embodiment
Below in conjunction with accompanying drawing method of the present invention is described in further details.
In conjunction with Fig. 2, shown in Figure 3, for for simplicity, the amount of bytes of supposing the each accessible data block of CPU of the present invention is the M byte, and the first address of purpose buffer memory is Addr, size is the N byte, and identical data volume is B in the data block that adjacent twice DMA moves when establishing ping-pong buffer, and the ratio of the relative M byte of B is a=B/M, then covers the required number of times of moving of DMA of purpose buffer memory N byte to be
L=floor ((N-M)/(M (1-a)))+1, wherein floor () expression rounds downwards.
The present invention finishes as follows:
The first step: utilize DMA to begin moving of M byte data piece, destination address is Addr, is the first address of purpose buffer memory, move address section in the purpose buffer memory and be [Addr, Addr+M);
Second step: the DMA that waits for the first step moves end, CPU is [Addr to address section, when data Addr+M) are handled, start the data block that DMA moves M* (1-a) byte once more, destination address is Addr+M, move address section in the purpose buffer memory for being [Addr+M, Addr+M* (1-a)+M);
The 3rd step: the DMA that waited for for second step moves end, CPU is [Addr+M* (1-a) to address section, the data of Addr+M* (1-a)+M) are handled, meanwhile, start DMA and move the next data block of M* (1-a) byte, destination address is Addr+M* (1-a)+M, and the address section of moving in the purpose buffer memory is [Addr+M* (1-a)+M, Addr+2M* (1-a)+M);
The 4th step: always repeated for the 3rd step, be DMA of every startup, then move M* (1-a) byte data piece, and each destination address skew M (1-a), the data start address that corresponding C PU handles also is offset M (1-a), begin the data that the L-1 time DMA moves are handled up to CPU, its address section be [Addr+ (L-2) * M (1-a), Addr+M+ (L-2) * M (1-a)), meanwhile, DMA begins to start the L time and moves, destination address is Addr+M+ (L-2) * M (1-a), and move address section in the purpose buffer memory be [Addr+M+ (L-2) * M (1-a), Addr+M+ (L-1) * M (1-a)), promptly [Addr+M+ (L-2) * M (1-a), 2M);
The 5th step: wait for that the L time DMA moves end, it is [Addr+ (L-1) * M (1-a) that CPU begins address section, Addr+M+ (L-1) * M (1-a)) data are handled, epicycle purpose buffer memory covers and finishes, meanwhile, start DMA and move the M byte data, and the start address of purpose buffer memory wheel goes back to Addr, the purpose buffer memory of a beginning new round covers, and only all disposes to all signals.
Estimation in encoding with video below illustrates concrete an application of the present invention as example, and other utilization can be done similar processing.
Estimation need be used the data of reference frame, common way be with the reference frame of current coding macro block row institute correspondence position last, current, down macro-block line totally 48 row pixels utilize DMA to move in the chip internal storer.Data structure by the used reference frame of estimation in the video coding it may be noted that have 2/3 to be that 32 row pixels are duplicate in the 48 row pixels that will move.
Utilize below the present invention right-motion estimation process of frame CIF image in different resolution is described, M=48 * 352=16896B in the present embodiment, the first address of purpose buffer memory are Addr, N=2M=33792B, B are 2/3M, a=B/M=2/3 is so there is L=4.
S01: utilize DMA to move the needed M byte of first macro-block line reference data, destination address is Addr, move in the purpose buffer memory address section for be [Addr, Addr+M);
S02: wait for that the DMA among the S01 moves end, CPU begins first macro-block line (being first macro-block line among the S01) is carried out estimation, the address section of employed reference data be [Addr, Addr+M); Meanwhile, start DMA and move the i.e. reference data of 16 row pixels of 1/3*M, destination address is Addr+M, move in the purpose buffer memory address section for be [Addr+M, Addr+4/3*M);
S03: wait for that the DMA among the S02 moves end, CPU carries out estimation to second macro-block line (S02 relatively), the address section of employed reference data be [Addr+1/3*M, Addr+4/3*M); Meanwhile, start the reference data that DMA moves ensuing 1/3*M, destination address is Addr+4/3*M, move in the purpose buffer memory address section for be [Addr+4/3*M, Addr+5/3*M);
S04: wait for that the DMA among the S03 moves end, CPU carries out estimation to the 3rd macro-block line (S02 relatively), employed reference data be address section [Addr+2/3*M, Addr+M*5/3); Meanwhile, start DMA and move ensuing 1/3*M reference data, destination address is Addr+5/3*M, move in the purpose buffer memory address section for be [Addr+5/3*M, Addr+2M);
S05: wait for that the DMA among the S04 moves end, CPU carries out estimation to the 4th macro-block line (S02 relatively), the address space of employed reference data be [Addr+M, Addr+2M), epicycle purpose buffer memory covers and finishes; Meanwhile, start DMA and move the reference data that corresponding size is M, at this moment, the destination address wheel goes back to Addr, and the macro-block line address among the S02 is done the skew of 4 macro-block line, as first macro-block line of estimation next time, repeat S02 and all dispose up to the macro-block line estimation of putting in order frame to S04.
With respect to existing ping-pong buffer method, caching method used in the present invention is using under the situation of same memory space, with the DMA data-moving amount that reduces 50% when doing Video Motion Estimation.
Claims (4)
1, a kind of quick direct memory access (DMA) ping-pong caching method, be used for moving adjacent data blocks the identical data block of partial data is arranged, it is characterized in that, described method is, DMA at first moves the data block of the each accessible amount of bytes size of CPU in the purpose buffer memory, and then move successively in the data block that need move with adjacent data blocks in the equal-sized data block of partial data amount of bytes inequality in ensuing purpose buffer memory, covered fully up to the purpose buffer memory, the epicycle data-moving finishes.
2, quick direct memory access (DMA) ping-pong caching method as claimed in claim 1 is characterized in that, after the each data-moving of described DMA finished, CPU all can begin the data in the purpose buffer memory are handled, and starts DMA simultaneously and carry out data-moving next time; And the like, all data in the purpose buffer memory are all processed to finish.
3, quick direct memory access (DMA) ping-pong caching method as claimed in claim 1, it is characterized in that, described DMA is every to take turns in the data-moving, for the first time the destination address of data-moving is the first address of purpose buffer memory, and the destination address of each data-moving is the data block amount of bytes of once moving before the destination address skew that last secondary data moves afterwards.
4, quick direct memory access (DMA) ping-pong caching method as claimed in claim 2, it is characterized in that, the start address of the data that described CPU handles for the first time is the first address of purpose buffer memory, the data block amount of bytes of once moving before the start address skew DMA of the data of single treatment before the start address of each data of handling is afterwards.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008101423012A CN101645052B (en) | 2008-08-06 | 2008-08-06 | Quick direct memory access (DMA) ping-pong caching method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008101423012A CN101645052B (en) | 2008-08-06 | 2008-08-06 | Quick direct memory access (DMA) ping-pong caching method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101645052A true CN101645052A (en) | 2010-02-10 |
CN101645052B CN101645052B (en) | 2011-10-26 |
Family
ID=41656941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2008101423012A Expired - Fee Related CN101645052B (en) | 2008-08-06 | 2008-08-06 | Quick direct memory access (DMA) ping-pong caching method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101645052B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101820543A (en) * | 2010-03-30 | 2010-09-01 | 北京蓝色星河软件技术发展有限公司 | Ping-pong structure fast data access method combined with direct memory access (DMA) |
CN105516547A (en) * | 2015-12-10 | 2016-04-20 | 中国科学技术大学 | Video dehazing optimization method based on DSP (Digital Signal Processor) |
CN110399322A (en) * | 2019-06-28 | 2019-11-01 | 苏州浪潮智能科技有限公司 | A kind of data transmission method and DMA framework of rattling |
CN111615692A (en) * | 2019-05-23 | 2020-09-01 | 深圳市大疆创新科技有限公司 | Data transfer method, calculation processing device, and storage medium |
CN112506437A (en) * | 2020-12-10 | 2021-03-16 | 上海阵量智能科技有限公司 | Chip, data moving method and electronic equipment |
CN115802236A (en) * | 2023-01-04 | 2023-03-14 | 成都市安比科技有限公司 | Method for shortening delay of earphone with auxiliary hearing |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020016873A1 (en) * | 2000-07-31 | 2002-02-07 | Gray Donald M. | Arbitrating and servicing polychronous data requests in direct memory access |
KR20050000927A (en) * | 2003-06-25 | 2005-01-06 | 삼성전자주식회사 | a video data control unit and a loading/storing method of video data thereof |
CN101043282A (en) * | 2006-03-24 | 2007-09-26 | 中兴通讯股份有限公司 | Data storage means for multi-channel voice process |
CN101060627A (en) * | 2007-04-13 | 2007-10-24 | 深圳安凯微电子技术有限公司 | A high definition signal decoder |
-
2008
- 2008-08-06 CN CN2008101423012A patent/CN101645052B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020016873A1 (en) * | 2000-07-31 | 2002-02-07 | Gray Donald M. | Arbitrating and servicing polychronous data requests in direct memory access |
KR20050000927A (en) * | 2003-06-25 | 2005-01-06 | 삼성전자주식회사 | a video data control unit and a loading/storing method of video data thereof |
CN101043282A (en) * | 2006-03-24 | 2007-09-26 | 中兴通讯股份有限公司 | Data storage means for multi-channel voice process |
CN101060627A (en) * | 2007-04-13 | 2007-10-24 | 深圳安凯微电子技术有限公司 | A high definition signal decoder |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101820543A (en) * | 2010-03-30 | 2010-09-01 | 北京蓝色星河软件技术发展有限公司 | Ping-pong structure fast data access method combined with direct memory access (DMA) |
CN105516547A (en) * | 2015-12-10 | 2016-04-20 | 中国科学技术大学 | Video dehazing optimization method based on DSP (Digital Signal Processor) |
CN111615692A (en) * | 2019-05-23 | 2020-09-01 | 深圳市大疆创新科技有限公司 | Data transfer method, calculation processing device, and storage medium |
CN110399322A (en) * | 2019-06-28 | 2019-11-01 | 苏州浪潮智能科技有限公司 | A kind of data transmission method and DMA framework of rattling |
CN110399322B (en) * | 2019-06-28 | 2021-03-09 | 苏州浪潮智能科技有限公司 | Data transmission method and ping-pong DMA framework |
CN112506437A (en) * | 2020-12-10 | 2021-03-16 | 上海阵量智能科技有限公司 | Chip, data moving method and electronic equipment |
CN115802236A (en) * | 2023-01-04 | 2023-03-14 | 成都市安比科技有限公司 | Method for shortening delay of earphone with auxiliary hearing |
CN115802236B (en) * | 2023-01-04 | 2023-04-14 | 成都市安比科技有限公司 | Method for shortening delay of earphone with auxiliary hearing |
Also Published As
Publication number | Publication date |
---|---|
CN101645052B (en) | 2011-10-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101645052B (en) | Quick direct memory access (DMA) ping-pong caching method | |
US7132963B2 (en) | Methods and apparatus for processing variable length coded data | |
CN108022269B (en) | Modeling system for GPU (graphics processing Unit) compression texture storage Cache | |
KR101329517B1 (en) | Digital still camera architecture with reduced delay between subsequent image acquisitions | |
US20050262276A1 (en) | Design method for implementing high memory algorithm on low internal memory processor using a direct memory access (DMA) engine | |
CN103827818B (en) | FIFO loading instructions | |
US7885336B2 (en) | Programmable shader-based motion compensation apparatus and method | |
CN1305313C (en) | System for discrete cosine transforms/inverse discrete cosine transforms based on pipeline architecture | |
WO2013149132A1 (en) | System and method for multi-core hardware video encoding and decoding | |
CN102446087A (en) | Instruction prefetching method and device | |
Fan et al. | A hardware-oriented IME algorithm for HEVC and its hardware implementation | |
CN1112654C (en) | Image processor | |
JP2009211494A (en) | Information processor, and information processing method | |
CN103997648B (en) | A kind of JPEG2000 standard picture rapid decompression compression systems and method based on DSP | |
CN1852442A (en) | Layering motion estimation method and super farge scale integrated circuit | |
US20130235272A1 (en) | Image processing apparatus and image processing method | |
US10440359B2 (en) | Hybrid video encoder apparatus and methods | |
CN1520187A (en) | System and method for video data compression | |
CN104967856A (en) | Coding method and corresponding device | |
JP2009522698A (en) | Memory organization scheme and controller architecture for image and video processing | |
JP2004046499A (en) | Data processing system | |
JP2006042331A (en) | Coefficient variable length coding method of four steps pipe line system and coefficient variable length coding machine | |
TWI586144B (en) | Multiple stream processing for video analytics and encoding | |
JP2003153283A (en) | Method for performing motion estimation in video encoding, a video encoding system, and a video encoding device | |
US20140185928A1 (en) | Hardware-supported huffman coding of images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20111026 Termination date: 20160806 |