CN101645052A - Quick direct memory access (DMA) ping-pong caching method - Google Patents

Quick direct memory access (DMA) ping-pong caching method Download PDF

Info

Publication number
CN101645052A
CN101645052A CN200810142301A CN200810142301A CN101645052A CN 101645052 A CN101645052 A CN 101645052A CN 200810142301 A CN200810142301 A CN 200810142301A CN 200810142301 A CN200810142301 A CN 200810142301A CN 101645052 A CN101645052 A CN 101645052A
Authority
CN
China
Prior art keywords
data
dma
moving
buffer memory
ping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810142301A
Other languages
Chinese (zh)
Other versions
CN101645052B (en
Inventor
陈晨航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN2008101423012A priority Critical patent/CN101645052B/en
Publication of CN101645052A publication Critical patent/CN101645052A/en
Application granted granted Critical
Publication of CN101645052B publication Critical patent/CN101645052B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Input (AREA)

Abstract

The invention relates to a quick direct memory access (DMA) ping-pong caching method, which is used for moving data blocks of which part of data are the same from adjacent data blocks. The method comprises the following steps that: the DMA firstly moves the data blocks in a processable byte quantity by a CPU into a target cache and then sequentially moves the data blocks into the target cache until the target cache is completely covered and the turn of data moving is completed, wherein the byte quantity of the sequentially moved data block is equal to that of the part of the different data inthe adjacent data blocks from the data blocks which need to be moved. The method can reduce the redundancy in the process of processing the part of the same data in the adjacent data so as to reduce the quantity of the data moved by the DMA at each time, thereby decreasing the wait time of the CPU.

Description

A kind of quick direct memory access (DMA) ping-pong caching method
Technical field
The present invention relates to technical field of information processing, be specifically related to a kind of quick DMA (Direct MemoryAccess, direct memory access) ping-pong buffer method.
Background technology
In existing mainstream chip processor (as DSP, FPGA, ASIC etc.), its on-chip memory space all is limited, and the access speed difference of on-chip memory and chip external memory is sizable.So, when handling the data (as audio-video signal) of highly dense intensity,, must use the relatively slow chip external memory of speed because data volume is very big, CPU (CPU (central processing unit)) directly visits chip external memory and can cause the reading and writing data disappearance in addition, causes the low of treatment effeciency.DMA is parts that can be independent of CPU work, and in existing main flow processor, DMA is indispensable parts.
In order to reduce the reading and writing data disappearance, a kind of disposal route commonly used just is to use DMA table tennis processing procedure, the table tennis processing procedure of DMA is meant when CPU handles the data in the memory headroom, the DMA parts are moved memory headroom to be processed CPU next time with the data that CPU needs to handle next time from external memory, because the running background characteristic of DMA, the time that makes DMA move hides in CPU in the process of pre-treatment.Like this, CPU just can directly obtain from on-chip memory when handling the data of internal memory next, finally reduces the time of reading and writing data disappearance.
In some cases, two adjacent data moving for twice of DMA are relevant.For example in video compress, existing ping-pong buffer method is when the current macro row is carried out estimation, start DMA the needed 48 row reference datas of next macro-block line are moved BUFFER_B in the sheet, as shown in Figure 1, that current macro is exercised usefulness here is BUFFER_A, be to take exercises by previous macro-block line to utilize DMA to move into when estimating, data structure by the used reference frame of estimation in the video coding can be noticed, have 2/3 to be that 32 row pixels are duplicate during adjacent twice DMA moves, adopt existing DMA table tennis processing mode can bring a lot of redundancies.
Summary of the invention
Technical matters to be solved by this invention is, a kind of ping-pong buffer of DMA fast method is provided, the present invention can reduce the redundancy in the process that the identical data processing of part is arranged in the adjacent data, and the data volume that makes each DMA move reduces, thereby reduces the time that CPU waits for.
A kind of quick direct memory access (DMA) ping-pong caching method, be used for moving adjacent data blocks the identical data block of partial data is arranged, described method is, DMA at first moves the data block of the each accessible amount of bytes size of CPU in the purpose buffer memory, and then move successively in the data block that need move with adjacent data blocks in the equal-sized data block of partial data amount of bytes inequality in ensuing purpose buffer memory, covered fully up to the purpose buffer memory, the epicycle data-moving finishes.
After the each data-moving of described DMA finished, CPU all can begin the data in the purpose buffer memory are handled, and started DMA simultaneously and carry out data-moving next time; And the like, all data in the purpose buffer memory are all processed to finish.
Described DMA is every to take turns in the data-moving, and for the first time the destination address of data-moving is the first address of purpose buffer memory, and the destination address of each data-moving is the data block amount of bytes of once moving before the destination address skew that last secondary data moves afterwards.
The start address of the data that described CPU handles for the first time is the first address of purpose buffer memory, the data block amount of bytes of once moving before the start address skew DMA of the data of single treatment before the start address of each data of handling is afterwards.
With respect to existing ping-pong buffer method, there is identical data in the method for the invention for adjacent data blocks and uses under the situation of same memory space, has reduced by 50% DMA data-moving amount.
Description of drawings
Fig. 1 is existing ping-pong buffer structural representation;
Fig. 2 is a ping-pong buffer structural representation of the present invention;
Fig. 3 is the method for the invention process flow diagram.
Embodiment
Below in conjunction with accompanying drawing method of the present invention is described in further details.
In conjunction with Fig. 2, shown in Figure 3, for for simplicity, the amount of bytes of supposing the each accessible data block of CPU of the present invention is the M byte, and the first address of purpose buffer memory is Addr, size is the N byte, and identical data volume is B in the data block that adjacent twice DMA moves when establishing ping-pong buffer, and the ratio of the relative M byte of B is a=B/M, then covers the required number of times of moving of DMA of purpose buffer memory N byte to be
L=floor ((N-M)/(M (1-a)))+1, wherein floor () expression rounds downwards.
The present invention finishes as follows:
The first step: utilize DMA to begin moving of M byte data piece, destination address is Addr, is the first address of purpose buffer memory, move address section in the purpose buffer memory and be [Addr, Addr+M);
Second step: the DMA that waits for the first step moves end, CPU is [Addr to address section, when data Addr+M) are handled, start the data block that DMA moves M* (1-a) byte once more, destination address is Addr+M, move address section in the purpose buffer memory for being [Addr+M, Addr+M* (1-a)+M);
The 3rd step: the DMA that waited for for second step moves end, CPU is [Addr+M* (1-a) to address section, the data of Addr+M* (1-a)+M) are handled, meanwhile, start DMA and move the next data block of M* (1-a) byte, destination address is Addr+M* (1-a)+M, and the address section of moving in the purpose buffer memory is [Addr+M* (1-a)+M, Addr+2M* (1-a)+M);
The 4th step: always repeated for the 3rd step, be DMA of every startup, then move M* (1-a) byte data piece, and each destination address skew M (1-a), the data start address that corresponding C PU handles also is offset M (1-a), begin the data that the L-1 time DMA moves are handled up to CPU, its address section be [Addr+ (L-2) * M (1-a), Addr+M+ (L-2) * M (1-a)), meanwhile, DMA begins to start the L time and moves, destination address is Addr+M+ (L-2) * M (1-a), and move address section in the purpose buffer memory be [Addr+M+ (L-2) * M (1-a), Addr+M+ (L-1) * M (1-a)), promptly [Addr+M+ (L-2) * M (1-a), 2M);
The 5th step: wait for that the L time DMA moves end, it is [Addr+ (L-1) * M (1-a) that CPU begins address section, Addr+M+ (L-1) * M (1-a)) data are handled, epicycle purpose buffer memory covers and finishes, meanwhile, start DMA and move the M byte data, and the start address of purpose buffer memory wheel goes back to Addr, the purpose buffer memory of a beginning new round covers, and only all disposes to all signals.
Estimation in encoding with video below illustrates concrete an application of the present invention as example, and other utilization can be done similar processing.
Estimation need be used the data of reference frame, common way be with the reference frame of current coding macro block row institute correspondence position last, current, down macro-block line totally 48 row pixels utilize DMA to move in the chip internal storer.Data structure by the used reference frame of estimation in the video coding it may be noted that have 2/3 to be that 32 row pixels are duplicate in the 48 row pixels that will move.
Utilize below the present invention right-motion estimation process of frame CIF image in different resolution is described, M=48 * 352=16896B in the present embodiment, the first address of purpose buffer memory are Addr, N=2M=33792B, B are 2/3M, a=B/M=2/3 is so there is L=4.
S01: utilize DMA to move the needed M byte of first macro-block line reference data, destination address is Addr, move in the purpose buffer memory address section for be [Addr, Addr+M);
S02: wait for that the DMA among the S01 moves end, CPU begins first macro-block line (being first macro-block line among the S01) is carried out estimation, the address section of employed reference data be [Addr, Addr+M); Meanwhile, start DMA and move the i.e. reference data of 16 row pixels of 1/3*M, destination address is Addr+M, move in the purpose buffer memory address section for be [Addr+M, Addr+4/3*M);
S03: wait for that the DMA among the S02 moves end, CPU carries out estimation to second macro-block line (S02 relatively), the address section of employed reference data be [Addr+1/3*M, Addr+4/3*M); Meanwhile, start the reference data that DMA moves ensuing 1/3*M, destination address is Addr+4/3*M, move in the purpose buffer memory address section for be [Addr+4/3*M, Addr+5/3*M);
S04: wait for that the DMA among the S03 moves end, CPU carries out estimation to the 3rd macro-block line (S02 relatively), employed reference data be address section [Addr+2/3*M, Addr+M*5/3); Meanwhile, start DMA and move ensuing 1/3*M reference data, destination address is Addr+5/3*M, move in the purpose buffer memory address section for be [Addr+5/3*M, Addr+2M);
S05: wait for that the DMA among the S04 moves end, CPU carries out estimation to the 4th macro-block line (S02 relatively), the address space of employed reference data be [Addr+M, Addr+2M), epicycle purpose buffer memory covers and finishes; Meanwhile, start DMA and move the reference data that corresponding size is M, at this moment, the destination address wheel goes back to Addr, and the macro-block line address among the S02 is done the skew of 4 macro-block line, as first macro-block line of estimation next time, repeat S02 and all dispose up to the macro-block line estimation of putting in order frame to S04.
With respect to existing ping-pong buffer method, caching method used in the present invention is using under the situation of same memory space, with the DMA data-moving amount that reduces 50% when doing Video Motion Estimation.

Claims (4)

1, a kind of quick direct memory access (DMA) ping-pong caching method, be used for moving adjacent data blocks the identical data block of partial data is arranged, it is characterized in that, described method is, DMA at first moves the data block of the each accessible amount of bytes size of CPU in the purpose buffer memory, and then move successively in the data block that need move with adjacent data blocks in the equal-sized data block of partial data amount of bytes inequality in ensuing purpose buffer memory, covered fully up to the purpose buffer memory, the epicycle data-moving finishes.
2, quick direct memory access (DMA) ping-pong caching method as claimed in claim 1 is characterized in that, after the each data-moving of described DMA finished, CPU all can begin the data in the purpose buffer memory are handled, and starts DMA simultaneously and carry out data-moving next time; And the like, all data in the purpose buffer memory are all processed to finish.
3, quick direct memory access (DMA) ping-pong caching method as claimed in claim 1, it is characterized in that, described DMA is every to take turns in the data-moving, for the first time the destination address of data-moving is the first address of purpose buffer memory, and the destination address of each data-moving is the data block amount of bytes of once moving before the destination address skew that last secondary data moves afterwards.
4, quick direct memory access (DMA) ping-pong caching method as claimed in claim 2, it is characterized in that, the start address of the data that described CPU handles for the first time is the first address of purpose buffer memory, the data block amount of bytes of once moving before the start address skew DMA of the data of single treatment before the start address of each data of handling is afterwards.
CN2008101423012A 2008-08-06 2008-08-06 Quick direct memory access (DMA) ping-pong caching method Expired - Fee Related CN101645052B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101423012A CN101645052B (en) 2008-08-06 2008-08-06 Quick direct memory access (DMA) ping-pong caching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101423012A CN101645052B (en) 2008-08-06 2008-08-06 Quick direct memory access (DMA) ping-pong caching method

Publications (2)

Publication Number Publication Date
CN101645052A true CN101645052A (en) 2010-02-10
CN101645052B CN101645052B (en) 2011-10-26

Family

ID=41656941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101423012A Expired - Fee Related CN101645052B (en) 2008-08-06 2008-08-06 Quick direct memory access (DMA) ping-pong caching method

Country Status (1)

Country Link
CN (1) CN101645052B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101820543A (en) * 2010-03-30 2010-09-01 北京蓝色星河软件技术发展有限公司 Ping-pong structure fast data access method combined with direct memory access (DMA)
CN105516547A (en) * 2015-12-10 2016-04-20 中国科学技术大学 Video dehazing optimization method based on DSP (Digital Signal Processor)
CN110399322A (en) * 2019-06-28 2019-11-01 苏州浪潮智能科技有限公司 A kind of data transmission method and DMA framework of rattling
CN111615692A (en) * 2019-05-23 2020-09-01 深圳市大疆创新科技有限公司 Data transfer method, calculation processing device, and storage medium
CN112506437A (en) * 2020-12-10 2021-03-16 上海阵量智能科技有限公司 Chip, data moving method and electronic equipment
CN115802236A (en) * 2023-01-04 2023-03-14 成都市安比科技有限公司 Method for shortening delay of earphone with auxiliary hearing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020016873A1 (en) * 2000-07-31 2002-02-07 Gray Donald M. Arbitrating and servicing polychronous data requests in direct memory access
KR20050000927A (en) * 2003-06-25 2005-01-06 삼성전자주식회사 a video data control unit and a loading/storing method of video data thereof
CN101043282A (en) * 2006-03-24 2007-09-26 中兴通讯股份有限公司 Data storage means for multi-channel voice process
CN101060627A (en) * 2007-04-13 2007-10-24 深圳安凯微电子技术有限公司 A high definition signal decoder

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020016873A1 (en) * 2000-07-31 2002-02-07 Gray Donald M. Arbitrating and servicing polychronous data requests in direct memory access
KR20050000927A (en) * 2003-06-25 2005-01-06 삼성전자주식회사 a video data control unit and a loading/storing method of video data thereof
CN101043282A (en) * 2006-03-24 2007-09-26 中兴通讯股份有限公司 Data storage means for multi-channel voice process
CN101060627A (en) * 2007-04-13 2007-10-24 深圳安凯微电子技术有限公司 A high definition signal decoder

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101820543A (en) * 2010-03-30 2010-09-01 北京蓝色星河软件技术发展有限公司 Ping-pong structure fast data access method combined with direct memory access (DMA)
CN105516547A (en) * 2015-12-10 2016-04-20 中国科学技术大学 Video dehazing optimization method based on DSP (Digital Signal Processor)
CN111615692A (en) * 2019-05-23 2020-09-01 深圳市大疆创新科技有限公司 Data transfer method, calculation processing device, and storage medium
CN110399322A (en) * 2019-06-28 2019-11-01 苏州浪潮智能科技有限公司 A kind of data transmission method and DMA framework of rattling
CN110399322B (en) * 2019-06-28 2021-03-09 苏州浪潮智能科技有限公司 Data transmission method and ping-pong DMA framework
CN112506437A (en) * 2020-12-10 2021-03-16 上海阵量智能科技有限公司 Chip, data moving method and electronic equipment
CN115802236A (en) * 2023-01-04 2023-03-14 成都市安比科技有限公司 Method for shortening delay of earphone with auxiliary hearing
CN115802236B (en) * 2023-01-04 2023-04-14 成都市安比科技有限公司 Method for shortening delay of earphone with auxiliary hearing

Also Published As

Publication number Publication date
CN101645052B (en) 2011-10-26

Similar Documents

Publication Publication Date Title
CN101645052B (en) Quick direct memory access (DMA) ping-pong caching method
US7132963B2 (en) Methods and apparatus for processing variable length coded data
CN108022269B (en) Modeling system for GPU (graphics processing Unit) compression texture storage Cache
KR101329517B1 (en) Digital still camera architecture with reduced delay between subsequent image acquisitions
US20050262276A1 (en) Design method for implementing high memory algorithm on low internal memory processor using a direct memory access (DMA) engine
CN103827818B (en) FIFO loading instructions
US7885336B2 (en) Programmable shader-based motion compensation apparatus and method
CN1305313C (en) System for discrete cosine transforms/inverse discrete cosine transforms based on pipeline architecture
WO2013149132A1 (en) System and method for multi-core hardware video encoding and decoding
CN102446087A (en) Instruction prefetching method and device
Fan et al. A hardware-oriented IME algorithm for HEVC and its hardware implementation
CN1112654C (en) Image processor
JP2009211494A (en) Information processor, and information processing method
CN103997648B (en) A kind of JPEG2000 standard picture rapid decompression compression systems and method based on DSP
CN1852442A (en) Layering motion estimation method and super farge scale integrated circuit
US20130235272A1 (en) Image processing apparatus and image processing method
US10440359B2 (en) Hybrid video encoder apparatus and methods
CN1520187A (en) System and method for video data compression
CN104967856A (en) Coding method and corresponding device
JP2009522698A (en) Memory organization scheme and controller architecture for image and video processing
JP2004046499A (en) Data processing system
JP2006042331A (en) Coefficient variable length coding method of four steps pipe line system and coefficient variable length coding machine
TWI586144B (en) Multiple stream processing for video analytics and encoding
JP2003153283A (en) Method for performing motion estimation in video encoding, a video encoding system, and a video encoding device
US20140185928A1 (en) Hardware-supported huffman coding of images

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111026

Termination date: 20160806