Summary of the invention
(1) to solve the technical problem that
The technical problem to be solved is prior art is all that a certain function point for register file heap makes optimization, as extended reading-writing port number and reducing memory access latency, without from data locality aspect, depositor memory access performance is optimized。
(2) technical scheme
The present invention proposes a kind of from index register file stack device, peripheral logic including bank of registers He this bank of registers, described storage memory bank of depositing is configured to from index area and general area, the described size from index area, originating register number can flexible configuration, general area adopts literal register mode to be indexed, when this register file stack device being initiated read-write enabling signal, this register file stack device calculates current desired call number automatically, read-write all described in index area, when read-write is to behind border, index area, subsequent operation is automatically brought to from index area original position。
A kind of detailed description of the invention according to the present invention, described self-indexed file stack device and an instruction queue storage device connect, obtaining instruction queue from this instruction queue storage device, this instruction queue storage device is used for the control command deposited this register file stack device。
A kind of detailed description of the invention according to the present invention, described register file stack device also includes writing control unit configuration register from index, read control unit from index, write index gate and reading index gate, wherein, described from index write control unit for calculate access next time needed for index area use write call number;Described configuration register is described from the configuration information of index area for storing;The described function from index reading control unit mainly calculates and accesses the reading call number used needed for index area next time;Described index gate of writing is for selecting currently used index of writing, and the index that the index of bank of registers is the index from index area or general area is write in decision;Described reading index gate is used for selecting currently used index of reading, and the index of decision read register memory bank is the index of the index from index area or general area。
A kind of detailed description of the invention according to the present invention, the described read write command from index register file stack device is divided into the read write command of general area and the read write command from index area, includes a special identifier from the definition format of the read write command of index area。
A kind of detailed description of the invention according to the present invention, described configuration information stores in the way of configuration-direct, and the configuration information represented by configuration-direct mainly includes depositing described in configuration the step-length from the initial call number StartID of index area, size Depth from index area and read register memory bank every time of storage memory bank。
A kind of detailed description of the invention according to the present invention, when described bank of registers comprises N number of depositor, then StartID+Depth < N。
A kind of detailed description of the invention according to the present invention, write control unit from index and all include two adders, a comparator and a distributor from index reading control unit, wherein, value InnerID after calculating, for calculating the index value carried out after the write operation of index every time, is write in distributor by described first adder;Described distributor is used for depositing this intermediate object program index value InnerID, and its initial value is the StartID value of described configuration register;Described second adder is described from the border floor value of index area for calculating;Described comparator) for relatively more described from the border floor value of index area and described intermediate object program index value InnerID: when equal, write index gate and select the first options StartID as output;When unequal, write index gate and select the second options, intermediate object program index value InnerID is worth output as a result。
A kind of detailed description of the invention according to the present invention, described index gate of writing there is currently when the write command of index area, selecting effectively to write call number is write the output of control unit from index, when there is the write command of general area, selects effectively to write the index that call number is general area write command。
A kind of detailed description of the invention according to the present invention, described index gate of reading there is currently when the reading instruction of index area, selecting effectively to read call number is read the output of control unit from index, when there is the reading instruction of general area, selects effectively to read the index that call number is general area reading instruction。
(3) beneficial effect
The present invention, by the bank of registers of register file stack device being configured to " from index area " and " general area ", has and has following usefulness:
1) there is the convenience of programming。Programmer is in using assembly of the invention process, it is only necessary to configures service condition in the incipient stage, need not specify which depositor currently used afterwards, evade the programming complexity of depositor assigning process。
2) power consumption of processor can be saved。The present invention fully uses this from index area by combination algorithm, can make full use of data locality, reduces memory access number of times, thus reducing the power consumption of processor。
Detailed description of the invention
The present invention proposes a kind of special register file stack device, and register file stack device includes the peripheral logic of register file cell (bank of registers) and this bank of registers。
This register file stack device is applicable to multiple different computation-intensive algorithm, the difference according to algorithm, can be configured to " from index area " and " general area " by depositing storage memory bank。Wherein from the information such as the size of index area, originating register number all can flexible configuration, general area adopt literal register mode be indexed。After index area information configuration is complete, user only need to initiate read-write enabling signal, and this register file stack device can calculate current desired call number automatically, read-write all specify in index area, when reading or writing behind border, index area, subsequent operation is automatically brought to from index area original position。
For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in further detail。
Fig. 1 is the structural representation from index register file stack device of the present invention。As it is shown in figure 1, following components should mainly be included from index register file stack device 20: write control unit 30, configuration register 40 from index, read control unit 50, bank of registers 60 from index, write index gate 70 and read index gate 80。
Described self-indexed file stack device 20 is generally connected with an instruction queue storage device 10, obtains instruction queue from this instruction queue storage device 10, and this instruction queue storage device 10 is used for the control command deposited this register file stack device 20。Additionally, instruction queue storage device 10 is additionally operable to the processor instructions such as storage calculating, logical operation。
Bank of registers 60 is the entity memory element of the register file stack device 20 of the present invention, including read decoder 601, storage entity 602, write decoder 603, write port 604 and read port 605。The bank of registers 60 of the present invention can adopt the register file of routine, and read port 605 place is transmitted the reading index come and decodes by read decoder 601, chooses the particular register of storage entity 602, the value of this depositor is exported;Write port 604 place is transmitted the index of writing come and decodes by write decoder 603, chooses the particular register of storage entity 602, and the register value that write port inputs is write this depositor。
The register file stack device 20 of the present invention has register file from index function, so, a certain section of bank of registers 60 is configured to from index area by configuration-direct formula by programmer, by writing " from index read-write register stack device order ", this device is operated。Usually, if some data is needed to reuse by user, can select to use from index function。
The register file stack device 20 of the present invention is operated relevant instruction and includes two parts: configuration-direct and read write command。Configuration-direct is only using register file just to may require that in index function, configuration information represented by configuration-direct mainly includes depositing described in configuration the step-length Step from the initial call number StartID of index area, size Depth from index area and read register memory bank 60 every time of storage memory bank 60, wherein initial call number StartID indication from index area the original position in depositing storage memory bank 60;From the size Depth indication of index area from call number StartID Depth depositor of continuous print started, for be currently configured from index area size;Step-length Step refers to the difference of the call number of every adjacent twi-read register file。
Configuration register 40 (ConfigRegister, hereinafter also referred to as CR depositor) is described from the configuration information of index area for storing, and configuration information stores in the way of configuration-direct。The number assuming this comprised depositor of bank of registers 60 is N, then need to meet relation: StartID+Depth < N。
Fig. 2 shows a kind of exemplary form of the preparation instruction comprising configuration information stored in the configuration register 40 of the present invention。As in figure 2 it is shown, be the logical structure of configuration register 40, bit wide is 32bit, and wherein the Immediate32 in configuration-direct " CR=Immediate32 " refers to the immediate of 32, i.e. value in write configuration register 40。It should be noted that Fig. 2 is only a kind of example, as long as comprising above-mentioned configuration information, the present invention is not limited to specific form。
The read write command of the present invention is divided into the read write command of general area and the read write command from index area。Wherein the read write command of general area is similar with legacy register read-write, and definition format such as can be as shown in the table, but the present invention is also not only limited to this form。
Include a special identifier from the definition format of the read write command of index area, " (I++) " in this this example, indicator register document stack device this be from index area instruction。Such as shown in the table, but the present invention is also not only limited to this form。
From index write the function of control unit 30 mainly calculate access next time use needed for index area write call number。The call number used from the write operation of index area is the StartID in configuration register 40 by first time, for the initial value of programmer's configuration。Then writing control unit 30 from index can when the write command of index area is effective, automatically within the scope of the size Depth of index area, write operation is being carried out, when write operation is when from border, index area, write operation jumps to the StartID place from index area next time, so moves in circles。
Fig. 3 shows that certainly indexing the one writing control unit realizes circuit。As it is shown on figure 3, it includes 301,302, comparator 303 of two adders and a distributor 304。CR is represented as the logical structure of configuration register 40。First adder 301 is used for automatic calculating and carries out the index value after the write operation of index every time, value InnerID after calculating writes in distributor 304, distributor 304 deposits this intermediate object program index value InnerID, and its initial value is the StartID value of configuration register。Second adder 302 is described from the border floor value of index area for calculating, the intermediate object program index value InnerID of the value drawn and distributor 304 uses comparator 303 to compare, when equal, it is effective that index selection enables signal IndexSelEn, figure writes index gate 70 and can select the first options StartID, if IndexSelEn is invalid, namely two inputs of comparator 303 are unequal, now write index gate 70 and can select the second options, namely intermediate object program index value InnerID is worth output as a result, final output valve is write the output of control unit 30 from index marker SelfIndexID from index。
The function mainly calculating reading control unit 50 from index accesses the reading call number used needed for index area next time。The call number used from the read operation of index area is the StartID of configuration register by first time, for the initial value of programmer's configuration。Then control unit is read when the reading instruction of index area is effective from index, automatically within the scope of the size Depth of index area, read operation is being carried out, when read operation is when from border, index area, the StartID place from index area is jumped in read operation next time, so moves in circles。Realizing circuit from the one of index reading control unit 50 similar with Fig. 3, and the effect of its all parts comprised is also identical, final output valve is read the output of control unit 50 from index marker SelfIndexID from index。
Writing index gate 70 for selecting currently used index of writing, the index that the index of bank of registers 60 is the index from index area or general area is write in decision。Write index gate 70 and there is currently when the write command of index area, can select effectively to write call number is write the output of control unit 30 from index, when there is the write command of general area, select effectively to write the index that call number is general area write command, can directly obtain from the coding of instruction queue general area write command。
Reading index gate 80 is used for selecting currently used index of reading, and the index of decision read register memory bank 60 is the index of the index from index area or general area。Read index gate 80 and there is currently when the reading instruction of index area, can select effectively to read call number is read the output of control unit 50 from index, when there is the reading instruction of general area, select effectively to read the index that call number is general area reading instruction, can directly obtain from the coding of instruction queue general area reading instruction。
Be described above realize the present invention substantially realize structure from index register file stack device。On the above-mentioned basis substantially realizing structure, it is possible to be containing N number of structure from index area by the Function Extension of this register file stack device, it is divided into multiple from index area by bank of registers 60。Support needed for this Function Extension is as follows:
1, the extension support of configuration-direct and read write command。Needing N number of configuration register CR, the definition of every section and bit wide also can be determined by designer flexibly。
2, the hardware supported of read-write control unit。Control unit write in the index certainly needing N set above-mentioned, the reading control unit of index certainly that N set is above-mentioned。
3, the Function Extension of read-write index gate。This read/write index gate Function Extension is the N+1 gate selecting 1, and wherein N represents the call number that N number of read/write control unit exports, and 1 represents the general area read/write instruction index in current read/write instruction, and bank of registers is gone in the output of this gate。
N number of configuration register is used for configuring N number of configuration information used from index area。When needs read operation, need to showing the index area labelling number to read, as there is Ix++ Warning Mark, wherein x=[0, N-1], then the reading control unit of this index area is operated, and obtains the index address value of next reading;N number of configuration register is used for configuring N number of configuration information used from index area。When needs read operation, need to showing the index area labelling number to read, as there is Ix++ Warning Mark, wherein x=[0, N-1], then the reading control unit of this index area is operated, and obtains the index address value of next reading。
Above-mentioned extension content all should belong in the protection domain of this patent。
Below by conjunction with a specific embodiment, a kind of implementation and the use procedure of the present invention are discussed in detail。
In this embodiment, bank of registers 60 has 128 depositors, each 128bit, called after M register file。User can pass through to arrange configuration register 40 (32bit bit wide) and configure and have related control information from index area, by carrying out the read-write operation from index area from the read write command of index area, carried out the conventional read-write of general area by the read write command of general area。
In the present embodiment, 0~6bit of design configurations depositor is StartID territory coding, and 8~11 encode for Step territory, and 12~18 encode for Depth territory。That is, design from index area by 8 depositors that call number in register file is 3~10 for from index area;The index difference of adjacent twice reading is 1, i.e. StartID=3, Step=1, Depth=8。The structural representation of bank of registers 60 and configuration register 40 is as shown in Figure 4。Wherein bit7 and bit19~bit31 is reserved bit, temporarily without effective implication。Configuration-direct is: CR=0x8103。
To this from the write operation instruction of index area it is: Ri=R.wp (I++)。In continuous print write command process, initially writing index value is StartID (3), occur when index area write command afterwards every time, call number adds 1 automatically, when this index value is equal to 10, and after the depositor that call number is 10 is carried out write operation, index value automatically becomes 3, so circulation carries out write operation。
To this from the read operation instruction of index area it is: R.rp=R (I++)。In continuous print reading instruction process, initial read index value is StartID (3), occur when index area reading instruction afterwards every time, call number adds Step (1) automatically, when this index value is equal to 10, and after the depositor that call number is 10 carries out reading manipulation, index value automatically becomes 3, so circulation carries out read operation。
Above-mentioned in the read-write process of index area, if there is common read-write, will not disturb from the related read-write of index area。Wherein common write command form is Ri=R.wp, if the depositor that register number is 4 is carried out write operation, this instruction type is: R4=R.wp。Accordingly, reading instruction form is: R.rp=Ri, if the depositor that register number is 5 is carried out read operation, this instruction type is: R.rp=R5。
Particular embodiments described above; the purpose of the present invention, technical scheme and beneficial effect have been further described; it it should be understood that; the foregoing is only specific embodiments of the invention; it is not limited to the present invention; all within the spirit and principles in the present invention, any amendment of making, equivalent replacement, improvement etc., should be included within protection scope of the present invention。