JP2824484B2

JP2824484B2 - Pipeline processing computer

Info

Publication number: JP2824484B2
Application number: JP22164692A
Authority: JP
Inventors: 達己中田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-08-20
Filing date: 1992-08-20
Publication date: 1998-11-11
Anticipated expiration: 2013-11-11
Also published as: JPH0667879A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明はデータ処理装置に係わ
り、さらに詳しくは先行して実行される命令と、その先
行命令に後続して実行される命令との間でのデータの依
存関係をチェックするスコアボードを備えたパイプライ
ン処理計算機に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data processing apparatus, and more particularly, to a data processor for checking a data dependency between an instruction executed before and an instruction executed after the preceding instruction. The present invention relates to a pipeline processing computer provided with a scoreboard.

【０００２】パイプライン処理を行う計算機において
は、先行して実行される命令とその先行命令に続いて実
行される命令との間でのデータの依存関係、例えば最初
の命令による演算の結果を次の命令における演算データ
として用いるような関係がある場合には、その依存関係
を乱すことのない処理を行う必要がある。このように演
算に使用されるデータの相互依存関係を管理する制御方
法として、スコアボードを用いる方法がある。2. Description of the Related Art In a computer that performs a pipeline process, a data dependency between an instruction to be executed in advance and an instruction to be executed following the preceding instruction, for example, a result of an operation by the first instruction is described as follows. When there is a relationship that is used as operation data in the instruction, it is necessary to perform processing that does not disturb the dependency. As a control method for managing the interdependency of the data used for the calculation, there is a method using a scoreboard.

【０００３】この方法においては、汎用レジスタのそれ
ぞれに対応して、使用中／未使用を表す記憶手段として
のスコアボードが備えられている。汎用レジスタに書き
込みをする必要のある命令を実行する場合には、そのレ
ジスタ番号に対応するスコアボードのビットをセット
し、そのレジスタに書き込みが行われることを示してお
く。ある命令で結果が汎用レジスタに書き込まれた時に
は、そのレジスタ番号に対応するスコアボードのビット
をリセットし、書き込みが完了したことが示される。[0003] In this method, a scoreboard is provided as storage means for indicating use / unused in correspondence with each of the general-purpose registers. When an instruction that needs to write to a general-purpose register is executed, the bit of the scoreboard corresponding to the register number is set to indicate that writing to the register is performed. When a result is written to a general-purpose register by a certain instruction, the bit of the scoreboard corresponding to the register number is reset to indicate that the writing is completed.

【０００４】後続する命令においては、先行して実行さ
れている命令でレジスタの書き換えが行われるかどうか
を確認するために、演算データを格納するレジスタの番
号に対応するスコアボードのビットを読み出し、それが
セットされている場合には先行して実行中の命令での書
き換えが完了していないことになるため、後続命令の、
演算の開始を待つ必要がある。スコアボードのビットが
リセットされていたら、先行して実行中の命令はレジス
タの内容を書き換えることはない、すなわち後続の命令
が演算の開始を待った結果、先行して実行中の命令での
レジスタの書き換えが完了していることになるので、後
続命令の演算を開始するとができる。In a subsequent instruction, a bit of a scoreboard corresponding to a register number storing operation data is read in order to confirm whether or not the register executed by the previously executed instruction is rewritten. If it is set, it means that the rewriting by the previously executing instruction has not been completed, so the following instruction
It is necessary to wait for the operation to start. If the scoreboard bit is reset, the previously executing instruction will not rewrite the contents of the register, i.e., the subsequent instruction will wait for the operation to start, resulting in the register being updated by the previously executing instruction. Since the rewriting has been completed, the operation of the subsequent instruction can be started.

【０００５】スコアボードを用いる処理の必要性を図
７、および図８を用いて説明する。図７は先行命令での
レジスタへの書き込み完了を待たなかった場合に、デー
タの相互関係を正しく処理できない場合があることを示
している。同図において、第１の乗算命令（ｍｕｌｔ）
では、レジスタ２の内容（ｇｒ２）とレジスタ３の内容
（ｇｒ３）とを掛け合わせ、その結果をレジスタ４に格
納する命令を示し、第２の命令としての加算命令（ａｄ
ｄ）では、レジスタ５の内容とレジスタ６の内容を加算
して、レジスタ４に格納する命令を示し、また少し位置
の離れた第３の命令としての加算命令では、第２の命令
としての加算命令の結果が格納されたレジスタ４の内容
とレジスタ７の内容とを加算して、レジスタ８に格納す
る命令を示している。この場合スコアボードを用いるこ
となく処理を実行すると、図に示すように例えば第１の
命令としての乗算命令では演算実行（Ｅ）ステージが３
ステージかかるために、第３の命令としての加算命令の
処理においては、第１の命令としての乗算命令の結果が
格納されたレジスタ４の内容が読み出されて加算が実行
されてしまい、間違った結果が得られることになる。The necessity of processing using a scoreboard will be described with reference to FIGS. 7 and 8. FIG. 7 shows that there is a case where the mutual relation of data cannot be correctly processed without waiting for the completion of writing to the register by the preceding instruction. In the figure, a first multiplication instruction (multi)
Shows an instruction for multiplying the content (gr2) of the register 2 by the content (gr3) of the register 3 and storing the result in the register 4, and an addition instruction (ad
In d), an instruction to add the contents of the register 5 and the contents of the register 6 and store the result in the register 4 is shown. In addition, an addition instruction as a third instruction which is a little away from the position is an addition as a second instruction. The instruction to add the contents of the register 4 storing the result of the instruction and the contents of the register 7 and store the result in the register 8 is shown. In this case, if the processing is executed without using the scoreboard, as shown in the figure, for example, in the multiplication instruction as the first instruction, the operation execution (E) stage is 3
Because of the stage, in the processing of the addition instruction as the third instruction, the contents of the register 4 storing the result of the multiplication instruction as the first instruction are read out, and the addition is executed. The result will be obtained.

【０００６】図８は先行命令でのレジスタへのデータ書
き込み完了を後続命令が待つ場合の処理を示している。
同図においては、第１の命令として乗算命令の最初の実
行（Ｅ）ステージでレジスタ４に対するスコアボードビ
ットを‘１’とすることによって、その結果がレジスタ
４に書き込まれ、スコアボードビットが‘０’になるま
で、すなわちリセットされるまで第２の加算命令の実行
はインタロックされており、第２の加算命令の処理にお
いて正しいデータを読み出した処理が行われることを示
している。FIG. 8 shows a process in a case where a subsequent instruction waits for completion of data writing to a register by a preceding instruction.
In the figure, the scoreboard bit for the register 4 is set to "1" at the first execution (E) stage of the multiplication instruction as the first instruction, and the result is written to the register 4, and the scoreboard bit is set to "1". The execution of the second addition instruction is interlocked until it becomes 0 ', that is, until the reset is performed, indicating that the processing of reading out the correct data is performed in the processing of the second addition instruction.

【０００７】なおスコアボードにビットをセットするこ
とは、レジスタの値を読み出せない（書き込めない）と
いうことを示している。加算命令のような基本命令の多
くの実行ステージは一般に１ステージのみであり、かつ
バイパス機能によって実行ステージで生成された結果は
データとして読み出すことができる。１ステージで実行
ステージが終了する命令では、図７に示したように後続
の命令に追い越されて誤った動作をする可能性は存在し
ない。そこで加算命令のような基本命令、すなわち１フ
ローで実行でき、かつ１ステージしか実行にかからな
い、最も実行時間の少ない命令では、スコアボードのセ
ット／リセットを行わなくてもデータの相互干渉を防止
することはできる。Setting a bit in the scoreboard indicates that the value of the register cannot be read (written). Many execution stages of a basic instruction such as an addition instruction generally have only one execution stage, and the result generated in the execution stage by the bypass function can be read as data. For an instruction whose execution stage ends in one stage, there is no possibility that the subsequent instruction will overtake an erroneous operation as shown in FIG. Therefore, in the case of a basic instruction such as an addition instruction, that is, an instruction which can be executed in one flow and takes only one stage and has the shortest execution time, mutual interference of data is prevented without setting / resetting the scoreboard. Can do it.

【０００８】図９はスコアボードを使用したパイプライ
ン計算機の従来例の構成ブロック図である。同図におい
て、パイプライン計算機は命令コードを解読する命令デ
コーダ１、演算用データおよび演算結果を格納する汎用
レジスタ（ＧＲ）２、例えば加算器としての演算器３、
例えばシフターとしての演算器４、演算器３に対する演
算用データ、すなわちオペランドを保持するオペランド
レジスタ３ａ，３ｂ，演算器４に対するオペランドを保
持するオペランドレジスタ４ａ，４ｂ，スコアボード
５，パイプライン動作を制御するパイプラインコントロ
ーラ６、パイプラインコントローラ６の制御の基にパイ
プラインタグを保持するパイプラインタグレジスタ７、
およびアンド回路８から構成されている。FIG. 9 is a block diagram showing the configuration of a conventional pipeline computer using a scoreboard. In FIG. 1, a pipeline computer includes an instruction decoder 1 for decoding an instruction code, a general-purpose register (GR) 2 for storing operation data and operation results, for example, an operation unit 3 as an adder,
For example, an arithmetic unit 4 as a shifter, operation data for the arithmetic unit 3, that is, operand registers 3a and 3b for holding operands, operand registers 4a and 4b for holding operands for the arithmetic unit 4, a scoreboard 5, and a pipeline operation are controlled. A pipeline controller 6, a pipeline tag register 7 for holding a pipeline tag under the control of the pipeline controller 6,
And an AND circuit 8.

【０００９】図９において、命令デコーダ１によって命
令コードが解読され、読み出しレジスタ番号としてソー
スレジスタアドレス０、およびソースレジスタアドレス
１が、また書き込みレジスタ番号としてディスティネー
ションレジスタアドレスが得られ、読み出しレジスタ番
号はＧＲ２の読み出しレジスタポートに入力され、ソー
スオペランドとしてのとしてのリードデータ０とリード
データ１が各オペランドレジスタ３ａ，３ｂ，４ａ，お
よび４ｂに保持される。これらのオペランドレジスタの
出力は演算器３および４に入力され、演算結果はライト
データとしてＧＲ２の書き込みデータポートに入力され
る。この演算結果と同時に、書き込みレジスタ番号とし
てのディスティネーションレジスタアドレスが、パイプ
ラインタグレジスタ７によってタイミングが合わせられ
て、書き込みレジスタアドレスとしてＧＲ２の書き込み
レジスタポートに入力される。書き込み制御信号も命令
デコーダ１によって生成され、パイプラインタグレジス
タ７によってタイミングが合わされてＧＲ２に入力され
るが、この制御信号については図示していない。In FIG. 9, an instruction code is decoded by an instruction decoder 1 to obtain a source register address 0 and a source register address 1 as read register numbers and a destination register address as a write register number. Read data 0 and read data 1 are input to the read register port of GR2 and held as operand data in the operand registers 3a, 3b, 4a, and 4b. The outputs of these operand registers are input to arithmetic units 3 and 4, and the operation result is input as write data to the write data port of GR2. Simultaneously with this operation result, the destination register address as the write register number is input to the write register port of GR2 as the write register address, with the timing adjusted by the pipeline tag register 7. The write control signal is also generated by the instruction decoder 1 and is input to the GR 2 with the timing adjusted by the pipeline tag register 7, but this control signal is not shown.

【００１０】以上はデータの相互干渉がない場合の動作
であるが、データの相互干渉がある場合には、命令デコ
ーダ１から出力される読み出しレジスタ番号、および書
き込みレジスタ番号がスコアボード５の読み出しレジス
タ検査ポート（ＲＤ０ＣＨＫ，ＲＤ１ＣＨＫ）およ
び書き込みレジスタ検査ポート（ＷＲＣＨＫ）に入力
され、これらのレジスタ番号はそれぞれがスコアボード
５内の使用中／未使用を表すレジスタを選択するための
選択信号とし用いられる。選択されたレジスタからの信
号ＳＲＣ０ＲＥＧＢＵＳＹ，ＳＲＣ１ＲＥＧＢ
ＵＳＹ，およびＷＲＲＥＧＢＵＳＹはパイプライン
コントローラ６に入力され、図８で示したようにデコー
ドステージをインタロックするかどうかを決定するため
に用いられる。The above is the operation in the case where there is no mutual interference of data. If there is mutual interference of data, the read register number and the write register number output from the instruction decoder 1 are read from the read register of the scoreboard 5. The check ports (RD0 CHK, RD1 CHK) and the write register check port (WR CHK) are input to the check ports, and these register numbers are used as selection signals for selecting registers in the scoreboard 5 indicating busy / unused. Used. Signal SRC0 REG BUSY, SRC1 REG B from selected register
USY and WR REG BUSY are input to the pipeline controller 6 and are used to determine whether to interlock the decode stage as shown in FIG.

【００１１】パイプラインコントローラ６への前述の３
つの信号のいずれかがスコアボードビットが‘１’であ
ることを示していた場合には、Ｄステージリリース信
号、すなわちデコードステージを完了して実行ステージ
に進んでよいことを示す信号の出力は抑制される。The above-mentioned 3 to the pipeline controller 6
If one of the two signals indicates that the scoreboard bit is '1', the output of the D stage release signal, ie, the signal indicating that the decoding stage can be completed and the execution stage can be performed, is suppressed. Is done.

【００１２】先行する命令において、例えばレジスタへ
の書き込みが完了し、スコアボード５内の使用中／未使
用を表すレジスタのリセットが行われ、パイプラインコ
ントローラ６への前述の３つの信号がいずれもスコアボ
ードビットがセットされていないことを示すようになっ
た場合には、Ｄステージ信号が４つのオペランドレジス
タに出力されると共に、その信号がパイプラインタグレ
ジスタ７内のＤフリップフロップのクロックイネーブル
端子に、またスコアボード５の前にあるアンド回路８に
与えられる。In the preceding instruction, for example, the writing to the register is completed, the register indicating whether the scoreboard 5 is in use or unused is reset, and all of the above three signals to the pipeline controller 6 are output. If the result indicates that the scoreboard bit is not set, the D stage signal is output to the four operand registers and the signal is output to the clock enable terminal of the D flip-flop in the pipeline tag register 7. And an AND circuit 8 in front of the scoreboard 5.

【００１３】これによって後続命令に対する実行ステー
ジの処理が行われるが、この時アンド回路８のもう一方
の入力として与えられている命令デコーダ１からのディ
スティネーションレジスタアドレスがスコアボード５の
ＷＲＳＥＴ端子に与えられ、書き込みレジスタ番号に
対応したスコアボードのビットがセットされる。更に、
パイプラインコントローラ６から出力されるＥステージ
リリース信号がパイプラインタグレジスタ７に入力され
た時点で、ディスティネーションレジスタアドレスはラ
イトレジスタアドレスとしてスコアボードのＷＲＲＥ
Ｓ端子に与えられ、対応するスコアボードビットのリセ
ットが行われると共に、前述のようにこのライトレジス
タアドレスはＧＲ２に与えられる。The execution stage processing for the subsequent instruction is thereby performed. At this time, the destination register address from the instruction decoder 1 provided as the other input of the AND circuit 8 is supplied to the WR SET terminal of the scoreboard 5. And the bit of the scoreboard corresponding to the write register number is set. Furthermore,
When the E-stage release signal output from the pipeline controller 6 is input to the pipeline tag register 7, the destination register address is set as the write register address and the WR RE of the scoreboard is used.
The signal is applied to the S terminal, the corresponding scoreboard bit is reset, and the write register address is applied to GR2 as described above.

【００１４】[0014]

【発明が解決しようとする課題】発明が解決しようとす
る課題を説明する前に、まずマルチフロー処理と、命令
の並列実行時におけるスコアボードの使用について説明
する。マルチフロー処理はステージ展開処理とも呼ば
れ、１つの命令をあたかも複数の命令であるかのように
複数のフローに分けて実行する処理である。Before describing the problems to be solved by the present invention, first, multiflow processing and use of a scoreboard at the time of executing instructions in parallel will be described. The multi-flow processing is also referred to as a stage development processing, and is a processing in which one instruction is divided into a plurality of flows and executed as if they were a plurality of instructions.

【００１５】例えばペアとしての２つの４バイトの汎用
レジスタ（ＧＲ）の値を８バイトの浮動小数点レジスタ
（ＦＲ）に転送する命令を実行する場合に、汎用レジス
タ側で１つのポートだけを使用して１度に４バイトずつ
読み出し、それを浮動小数点レジスタに転送する処理を
２回行うような処理がマルチフロー処理である。図１０
はこの処理の例であり、ｒ１，ｒ１＋１で示されるレジ
スタペアの格納内容を２回のフローに分けて転送してい
る様子を示している。For example, when executing an instruction to transfer the value of two 4-byte general-purpose registers (GR) as a pair to an 8-byte floating-point register (FR), only one port is used on the general-purpose register side. Multi-flow processing is to read four bytes at a time and transfer it to the floating-point register twice. FIG.
Is an example of this processing, and shows how the contents stored in the register pair indicated by r1 and r1 + 1 are transferred in two flows.

【００１６】図１１は乗算命令に対するマルチフロー処
理の例である。一般に乗算命令の実行頻度は少なく、そ
の性能が低下してもシステム全体の性能に与える影響は
小さい。従って図１１においては、乗算用の２つの演算
データが２回のフローに分けて乗算器に転送され、これ
によって演算データを転送するためのデータバスのバス
幅の減少が図られている。FIG. 11 shows an example of multiflow processing for a multiplication instruction. Generally, the execution frequency of the multiplication instruction is low, and even if the performance is reduced, the influence on the performance of the entire system is small. Therefore, in FIG. 11, two operation data for multiplication are transferred to the multiplier in two separate flows, whereby the bus width of the data bus for transferring the operation data is reduced.

【００１７】しかしながらこのようなマルチフロー処理
を行う乗算命令において、スコアボードのセットを行っ
た場合にハングアップ状態となる例が図１２に示されて
いる。図１２においては、レジスタ２の内容とレジスタ
３の内容とを掛けて、その結果をレジスタ３に格納する
乗算命令を２つのフローに分けて実行する様子が示され
ている。まず第１のフローにおいてはレジスタ２の内容
が乗算器に送られるが、乗算結果をレジスタ３に格納す
るために、第１のフローの実行ステージにおいてレジス
タ３に対応するスコアボードビットがセットされる。こ
のため第２のフローでレジスタ３の内容を読み出そうと
しても、スコアボードのビットがセットされているため
に読み出しを行うことはできず、インターロックの状態
となる。このスコアボードビットはレジスタ３への書き
込みでリセットされるが、第２のフローの実行ステージ
の結果が出るまではリセットされず、第２のフローがデ
コードステージでインターロックしているために永久に
リセットができず、ハングアップ状態となってしまうこ
とになる。このようなハングアップ状態はハード障害状
態であり、デッドロックとも呼ばれ、永遠に解消されな
い状態である。これに対して図８で示したインターロッ
ク状態は、スコアボードビットのリセットによって処理
が再開されるものである。However, FIG. 12 shows an example in which a hang-up state occurs when a scoreboard is set in a multiplication instruction for performing such multiflow processing. FIG. 12 shows a state in which the contents of the register 2 are multiplied by the contents of the register 3 and the multiplication instruction for storing the result in the register 3 is executed in two flows. First, in the first flow, the contents of the register 2 are sent to the multiplier. In order to store the multiplication result in the register 3, the scoreboard bit corresponding to the register 3 is set in the execution stage of the first flow. . For this reason, even if an attempt is made to read the contents of the register 3 in the second flow, the reading cannot be performed because the bit of the scoreboard is set, and an interlock state is set. This scoreboard bit is reset by writing to register 3, but is not reset until the result of the execution stage of the second flow is obtained, and is permanently set because the second flow is interlocked in the decoding stage. It cannot be reset, resulting in a hang-up state. Such a hang-up state is a hard failure state, also called a deadlock, and is a state that cannot be resolved forever. In contrast, in the interlock state shown in FIG. 8, the processing is restarted by resetting the scoreboard bit.

【００１８】次に命令の並列実行時における問題点を説
明する。図１３はその問題点の説明図である。この例に
おいては加算命令と乗算命令が並列実行されるが、加算
命令ではレジスタ１の内容とレジスタ２の内容が加算さ
れてレジスタ３に格納され、また乗算命令においてはレ
ジスタ２の内容とレジスタ３との内容との乗算が行わ
れ、その結果はレジスタ４に格納される。これらの命令
のうち加算命令は最初の３つのステージで終了し、一方
乗算命令は２つのフローに分けられ、すなわちマルチフ
ロー処理として行われる。Next, problems at the time of parallel execution of instructions will be described. FIG. 13 is an explanatory diagram of the problem. In this example, the addition instruction and the multiplication instruction are executed in parallel. In the addition instruction, the contents of the register 1 and the contents of the register 2 are added and stored in the register 3, and in the multiplication instruction, the contents of the register 2 and the register 3 are added. And the result is stored in the register 4 . Of these instructions, the add instruction ends in the first three stages, while the multiply instruction is split into two flows, ie, performed as a multi-flow operation.

【００１９】図１２で説明した問題点を避けるために、
例えばスコアボードビットを乗算命令の第２フローにお
いてレジスタ３の内容を読み込む時に立てるとしても、
この時には加算命令の結果がレジスタ３に格納されてお
り、乗算命令の処理はこの加算結果を用いて行われてし
まうことになる。このように命令の並列実行に対して
は、データの相互依存関係を考慮してスコアボードビッ
トを立てると共に、最も実行時間の少ない命令としての
基本命令の実行はデータの相互依存関係を考慮して遅延
させることが必要になるという問題点がある。In order to avoid the problem described with reference to FIG.
For example, even if the scoreboard bit is set when reading the contents of the register 3 in the second flow of the multiplication instruction,
At this time, the result of the addition instruction is stored in the register 3, and the processing of the multiplication instruction is performed using this addition result. As described above, for parallel execution of instructions, a scoreboard bit is set in consideration of data interdependency, and execution of a basic instruction as an instruction having the shortest execution time is performed in consideration of data interdependency. There is a problem that it is necessary to delay.

【００２０】本発明は、マルチフロー処理を必要とする
命令の実行時にスコアボードによるデータの相互依存関
係を正しく維持して処理を行うことと、単一のフローで
実行できる命令とマルチフロー処理を必要とする命令と
の並列実行時におけるデータの相互依存関係を損なうこ
となく正しい処理を実行することとを可能とすることで
ある。According to the present invention, when an instruction requiring multi-flow processing is executed, processing is performed while maintaining the interdependency of data by the scoreboard correctly, and instructions which can be executed by a single flow and multi-flow processing are executed. An object of the present invention is to enable correct processing to be performed without impairing the interdependency of data during parallel execution with a required instruction.

【００２１】[0021]

【課題を解決するための手段】図１は本発明の原理ブロ
ック図である。同図は、先行して実行される命令と、そ
の先行命令に引き続いて実行される命令との間でのデー
タの依存関係をチェックするスコアボードを備え、かつ
１つの命令を複数のフローに分けて実行することができ
るパイプライン処理計算機の原理ブロック図である。FIG. 1 is a block diagram showing the principle of the present invention. The figure includes a scoreboard for checking a data dependency between an instruction executed in advance and an instruction executed in succession to the preceding instruction, and one instruction is divided into a plurality of flows. FIG. 2 is a block diagram showing the principle of a pipeline processing computer that can be executed by a computer.

【００２２】図１において、最終フロー検出手段１０
は、例えばマルチフローコントローラであり、１つの命
令を複数のフローに分けて実行する処理としてのマルチ
フロー処理の最終フローを検出するものである。またス
コアボード更新制御手段１２は、例えばスコアボード１
１の入力側に設けられる２つのアンド回路であり、マル
チフロー処理の最終フローにおいて、前述のデータ依存
関係に対応するスコアボード１１の格納内容の更新を行
わせるものである。In FIG. 1, the final flow detecting means 10
Is a multi-flow controller, for example, for detecting the final flow of multi-flow processing as processing for executing one instruction by dividing it into a plurality of flows. The scoreboard update control means 12 is, for example, the scoreboard 1
The two AND circuits provided on the input side of the input unit 1 update the contents stored in the scoreboard 11 corresponding to the above-described data dependency in the final flow of the multiflow processing.

【００２３】[0023]

【作用】本発明においては、１つの命令を複数のフロー
に分けて実行する処理、すなわちマルチフロー処理の最
終フローでのみスコアボードの格納内容更新処理、すな
わちセット／リセット処理が行われる。図２を用いて本
発明の作用を説明する。According to the present invention, the processing for updating one instruction divided into a plurality of flows, that is, the update processing of the content stored in the scoreboard, that is, the set / reset processing is performed only in the final flow of the multi-flow processing. The operation of the present invention will be described with reference to FIG.

【００２４】図１で説明したように、本発明においては
最終フロー検出手段１０がマルチフロー処理の最終フロ
ーを検出した時に、例えばそのフローの命令が実行され
た結果が書き込まれるレジスタに対応するスコアボード
のビットがセットされる。このセットはスコアボード更
新制御手段１２によって行われる。図２においては、図
１２におけると同じ命令が処理されるが、図１２におい
ては第１のフローにおいてレジスタ３に乗算結果が格納
されると判明した時点で第１のフローの実行ステージに
おいてスコアボードビットが‘１’とされたが図２１に
おいては第２のフロー、すなわち最終フローの第１の実
行ステージにおいてスコアボードのセットが行われる。
これによってレジスタ３の内容は第２のフローの第１の
実行ステージにおいて汎用レジスタＧＲに転送され、そ
の後乗算処理が実行されて乗算結果がレジスタ３に格納
される時点でスコアボードビットが‘０’にリセットさ
れる。As described with reference to FIG. 1, in the present invention, when the final flow detecting means 10 detects the final flow of the multiflow processing, for example, the score corresponding to the register in which the result of executing the instruction of the flow is written. The board bit is set. This set is performed by the scoreboard update control means 12. In FIG. 2, the same instruction as in FIG. 12 is processed, but in FIG. 12, when it is determined that the multiplication result is stored in the register 3 in the first flow, the scoreboard is executed in the execution stage of the first flow. Although the bit is set to '1', the scoreboard is set in the second flow in FIG. 21, that is, in the first execution stage of the final flow.
As a result, the content of the register 3 is transferred to the general-purpose register GR in the first execution stage of the second flow, and thereafter, when the multiplication process is executed and the multiplication result is stored in the register 3, the scoreboard bit is set to “0”. Is reset to

【００２５】以上のように、本発明においては、マルチ
フロー処理の最終フローでのみスコアボードのセット／
リセット処理が行われる。その結果、先行命令と後続命
令との間でのデータの依存関係を乱すことのない処理
が、パイプラインのインタロックを含む処理によって保
証される。As described above, in the present invention, the scoreboard set /
Reset processing is performed. As a result, processing that does not disturb the data dependency between the preceding instruction and the succeeding instruction is guaranteed by the processing including the pipeline interlock.

【００２６】またマルチフロー処理を必要とする命令と
必要としない命令との同時実行時には、スコアボードの
セット／リセット処理はマルチフロー処理の最終フロー
で行われると共に、マルチフロー処理を必要としない命
令の実行もマルチフロー処理の最終フローまで遅延させ
られる。When simultaneously executing an instruction that requires multiflow processing and an instruction that does not require multiflow processing, the scoreboard set / reset processing is performed in the final flow of the multiflow processing, and the instruction that does not require multiflow processing is executed. Is also delayed until the final flow of the multi-flow process.

【００２７】[0027]

【実施例】図３は本発明のパイプライン処理計算機の全
体構成ブロック図である。同図において図９の従来例と
同じ部分には符号を付してある。図３において、図９の
従来例と異なる点はマルチフロー処理における最終フロ
ーを検出するためのマルチフローコントローラ２０が追
加され、またスコアボード５の入力側にアンド回路８の
代わりにアンド回路２１が、また新たにアンド回路２２
が設けられていることである。FIG. 3 is a block diagram showing the overall configuration of a pipeline processing computer according to the present invention. In this figure, the same parts as those in the conventional example of FIG. 9 are denoted by reference numerals. 3 differs from the conventional example of FIG. 9 in that a multiflow controller 20 for detecting the final flow in the multiflow processing is added, and an AND circuit 21 is provided on the input side of the scoreboard 5 instead of the AND circuit 8. , And a new AND circuit 22
Is provided.

【００２８】マルチフローコントローラ２２には、命令
デコーダ１からマルチフローが２つのフローに分かれて
いることを示すツーフローオペレーション、または３つ
のフローに分かれていることを示すスリーフローオペレ
ーション信号が入力され、またパイプラインコントロー
ラ６からのＤステージリリース信号入力される。そして
マルチフローコントローラ２０からのラストフロー検出
信号は２つのアンド回路２１，２２に入力され、またフ
ローカウンタ信号は命令デコーダ１に出力される。アン
ド回路２１にはラストフロー検出信号と共に、図９にお
けると同様に書き込みレジスタ番号を示すディスティネ
ーションレジスタアドレスとＤステージリリース信号が
入力されており、これらの信号が揃った時点でスコアボ
ード５にライトレジスタセット入力が与えられる。また
アンド回路２２には、ラストフロー検出信号と共に、パ
イプラインタグレジスタ７からのライトレジスタアドレ
スが与えられており、これらの信号が揃ったところでス
コアボード５にライトレジスタリセット信号が入力され
る。The multi-flow controller 22 receives from the instruction decoder 1 a two-flow operation indicating that the multi-flow is divided into two flows or a three-flow operation signal indicating that the multi-flow is divided into three flows. Also, a D stage release signal is input from the pipeline controller 6. Then, the last flow detection signal from the multi-flow controller 20 is input to the two AND circuits 21 and 22, and the flow counter signal is output to the instruction decoder 1. A destination register address indicating a write register number and a D stage release signal are input to the AND circuit 21 as well as the last flow detection signal as in FIG. A register set input is provided. The AND circuit 22 is supplied with a write register address from the pipeline tag register 7 together with the last flow detection signal. When these signals are completed, a write register reset signal is input to the scoreboard 5.

【００２９】図３における各ステージの処理の概要を説
明する。まずデコード（Ｄ）ステージでは命令をデコー
ドしてマルチフロー命令であること、命令の実行に必要
なフロ数、スコアボードの更新を必要とする命令である
ことをデコードする。デコードした結果及びマルチフロ
ーカウンタの値等をパイプライン・タグに保持し、以後
のステージで使用する。An outline of the processing of each stage in FIG. 3 will be described. First, in the decode (D) stage, the instruction is decoded to determine that the instruction is a multi-flow instruction, the number of flows required for executing the instruction, and that the instruction requires updating of the scoreboard. The result of decoding, the value of the multiflow counter, and the like are stored in the pipeline tag, and used in the subsequent stages.

【００３０】書き込みレジスタ番号に対応するスコアボ
ードのビット数を検査し、もしセットされていたら命令
デコードステージでインターロックする。読み出しレジ
スタ番号に対応するスコアボードのビット数を検査し、
もしセットされていたら命令デコードステージでインタ
ーロックする。The number of bits of the scoreboard corresponding to the write register number is checked, and if set, interlock is performed in the instruction decode stage. Check the number of bits of the scoreboard corresponding to the read register number,
If set, interlock at the instruction decode stage.

【００３１】マルチフロー命令の場合に現在何フロー目
の処理をしているかを示すマルチフローカウンタの出力
は、マルチフロー命令でない場合は常に１フロー目を実
行しているのと等価な値を示している。また最終フロー
であることを検出するために、マルチフローカウンタの
値と命令の実行に必要なフロー数を比較する。マルチフ
ロー命令でない場合はマルチフローカウンタが常に１フ
ロー目を実行しているのと等価な値を示しているので最
終フローであることが検出される。In the case of a multi-flow instruction, the output of the multi-flow counter indicating the current flow of processing indicates a value equivalent to executing the first flow whenever the instruction is not a multi-flow instruction. ing. Further, in order to detect the final flow, the value of the multiflow counter is compared with the number of flows required for executing the instruction. If the instruction is not a multi-flow instruction, the multi-flow counter always indicates a value equivalent to executing the first flow, so that the final flow is detected.

【００３２】演算実行（Ｅ）ステージでは演算実行以外
に、演算ステージを実行中の命令がスコアボードの更新
を必要とする命令であり、かつ最終フローを実行してい
たら、書き込みレジスタ番号に対応するスコアボードの
ビットをセットする。一部の命令ではこのステージでレ
ジスタ読み出しを行う場合もある。In the operation execution (E) stage, in addition to the execution of the operation, if the instruction that is executing the operation stage is an instruction that requires updating of the scoreboard and the final flow is being executed, the instruction corresponding to the write register number is executed. Set the scoreboard bit. For some instructions, register reading may be performed at this stage.

【００３３】書き込み（Ｗ）ステージでは演算ステージ
で得られた結果をレジスタに書き込むなどの処理以外
に、書き込み（Ｗ）ステージを実行中の命令がスコアボ
ードの更新を必要とする命令であり、かつ最終フローを
実行していたら、書き込みレジスタ番号に対応するスコ
アボードのビットをリセットする。In the write (W) stage, in addition to processing such as writing the result obtained in the operation stage to a register, the instruction executing the write (W) stage is an instruction that requires updating of the scoreboard, and If the final flow has been executed, the bit of the scoreboard corresponding to the write register number is reset.

【００３４】図４はマルチコントローラの実施例構成ブ
ロック図である。同図において、マルチコントローラ２
０は命令デコーダ１からのツーフローオペレーション、
及びスリーフローオペレーション信号が入力されるオア
回路２３、オア回路２３の出力が与えられるインバータ
２４、パイプラインコントローラ６が出力するＤステー
ジリリース信号とクロック信号とが入力されるアンド回
路２５、フローカウンタ２６、フローカウンタ２６の出
力とツーフローオペレーション信号が入力されるアンド
回路２７、フローカウンタ２６の出力とスリーフローオ
ペレーション信号が入力されるアンド回路２８、インバ
ータ２４、アンド回路２７、及び２８の出力が入力され
るオア回路２９、及びオア回路２９の出力を反転するイ
ンバータ３０から構成されている。FIG. 4 is a block diagram showing an embodiment of the multi-controller. In FIG.
0 is a two-flow operation from the instruction decoder 1,
And an OR circuit 23 to which a three-flow operation signal is input, an inverter 24 to which an output of the OR circuit 23 is supplied, an AND circuit 25 to which a D-stage release signal and a clock signal output from the pipeline controller 6 are input, and a flow counter 26 , An AND circuit 27 to which the output of the flow counter 26 and the two-flow operation signal are inputted, an AND circuit 28 to which the output of the flow counter 26 and the three-flow operation signal are inputted, and the outputs of the inverter 24 and the AND circuits 27 and 28 are inputted. And an inverter 30 for inverting the output of the OR circuit 29.

【００３５】図３において命令デコーダでは実行する命
令がいくつのフローから構成されるかをデコードする。
またマルチフロー命令の場合はＦＬＯＷＣＯＵＮＴＥ
Ｒの値によって現在実行している命令のフロー番号を教
えており、最終フローであることを検出し（ＬＡＳＴ
ＦＬＯＷ信号）、最終フローであった場合にはプログラ
ムカウンタの更新制御や命令バッファ（キュウ）の中か
らの命令の選択を行う。In FIG. 3, the instruction decoder decodes how many flows the instruction to be executed is composed.
In the case of a multi-flow instruction, FLOW COUNTE
The value of R indicates the flow number of the instruction currently being executed, and detects that it is the last flow (LAST
FLOW signal), if it is the final flow, the update control of the program counter and the selection of an instruction from the instruction buffer (queue) are performed.

【００３６】命令デコーダはＦＬＯＷＣＯＵＮＴＥＲ
の値を使って、フローによって同じ命令でもデコードの
結果を一部変更する。例えば図１０で説明したＧＲから
ＦＲへの転送では、読み出しレジスタ番号を１フロー目
と２フロー目で異なるレジスタ番号を与えて、２つのレ
ジスタを読み出している。The instruction decoder is a FLOW COUNTER
The decoding result is partially changed for the same instruction depending on the flow using the value of. For example, in the transfer from GR to FR described with reference to FIG. 10, two registers are read by giving different register numbers to the read register numbers for the first and second flows.

【００３７】図４において、マルチフロー処理でない単
一のフローで処理される命令に対しては、命令デコーダ
１からツーフローオペレーション、及びスリーフローオ
ペレーション信号は出力されず、オア回路２３の出力は
‘０’、インバータ２４の出力が‘１’となり、オア回
路２９の出力によってそのフローはラストフローである
ことが示される。In FIG. 4, the instruction decoder 1 does not output a two-flow operation signal and a three-flow operation signal for an instruction processed by a single flow other than the multi-flow processing, and the output of the OR circuit 23 becomes' 0, the output of the inverter 24 becomes '1', and the output of the OR circuit 29 indicates that the flow is the last flow.

【００３８】これに対してマルチフロー処理が２つのフ
ローから成る場合には、命令デコーダ１からツーフロー
オペレーション信号がアンド回路２７の最も上の入力端
子に与えられる。そこでアンド回路２７の出力は、上か
ら２番目の入力端子に‘１’、３番目の入力端子に
‘０’が与えられた時に‘１’となり、オア回路２９の
出力はラストフローを示すことになる。すなわちこの時
フローカウンタの出力は‘０１’である。アンド回路２
５に対しては、クロック信号と共にパイプラインコント
ローラ６からのＤステージリリース信号が入力されてお
り、クロック信号の立ち上がり時にＤステージリリース
信号が与えられていればフローカウンタ２６の出力がイ
ンクリメントされる。マルチフロー処理におけるフロー
が‘２’の場合には、最初のフローに対してはフローカ
ウンタ２６の出力は‘０’となっており、第２のフロー
のＤステージが完了したことを示すＤステージリリース
信号の入力時にフローカウンタ２６の出力が‘１’とな
る。これによってオア回路２９からラストフロー検出信
号が出力される。On the other hand, when the multi-flow process includes two flows, the instruction decoder 1 supplies a two-flow operation signal to the uppermost input terminal of the AND circuit 27. Therefore, the output of the AND circuit 27 becomes "1" when "1" is given to the second input terminal from the top and "0" is given to the third input terminal, and the output of the OR circuit 29 indicates the last flow. become. That is, at this time, the output of the flow counter is “01”. AND circuit 2
5, the D-stage release signal from the pipeline controller 6 is input together with the clock signal, and if the D-stage release signal is given when the clock signal rises, the output of the flow counter 26 is incremented. When the flow in the multi-flow process is “2”, the output of the flow counter 26 is “0” for the first flow, and the D stage indicating that the D stage of the second flow has been completed When the release signal is input, the output of the flow counter 26 becomes “1”. As a result, the OR circuit 29 outputs a last flow detection signal.

【００３９】マルチフロー処理が３つのフローから成る
場合には、スリーフローオペレーション信号がアンド回
路２８の１番上の入力端子に与えられ、アンド回路２８
の出力はその第２の入力端子への入力が‘０’第３の入
力端子への入力が‘１’となった時に‘１’となる。す
なわちフローカウンタ２６の出力は第３のフローに対し
てＤステージリリース信号が入力された時に‘２’すな
わち‘１０’となっており、この時点でオア回路２９か
らラストフロー検出信号が出力される。If the multi-flow processing consists of three flows, a three-flow operation signal is applied to the top input terminal of the AND circuit 28,
Becomes "1" when the input to the second input terminal is "0" and the input to the third input terminal is "1". That is, the output of the flow counter 26 is “2”, that is, “10” when the D stage release signal is input for the third flow, and at this time, the OR circuit 29 outputs the last flow detection signal. .

【００４０】また図４において、クロック信号の立ち上
がり時にＤステージリリース信号が‘１’でラストフロ
ー検出信号が‘１’である時、すなわちラストフローが
検出された後、次のマルチフロー処理の最初のフローに
対してＤステージリリース信号が出力された時にはフロ
ーカウンタがリセットされ、クロック信号の立ち上がり
時にＤステージリリース信号が‘１’、ラストフロー検
出信号が‘０’である時にフローカウンタはインクリメ
ントされ、クロック信号の立ち上がり時にＤステージリ
リース信号が‘０’である時にはフローカウンタの出力
は変化しないことが示されている。この作用はインバー
タ３０によって行われる。In FIG. 4, when the D-stage release signal is "1" and the last flow detection signal is "1" at the time of rising of the clock signal, that is, after the last flow is detected, the first multi-flow process is started. The flow counter is reset when the D stage release signal is output with respect to the flow of, and the flow counter is incremented when the D stage release signal is "1" at the rise of the clock signal and the last flow detection signal is "0". It shows that the output of the flow counter does not change when the D-stage release signal is "0" at the rising of the clock signal. This operation is performed by the inverter 30.

【００４１】図５は本発明におけるスコアボードの実施
例構成ブロック図である。同図において、スコアボード
は２つの４入力１６出力デコーダ３３，３４，１６個の
ＳＲフリップフロップ３５、３個の１６入力１出力セレ
クタ３６，３７および３８から構成されている。FIG. 5 is a block diagram showing an embodiment of a scoreboard according to the present invention. In the figure, the scoreboard comprises two 4-input 16-output decoders 33, 34, 16 SR flip-flops 35, and three 16-input 1-output selectors 36, 37 and 38.

【００４２】図５において、デコーダ３３はアンド回路
２１からの４ビットのＷＲＳＥＴ信号の内容に従っ
て、Ｄステージライトイネーブル信号（ＤＷＥ）の入
力時に１６個のＳＲフリップフロップ３５のいずれかを
セットするものであり、またデコーダ３４はアンド回路
２２の出力する４ビットのＷＲＲＥＳ信号の内容に従
って、Ｗステージライトイネーブル信号（ＷＷＥ）の
入力時にＳＲフリップフロップ３５のいずれかをリセッ
トするものである。In FIG. 5, the decoder 33 sets one of the 16 SR flip-flops 35 upon input of the D stage write enable signal (DWE) in accordance with the contents of the 4-bit WR SET signal from the AND circuit 21. The decoder 34 resets one of the SR flip-flops 35 according to the contents of the 4-bit WR RES signal output from the AND circuit 22 when the W stage write enable signal (W WE) is input.

【００４３】次にセレクタ３６は、命令デコーダ１から
出力される読み出しレジスタ番号ＳＲＣＲＥＧＡＤ
Ｒ０信号としてのスコアボード読み出しレジスタ検査ポ
ートへの入力信号ＲＤ０ＣＨＫ４ビットの内容に従っ
て、１６個のＳＲフリップフロップ３５のいずれか１つ
の出力をパイプラインコントローラ６に与えるＳＲＣ０
ＲＥＧＢＵＳＹ信号として出力するものである。同
様にレジスタ３７は読み出しレジスタ検査ポートへの入
力信号ＲＤ１ＣＨＫ４ビットの内容に従って、１６個
のフリップフロップ３５のいずれかの出力をＳＲＣ１
ＲＥＧＢＵＳＹ信号としてパイプラインコントローラ
６に出力し、またセレクタ３８は書き込みレジスタ検査
ポートへの入力信号ＷＲＣＨＫ４ビットの内容に従っ
て、１６個のフリップフロップ３５の出力のいずれかを
ＷＲＲＥＧＢＵＳＹ信号としてパイプラインコント
ローラ６に出力するものである。Next, the selector 36 reads the read register number SRC_REG_AD output from the instruction decoder 1.
SRC0 which supplies one of the outputs of 16 SR flip-flops 35 to pipeline controller 6 according to the contents of input signal RD0 CHK4 bits to score board read register test port as R0 signal
It is output as a REG BUSY signal. Similarly, the register 37 outputs one of the 16 flip-flops 35 according to the contents of the 4-bit input signal RD1 CHK to the read register test port SRC1.
The selector 38 outputs the REG BUSY signal to the pipeline controller 6, and the selector 38 outputs one of the outputs of the 16 flip-flops 35 as the WR REG BUSY signal according to the contents of the 4-bit input signal WR CHK to the write register inspection port. Output to the controller 6.

【００４４】図６はパイプラインコントローラの実施例
の構成ブロック図である。図３において、パイプライン
コントローラ６にはスコアボード５からの３つのＢＵＳ
Ｙ信号とその他のインターロック条件信号が与えられる
が、３つのＢＵＳＹ信号とその他のＤステージインター
ロック条件信号はオア回路４０に入力される。アンド回
路４１には、オア回路４０の出力とＤステージバリッド
信号、すなわちＤステージを実行している命令があるこ
とを示す信号と、後述するアンド回路４５の出力が入力
される。アンド回路４１の出力はＤステージを実行して
いる命令が完了したことを示すＤステージリリース信号
であり、その値はＤステージバリッド信号が‘１’であ
り、オア回路４０、及びアンド回路４５の出力が共に
‘０’である時に‘１’となる。すなわちＤステージバ
リッド信号が‘１’であり、Ｄステージが有効であっ
ても、オア回路４０が‘１’を出力し、Ｄステージイン
ターロック条件がある場合、またはアンド回路４５が
‘１’を出力し、Ｅステージで実行中の命令がインター
ロックしている場合にはアンド回路４１の出力は‘１’
とならない。これは例えばＤステージで実行される命令
がＥステージに進んでしまうとＥステージでインターロ
ックしている命令が完了しないまま、例えばレジスタへ
の上書きが行われてしまうためであり、このようなスコ
アボード検査によるインターロックはＤインターロック
条件の１つと考えられる。FIG. 6 is a block diagram showing the configuration of an embodiment of the pipeline controller. In FIG. 3, the pipeline controller 6 has three BUSs from the scoreboard 5.
The Y signal and other interlock condition signals are provided, and the three BUSY signals and other D stage interlock condition signals are input to the OR circuit 40. The output of the OR circuit 40 and a D stage valid signal, that is, a signal indicating that there is an instruction executing the D stage, and an output of an AND circuit 45 described later are input to the AND circuit 41. The output of the AND circuit 41 is a D stage release signal indicating that the instruction executing the D stage has been completed. The value of the output is a D stage valid signal of “1”, and the value of the OR circuit 40 and the AND circuit 45 It becomes '1' when both outputs are '0'. That is, even if the D stage valid signal is "1" and the D stage is valid, the OR circuit 40 outputs "1" and the D stage interlock condition exists, or the AND circuit 45 outputs "1". When the instruction being executed in the E stage is interlocked, the output of the AND circuit 41 is "1".
Does not. This is because, for example, if an instruction executed in the D stage advances to the E stage, the register interlocked in the E stage is not completed, for example, the register is overwritten. The interlock by the board inspection is considered as one of the D interlock conditions.

【００４５】アンド回路４１の出力としてのＤステージ
リリース信号は、フリップフロップ４２を介してＥステ
ージバリッド信号、すなわちＤステージの終了によって
Ｅステージを実行している命令があることを示す信号と
してアンド回路４３、及び４５に与えられる。アンド回
路４３には、図３におけるパイプラインコントローラ６
への他のインターロックコンディション信号としてのＥ
ステージインターロックコンディション信号と、後述す
るアンド回路４８の出力とが与えられており、またこれ
らの２つの信号はオア回路４４を介してアンド回路４５
に与えられている。すなわちアンド回路４５の出力は、
前述のようにＥステージで実行されている命令がＥステ
ージインターロックコンディションによってインターロ
ックしていることを示している。The D-stage release signal as the output of the AND circuit 41 is supplied as an E-stage valid signal via the flip-flop 42, that is, a signal indicating that there is an instruction executing the E-stage by the end of the D-stage. 43 and 45. The AND circuit 43 includes the pipeline controller 6 in FIG.
E as another interlock condition signal to
A stage interlock condition signal and an output of an AND circuit 48, which will be described later, are provided. These two signals are supplied to an AND circuit 45 via an OR circuit 44.
Has been given to. That is, the output of the AND circuit 45 is
As described above, it indicates that the instruction executed at the E stage is interlocked by the E stage interlock condition.

【００４６】アンド回路４３の出力は、Ｅステージバリ
ッド信号が‘１’でありＥステージインターロックコン
ディション信号とアンド回路４８の出力とが共に‘０’
である時に‘１’となる。アンド回路４３の出力はＥス
テージリリース信号、すなわちＥステージを実行してい
る命令が完了したことを示すものであり、Ｄステージリ
リース信号と同様にＥステージが有効であってもＥステ
ージインターロックコンディションがあるか、またはＷ
ステージで実行されるべき命令がインターロックしてい
る場合にはその値は‘１’とならない。すなわちＥステ
ージで実行されている命令がＷステージに進んでしまう
と、Ｗステージでインターロックしている命令が完了し
ないまま、例えばレジスタへの上書きが行われてしまう
ことになる。The output of the AND circuit 43 is such that the E stage valid signal is “1” and both the E stage interlock condition signal and the output of the AND circuit 48 are “0”.
It becomes '1' when. The output of the AND circuit 43 indicates the E stage release signal, that is, the completion of the instruction executing the E stage, and similarly to the D stage release signal, even if the E stage is valid, the E stage interlock condition Or W
If the instruction to be executed in the stage is interlocked, its value will not be '1'. That is, when the instruction executed in the E stage advances to the W stage, for example, the register is overwritten without completing the instruction interlocked in the W stage.

【００４７】アンド回路４３の出力、すなわちＥステー
ジリリース信号はフリップフロップ４６を介してＷステ
ージを実行している命令があることを示すＷステージバ
リッド信号としてアンド回路４７に与えられる。アンド
回路４７には、Ｗステージインターロックコンディショ
ン信号が与えられており、Ｗステージバリッド信号が
‘１’であり、Ｗステージインターロックコンディショ
ン信号が‘０’である時にアンド回路４７からＷステー
ジリリース信号、すなわちＷステージを実行している命
令が完了したことを示す信号が出力される。一方アンド
回路４８には、Ｗステージバリッド信号とＷステージイ
ンターロックコンディション信号とが入力されており、
これらが‘１’である時にはアンド回路４８の出力が
‘１’となり、その出力は前述のようにアンド回路４３
及びオア回路４４に与えられる。The output of the AND circuit 43, that is, the E stage release signal is supplied to the AND circuit 47 via the flip-flop 46 as a W stage valid signal indicating that there is an instruction executing the W stage. The AND circuit 47 is supplied with a W stage interlock condition signal. When the W stage valid signal is “1” and the W stage interlock condition signal is “0”, the AND circuit 47 outputs a W stage release signal. , That is, a signal indicating that the instruction executing the W stage is completed. On the other hand, the W stage valid signal and the W stage interlock condition signal are input to the AND circuit 48,
When these are "1", the output of the AND circuit 48 is "1", and the output is as described above.
And an OR circuit 44.

【００４８】次に並列命令実行時における本発明の実施
例について説明する。図１３で説明した問題点を解決す
るためには、マルチフロー処理を必要とする命令と、単
一のフローのみの命令とを並列実行する場合には、マル
チフロー処理の最終フローでスコアボードのセット／リ
セットを行うと共に、単一のフローのみの命令の実行を
その最終フローの時点まで遅延させることが必要にな
る。Next, an embodiment of the present invention at the time of executing a parallel instruction will be described. In order to solve the problem described with reference to FIG. 13, when an instruction requiring multi-flow processing and an instruction only for a single flow are executed in parallel, the scoreboard of the final flow of the multi-flow processing is used. In addition to performing set / reset, it is necessary to delay execution of instructions of only a single flow until the time of its final flow.

【００４９】従って、例えばロード命令（メモリアクセ
ス命令）と乗算命令とを同時に実行する場合には単一フ
ローとしてのロード命令の実行とスコアボードのセット
／リセットを乗算命令の最終フローまで遅延させること
によってデータの相互干渉を防止する処理が可能とな
る。Therefore, for example, when a load instruction (memory access instruction) and a multiplication instruction are executed simultaneously, execution of the load instruction as a single flow and set / reset of the scoreboard are delayed until the final flow of the multiplication instruction. As a result, it is possible to perform a process for preventing data interference.

【００５０】以上説明した実施例においては、演算に用
いるデータは２個であり、従って読み出しレジスタが２
個、書き込みレジスタが１個の場合を説明したが、これ
らのデータ及びレジスタの数がこれに限定されないこと
は当然である。またこれらのレジスタ番号（アドレス）
が４ビットであり、従って図５におけるスコアボード内
のフリップフロップが１６個の場合を説明したが、これ
らのビット数及びフリップフロップの数もこれに限定さ
れないことは当然である。更にマルチフローのフローの
数を２、または３として図４のマルチフローコントロー
ラを説明したが、マルチフロー処理のフローの数がこれ
に限定されないことは当然である。In the embodiment described above, the data used for the operation is two, so that the read register has two data.
Although the description has been made of the case where the number of write registers is one, the number of these data and the number of registers is not limited to this. These register numbers (addresses)
Is 4 bits, and therefore, the case where the number of flip-flops in the scoreboard in FIG. 5 is 16 has been described. However, it is needless to say that the number of these bits and the number of flip-flops are not limited thereto. Further, the multiflow controller of FIG. 4 has been described with the number of flows of the multiflow being 2 or 3, but the number of flows of the multiflow processing is not limited to this.

【００５１】[0051]

【発明の効果】以上詳細に説明したように、本発明によ
れば１つの命令を複数のフローに分けて処理するマルチ
フロー処理に対してもスコアボードを使用してデータの
相互依存関係を崩すことなく処理を行うことが可能とな
り、また命令を並列に実行する場合にもマルチフロー処
理を必要とする命令に対してはスコアボードのセット／
リセットを最終フローにおいて行い、マルチフロー処理
を必要としない命令をその最終フローまで実行を遅延さ
せることによってデータの相互干渉を解決することが可
能となる。As described above in detail, according to the present invention, even in a multi-flow process in which one instruction is divided into a plurality of flows and processed, the interdependency of data is broken by using a scoreboard. Processing can be performed without instructions, and even when instructions are executed in parallel, a set of scoreboards /
By performing the reset in the final flow and delaying the execution of an instruction that does not require multiflow processing until the final flow, it is possible to solve the mutual interference of data.

[Brief description of the drawings]

【図１】本発明の原理ブロック図である。FIG. 1 is a principle block diagram of the present invention.

【図２】本発明における乗算命令の実行の様子を示す図
である。FIG. 2 is a diagram showing a state of execution of a multiplication instruction in the present invention.

【図３】本発明におけるパイプライン処理計算機の実施
例の構成を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration of an embodiment of a pipeline processing computer according to the present invention.

【図４】マルチフローコントローラの実施例の構成を示
すブロック図である。FIG. 4 is a block diagram illustrating a configuration of an embodiment of a multi-flow controller.

【図５】スコアボードの実施例の構成を示すブロック図
である。FIG. 5 is a block diagram showing a configuration of an embodiment of a scoreboard.

【図６】パイプラインコントローラの実施例の構成を示
すブロック図である。FIG. 6 is a block diagram illustrating a configuration of an embodiment of a pipeline controller.

【図７】後続の命令が先行命令の完了を待たない場合の
動作を示す図である。FIG. 7 is a diagram illustrating an operation when a subsequent instruction does not wait for completion of a preceding instruction.

【図８】スコアボードを用いた処理の動作を示す図であ
る。FIG. 8 is a diagram illustrating an operation of a process using a scoreboard.

【図９】パイプライン処理計算機の従来例の構成を示す
ブロック図である。FIG. 9 is a block diagram showing a configuration of a conventional example of a pipeline processing computer.

【図１０】マルチフロー処理の例を示す図である。FIG. 10 is a diagram illustrating an example of multiflow processing.

【図１１】乗算命令のマルチフロー処理の例を示す図で
ある。FIG. 11 is a diagram illustrating an example of multiflow processing of a multiplication instruction.

【図１２】乗算命令のハングアップ状態の例を示す図で
ある。FIG. 12 is a diagram illustrating an example of a hang-up state of a multiplication instruction.

【図１３】命令の並列実行における問題点を説明する図
である。FIG. 13 is a diagram illustrating a problem in parallel execution of instructions.

[Explanation of symbols]

１命令デコーダ２汎用レジスタ（ＧＲ）３，４演算器５，１１スコアボード６パイプラインコントローラ７パイプラインタグレジスタ１０最終フロー検出手段１２スコアボード更新制御手段２０マルチフローコントローラ DESCRIPTION OF SYMBOLS 1 Instruction decoder 2 General-purpose register (GR) 3, 4 Operation unit 5, 11 Scoreboard 6 Pipeline controller 7 Pipeline tag register 10 Final flow detection means 12 Scoreboard update control means 20 Multiflow controller

Claims

(57) [Claims]

1. A scoreboard for checking a data dependency between an instruction executed before and an instruction executed after the preceding instruction, and a single instruction is transmitted to a plurality of flows. A pipeline processing computer that can be executed separately; a final flow detection means (10) for detecting a final flow of the plurality of flows; and a scoreboard corresponding to the dependency in the detected final flow. 12. A pipeline processing computer comprising: a scoreboard update control means (11) for updating the stored contents of (12), and guaranteeing processing without disturbing the dependency.

2. When at least one instruction executed in a plurality of flows is executed among instructions executed in parallel by the pipeline processing computer, an instruction executed in a plurality of flows is executed. In the final flow of the plurality of flows, the content stored in the scoreboard is updated. For an instruction that can be executed in a single flow, the execution of the instruction and the update of the content stored in the scoreboard are performed by 2. The pipeline processing computer according to claim 1, wherein a delay is made until a final flow of instructions to be executed by dividing into a plurality of flows.