JPS6134186B2

JPS6134186B2 -

Info

Publication number: JPS6134186B2
Application number: JP56047774A
Authority: JP
Inventors: Shigeaki Okuya; Tetsuo Okagata
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1981-03-30
Filing date: 1981-03-30
Publication date: 1986-08-06
Also published as: JPS57161938A

Description

[Detailed description of the invention]

本発明は、データ処理装置、例えばベクトル処
理装置において、後発の命令を先発の命令より先
に実行できるようにした命令制御方式に関するも
のである。複数のエレメントを有する第２のオペランドＡ
（a₀，a₁，……ａ_o-1）と複数のエレメントＢ
（b₀，b₁，……ｂ_o-1）とを対応するエレメント同
志に演算を施し、結果の第１オペランドＣ（c⁰，
c¹，……ｃ_o-1）を得るデータ処理装置は、ベク
トル処理装置とよばれている。これに対してエレ
メントが１個（ｎ＝１）に限定された従来の汎用
処理装置はスカラー処理装置とよばれている。第１図はベクトル処理装置の概要を示すもので
あつて、１は主記憶装置、２は主記憶制御装置、
３はベクトル処理装置、４はストア処理部、５は
ロード処理部、６はベクトル・レジスタ、７は乗
算器、８は加算器、９は命令制御部をそれぞれ示
している。なお、実線の矢印はデータの流れを示
し、点線の矢印は制御信号の流れを示している。
ストア制御部４は、ベクトル・レジスタのデータ
を主記憶装置１に格納するためのものであり、ロ
ード処理部５は、主記憶装置１からデータを読出
してベクトル・レジスタ６に格納するためのもの
である。ベクトル・レジスタ６は、複数のエレメ
ントよりなるベクトルデータを保持するベクト
ル・レジスタを複数個有している。ストア処理部
４、ロード処理部５、乗算器７および加算器８は
パイプライン構造のものである。命令制御部９
は、ベクトル・レジスタ６やストア処理部４、ロ
ード処理部５、乗算器７、加算器８などを制御す
る。本発明は、この命令制御部に関するものであ
る。また、本発明ではストア処理部４、ロード処
理部５、乗算器７、加算器８を含めて演算処理部
と称する。ベクトル命令は、命令コード、第１オペランド
指定、第２オペランド指定および第３オペランド
指定部を有している。ベクトル乗算命令は、 VM１，２，３で表わされる。これは、ベクトル・レジスタ２と
ベクトル・レジスタ３の内容とを乗算し、結果を
ベクトル・レジスタ１に入れるものである。ベク
トル加算命令は例えば VA４，５，１で表わされる。これは、ベクトル・レジスタ５と
ベクトル・レジスタ１の内容を加算し、結果をベ
クトル・レジスタ１に入れるものである。ベクトル命令を処理する場合は、乗算器７や加
算器８などをパイプライン構造とし、先行のエレ
メントの演算処理が完了する前に後続のエレメン
トを投入するようになつている。第２図は加算器
８におけるベクトル加算命令の処理状況を示すも
のであつて、 (1) データの読出し（READ） (2) 両オペランドの指数比較（COMPARE） (3) 指数合わせのためのシフト（PRE−
SHIFT） (4) 加算（ADD） (5) 演算後の正規化のためのシフト（POST
SHIFT） (6) データの書込み（WRITE）の６段階のパイプラインとなる。命令処理は、第
２図に示すように平行四辺形で表わされる。次に、従来の命令制御方式の問題点について説
明する。いま、 VM１，２，３ VA４，５，１ VA７，８，９というベクトル命令系列があつたとする。このと
きの従来の処理は第３図に示される。たゞし、ベ
クトル乗算命令は11段階のパイプラインであり、
エレメント数は８であるとする。第３図で、VA
４，５，１が時刻＝11から始まつているのは、
VA４，５，１がVA１，２，３の結果データを使
用するからである。ところで、VA７，８，９は先行の命令とベク
トル・レジスタの干渉がないので、VA４，５，
１の前にVA７，８，９を実行させることによつ
て、命令処理時間を短縮することが出来る。第４
図は、このときの状態を示している。第３図およ
び第４図から明らかなように、第４図のものは第
３図のものに比べ命令処理時間が９サイクル短か
くなつている。このように、プログラムで指示さ
れたベクトル命令の順序を変更するようなベクト
ル処理装置は、従来存在しなかつた。スカラー処理装置では、複数の命令のオペラン
ドを保持するオペランド・レジスタを各演算処理
部毎に持つと共に、先行命令の結果データを使用
するような後続命令のオペランドのために、先行
命令の結果オペランドを識別するためのフラグ・
レジスタを持つオペランド予約部と、各演算処理
部の結果データをオペランド予約部に送る共通バ
スとを設ける方式がある。この方式をベクトル処
理装置に適用することは、オペランド予約部を巨
大化させ、いたずらに金物量を増させることにな
り、得策ではない。本発明は、上記の考察に基づくものであつて、
先発命令より後発命令を先に実行させ、命令処理
時間を短縮する命令制御方式を少ない金物量で実
現することを目的としている。そしてそのため、
本発明の命令制御方式はベクトル・レジスタと、
該ベクトル・レジスタのベクトル・データをオペ
ランド・データとして処理を行う複数個のパイプ
ライン構造の演算処理部とを有するベクトル処理
装置において、ベクトル命令がセツトされる命令
レジスタと、複数の待合レジスタと、上記命令レ
ジスタのベクトル命令情報を上記待合レジスタに
格納する投入制御回路と、演算実行中のベクトル
命令のベクトル命令情報を保持する実行中情報保
持レジスタと、上記待合レジスタ内のベクトル命
令情報の発信を行うと共に発信したベクトル命令
情報を対応する実行中命令情報保持レジスタに格
納する命令発信制御回路と、上記待合レジスタの
ベクトル命令情報と上記実行中命令情報保持レジ
スタのベクトル命令情報とを比較する比較手段と
を具備し、且つ上記命令発信制御部が、上記複数
の待合レジスタのベクトル命令情報のうち上記比
較手段によつて命令実行開始を妨げる要因がない
と判断されたベクトル命令情報を発信することを
特徴とするものである。以下、本発明を図面を参
照しつつ説明する。第５図は本発明の命令制御部の第１実施例のブ
ロツク図、第６図は命令の追越しが行われる場合
のタイムチヤート、第７図および第８図はレジス
タ干渉チエツク回路の１実施例のブロツク図であ
る。第５図において、１０はフエツチ・レジスタ、
１１は待合レジスタ投入制御回路、１２−１と１
２−２は待合レジスタ、１３−１と１３−２はレ
ジスタ干渉チエツク回路、１４はセレクタ、１５
は命令発信制御回路、１６は乗算レジスタ、１７
は加算レジスタ、１８−１と１８−２はレジスタ
干渉チエツク回路、１９は選択制御回路をしれぞ
れ示している。フエツチ・レジスタ１０には、主記憶装置１か
ら取出された命令情報がセツトされる。待合レジ
スタ投入制御回路１１は、レジスタ干渉チエツク
回路１３−１および１３−２がレジスタ干渉なし
を示していること等を条件として、フエツチ・レ
ジスタ１０の命令情報を待合レジスタ１２−１又
は１２−２へ移す。レジスタ干渉チエツク回路１
３−１と１３−２の詳細については後述する。セ
レクタ１４は、選択制御回路１９の制御信号に従
つて待合レジスタ１２−１又は１２−２を選択す
る。選択制御回路１９は、１サイクル毎に制御信
号の値を反転する。命令発信制御回路１５は、レ
ジスタ干渉チエツク回路１８−１と１８−２の双
方がレジスタ干渉なしを示していること等を条件
として、セレクタ１４の出力がベクトル加算命令
のときはこれを加算レジスタ１７に、セレクタ１
４の出力がベクトル乗算命令の場合にはこれを乗
算レジスタ１６に移し、これと同時に演算処理部
起動情報を送出する。レジスタ干渉チエツク回路
１８−１と１８−２の詳細については後述する。第６図は命令の追越しが行われる場合のタイム
チヤートである。VM１，２，３なる命令情報
は、時刻Ｔ＝０でフエツチ・レジスタ１０にセツ
トされ、Ｔ＝１で待合レジスタ１２−１に移さ
れ、Ｔ＝２で乗算レジスタ１６に移され、そして
乗算が開始される。乗算開始と同時にWRITE開
始前フラグオンされる。このフラグはベクトル・
レジスタへの演算結果の書込みが開始されると、
オフされる。VA４，５，１なる命令情報は、Ｔ
＝１でフエツチ・レジスタ１０にセツトされ、Ｔ
＝２で待合レジスタ１２−２にセツトされる。乗
算レジスタ１６に保持されているVM１，２，３
の第１オペランド・レジスタと待合レジスタ１２
−２に保持されているVA４，５，１の第１ベク
トル・レジスタとがレジスタ干渉を起しているの
で、待合レジスタ１２−２の命令の発信は待たさ
れる。Ｔ＝２でVA７，８，９がフエツチ・レジ
スタ１０にセツトされる。待合レジスタ１２−２
に保持されているVA４，５，１とフエツチ・レ
ジスタ１０に保持されているVA７，８，９との
間にはレジスタ干渉がないので、Ｔ＝３でVA
７，８，９が待合レジスタ１２−１に移され、命
令の発信を妨げる要因がないので、Ｔ＝４でVA
７，８，９は加算レジスタ１７に移されると共
に、加算器８に対して処理の起動がかかる。第７図はレジスタ干渉チエツク回路１８−１の
１実施例を示すものである。なお、レジスタ干渉
チエツク回路１８−１と１８−２は同一構成であ
る。第７図において、２０と２１は一致回路、２
２と２３はAND回路、２４はOR回路をそれぞれ
示している。一致回路２０には、演算処理中命令
第１オペランド・レジスタ番号と演算命令第２オ
ペランド・レジスタ番号とが入力され、一致回路
２１には、演算処理中命令第１オペランド・レジ
スタ番号と演算処理命令第３オペランド・レジス
タ番号とが入力される。AND回路２２には演算
処理中命令WRITE開始前フラグと一致回路２０
の出力とが入力され、AND回路２３には演算処
理命令WRITE開始前フラグと一致回路２１の出
力が入力される。AND回路２２，２３の出力は
OR回路２４に入力される。OR回路２４の出力が
論理「１」のときレジスタ干渉ありを示す。第８図はレジスタ干渉チエツク回路１３−１の
１実施例を示すものである。第８図において、２
５ないし２７は一致回路、２８はOR回路をそれ
ぞれ示している。一致回路２５，２６，２７の一
方の入力端子には命令レジスタ１０の第１オペラ
ンド・レジスタ番号が入力され、一致回路２５の
他方の入力端子には待合レジスタの第１オペラン
ド・レジスタ番号が入力され、一致回路２６の他
方の入力端子には待合レジスタの第２オペラン
ド・レジスタ番号が入力され、一致回路２７の他
方の入力端子には待合レジスタの第３のオペラン
ド・レジスタ番号が入力されれる。一致回路２
５，２６，２７の出力はOR回路２８に入力され
る。OR回路２８の出力が論理「１」のとき、レ
ジスタ干渉有りを示す。第９図は本発明の第２実施例のブロツク図、第
１０図は優先設定回路の論理を示すフローチヤー
ト、第１１図はレジスタ干渉チエツク回路の１実
施例のブロツク図である。第９図において、１１０は命令フエツチ・レジ
スタ、１１１は待合レジスタ投入制御回路、１１
２−１と１１２−２は待合レジスタ、１１３はレ
ジスタ干渉チエツク回路、１１４はセレクタ、１
１５は命令発信制御回路、１１６は乗算レジス
タ、１１７は加算レジスタ、１１８−１と１１８
−２はレジスタ干渉チエツク回路、１２０は優先
設定回路、１２１は優先フリツプ・フロツプをそ
れぞれ示している。なお、１１０，１１２−１，
１１２−２，１１４，１１６，１１７，１１８−
１，１１８−２で示されるものは、それぞれ符号
１０，１２−１，１２−２，１４，１６，１７，
１８−１，１８−２で示されるものと同一であ
る。第９図の第２実施例は、待合レジスタ１１２−
１，１１２−２に格納される命令情報に対して優
先順位を与え、これらの命令情報のの間にレジス
タ干渉がある場合には、優先順位の高い命令を先
に実行するようにしたものである。この第２実施
例においては、待合レジスタ投入制御回路１１１
は、待合レジスタ１１２−１又は１１２に空きが
あると、レジスタ干渉をチエツクをすることな
く、フエツチ・レジスタ１１０の命令情報を空き
の待合レジスタに移す。第１０図は優先設定回路１２０の論理を示すフ
ローチヤートである。なお、Q₁は待合レジスタ
１１２−１を示し、Q₂は待合レジスタ１１２−
２を示している。優先順位の設定は下記のように
して設定される。 (イ) 命令情報がQ₁にセツトされるか否かを調べ
る。Yesの場合は(ロ)の処理を行い、Noの場合は
(ホ)の処理を行う。 (ロ) Q₂がパリツドであるか、否かを調べる。No
のときは(ニ)の処理を行い、Yesの場合は(ト)の処
理を行う。 (ハ) Q₁がリリースされるか、否かを調べる。リ
リースとは、Q₁の命令を命令発信制御回路１
１５が取込み乗算レジスタ１１６又は加算レジ
スタ１１７にセツトすることを意味している。
Noのときは(ニ)の処理が行われ、Yesのときは(チ)
の処理が行われる。 (ニ) Q₁の優先順位をQ₂より高くする。 (ホ) 命令情報がQ₂にセツトされるか、否かを調
べる。Noであれば(リ)の処理を行い、Yesであれ
ば(ヘ)の処理を行う。 (ヘ) Q₁がバリツドであるか、否かを調べる。No
であれば(チ)の処理を行い、Yesの場合は(ハ)の処
理を行う。 (ト) Q₂がリリースされるか否かを調べる。Noの
場合は(チ)の処理を行い、Yesの場合は(ニ)の処理
を行う。 (チ) Q₂の優先順位をQ₁より高くする。 (リ) Q₁の優先順位がQ₂より高きか、否かを調べ
る。Yesの場合は(ニ)の処理を行い、Noの場合は
(チ)の処理を行う。第１１図はレジスタ干渉チエツク回路１１３の
１実施例のブロツク図である。第１１図におい
て、１２２ないし１２６は一致回路、１２７は
OR回路、１２８と１２９はAND回路、１３０は
OR回路をそれぞれ示している。待合レジスタ１１２−２の第１オペランド・レ
ジスタ番号は一致回路１２２，１２３，１２４の
一方の入力端子に入力され、一致回路１２２の他
方の入力端子には待合レジスタ１１２−１の第２
オペランド・レジスタ番号が入力され、一致回路
１２３の他方の入力端子には待合レジスタ１１２
−１の第３オペランド・レジスタ番号が入力さ
れ、一致回路１２４の他方の入力端子には待合レ
ジスタ１１２−１の第１オペランド・レジスタ番
号が入力される。一致回路１２５，１２６の一方
の入力端子には待合レジスタ１１２−１の第１オ
ペランド・レジスタ番号が入力され、一致回路１
２５の他方の入力端子には待合レジスタ１１２−
２の第２オペランド・レジスタ番号が入力され、
一致回路１２６の他方の入力端子には待合レジス
タ１１２−２の第３オペランド・レジスタ番号が
入力される。一致回路１２２，１２３，１２４，
１２５，１２６の出力はOR回路１２７に入力さ
れる。OR回路１２７の否定出力が論理「１」の
ときはレジスタ干渉なしを示しており、OR回路
１２７の肯定出力が論理「１」のときはレジスタ
干渉ありを示している。AND回路１２８は、レ
ジスタ干渉があること、待合レジスタ１１２−１
が選択されていること及び待合レジスタ１１２−
１の優先順位が高いことを条件として論理「１」
を出力する。AND回路１２９は、レジスタ干渉
があること、待合レジスタ１１２−２が選択され
ていること及び待合レジスタ１１２−２の優先順
位が高いことを条件として論理「１」を出力す
る。OR回路１３０には、OR回路１２７の否定出
力、AND回路１２８の出力およびAND回路１２
９の出力が入力される。OR回路１３０の出力が
命令実行可能状態信号となり、この信号は命令発
信制御回路１１５に送られる。命令発信制御回路
１１５は、論理「１」の命令実行可能状態信号を
受取ると、セレクタ１１４の出力する命令情報を
取込み、レジスタ干渉チエツクなどを行う。上述
の説明では、第１０図のような論理をもつ優先設
定回路１２０で以て優先順位を決定しているが、
フエツチ・レジスタ１１０から待合レジスタ１１
２−１又は１１２−２に命令を投入するときに、
命令番号を付与して該番号により優先順位を決定
することも可能である。第１２図は本発明の第３実施例を示すブロツク
図、第１３図はその動作を示すタイムチヤートで
ある。第１２図において、２１０はフエツチ・レ
ジスタ、２１１は待合レジスタ投入制御回路、２
１２は待合レジスタ、２１３−１と２１３−２も
待合レジスタ、２２４はセレクタ、２２５は命令
発信制御回路、２２６は乗算レジスタ、２２７は
加算レジスタ、２２８−１と２２８−２はレジス
タ干渉チエツク回路をそれぞれ示している。符号
２１０，２２４，２２５，２２６，２２７，２２
８−１，２２８−２で示されるものは、それぞれ
付号１０，１４，１５，１６，１７，１８−１，
１８−２で示されるものと略ぼ同じである。待合レジスタ２１２には乗算命令が格納され、
待合レジスタ２１３−１および２１３−２には加
算命令が格納される。待合レジスタ２１３−１に
投入された加算命令は、待合レジスタ２１３−２
の加算命令が空きになると、待合レジスタ２１３
−２に移される。待合レジスタ投入制御回路２１
１は、さきに述べたようにレジスタ干渉チエツク
を行つて待合レジスタ２１２又は２１３−１への
命令情報の投入を行つている。第１３図は第１２図の実施例の動作を示すタイ
ムチヤートである。なお、Ｆはフエツチ・レジス
タ２１０、AQBは待合レジスタ２１３−１、AQ
は待合レジスタ２１３−２、ARは乗算レジスタ
２２７、MQは待合レジスタ２１２、MRは乗算
レジスタ２２６を意味している。第１３図は、 VA１，２，３ VA４，５，１ VA７，８，４ VMA，Ｂ，Ｃ VMD，Ｅ，Ａという命令系列を実行する場合を示している。先
ず、の命令がフエツチ・レジスタＦにセツトさ
れ、次のサイクルでこの命令は待合レジスタ
AQBに移され、次のサイクルでレジスタAQに移
され、Ｔ＝０で加算レジスタARに移される。
の命令は１サイクル遅れてフエツチ・レジスタＦ
に移され、次のサイクルで待合レジスタAQBに
移され、Ｔ＝０で待合レジスタAQに移される。
そして、の命令が終了すると、加算レジスタ
ARに移される。の命令は、の命令より１サ
イクル遅れてフエツチ・レジスタＦにセツトさ
れ、次のサイクル（Ｔ＝０）で待合レジスタ
AQBに移され、待合レジスタAQが空いたときそ
こに移され、の命令が終了すると、加算レジス
タARに移される。の命令は、の命令より１
サイクル遅れてＴ＝０でフエツチ・レジスタＦに
セツトされ、次のサイクルで待合レジスタMQに
移され、次のサイクルで乗算レジスタMRに移さ
れる。の命令は、Ｔ＝１でフエツチ・レジスタ
Ｆにセツトされ、次のサイクルで待合レジスタ
MQに移され、の命令が終了すると、乗算レジ
スタMRに移される。第１４図は待合レジスタAQBが存在しない場
合に同一の命令系列を実行するための動作を示し
ている。第１４図の場合にはないしの命令系
列を実行するためには55サイクルを必要とする
が、第１３図に場合には39サイクルで済む。な
お、第１２図の実施例において、待合レジスタ２
１３−１と２１３−２の双方に命令が入つていな
い場合には、命令を直接に待合レジスタ２１３−
２に投入するように制御することも可能である。以上の説明から明らかなように、本発明によれ
ば、少ない金物量で以て命令処理時間を大幅に短
縮することが可能となる。 The present invention relates to an instruction control system that allows a later instruction to be executed before an earlier instruction in a data processing device, such as a vector processing device. Second operand A with multiple elements
(a ₀ , a ₁ , ...a _o-1 ) and multiple elements B
(b ₀ , b ₁ , ...b _o-1 ) on the corresponding elements, and the first operand of the result C (c ⁰ ,
c ¹ , . . . c _o-1 ) is called a vector processing device. On the other hand, a conventional general-purpose processing device in which the number of elements is limited to one (n=1) is called a scalar processing device. FIG. 1 shows an overview of a vector processing device, in which 1 is a main memory, 2 is a main memory control device,
3 is a vector processing device, 4 is a store processing section, 5 is a load processing section, 6 is a vector register, 7 is a multiplier, 8 is an adder, and 9 is an instruction control section. Note that solid line arrows indicate the flow of data, and dotted line arrows indicate the flow of control signals.
The store control unit 4 is for storing data in the vector register in the main memory 1, and the load processing unit 5 is for reading data from the main memory 1 and storing it in the vector register 6. It is. The vector register 6 has a plurality of vector registers that hold vector data consisting of a plurality of elements. The store processing section 4, the load processing section 5, the multiplier 7, and the adder 8 have a pipeline structure. Command control unit 9
controls the vector register 6, store processing unit 4, load processing unit 5, multiplier 7, adder 8, etc. The present invention relates to this instruction control section. In the present invention, the store processing section 4, load processing section 5, multiplier 7, and adder 8 are collectively referred to as an arithmetic processing section. A vector instruction has an instruction code, a first operand designation, a second operand designation, and a third operand designation part. Vector multiplication instructions are represented by VM1,2,3. This multiplies the contents of vector register 2 and vector register 3 and places the result in vector register 1. A vector addition instruction is represented by VA4,5,1, for example. This adds the contents of vector register 5 and vector register 1 and places the result in vector register 1. When processing a vector instruction, the multiplier 7, adder 8, etc. are arranged in a pipeline structure, and the subsequent element is input before the arithmetic processing of the preceding element is completed. Figure 2 shows the processing status of the vector addition instruction in the adder 8. (1) Read data (READ) (2) Compare the exponents of both operands (COMPARE) (3) Shift for exponent matching (PRE−
SHIFT) (4) Addition (ADD) (5) Shift for normalization after operation (POST
SHIFT) (6) Data writing (WRITE) 6-stage pipeline. Instruction processing is represented by a parallelogram as shown in FIG. Next, problems with the conventional command control method will be explained. Now, suppose we have a vector instruction sequence: VM1, 2, 3 VA 4, 5, 1 VA 7, 8, 9. The conventional processing at this time is shown in FIG. However, the vector multiplication instruction is an 11-stage pipeline,
It is assumed that the number of elements is eight. In Figure 3, VA
The reason why 4, 5, 1 starts from time = 11 is because
This is because VA4, 5, and 1 use the result data of VA1, 2, and 3. By the way, VA7, 8, 9 do not interfere with the preceding instruction and the vector register, so VA4, 5,
By executing VA7, 8, and 9 before VA1, the instruction processing time can be shortened. Fourth
The figure shows the state at this time. As is clear from FIGS. 3 and 4, the instruction processing time in FIG. 4 is nine cycles shorter than that in FIG. 3. In this way, there has never been a vector processing device that changes the order of vector instructions instructed by a program. In a scalar processing unit, each arithmetic processing unit has an operand register that holds the operands of multiple instructions. Flags for identification
There is a method of providing an operand reservation section having registers and a common bus for sending result data of each arithmetic processing section to the operand reservation section. Applying this method to a vector processing device would make the operand reservation section huge and unnecessarily increase the amount of hardware, which is not a good idea. The present invention is based on the above considerations, and includes:
The objective is to realize an instruction control method that executes later instructions before earlier instructions and reduces instruction processing time with a small amount of hardware. And for that reason,
The instruction control method of the present invention uses vector registers,
In a vector processing device having a plurality of pipeline-structured arithmetic processing units that process vector data of the vector register as operand data, an instruction register in which a vector instruction is set, a plurality of waiting registers, an input control circuit that stores vector instruction information in the instruction register in the waiting register; an execution information holding register that holds vector instruction information of a vector instruction that is currently being executed; and an input control circuit that stores the vector instruction information in the waiting register. an instruction transmission control circuit that stores the transmitted vector instruction information in a corresponding executing instruction information holding register at the same time as executing the instruction; and a comparison means that compares the vector instruction information of the waiting register with the vector instruction information of the executing instruction information holding register. and the instruction transmission control unit transmits vector instruction information for which the comparison means has determined that there is no factor preventing the start of instruction execution from among the vector instruction information in the plurality of waiting registers. This is a characteristic feature. Hereinafter, the present invention will be explained with reference to the drawings. FIG. 5 is a block diagram of the first embodiment of the instruction control unit of the present invention, FIG. 6 is a time chart when an instruction is overtaken, and FIGS. 7 and 8 are one embodiment of a register interference check circuit. FIG. In FIG. 5, 10 is a fetch register;
11 is a waiting register input control circuit, 12-1 and 1
2-2 is a waiting register, 13-1 and 13-2 are register interference check circuits, 14 is a selector, 15
is an instruction transmission control circuit, 16 is a multiplication register, 17
18 shows an addition register, 18-1 and 18-2 a register interference check circuit, and 19 a selection control circuit, respectively. Fetch register 10 is set with instruction information fetched from main memory 1. The waiting register input control circuit 11 transfers the instruction information of the fetch register 10 to the waiting register 12-1 or 12-2 on the condition that the register interference check circuits 13-1 and 13-2 indicate that there is no register interference. Move to. Register interference check circuit 1
Details of 3-1 and 13-2 will be described later. The selector 14 selects the waiting register 12-1 or 12-2 according to a control signal from the selection control circuit 19. The selection control circuit 19 inverts the value of the control signal every cycle. When the output of the selector 14 is a vector addition instruction, the instruction transmission control circuit 15 transfers it to the addition register 17, provided that both register interference check circuits 18-1 and 18-2 indicate no register interference. , selector 1
If the output of step 4 is a vector multiplication instruction, it is transferred to the multiplication register 16, and at the same time, arithmetic processing unit activation information is sent out. Details of the register interference check circuits 18-1 and 18-2 will be described later. FIG. 6 is a time chart when instructions are overtaken. The instruction information VM1, 2, 3 is set in the fetch register 10 at time T=0, transferred to the waiting register 12-1 at T=1, transferred to the multiplication register 16 at T=2, and then multiplied. Begins. The pre-WRITE flag is turned on at the same time as multiplication starts. This flag is a vector
When writing of the operation result to the register starts,
It will be turned off. The command information VA4,5,1 is T
= 1, it is set in fetch register 10, and T
=2, it is set in the waiting register 12-2. VM1, 2, 3 held in multiplication register 16
1st operand register and waiting register 12
Since there is register interference with the first vector registers of VA4, 5, and 1 held at VA-2, the transmission of the command from the waiting register 12-2 is delayed. At T=2, VA7, 8, and 9 are set in fetch register 10. Waiting register 12-2
Since there is no register interference between VA4, 5, 1 held in the fetch register 10 and VA7, 8, 9 held in the fetch register 10, VA
7, 8, and 9 are moved to the waiting register 12-1, and since there is no factor that prevents the command from being issued, VA is set at T=4.
7, 8, and 9 are transferred to the addition register 17, and the adder 8 is started to process. FIG. 7 shows one embodiment of the register interference check circuit 18-1. Note that the register interference check circuits 18-1 and 18-2 have the same configuration. In FIG. 7, 20 and 21 are matching circuits, 2
2 and 23 are AND circuits, and 24 is an OR circuit, respectively. The matching circuit 20 receives the first operand register number of the instruction being processed and the second operand register number of the processing instruction, and the matching circuit 21 receives the first operand register number of the instruction currently processing and the processing instruction. The third operand register number is input. The AND circuit 22 includes a pre-start flag for the instruction WRITE during arithmetic processing and a match circuit 20.
The AND circuit 23 receives the arithmetic processing instruction WRITE pre-start flag and the output of the coincidence circuit 21 . The output of AND circuits 22 and 23 is
It is input to the OR circuit 24. When the output of the OR circuit 24 is logic "1", it indicates that there is register interference. FIG. 8 shows one embodiment of the register interference check circuit 13-1. In Figure 8, 2
5 to 27 indicate matching circuits, and 28 indicates an OR circuit, respectively. The first operand register number of the instruction register 10 is input to one input terminal of the matching circuits 25, 26, and 27, and the first operand register number of the waiting register is input to the other input terminal of the matching circuit 25. , the second operand register number of the waiting register is input to the other input terminal of the matching circuit 26, and the third operand register number of the waiting register is input to the other input terminal of the matching circuit 27. Matching circuit 2
The outputs of 5, 26, and 27 are input to an OR circuit 28. When the output of the OR circuit 28 is logic "1", it indicates that there is register interference. FIG. 9 is a block diagram of a second embodiment of the present invention, FIG. 10 is a flowchart showing the logic of the priority setting circuit, and FIG. 11 is a block diagram of one embodiment of the register interference check circuit. In FIG. 9, 110 is an instruction fetch register, 111 is a waiting register input control circuit, and 11
2-1 and 112-2 are waiting registers, 113 is a register interference check circuit, 114 is a selector, 1
15 is an instruction transmission control circuit, 116 is a multiplication register, 117 is an addition register, 118-1 and 118
-2 represents a register interference check circuit, 120 represents a priority setting circuit, and 121 represents a priority flip-flop. In addition, 110, 112-1,
112-2, 114, 116, 117, 118-
Those indicated by 1,118-2 are respectively 10, 12-1, 12-2, 14, 16, 17,
It is the same as that shown by 18-1 and 18-2. The second embodiment of FIG. 9 has a waiting register 112-
A priority is given to the instruction information stored in 1 and 112-2, and if there is register interference between these instruction information, the instruction with a higher priority is executed first. be. In this second embodiment, the waiting register input control circuit 111
If there is an empty waiting register 112-1 or 112, the instruction information in the fetch register 110 is moved to the empty waiting register without checking for register interference. FIG. 10 is a flowchart showing the logic of the priority setting circuit 120. Note that Q ₁ indicates the waiting register 112-1, and Q ₂ indicates the waiting register 112-1.
2 is shown. The priority order is set as follows. (b) Check whether the instruction information is set to _Q1 . If Yes, process (b); if No, process
Perform the processing in (e). (b) Check whether Q ₂ is paritud or not. No
If yes, perform process (d); if Yes, perform process (g). (c) Find out whether Q ₁ will be released or not. Release means that the Q ₁ command is sent to the command transmission control circuit 1.
15 means setting in the acquisition multiplication register 116 or addition register 117.
If No, process (d) will be performed, if Yes, process (ch) will be performed.
processing is performed. (d) Set the priority of Q ₁ higher than Q ₂ . (e) Check whether the instruction information is set to _Q2 . If No, perform the process in (i), and if Yes, perform the process in (f). (f) Check whether Q ₁ is valid or not. No
If so, perform the process in (H), and if Yes, perform the process in (C). (g) Find out whether Q ₂ will be released. If No, perform process (H); if Yes, perform process (D). (H) Set the priority of Q ₂ higher than Q ₁ . (li) Check whether the priority of Q ₁ is higher than Q ₂ . If Yes, process (d); if No, process
Perform the processing in (h). FIG. 11 is a block diagram of one embodiment of register interference check circuit 113. In FIG. 11, 122 to 126 are matching circuits, and 127 is a matching circuit.
OR circuit, 128 and 129 are AND circuit, 130 is
Each shows an OR circuit. The first operand register number of the waiting register 112-2 is input to one input terminal of the matching circuits 122, 123, 124, and the second operand register number of the waiting register 112-1 is input to the other input terminal of the matching circuit 122.
The operand register number is input, and the waiting register 112 is input to the other input terminal of the matching circuit 123.
The third operand register number -1 is input, and the first operand register number of the waiting register 112-1 is input to the other input terminal of the matching circuit 124. The first operand register number of the waiting register 112-1 is input to one input terminal of the matching circuits 125 and 126, and the matching circuit 1
25 has a waiting register 112-
The second operand register number of 2 is input;
The third operand register number of the waiting register 112-2 is input to the other input terminal of the matching circuit 126. Matching circuits 122, 123, 124,
The outputs of 125 and 126 are input to an OR circuit 127. When the negative output of the OR circuit 127 is logic "1", it indicates that there is no register interference, and when the positive output of the OR circuit 127 is logic "1", it indicates that there is register interference. AND circuit 128 detects that there is register interference, waiting register 112-1
is selected and the waiting register 112-
Logical “1” with the condition that 1 has higher priority
Output. The AND circuit 129 outputs logic "1" under the conditions that there is register interference, that the waiting register 112-2 is selected, and that the priority of the waiting register 112-2 is high. The OR circuit 130 includes the negative output of the OR circuit 127, the output of the AND circuit 128, and the output of the AND circuit 12.
The output of 9 is input. The output of the OR circuit 130 becomes a command executable state signal, and this signal is sent to the command generation control circuit 115. When the instruction generation control circuit 115 receives the instruction executable state signal of logic "1", it takes in the instruction information output from the selector 114 and performs a register interference check and the like. In the above explanation, the priority order is determined by the priority setting circuit 120 having the logic as shown in FIG.
From fetish register 110 to waiting register 11
When inputting a command to 2-1 or 112-2,
It is also possible to assign an instruction number and determine the priority order based on the number. FIG. 12 is a block diagram showing a third embodiment of the present invention, and FIG. 13 is a time chart showing its operation. In FIG. 12, 210 is a fetch register, 211 is a waiting register input control circuit, and 2
12 is a waiting register, 213-1 and 213-2 are also waiting registers, 224 is a selector, 225 is an instruction transmission control circuit, 226 is a multiplication register, 227 is an addition register, 228-1 and 228-2 are register interference check circuits. are shown respectively. Code 210, 224, 225, 226, 227, 22
8-1, 228-2 are numbered 10, 14, 15, 16, 17, 18-1, respectively.
It is almost the same as that shown by 18-2. A multiplication instruction is stored in the waiting register 212,
Addition instructions are stored in waiting registers 213-1 and 213-2. The addition instruction input to the waiting register 213-1 is transferred to the waiting register 213-2.
When the addition instruction becomes empty, the waiting register 213
-2. Waiting register input control circuit 21
1 performs a register interference check as described above and inputs instruction information to the waiting register 212 or 213-1. FIG. 13 is a time chart showing the operation of the embodiment of FIG. 12. In addition, F is the fetch register 210, AQB is the waiting register 213-1, AQ
stands for the waiting register 213-2, AR stands for the multiplication register 227, MQ stands for the waiting register 212, and MR stands for the multiplication register 226. FIG. 13 shows the case where the instruction sequence VA1, 2, 3 VA 4, 5, 1 VA 7, 8, 4 VMA, B, C VMD, E, A is executed. First, the instruction is set in fetch register F, and in the next cycle, this instruction is set in fetch register F.
It is moved to AQB, and in the next cycle it is moved to register AQ, and at T=0 it is moved to addition register AR.
The instruction in fetch register F is delayed by one cycle.
In the next cycle, it is moved to the waiting register AQB, and at T=0, it is moved to the waiting register AQ.
Then, when the instruction of is finished, the addition register is
Moved to AR. The instruction is set in the fetch register F one cycle later than the instruction in , and is set in the fetch register F in the next cycle (T=0).
It is moved to AQB, when the waiting register AQ becomes free, it is moved there, and when the instruction in is completed, it is moved to the addition register AR. The command of is 1 more than the command of
After a cycle delay, it is set in fetch register F at T=0, moved to waiting register MQ in the next cycle, and moved to multiplication register MR in the next cycle. The instruction is set in the fetch register F at T=1, and is set in the waiting register F in the next cycle.
It is moved to MQ, and when the instruction in is completed, it is moved to multiplication register MR. FIG. 14 shows the operation for executing the same instruction sequence when the waiting register AQB does not exist. In the case of FIG. 14, 55 cycles are required to execute the instruction sequence, but in the case of FIG. 13, only 39 cycles are required. In addition, in the embodiment shown in FIG. 12, the waiting register 2
If there is no instruction in both registers 13-1 and 213-2, the instruction is sent directly to the waiting register 213-2.
It is also possible to control the amount to be input at 2. As is clear from the above description, according to the present invention, it is possible to significantly shorten the instruction processing time with a small amount of hardware.

[Brief explanation of the drawing]

第１図はベクトル処理装置の概要を示す図、第
２図は加算器におけるベクトル加算命令の処理状
況を示す図、第３図はベクトル命令系列の従来の
処理を示す図、第４図は後発のベクトル命令を先
発のベクトル命令より先に実行した場合の命令処
理時間の短縮を説明する図、第５図は本発明の第
１実施例のブロツク図、第６図は命令の追越しが
行われる場合のタイムチヤート、第７図および第
８図はレジスタ干渉チエツク回路の１実施例のブ
ロツク図、第９図は本発明の第２実施例のブロツ
ク図、第１０図は優先設定回路の論理を示すフロ
ーチヤート、第１１図は第２実施例におけるレジ
スタ干渉チエツク回路の１実施例のブロツク図、
第１２図は本発明の第３実施例のブロツク図、第
１３図はその動作を示すタイムチヤート、第１４
図は比較のためのタイムチヤートである。１０…フエツチ・レジスタ、１１…待合レジス
タ投入制御回路、１２−１と１２−２…待合レジ
スタ、１３−１と１３−２…レジスタ干渉チエツ
ク回路、１４…セレクタ、１５…命令発信制御回
路、１６…乗算レジスタ、１７…加算レジスタ、
１８−１と１８−２…レジスタ干渉チエツク回
路、１９…選択制御回路、１１０…フエツチ・レ
ジスタ、１１１…待合レジスタ投入制御回路、１
１２−１と１１２−２…待合レジスタ、１１３…
レジスタ干渉チエツク回路、１２０…優先設定回
路、１２１…優先フリツプ・フロツプ、２１２…
待合レジスタ、２１３−１と２１３−２…待合レ
ジスタ。 Figure 1 is a diagram showing an overview of a vector processing device, Figure 2 is a diagram showing the processing status of vector addition instructions in an adder, Figure 3 is a diagram showing conventional processing of vector instruction sequences, and Figure 4 is a diagram showing later development. FIG. 5 is a block diagram of the first embodiment of the present invention, and FIG. 6 is a diagram illustrating the reduction in instruction processing time when a vector instruction is executed before the preceding vector instruction. 7 and 8 are block diagrams of one embodiment of the register interference check circuit, FIG. 9 is a block diagram of the second embodiment of the present invention, and FIG. 10 shows the logic of the priority setting circuit. 11 is a block diagram of one embodiment of the register interference check circuit in the second embodiment,
FIG. 12 is a block diagram of the third embodiment of the present invention, FIG. 13 is a time chart showing its operation, and FIG.
The figure is a time chart for comparison. 10...Fetch register, 11...Waiting register input control circuit, 12-1 and 12-2...Waiting register, 13-1 and 13-2...Register interference check circuit, 14...Selector, 15...Instruction transmission control circuit, 16 ...Multiplication register, 17...Addition register,
18-1 and 18-2...Register interference check circuit, 19...Selection control circuit, 110...Fetch register, 111...Waiting register input control circuit, 1
12-1 and 112-2...waiting register, 113...
Register interference check circuit, 120...Priority setting circuit, 121...Priority flip-flop, 212...
Waiting registers, 213-1 and 213-2...Waiting registers.

Claims

[Claims] 1. In a vector processing device that includes a vector register and a plurality of pipeline-structured arithmetic processing units that process vector data in the vector register as operand data, a vector instruction is an instruction register to be set, a plurality of waiting registers, an input control circuit that stores the vector instruction information of the instruction register in the waiting register, and an execution information holding register that holds the vector instruction information of the vector instruction that is currently being executed. an instruction transmission control circuit that transmits vector instruction information in the waiting register and stores the transmitted vector instruction information in a corresponding executing instruction information holding register; and a comparison means for comparing the vector instruction information in the information holding register, and the instruction transmission control section determines, by the comparison means, a factor that prevents the start of instruction execution from among the vector instruction information in the plurality of waiting registers. An instruction control method characterized by transmitting vector instruction information that is determined not to exist. 2. Comparing means for comparing vector instruction information in the instruction register and vector instruction information in the waiting register, and by the comparing means, the order of vector instructions is determined between the vector instruction information in the instruction register and the vector instruction information in the waiting register. 2. The instruction control method according to claim 1, wherein the vector instruction information in the instruction register is transferred to the waiting register by the input control circuit when it is determined that there is no need to distinguish between the instructions. 3. A selection circuit for selecting one of a plurality of waiting registers is provided, and a comparison means compares the vector instruction information of the waiting register selected by the selection circuit with the vector instruction information of the executing instruction information holding register. An instruction control system according to claim 1 or 2, characterized in that: 4. It has an order identification means for identifying the order among the waiting registers, and a comparison means for comparing vector instruction information between the waiting registers, and the comparison means prevents register interference between the waiting registers. When it is determined that there is no vector instruction, all vector instructions in the waiting register are considered to be in the instruction execution start state, and when it is determined that there is register interference between the waiting registers, the ordering identification means determines that the vector instructions have a high priority. 2. The instruction control method according to claim 1, wherein the vector instruction that has been executed is considered to be in a state where instruction execution can be started. 5 A plurality of waiting registers are divided into a plurality of groups, one vector instruction information is selected in a first-in, first-out manner within a group, and the vector instruction information before execution of the operation selected for each group and the executing instruction information holding register are combined. 2. The instruction control method according to claim 1, wherein the vector information is compared with the vector information by a comparing means.