JP2010134713A

JP2010134713A - Arithmetic processing apparatus and conversion device

Info

Publication number: JP2010134713A
Application number: JP2008310153A
Authority: JP
Inventors: Katsunori Hirase; 勝典平瀬; Makoto Kosone; 真小曽根; Kazuhisa Iizuka; 和久飯塚
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2008-12-04
Filing date: 2008-12-04
Publication date: 2010-06-17

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently execute multiplication in an arithmetic processing apparatus with an arithmetic logical unit including a plurality of computing elements. <P>SOLUTION: The arithmetic processing apparatus includes an arithmetic logical unit 10 which is capable of changing a function in accordance with setting data supplied from the outside. The arithmetic logical unit 10 includes first computing elements 11 to 46 capable of selectively executing various arithmetic logical operations except multiplication and second computing elements 61 and 71 capable of executing multiplication independently. The first computing elements 11 to 46 may constitute a first computing element array of (x rows)×(y columns) wherein x is an integer equal to or larger than 2 and y is an integer equal to or larger than 2. The second computing elements 61 and 71 may constitute a second computing element column or a second computing element array of (m rows)×(n columns) wherein m is a natural number equal to or smaller than x and n is a natural number. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、複数の演算器を有する演算処理装置、およびその演算処理装置に設定すべき設定データをソースプログラムから生成する変換装置に関する。 The present invention relates to an arithmetic processing device having a plurality of arithmetic units and a conversion device that generates setting data to be set in the arithmetic processing device from a source program.

近年、複数の演算器（以下適宜、ＡＬＵ（Arithmetic Logic Unit）という）を有する演算部を備える演算処理装置の開発が進められている。このような演算処理装置では、制御部から上記演算部に設定データが供給されることにより、当該演算部内のＡＬＵおよび接続部が制御され、当該演算部が全体として所期の回路を構成する。 In recent years, development of an arithmetic processing device including an arithmetic unit having a plurality of arithmetic units (hereinafter appropriately referred to as ALU (Arithmetic Logic Unit)) has been advanced. In such an arithmetic processing device, setting data is supplied from the control unit to the arithmetic unit, whereby the ALU and the connection unit in the arithmetic unit are controlled, and the arithmetic unit constitutes an intended circuit as a whole.

従来、上記演算部に含まれる個々のＡＬＵに乗算機能を持たせない構成が一般的であった（たとえば、特許文献１〜３および非特許文献１参照）。ＡＬＵに乗算機能を持たせると、その回路規模および消費電力が増大してしまう点が考慮されてのことである。
特開２００４−２２０３７７号公報特開２００７−２１３５９４号公報特開２００５−２７５６９８号公報飯塚和久、平瀬勝典、小曽根真、平松達夫、”ＡＬＵアレイアーキテクチャＬＳＩを用いた放送受信機の実現”、電子情報通信学会技術研究報告 Vol.108, No.172, pp.43-47(SR2008-24), Jul. 2008 Conventionally, a configuration in which each ALU included in the arithmetic unit does not have a multiplication function is common (see, for example, Patent Documents 1 to 3 and Non-Patent Document 1). If the ALU has a multiplication function, the circuit scale and power consumption will be increased.
JP 2004-220377 A JP 2007-213594 A JP-A-2005-275698 Kazuhisa Iizuka, Katsunori Hirase, Makoto Kosone, Tatsuo Hiramatsu, “Realization of Broadcast Receiver Using ALU Array Architecture LSI”, IEICE Technical Report Vol.108, No.172, pp.43-47 (SR2008- 24), Jul. 2008

上述した構成では、乗算式を展開することにより乗算を実行していた。たとえば、乗算式をシフト演算と加減算の組み合わせ式に変換することにより、乗算を実行していた。しかしながら、このようなアプローチは、フィルタ処理など乗算が多用されるアプリケーションに対して効率的でなかった。上記アプローチでは、複数のＡＬＵを用いて複数の演算ステップを経なければ乗算を実行できないため、処理時間の遅延につながったり、上記演算部に含まれるＡＬＵの利用効率の低下につながっていた。 In the above-described configuration, multiplication is performed by developing a multiplication formula. For example, the multiplication is performed by converting the multiplication expression into a combination expression of shift operation and addition / subtraction. However, such an approach has not been efficient for applications where multiplication is frequently used, such as filtering. In the above approach, multiplication cannot be executed unless a plurality of calculation steps are performed using a plurality of ALUs, which leads to a delay in processing time and a decrease in utilization efficiency of the ALU included in the calculation unit.

本発明はこうした状況に鑑みなされたものであり、その目的は、複数の演算器を含む演算部を備える演算処理装置において、乗算を効率的に実行することができる技術を提供することにある。 The present invention has been made in view of such a situation, and an object of the present invention is to provide a technique capable of efficiently performing multiplication in an arithmetic processing device including an arithmetic unit including a plurality of arithmetic units.

本発明のある態様の演算処理装置は、外部から供給される設定データに応じて機能の変更が可能な演算部を備える演算処理装置であって、演算部は、乗算を除く複数種類の算術論理演算を選択的に実行可能な複数の第１演算器と、乗算を単体で実行可能な少なくとも一つの第２演算器と、を含む。 An arithmetic processing device according to an aspect of the present invention is an arithmetic processing device including an arithmetic unit that can change a function according to setting data supplied from the outside, and the arithmetic unit includes a plurality of types of arithmetic logic except for multiplication. A plurality of first computing units capable of selectively executing an operation, and at least one second computing unit capable of executing a multiplication alone.

本発明の別の態様は、変換装置である。この装置は、ソースプログラムを、演算処理装置で処理されるべき設定データに変換する変換装置であって、ソースプログラムに含まれる乗算処理を、第１演算器のシフト演算機能を用いて実行するか、第２演算器の乗算機能を用いて実行するかを判定する判定部と、乗算処理を、判定部による判定結果に応じた設定データに変換する設定データ生成部と、を備える。 Another aspect of the present invention is a conversion device. This device is a conversion device that converts a source program into setting data to be processed by an arithmetic processing device, and executes a multiplication process included in the source program using a shift arithmetic function of the first arithmetic unit. And a determination unit that determines whether to execute using the multiplication function of the second arithmetic unit, and a setting data generation unit that converts the multiplication process into setting data according to the determination result by the determination unit.

本発明のさらに別の態様もまた、変換装置である。この装置は、ソースプログラムを、演算処理装置で処理されるべき設定データに変換する変換装置であって、ソースプログラムに含まれる乗算処理を、第１演算器のシフト演算機能を用いて実行するか、第２演算器の乗算機能を用いて実行するかを判定する判定部と、乗算処理を、判定部による判定結果に応じた設定データに変換する設定データ生成部と、を備える。判定部は、第１演算器のシフト演算機能を用いて実行する、第２演算器の乗算機能を用いて実行するのうち、第１演算器アレイの行数を基準に、乗算処理を少ない行数で実行可能なほうを選択する。 Yet another embodiment of the present invention is also a conversion device. This device is a conversion device that converts a source program into setting data to be processed by an arithmetic processing device, and executes a multiplication process included in the source program using a shift arithmetic function of the first arithmetic unit. And a determination unit that determines whether to execute using the multiplication function of the second arithmetic unit, and a setting data generation unit that converts the multiplication process into setting data according to the determination result by the determination unit. The determination unit executes the multiplication process with a small number of rows based on the number of rows of the first arithmetic unit array among the executions using the multiplication function of the second arithmetic unit, which is executed using the shift arithmetic function of the first arithmetic unit. Select the number that can be executed.

なお、以上の構成要素の任意の組み合わせ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

本発明によれば、複数の演算器を含む演算部を備える演算処理装置において、乗算を効率的に実行することができる。 According to the present invention, multiplication can be efficiently executed in an arithmetic processing device including an arithmetic unit including a plurality of arithmetic units.

図１は、本発明の実施の形態１に係る演算処理装置１００の構成を示すブロック図である。演算処理装置１００は、演算部１０、制御部２０および記憶部３０を備える。 FIG. 1 is a block diagram showing a configuration of an arithmetic processing apparatus 100 according to Embodiment 1 of the present invention. The arithmetic processing device 100 includes an arithmetic unit 10, a control unit 20, and a storage unit 30.

演算部１０は、制御部２０から供給される設定データ（以下適宜、コマンドデータという）に基づいて所定の演算を実行する。演算部１０は、当該コマンドデータに応じて動的に機能の変更が可能なリコンフィギュラブル回路を構成する。なお、演算部１０の詳細な構成は後述する。 The calculation unit 10 performs a predetermined calculation based on setting data (hereinafter, appropriately referred to as command data) supplied from the control unit 20. The arithmetic unit 10 constitutes a reconfigurable circuit whose function can be dynamically changed according to the command data. The detailed configuration of the calculation unit 10 will be described later.

制御部２０は、外部から入力されるコマンドデータを保持し、その保持しているコマンドデータを順次、演算部１０に供給する。このコマンドデータは、ソースプログラムから後述する変換装置２００（図９参照）により変換されたデータであってもよい。 The control unit 20 holds command data input from the outside, and sequentially supplies the held command data to the calculation unit 10. The command data may be data converted from a source program by a conversion device 200 (see FIG. 9) described later.

記憶部３０は、演算部１０で処理される演算データを保持する。この演算データは、最終的な演算結果を示すデータに限らず、演算途中のデータも含まれる。また、この演算データは、変数であってもよいし、定数であってもよい。 The storage unit 30 holds calculation data processed by the calculation unit 10. This calculation data is not limited to data indicating the final calculation result, but includes data in the middle of calculation. Further, the calculation data may be a variable or a constant.

図２は、従来技術に係る演算部１０の第１構成例を示す図である。この第１構成例に係る演算部１０は、複数の第１演算器（図２〜８ではＡＬＵと表記）１１〜４６と、複数の接続部（図２、４、６〜８ではＳＷと表記）５１〜５４を含む。第１演算器１１〜４６は、ｘ（ｘは２以上の整数）行×ｙ（ｙは２以上の整数）列の第１演算器アレイを構成する。図２では、４行×６列の第１演算器アレイを構成する。以下適宜、ｙ個の第１演算器を含む１行の演算器群を第１演算器列と表記する。第１演算器１１〜４６および接続部５１〜５４の構成は、上記コマンドデータにより動的に設定変更される。 FIG. 2 is a diagram illustrating a first configuration example of the calculation unit 10 according to the related art. The arithmetic unit 10 according to the first configuration example includes a plurality of first arithmetic units (denoted as ALU in FIGS. 2 to 8) 11 to 46 and a plurality of connection units (denoted as SW in FIGS. 2, 4, and 6 to 8). ) 51-54. The first computing units 11 to 46 constitute a first computing unit array having x (x is an integer of 2 or more) rows × y (y is an integer of 2 or more) columns. In FIG. 2, a first computing element array of 4 rows × 6 columns is constructed. Hereinafter, one row of computing units including y first computing units will be referred to as a first computing unit column as appropriate. The configurations of the first computing units 11 to 46 and the connection units 51 to 54 are dynamically changed by the command data.

各接続部５１〜５４は、隣接する２段の第１演算器列の間に設けられる。各接続部５１〜５４は、前段の第１演算器列に含まれる第１演算器の出力と、後段の第１演算器列に含まれる第１演算器の入力との接続関係を設定する。最下段の第１演算器列は、接続部５１を介して最上段の第１演算器列に他の段間と同様に接続される。 Each of the connection portions 51 to 54 is provided between two adjacent first arithmetic unit rows. Each connection part 51-54 sets the connection relation of the output of the 1st computing unit contained in the 1st computing unit row | line | column of the front | former stage, and the input of the 1st computing unit contained in the 1st computing unit row | line | column of a back | latter stage. The lowermost first computing element row is connected to the uppermost first computing element row through the connection unit 51 in the same manner as the other stages.

当該第１演算器アレイでの処理は段毎に行われており、１段目の第１演算器列で処理された結果が接続部５２を介して２段目の第１演算器列に渡され、その後に２段目の第１演算器列で処理されるようになっている。各段の処理は、それぞれ１クロックごとに行われる。４段のＡＬＵアレイが使用される場合、４つの独立した処理（以下、スレッドという）が動作できるようになっている。たとえば、スレッド１が１段目の第１演算器列で処理された後、つぎのクロックで、スレッド１が２段目の第１演算器列で処理されるとともに、スレッド２が１段目の第１演算器列で処理される。 The processing in the first computing element array is performed for each stage, and the result processed in the first computing element row in the first stage is passed to the first computing element row in the second stage via the connection unit 52. After that, the processing is performed by the first computing element row in the second stage. Each stage of processing is performed every clock. When a four-stage ALU array is used, four independent processes (hereinafter referred to as threads) can operate. For example, after thread 1 is processed by the first stage of the first arithmetic unit, after the next clock, thread 1 is processed by the first stage of the second arithmetic unit and thread 2 is processed by the first stage. Processed by the first arithmetic unit array.

各第１演算器は、複数種類の多ビット演算を選択的に実行可能な算術論理回路であって、加減算、比較演算、論理演算、シフト演算、選択演算、乗算演算の補助演算などの複数種類の多ビット演算を設定により選択的に実行することができる。なお、乗算演算の補助演算の詳細は後述する。 Each first arithmetic unit is an arithmetic logic circuit capable of selectively executing a plurality of types of multi-bit operations, and includes a plurality of types such as addition / subtraction, comparison operation, logic operation, shift operation, selection operation, and auxiliary operation of multiplication operation. The multi-bit operation can be selectively executed by setting. Details of the auxiliary operation of the multiplication operation will be described later.

第１演算器は、乗算演算を実行することはできない。ここでの乗算演算とは、１０進数の乗算を意味し、２進数の乗算を含まない。第１演算器はシフト演算機能を備えるため、１０進数で記述された乗数が２の乗数である場合、結果的に乗算演算を単体で完結させることができるが、その乗数が２の乗数でない場合、単体ではその乗算演算を完結させることができない。この考察も含めて本明細書では、第１演算器は乗算演算機能を持たないと定義する。 The first arithmetic unit cannot execute a multiplication operation. The multiplication operation here means a decimal multiplication and does not include a binary multiplication. Since the first arithmetic unit has a shift operation function, when the multiplier described in decimal is a multiplier of 2, as a result, the multiplication operation can be completed alone, but the multiplier is not a multiplier of 2 The single unit cannot complete the multiplication operation. In this specification including this consideration, it is defined that the first arithmetic unit does not have a multiplication operation function.

図３は、従来技術に係る演算部１０の第２構成例を示す図である。この第２構成例に係る演算部１０は、複数の第１演算器１１〜４６と、複数の接続部（図３、５では矢印のみで表記）５１ａ〜５４ａを含む。図３でも、第１演算器１１〜４６は、４行×６列の第１演算器アレイを構成する。 FIG. 3 is a diagram illustrating a second configuration example of the calculation unit 10 according to the related art. The calculation unit 10 according to the second configuration example includes a plurality of first calculation units 11 to 46 and a plurality of connection units (shown only by arrows in FIGS. 3 and 5) 51a to 54a. Also in FIG. 3, the first computing units 11 to 46 constitute a first computing unit array of 4 rows × 6 columns.

従来技術の第２構成例に係る演算部１０の構成は、従来技術の第１構成例に係る演算部１０と基本的に同じであるが、その第２構成例では接続部５１ａ〜５４ａが第１演算器間の接続を制限している。すなわち、上記第２構成例に係る接続部５１ａ〜５４ａは、前段の１つの第１演算器の出力先を後段の直下の第１演算器とその左右の第１演算器の３方向に制限する。これに対し、上記第１構成例に係る接続部５１〜５４は、前段の１つの第１演算器の出力先を制限しない。すなわち、前段の１つの第１演算器の出力を後段のいずれの第１演算器にも入力することができる。このように、上記第２構成例に係る演算部１０では、上記接続制限が施されているため、上記第１構成例に係る演算部１０と比較し、第１演算器間の接続数を大幅に削減することができる。 The configuration of the calculation unit 10 according to the second configuration example of the prior art is basically the same as that of the calculation unit 10 according to the first configuration example of the conventional technology. However, in the second configuration example, the connection units 51a to 54a are the first ones. Connection between one arithmetic unit is limited. That is, the connection parts 51a to 54a according to the second configuration example limit the output destination of one first computing unit in the preceding stage to three directions, that is, the first computing unit immediately below the succeeding stage and the first computing unit on the left and right. . On the other hand, the connection parts 51-54 which concern on the said 1st structural example do not restrict | limit the output destination of one 1st arithmetic unit of a front | former stage. That is, the output of one first arithmetic unit at the front stage can be input to any first arithmetic unit at the rear stage. Thus, since the connection restriction is applied in the arithmetic unit 10 according to the second configuration example, the number of connections between the first arithmetic units is greatly increased as compared with the arithmetic unit 10 according to the first configuration example. Can be reduced.

以下、本発明の実施の形態１に係る演算部１０について説明する。実施の形態１に係る演算部１０は、複数の第１演算器と、複数の接続部に加えて、少なくとも１つの第２演算器（図３〜図８では乗算器と表記する）を含む。第２演算器は、ｍ（ｍはｘ以下の自然数）行×ｎ（ｎは自然数）列の、第２演算器列または第２演算器アレイを構成する。たとえば、第２演算器は、第１演算器アレイの複数行ごとに１つ設けられてもよい。 Hereinafter, the calculation unit 10 according to Embodiment 1 of the present invention will be described. The arithmetic unit 10 according to the first embodiment includes a plurality of first arithmetic units and at least one second arithmetic unit (referred to as a multiplier in FIGS. 3 to 8) in addition to the plurality of connection units. The second arithmetic unit constitutes a second arithmetic unit column or a second arithmetic unit array having m (m is a natural number equal to or less than x) rows × n (n is a natural number) columns. For example, one second arithmetic unit may be provided for each of a plurality of rows of the first arithmetic unit array.

第２演算器は、乗算演算を単体で実行可能な演算器である。第２演算器は、乗算演算を専属的に実行する演算器であってもよいし、乗算演算に加えてその他の種類の演算も選択的に実行可能な演算器であってもよい。なお、ここでの乗算演算とは１０進数の乗算を意味する。 The second arithmetic unit is an arithmetic unit capable of executing a multiplication operation alone. The second arithmetic unit may be an arithmetic unit that exclusively executes multiplication operations, or may be an arithmetic unit that can selectively execute other types of operations in addition to multiplication operations. The multiplication operation here means a multiplication of decimal numbers.

図４は、本発明の実施の形態１に係る演算部１０の第１構成例を示す図である。この演算部１０は、従来技術の第１構成例に係る演算部１０の構成に、第２演算器６１、７１が追加された構成である。実施の形態１の第１構成例では、第１演算器１１〜４６が４行×６列の第１演算器アレイを構成し、第２演算器６１、７１が２行×１列の第２演算器列を構成する。ここでは、第１演算器アレイの２行ごとに１つの第２演算器が設けられる。 FIG. 4 is a diagram illustrating a first configuration example of the arithmetic unit 10 according to the first embodiment of the present invention. The computing unit 10 has a configuration in which second computing units 61 and 71 are added to the configuration of the computing unit 10 according to the first configuration example of the related art. In the first configuration example of the first embodiment, the first computing units 11 to 46 constitute a first computing unit array of 4 rows × 6 columns, and the second computing units 61 and 71 are second rows of 2 rows × 1 column. Construct an arithmetic operator string. Here, one second arithmetic unit is provided for every two rows of the first arithmetic unit array.

図５は、本発明の実施の形態１に係る演算部１０の第２構成例を示す図である。この演算部１０は、従来技術の第２構成例に係る演算部１０の構成に、第２演算器６１、７１が追加された構成である。実施の形態１の第２構成例でも、第１演算器１１〜４６が４行×６列の第１演算器アレイを構成し、第２演算器６１、７１が２行×１列の第２演算器列を構成する。ここでも、第１演算器アレイの２行ごとに１つの第２演算器が設けられる。なお、図５では第２演算器６１の出力先が第２演算器７１と第１演算器３６に制限されているが、実際は３段目の第１演算器列に含まれるいずれの第１演算器３１〜３６にも接続可能である。 FIG. 5 is a diagram illustrating a second configuration example of the arithmetic unit 10 according to the first embodiment of the present invention. This calculating part 10 is the structure which added the 2nd calculators 61 and 71 to the structure of the calculating part 10 which concerns on the 2nd structural example of a prior art. Also in the second configuration example of the first embodiment, the first arithmetic units 11 to 46 configure a first arithmetic unit array of 4 rows × 6 columns, and the second arithmetic units 61 and 71 are second units of 2 rows × 1 column. Construct an arithmetic operator string. Again, one second arithmetic unit is provided for every two rows of the first arithmetic unit array. In FIG. 5, the output destination of the second computing unit 61 is limited to the second computing unit 71 and the first computing unit 36, but in practice, any first computation included in the first stage computing unit row in the third stage. The devices 31 to 36 can also be connected.

図６は、本発明の実施の形態１に係る演算部１０の第３構成例を示す図である。この演算部１０は、従来技術の第１構成例に係る演算部１０の構成に、第２演算器６１、６２、７１、７２が追加された構成である。実施の形態１の第３構成例では、第１演算器１１〜４６が４行×６列の第１演算器アレイを構成し、第２演算器６１、６２、７１、７２が２行×２列の第２演算器アレイを構成する。ここでは、第１演算器アレイの２行ごとに２つの第２演算器が設けられる。 FIG. 6 is a diagram illustrating a third configuration example of the arithmetic unit 10 according to the first embodiment of the present invention. This calculating part 10 is the structure by which the 2nd calculator 61, 62, 71, 72 was added to the structure of the calculating part 10 which concerns on the 1st structural example of a prior art. In the third configuration example of the first embodiment, the first arithmetic units 11 to 46 constitute a first arithmetic unit array of 4 rows × 6 columns, and the second arithmetic units 61, 62, 71, 72 are 2 rows × 2 A second arithmetic unit array of columns is formed. Here, two second arithmetic units are provided for every two rows of the first arithmetic unit array.

図７は、本発明の実施の形態１に係る演算部１０の第４構成例を示す図である。この演算部１０は、従来技術の第１構成例に係る演算部１０の構成に、第２演算器６１が追加された構成である。実施の形態１の第４構成例では、第１演算器１１〜４６が４行×６列の第１演算器アレイを構成し、第２演算器が１つ設けられる。 FIG. 7 is a diagram illustrating a fourth configuration example of the arithmetic unit 10 according to the first embodiment of the present invention. This calculating part 10 is the structure by which the 2nd calculator 61 was added to the structure of the calculating part 10 which concerns on the 1st structural example of a prior art. In the fourth configuration example of the first embodiment, the first computing units 11 to 46 constitute a first computing unit array of 4 rows × 6 columns, and one second computing unit is provided.

図８は、本発明の実施の形態１に係る演算部１０の第５構成例を示す図である。この演算部１０は、従来技術の第１構成例に係る演算部１０の構成に、第２演算器６１、７１、８１、９１が追加された構成である。実施の形態１の第５構成例では、第１演算器１１〜４６が４行×６列の第１演算器アレイを構成し、第２演算器６１、７１、８１、９１が４行×１列の第２演算器列を構成する。ここでは、第１演算器アレイの１行に対して１つの第２演算器が設けられる。 FIG. 8 is a diagram illustrating a fifth configuration example of the arithmetic unit 10 according to the first embodiment of the present invention. The computing unit 10 has a configuration in which second computing units 61, 71, 81, and 91 are added to the configuration of the computing unit 10 according to the first configuration example of the related art. In the fifth configuration example of the first embodiment, the first arithmetic units 11 to 46 constitute a first arithmetic unit array of 4 rows × 6 columns, and the second arithmetic units 61, 71, 81, 91 are 4 rows × 1. A second arithmetic unit column of the column is configured. Here, one second arithmetic unit is provided for one row of the first arithmetic unit array.

図３〜図７に示すように、第１演算器アレイの複数行に１つの第２演算器が対応するようになっているのは、第２演算器での乗算演算処理が第１演算器での算術論理演算処理よりも時間がかかるためである。すなわち、第１演算器アレイの１行に１つの第２演算器を対応させると、第１演算器の動作速度を第２演算器の動作速度に合わせる必要があり、演算部１０全体の最大動作速度が低下してしまう。また、第２演算器の回路規模は第１演算器の回路規模より大きいため、第２演算器の数を多くしたくないという要請もある。ただし、演算部１０全体の最大動作速度の低下をある程度許容して、図８に示すように第１演算器アレイの１行に１つの第２演算器を対応させてもよい。乗算演算が非常に多いアプリケーションの場合、図８の構成のほうがそのアプリケーション全体の処理時間を短縮できることもある。 As shown in FIG. 3 to FIG. 7, one second arithmetic unit corresponds to a plurality of rows of the first arithmetic unit array because the multiplication arithmetic processing in the second arithmetic unit is the first arithmetic unit. This is because it takes more time than the arithmetic and logic operation processing in FIG. That is, if one second arithmetic unit is associated with one row of the first arithmetic unit array, it is necessary to match the operating speed of the first arithmetic unit with the operating speed of the second arithmetic unit. The speed will drop. In addition, since the circuit scale of the second arithmetic unit is larger than the circuit scale of the first arithmetic unit, there is a demand for not increasing the number of second arithmetic units. However, a reduction in the maximum operating speed of the entire arithmetic unit 10 may be allowed to some extent, and one second arithmetic unit may correspond to one row of the first arithmetic unit array as shown in FIG. In the case of an application having a large number of multiplication operations, the processing time of the entire application may be shortened with the configuration of FIG.

図９は、本発明の実施の形態２に係る変換装置２００の構成を示すブロック図である。変換装置２００は、所定のソースプログラムを実施の形態１に係る演算処理装置１００で処理されるべきコマンドデータに変換する。すなわち、所定のソースプログラムを当該コマンドデータにコンパイルする。 FIG. 9 is a block diagram showing a configuration of conversion apparatus 200 according to Embodiment 2 of the present invention. The conversion device 200 converts a predetermined source program into command data to be processed by the arithmetic processing device 100 according to the first embodiment. That is, a predetermined source program is compiled into the command data.

変換装置２００は、抽出部２１０、判定部２２０、データフローグラフ生成部２３０、コマンドデータ生成部２４０を備える。これらの構成は、ハードウェア的には、任意のコンピュータのＣＰＵ、メモリ、その他のＬＳＩで実現でき、ソフトウェア的にはメモリにロードされたプログラムなどによって実現されるが、ここではそれらの連携によって実現される機能ブロックを描いている。したがって、これらの機能ブロックがハードウェアのみ、ソフトウェアのみ、またはそれらの組み合わせによっていろいろな形で実現できることは、当業者には理解されるところである。 The conversion device 200 includes an extraction unit 210, a determination unit 220, a data flow graph generation unit 230, and a command data generation unit 240. These configurations can be realized in hardware by any computer's CPU, memory, and other LSIs, and in software, they are realized by programs loaded into the memory. Draw functional blocks. Therefore, those skilled in the art will understand that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.

抽出部２１０は、コンパイルされるべきソースプログラムから乗算処理を抽出する。より具体的には、乗算式が記述されたプログラムコードを抽出する。判定部２２０は、抽出部２１０により抽出された乗算処理を、演算処理装置１００の演算部１０に含まれる第１演算器のシフト演算機能を用いて実行するか、その演算部１０に含まれる第２演算器の乗算機能を用いて実行するかを判定する。 The extraction unit 210 extracts a multiplication process from the source program to be compiled. More specifically, a program code describing a multiplication formula is extracted. The determination unit 220 executes the multiplication process extracted by the extraction unit 210 by using the shift operation function of the first arithmetic unit included in the arithmetic unit 10 of the arithmetic processing apparatus 100, or the determination unit 220 includes the first arithmetic unit included in the arithmetic unit 10. It is determined whether to use the multiplication function of two arithmetic units.

判定部２２０は、上記第１演算器のシフト演算機能を用いて実行する、上記第２演算器の乗算機能を用いて実行するのうち、上記第１演算器アレイの行数を基準に、上記乗算処理を少ない行数で実行可能なほうを選択する。判定部２２０は、当該乗算処理をデータフローグラフ生成部２３０にコンパイルさせ、そのデータフローグラフを生成させる。その際、判定部２２０は当該乗算処理を前者の機能を用いて実行する場合のデータフローグラフと、後者の機能を用いて実行する場合のデータフローグラフとを生成させる。抽出部２１０は二つのデータフローグラフのうち、行数の少ないほうを選択する。行数が少ないほうが上記乗算処理の処理時間が短いためである。 The determination unit 220 executes using the shift operation function of the first arithmetic unit, and executes using the multiplication function of the second arithmetic unit, based on the number of rows of the first arithmetic unit array. Select the one that can execute multiplication with a small number of lines. The determination unit 220 causes the data flow graph generation unit 230 to compile the multiplication process and generate the data flow graph. At this time, the determination unit 220 generates a data flow graph when the multiplication process is executed using the former function and a data flow graph when the latter function is executed. The extraction unit 210 selects the smaller number of rows from the two data flow graphs. This is because the processing time of the multiplication process is shorter when the number of lines is smaller.

ここで、データフローグラフとは、演算間の実行順序の依存関係を表現し、入力変数および定数の演算の流れをグラフ構造で示したものである。本明細書では主に、各演算が演算部１０に含まれる第１演算器および第２演算器に割り当てられた後のものをデータフローグラフという。 Here, the data flow graph represents the dependency of the execution order between operations, and shows the flow of operations of input variables and constants in a graph structure. In the present specification, the operation after each operation is assigned to the first operation unit and the second operation unit included in the operation unit 10 is referred to as a data flow graph.

上記では、判定部２２０が実際に二つのデータフローグラフをデータフローグラフ生成部２３０に生成させて、いずれの機能を用いて実行するかを選択する手法を説明した。以下、判定部２２０がデータフローグラフをデータフローグラフ生成部２３０に生成させずに、上記乗算処理の性質を特定することにより、いずれの機能を用いて実行するかを選択する手法を３つ説明する。 In the above description, the method in which the determination unit 220 actually causes the data flow graph generation unit 230 to generate two data flow graphs and selects which function to use is described. Hereinafter, three methods will be described in which the determination unit 220 selects which function is used by specifying the nature of the multiplication process without causing the data flow graph generation unit 230 to generate a data flow graph. To do.

まず、第１手法について説明する。判定部２２０は、上記乗算処理が変数と定数との乗算であって、その定数が２の乗数であるとき、上記第１演算器のシフト演算機能を用いて実行するを選択し、それ以外のとき、上記第２演算器の乗算機能を用いて実行するを選択する。その定数が２、４、８、１６、３２、．．．の場合、上記変数を所定の桁数、左ビットシフトするだけで乗算演算が完了し、その定数が．．．、１／３２、１／１６、１／８、１／４、１／２の場合、上記変数を所定の桁数、右ビットシフトするだけで乗算演算が完了する。通常、第１演算器アレイの複数行に対して１つの第２演算器が設けられるため（図３〜図７参照）、乗算処理を１つの第１演算器のシフト演算機能で実行できる場合、第２演算器の乗算演算機能で実行する場合より、少ない行数で実行できることになる。 First, the first method will be described. When the multiplication process is multiplication of a variable and a constant and the constant is a multiplier of 2, the determination unit 220 selects execution using the shift operation function of the first arithmetic unit, At this time, the execution function is selected using the multiplication function of the second arithmetic unit. The constant is 2, 4, 8, 16, 32,. . . In the case of, the multiplication operation is completed simply by shifting the above variable to the left by a predetermined number of digits, and the constant is. . . , 1/32, 1/16, 1/8, 1/4, 1/2, the multiplication operation is completed simply by shifting the variable to the right by a predetermined number of digits. Usually, since one second arithmetic unit is provided for a plurality of rows of the first arithmetic unit array (see FIGS. 3 to 7), when the multiplication process can be executed by the shift arithmetic function of one first arithmetic unit, This can be executed with a smaller number of rows than when the multiplication operation function of the second arithmetic unit is used.

なお、変数と変数との乗算の場合、それらの変数の値がプログラムの実行結果に依存するため、判定部２２０は上記第２演算器の乗算機能を用いて実行するを選択する。なお、第１演算器が持つ乗算演算の補助演算機能を用いて実行するを選択してもよい。この機能の詳細は後述する。 Note that in the case of multiplication of variables, the values of those variables depend on the execution result of the program, so the determination unit 220 selects execution using the multiplication function of the second arithmetic unit. The execution may be selected using the auxiliary calculation function of the multiplication operation that the first calculator has. Details of this function will be described later.

つぎに、第２手法について説明する。判定部２２０は、上記乗算処理が変数と定数との乗算であって、その定数を２進数で表現した場合に１の数が所定の設定値以下であるとき、上記第１演算器のシフト演算機能を用いて実行するを選択し、それ以外のとき、上記第２演算器の乗算機能を用いて実行するを選択する。当該設定値は、実験結果やシミュレーション結果により得られた知見にもとづき設計者により設定されることができる。 Next, the second method will be described. When the multiplication process is a multiplication of a variable and a constant, and the constant is expressed by a binary number, when the number of 1 is equal to or less than a predetermined set value, the determination unit 220 performs the shift operation of the first arithmetic unit. Select to execute using the function, otherwise select to execute using the multiplication function of the second arithmetic unit. The set value can be set by the designer based on knowledge obtained from experimental results and simulation results.

以下、上記設定値が３に設定された例を説明する。上記定数が、「１２」、「３８」、「８２００」の場合について考える。
１２（２進数表記、１１００）＝（２＾３＋２＾２）
３８（２進数表記、１００１１０）＝（２＾５＋２＾２＋２＾１）
８２００（２進数表記、１０００００００００１０００）＝（２＾１３＋２＾３） Hereinafter, an example in which the set value is set to 3 will be described. Consider the case where the constants are “12”, “38”, and “8200”.
12 (binary notation, 1100) = (2 ^ 3 + 2 ^ 2)
38 (binary notation, 100110) = (2 ^ 5 + 2 ^ 2 + 2 ^ 1)
8200 (binary notation, 10000000001000) = (2 ^ 13 + 2 ^ 3)

いずれの定数も２進数で表現された場合、１が立っているビット数が３以下であるため、判定部２２０は、上記第１演算器のシフト演算機能を用いるほうを選択する。いずれの定数を用いた乗算演算も、三回以内のシフト演算と二回以内の加減算で実行することができ、第１演算器アレイを使用しても、比較的少ない行数で実行することができる。 When both constants are expressed in binary numbers, since the number of bits in which 1 stands is 3 or less, the determination unit 220 selects to use the shift operation function of the first arithmetic unit. Multiplication operations using any constant can be executed with up to three shift operations and up to two additions / subtractions, and even with the first arithmetic unit array, it can be executed with a relatively small number of rows. it can.

つぎに、第３手法について説明する。判定部２２０は、上記乗算処理が変数と定数との乗算であって、その定数が所定の設定値以下のシフト演算の組み合わせで表すことができるとき、上記第１演算器のシフト演算機能を用いて実行するを選択し、それ以外のとき、上記第２演算器の乗算機能を用いて実行するを選択する。すなわち、判定部２２０はその定数を多項式に展開した場合の項の数が当該設定値以下のとき、上記第１演算器のシフト演算機能を用いて実行するを選択する。当該設定値も、実験結果やシミュレーション結果により得られた知見にもとづき設計者により設定されることができる。 Next, the third method will be described. The determination unit 220 uses the shift operation function of the first arithmetic unit when the multiplication process is multiplication of a variable and a constant, and the constant can be expressed by a combination of shift operations equal to or less than a predetermined set value. In other cases, execute using the multiplication function of the second arithmetic unit is selected. That is, when the number of terms when the constant is expanded into a polynomial is equal to or less than the set value, the determination unit 220 selects to execute using the shift calculation function of the first calculator. The set value can also be set by the designer based on knowledge obtained from experimental results and simulation results.

以下、上記設定値が２に設定された例を説明する。上記定数が、「２５２」、「８１９０」の場合について考える。
２５２（２進数表記、１１１１１１００）＝（２＾８−２＾２）
８１９０（２進数表記、１１１１１１１１１１１１０）＝（２＾１３−２＾１） Hereinafter, an example in which the set value is set to 2 will be described. Consider the case where the above constants are “252” and “8190”.
252 (binary notation, 11111100) = (2 ^ 8-2 ^ 2)
8190 (binary notation, 1111111111110) = (2 ^ 13-2 ^ 1)

いずれの定数も多項式に展開された場合の項の数が２以下であるため、判定部２２０は、上記第１演算器のシフト演算機能を用いるほうを選択する。いずれの定数を用いた乗算演算も、二回以内のシフト演算と一回の加減算で実行することができ、第１演算器アレイを使用しても、比較的少ない行数で実行することができる。 Since the number of terms when any constant is expanded into a polynomial is 2 or less, the determination unit 220 selects to use the shift operation function of the first operation unit. Any multiplication operation using any constant can be executed with up to two shift operations and one addition / subtraction, and even with the first arithmetic unit array, it can be executed with a relatively small number of rows. .

図９にて、データフローグラフ生成部２３０は、抽出された乗算処理を、判定部２２０による判定結果に応じたデータフローグラフに変換する。すなわち、データフローグラフ生成部２３０は、判定部２２０により選択された機能を用いたデータフローグラフを生成する。コマンドデータ生成部２４０は、そのデータフローグラフからコマンドデータを生成する。 In FIG. 9, the data flow graph generation unit 230 converts the extracted multiplication processing into a data flow graph according to the determination result by the determination unit 220. That is, the data flow graph generation unit 230 generates a data flow graph using the function selected by the determination unit 220. The command data generation unit 240 generates command data from the data flow graph.

以下、実際のソースプログラム例と、それに対応するデータフローグラフ例を挙げながら変換装置２００の動作を具体的に説明する。ソースプログラム例はＣ言語で記述された例を示す。 Hereinafter, the operation of the conversion apparatus 200 will be specifically described with reference to an actual source program example and a corresponding data flow graph example. The source program example shows an example written in C language.

図１０は、ソースプログラム例１を示す図である。このソースプログラムは変数ｉｎ１と変数ｉｎ２との乗算を記述したものである。
図１１は、図２または図３に示した演算部１０で、ソースプログラム例１を実行する場合のデータフローグラフ例を示す。図１１のデータフローグラフ内の点線で描かれているノードは、演算部１０に含まれる第１演算器を示す。ここでは、第１演算器に搭載されている乗算演算の補助演算機能を用いる例を示す。 FIG. 10 is a diagram showing a source program example 1. This source program describes the multiplication of the variable in1 and the variable in2.
FIG. 11 shows an example of a data flow graph when the source program example 1 is executed by the arithmetic unit 10 shown in FIG. 2 or FIG. A node drawn with a dotted line in the data flow graph of FIG. 11 indicates a first arithmetic unit included in the arithmetic unit 10. Here, an example in which an auxiliary operation function of multiplication operation mounted on the first arithmetic unit is used will be described.

図１１にて、「＜＜」コマンドは入力データを左ビットシフトするコマンドである。「Ａｎｄ」コマンドは複数の入力データの論理積をとるコマンドである。「ｍｏｖコマンド」は、入力データをそのまま次のノードに出力するコマンドである。「ｎｅｇ」コマンドは入力データの符号を反転させるコマンドである。 In FIG. 11, a “<<” command is a command for shifting the input data to the left bit. The “And” command is a command that takes a logical product of a plurality of input data. The “mov command” is a command for outputting input data as it is to the next node. The “neg” command is a command for inverting the sign of input data.

「ｍｕｌ＿ｔ」コマンドは、乗算補助コマンドであり、以下の処理を行うためのコマンドである。
ｏｕｔ＝（ａ＞＞１）＋（（ａ＆１）？ｂ：０）
（ａ、ｂは入力データ）
この式の右辺の、前の項はａを１ビット右シフトした値を示し、後の項はａの最下位ビットが１の場合、ｂとなり、０の場合、０となることを示す。したがって、ａの最下位ビットが０の場合、ｏｕｔはａを１ビットシフトした値となり、１の場合、ｏｕｔはａを１ビットシフトした値とｂの値との合計値となる。 The “mul_t” command is a multiplication auxiliary command, and is a command for performing the following processing.
out = (a >> 1) + ((a & 1)? b: 0)
(A and b are input data)
The previous term on the right side of this expression indicates a value obtained by shifting a to the right by 1 bit, and the subsequent term indicates that the least significant bit of a is b when it is 1, and 0 when it is 0. Therefore, when the least significant bit of a is 0, out is a value obtained by shifting a by 1 bit, and when it is 1, out is a total value of a value obtained by shifting a by 1 bit and a value of b.

図１１のデータフローグラフは、８ビットの変数ｉｎ１と８ビットの変数ｉｎ２との乗算を、筆算アルゴリズムを用いて実行する例を示す。１段目の１つのノードでは、「＜＜」コマンドにより変数ｉｎ１が７ビット左シフトされる。１段目の別のノードでは、「Ａｎｄ」コマンドにより、変数ｉｎ２の９ビット目より上位に仮にデータが存在しても、そのデータがマスクされる。２段目以降のノードでは、変数ｉｎ２の最下位ビットから、ビット単位の乗算が実行され、それらが加算されていく処理が逐次実行される。 The data flow graph of FIG. 11 shows an example in which the multiplication of the 8-bit variable in1 and the 8-bit variable in2 is executed using a writing algorithm. At one node in the first stage, the variable “in1” is shifted 7 bits to the left by the “<<” command. In another node in the first stage, even if data exists above the 9th bit of the variable in2 by the “And” command, the data is masked. In the second and subsequent nodes, the bitwise multiplication is executed from the least significant bit of the variable in2, and the process of adding them is sequentially executed.

図１２は、図４または図５に示した演算部１０で、ソースプログラム例１を実行する場合のデータフローグラフ例を示す。図１２のデータフローグラフ内の楕円形で描かれているノードは、演算部１０に含まれる第２演算器を示す。「×」コマンドは複数の入力データを乗算するコマンドである。図１２のデータフローグラフにて、第２演算器の１段目のノード（第１演算器の１段目と２段目に対応）で、「×」コマンドにより変数ｉｎ１と変数ｉｎ２とが乗算される。 FIG. 12 shows an example of a data flow graph when the source program example 1 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. A node drawn with an ellipse in the data flow graph of FIG. 12 indicates a second computing unit included in the computing unit 10. The “x” command is a command for multiplying a plurality of input data. In the data flow graph of FIG. 12, the variable in1 and the variable in2 are multiplied by the “x” command at the first stage node of the second computing unit (corresponding to the first stage and the second stage of the first computing unit). Is done.

変数と変数との乗算を、第１演算器の乗算演算の補助演算機能を用いて実行するよりも、第２演算器の乗算演算機能を用いて実行するほうがデータフローグラフの行数が短くなる。図１１と図１２のデータフローグラフを比較すると、前者は９行必要であり、後者は２行で足りる。なお、変数のビット幅が大きいほど、前者では多くの行数が必要となり、後者を使用する効果がより大きくなる。 The number of rows in the data flow graph is shorter when the multiplication of the variables is performed by using the multiplication operation function of the second arithmetic unit than by the multiplication operation auxiliary function of the first arithmetic unit. . Comparing the data flow graphs of FIG. 11 and FIG. 12, the former requires 9 lines, and the latter requires 2 lines. Note that the larger the bit width of the variable, the greater the number of rows required for the former, and the greater the effect of using the latter.

図１３は、ソースプログラム例２を示す図である。このソースプログラムは変数ｉｎ１と定数「８１９２」との乗算を記述したものである。
図１４は、図２または図３に示した演算部１０で、ソースプログラム例２を実行する場合のデータフローグラフ例を示す。１段目のノードで、「＜＜」コマンドにより変数ｉｎ１が１３ビット左シフトされる。定数「８１９２」は２の１３乗であるため、変数ｉｎ１を１３ビット左シフトすれば、上記乗算を実現することができる。 FIG. 13 is a diagram showing a source program example 2. This source program describes the multiplication of the variable in1 and the constant “8192”.
FIG. 14 shows a data flow graph example when the source program example 2 is executed by the arithmetic unit 10 shown in FIG. 2 or FIG. At the first stage node, the variable “in1” is shifted 13 bits to the left by the “<<” command. Since the constant “8192” is 2 13, the above multiplication can be realized by shifting the variable in1 to the left by 13 bits.

図１５は、図４または図５に示した演算部１０で、ソースプログラム例２を実行する場合のデータフローグラフ例を示す。図１５のデータフローグラフにて、第２演算器に対応するノード１つで、「×」コマンドにより変数ｉｎ１と定数「８１９２」との乗算が実行される。 FIG. 15 illustrates an example of a data flow graph when the source program example 2 is executed by the arithmetic unit 10 illustrated in FIG. 4 or 5. In the data flow graph of FIG. 15, the multiplication of the variable in1 and the constant “8192” is executed by the “x” command at one node corresponding to the second arithmetic unit.

変数と定数との乗算をシフト演算１回で実現可能な場合、その乗算を第２演算器の乗算演算機能を用いて実行するよりも、第１演算器のシフト演算機能を用いて実行するほうがデータフローグラフの行数が短くなる。図１４と図１５のデータフローグラフを比較すると、前者は１行で足り、後者は２行必要である。 When the multiplication of the variable and the constant can be realized by one shift operation, it is better to execute the multiplication by using the shift operation function of the first arithmetic unit than by using the multiplication operation function of the second arithmetic unit. The number of rows in the data flow graph is shortened. Comparing the data flow graphs of FIG. 14 and FIG. 15, the former requires one line, and the latter requires two lines.

図１６は、ソースプログラム例３を示す図である。このソースプログラムは変数ｉｎ１と定数「１２３４５」との乗算を記述したものである。
図１７は、図２または図３に示した演算部１０で、ソースプログラム例３を実行する場合のデータフローグラフ例を示す。「−」コマンドは二つの入力データを減算するコマンドである。「＋」コマンドは複数の入力データを加算するコマンドである。図１７のデータフローグラフでは、変数ｉｎ１と定数「１２３４５」との乗算を、４回の左ビットシフト、１回の減算および３回の加算に展開している。 FIG. 16 is a diagram illustrating a source program example 3. This source program describes the multiplication of the variable in1 and the constant “12345”.
FIG. 17 shows a data flow graph example when the source program example 3 is executed by the arithmetic unit 10 shown in FIG. 2 or FIG. The “−” command is a command for subtracting two input data. The “+” command is a command for adding a plurality of input data. In the data flow graph of FIG. 17, the multiplication of the variable in1 and the constant “12345” is expanded into four left bit shifts, one subtraction, and three additions.

１段目の２番目のノードで、「＜＜」コマンドにより変数ｉｎ１が３ビット左シフトされ、８倍される。２段目の１番目のノードで、「−」コマンドにより前段のノードから入力される値から変数ｉｎ１が減算される。それと並行して２段目の２番目のノードで、「＜＜」コマンドにより前段のノードから入力される値が３ビット左シフトされ、８倍される。３段目の１番目のノードで、「＋」コマンドにより前段の二つのノードから入力される値が加算される。それと並行して３段目の２番目のノードで、「＜＜」コマンドにより、前段の２番目のノードから入力される値が６ビット左シフトされ、６４倍される。４段目の１番目のノードで、「＋」コマンドにより前段の二つのノードから入力される値が加算される。それと並行して３段目の２番目のノードで、「＜＜」コマンドにより、前段の２番目のノードから入力される値が１ビット左シフトされ、２倍される。５段目の１番目のノードで、前段の二つのノードから入力される値が加算され、上記乗算が完了する。 At the second node in the first stage, the variable “in1” is shifted left by 3 bits by the “<<” command and multiplied by eight. At the first node of the second stage, the variable in1 is subtracted from the value input from the preceding node by the “−” command. At the same time, at the second node in the second stage, the value input from the preceding node by the “<<” command is shifted left by 3 bits and multiplied by eight. At the first node in the third stage, the values inputted from the two nodes in the previous stage are added by the “+” command. At the same time, at the second node in the third stage, the value input from the second node in the previous stage is shifted to the left by 6 bits and multiplied by 64 by the “<<” command. At the first node in the fourth stage, the values inputted from the two nodes in the previous stage are added by the “+” command. At the same time, at the second node in the third stage, the value input from the second node in the previous stage is shifted left by 1 bit and doubled by the “<<” command. At the first node in the fifth stage, the values input from the two nodes in the previous stage are added, and the multiplication is completed.

図１８は、図４または図５に示した演算部１０で、ソースプログラム例３を実行する場合のデータフローグラフ例を示す。図１８のデータフローグラフにて、第２演算器の１段目のノード（第１演算器の１段目と２段目に対応）で、「×」コマンドにより変数ｉｎ１と定数「１２３４５」とが乗算される。 FIG. 18 shows an example of a data flow graph when the source program example 3 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. In the data flow graph of FIG. 18, at the first stage node of the second computing unit (corresponding to the first and second stage of the first computing unit), the variable “in1” and the constant “12345” are set by the “x” command. Is multiplied.

変数と定数との乗算を展開すると、多数のシフト演算と多数の加減算の組み合わせに変換される場合、その乗算を第１演算器のシフト演算機能を用いて実行するよりも、第２演算器の乗算演算機能を用いて実行するほうがデータフローグラフの行数が短くなる。図１７と図１８のデータフローグラフを比較すると、前者は５行必要であり、後者は２行で足りる。 When the multiplication of the variable and the constant is expanded and converted into a combination of a large number of shift operations and a large number of additions and subtractions, the multiplication of the second arithmetic unit is performed rather than the multiplication using the shift operation function of the first arithmetic unit. The number of rows in the data flow graph is shorter when the multiplication operation function is used. Comparing the data flow graphs of FIG. 17 and FIG. 18, the former requires 5 lines, and the latter requires 2 lines.

図１９は、ソースプログラム例４を示す図である。このソースプログラムは変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ１と定数「８２００」との乗算結果との加算を記述したものである。「８２０８」は「２＾１３＋２＾４」と、「８２００」は「２＾１３＋２＾３」と展開することができる。 FIG. 19 is a diagram showing a source program example 4. This source program describes the addition of the multiplication result of the variable in1 and the constant “8208” and the multiplication result of the variable in1 and the constant “8200”. “8208” can be expanded as “2 ^ 13 + 2 ^ 4”, and “8200” can be expanded as “2 ^ 13 + 2 ^ 3”.

図２０は、図２または図３に示した演算部１０で、ソースプログラム例４を実行する場合のデータフローグラフ例を示す。図２０のデータフローグラフにて、１段目の２番目のノードで、「＜＜」コマンドにより変数ｉｎ１が４ビット左シフトされ、１６倍される。それと並行して１段目の３番目のノードで、「＜＜」コマンドにより、変数ｉｎ１が３ビット左シフトされ、８倍される。２段目の１番目のノードで、「＜＜」コマンドにより前段の２番目のノードから入力される値が９ビット左シフトされ、５１２倍される。２段目の２番目のノードは「ｍｏｖ」コマンドにより前段の２番目のノードから入力される値をスルーする。それと並行して２段目の３番目のノードで、「＜＜」コマンドにより前段の３番目のノードから入力される値が１０ビット左シフトされ、１０２４倍される。２段目の４番目のノードは「ｍｏｖ」コマンドにより前段の３番目のノードから入力される値をスルーする。 FIG. 20 illustrates an example of a data flow graph when the source program example 4 is executed by the arithmetic unit 10 illustrated in FIG. 2 or FIG. In the data flow graph of FIG. 20, at the second node in the first row, the variable “in1” is shifted left by 4 bits by the “<<” command and multiplied by 16. At the same time, at the third node of the first stage, the variable “in1” is shifted left by 3 bits and multiplied by 8 by the “<<” command. At the first node of the second stage, the value input from the second node of the previous stage by the “<<” command is shifted left by 9 bits and multiplied by 512. The second node in the second stage passes through the value input from the second node in the previous stage by the “mov” command. At the same time, at the third node in the second stage, the value input from the third node in the previous stage is shifted 10 bits to the left by the “<<” command and multiplied by 1024. The fourth node in the second stage passes through the value input from the third node in the previous stage by the “mov” command.

３段目の２番目のノードで、「＋」コマンドにより前段の１番目のノードと２番目のノードから入力される値が加算される。それと並行して３段目の３番目のノードで、「＋」コマンドにより前段の３番目のノードと４番目のノードから入力される値が加算される。４段目の３番目のノードで、「＋」コマンドにより前段の２番目のノードと３番目のノードから入力される値が加算される。これにより、変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ１と定数「８２００」との乗算結果との加算が完了する。 At the second node in the third stage, the values input from the first node and the second node in the previous stage are added by the “+” command. At the same time, at the third node in the third stage, the values inputted from the third and fourth nodes in the previous stage are added by the “+” command. At the third node in the fourth stage, the values input from the second and third nodes in the previous stage are added by the “+” command. Thus, the addition of the multiplication result of the variable in1 and the constant “8208” and the multiplication result of the variable in1 and the constant “8200” is completed.

図２１は、図４または図５に示した演算部１０で、ソースプログラム例４を実行する場合のデータフローグラフ例を示す。図２１のデータフローグラフにて、第２演算器の１段目のノード（第１演算器の１段目と２段目に相当する）で、「×」コマンドにより変数ｉｎ１と定数「８２０８」とが乗算される。第２演算器の２段目のノード（第１演算器の３段目と４段目に対応）で、「×」コマンドにより変数ｉｎ１と定数「８２００」とが乗算される。それと並行して、第１演算器の３段目のノードは「ｍｏｖ」コマンドにより第２演算器の１段目のノードから入力される値をスルーし、第１演算器の４段目のノードも「ｍｏｖ」コマンドにより第１演算器の３段目のノードから入力される値をスルーする。 FIG. 21 shows a data flow graph example when the source program example 4 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. In the data flow graph of FIG. 21, at the first stage node of the second computing unit (corresponding to the first and second stage of the first computing unit), the variable “in1” and the constant “8208” by the “x” command. And are multiplied. At the second stage node of the second computing unit (corresponding to the third and fourth stages of the first computing unit), the variable in1 and the constant “8200” are multiplied by the “x” command. At the same time, the third stage node of the first arithmetic unit passes through the value input from the first stage node of the second arithmetic unit by the “mov” command, and the fourth stage node of the first arithmetic unit. Also, the value input from the third node of the first computing unit is passed through by the “mov” command.

第１演算器の５段目のノードで、第１演算器の３段目のノードから入力される値と、第２演算器の２段目のノードから入力される値が加算される。これにより、変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ１と定数「８２００」との乗算結果との加算が完了する。 The value input from the third stage node of the first arithmetic unit and the value input from the second stage node of the second arithmetic unit are added at the fifth stage node of the first arithmetic unit. Thus, the addition of the multiplication result of the variable in1 and the constant “8208” and the multiplication result of the variable in1 and the constant “8200” is completed.

変数と定数との乗算を複数含む式であって、各乗算を展開した場合のシフト演算の数が設定値より少ない場合、その複数の乗算を第２演算器の乗算演算機能を用いて実行するよりも、第１演算器のシフト演算機能を用いて実行するほうがデータフローグラフの行数が短くなることが多い。複数の乗算を第１演算器のシフト演算機能を用いて並行して実行することができるためである。図６に示したように、１段に複数の第２演算器が設けられる場合は、第２演算器の乗算演算機能を用いて実行するほうがデータフローグラフの行数が短くなることが多い。図２０と図２１のデータフローグラフを比較すると、前者は４行で足り、後者は５行必要である。 If the expression includes a plurality of multiplications of variables and constants and the number of shift operations when each multiplication is expanded is less than the set value, the plurality of multiplications are executed using the multiplication operation function of the second arithmetic unit. In many cases, the number of rows in the data flow graph is shorter when the shift calculation function of the first calculator is used. This is because a plurality of multiplications can be executed in parallel using the shift calculation function of the first calculator. As shown in FIG. 6, when a plurality of second arithmetic units are provided in one stage, the number of rows in the data flow graph is often shortened by executing using the multiplication arithmetic function of the second arithmetic unit. Comparing the data flow graphs of FIG. 20 and FIG. 21, the former requires four lines, and the latter requires five lines.

図２２は、ソースプログラム例５を示す図である。このソースプログラムは変数ｉｎ１と変数ｉｎ２との乗算結果と、変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ２と定数「８２００」との乗算結果との加算を記述したものである。 FIG. 22 is a diagram showing a source program example 5. This source program describes addition of a multiplication result of the variable in1 and the variable in2, a multiplication result of the variable in1 and the constant “8208”, and a multiplication result of the variable in2 and the constant “8200”.

図２３は、図４または図５に示した演算部１０で、ソースプログラム例５を実行する場合のデータフローグラフ例１を示す。このデータフローグラフ例１は、上記３つの乗算をすべての第２演算器の乗算演算機能を用いて実行する例である。図２３のデータフローグラフにて、第２演算器の１段目のノード（第１演算器の１段目と２段目に相当する）で、「×」コマンドにより変数ｉｎ１と変数ｉｎ２とが乗算される。第２演算器の２段目のノード（第１演算器の３段目と４段目に対応）で、「×」コマンドにより変数ｉｎ１と定数「８２０８」とが乗算される。それと並行して、第１演算器の３段目のノードは「ｍｏｖ」コマンドにより第２演算器の１段目のノードから入力される値をスルーし、第１演算器の４段目のノードも「ｍｏｖ」コマンドにより第１演算器の３段目のノードから入力される値をスルーする。 FIG. 23 illustrates a data flow graph example 1 when the source program example 5 is executed by the arithmetic unit 10 illustrated in FIG. 4 or 5. This data flow graph example 1 is an example in which the above three multiplications are executed using the multiplication operation functions of all the second arithmetic units. In the data flow graph of FIG. 23, at the first stage node of the second computing unit (corresponding to the first and second stage of the first computing unit), the variable in1 and the variable in2 are changed by the “x” command. Is multiplied. At the second stage node of the second computing unit (corresponding to the third and fourth stages of the first computing unit), the variable “in1” and the constant “8208” are multiplied by the “x” command. At the same time, the third stage node of the first arithmetic unit passes through the value input from the first stage node of the second arithmetic unit by the “mov” command, and the fourth stage node of the first arithmetic unit. Also, the value input from the third node of the first computing unit is passed through by the “mov” command.

第２演算器の３段目のノード（第１演算器の５段目と６段目に対応）で、「×」コマンドにより変数ｉｎ２と定数「８２００」とが乗算される。それと並行して、第１演算器の５段目のノードは「＋」コマンドにより第１演算器の４段目のノードから入力される値と第２演算器の１段目のノードから入力される値とが加算され、第１演算器の６段目のノードは「ｍｏｖ」コマンドにより第１演算器の５段目のノードから入力される値をスルーする。第１演算器の７段目のノードで、第１演算器の６段目のノードから入力される値と、第２演算器の３段目のノードから入力される値が加算される。これにより、変数ｉｎ１と変数ｉｎ２との乗算結果と、変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ２と定数「８２００」との乗算結果との加算が完了する。 At the third stage node of the second arithmetic unit (corresponding to the fifth and sixth stages of the first arithmetic unit), the variable “in2” and the constant “8200” are multiplied by the “x” command. In parallel with this, the value of the fifth stage node of the first arithmetic unit is inputted from the fourth stage node of the first arithmetic unit and the first stage node of the second arithmetic unit by the “+” command. And the sixth stage node of the first arithmetic unit passes through the value input from the fifth stage node of the first arithmetic unit by the “mov” command. The value input from the sixth stage node of the first arithmetic unit and the value input from the third stage node of the second arithmetic unit are added at the seventh stage node of the first arithmetic unit. Thus, the addition of the multiplication result of the variable in1 and the variable in2, the multiplication result of the variable in1 and the constant “8208”, and the multiplication result of the variable in2 and the constant “8200” is completed.

図２４は、図４または図５に示した演算部１０で、ソースプログラム例５を実行する場合のデータフローグラフ例２を示す。このデータフローグラフ例２は、上記３つの乗算を第１演算器のシフト演算機能と第２演算器の乗算演算機能を併用して実行する例である。図２４のデータフローグラフにて、第１演算器の１段目の４番目のノードで、「＜＜」コマンドにより変数ｉｎ１が４ビット左シフトされ、１６倍される。それと並行して第１演算器の１段目の５番目のノードで、「＜＜」コマンドにより、変数ｉｎ２が３ビット左シフトされ、８倍される。 FIG. 24 shows a data flow graph example 2 when the source program example 5 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. This data flow graph example 2 is an example in which the above three multiplications are executed using both the shift operation function of the first arithmetic unit and the multiplication operation function of the second arithmetic unit. In the data flow graph of FIG. 24, at the fourth node in the first stage of the first computing unit, the variable in1 is shifted left by 4 bits by the “<<” command and multiplied by 16. In parallel with this, at the fifth node of the first stage of the first arithmetic unit, the variable “in2” is shifted left by 3 bits and multiplied by 8 by the “<<” command.

第１演算器の２段目の３番目のノードで、「＜＜」コマンドにより前段の４番目のノードから入力される値が９ビット左シフトされ、５１２倍される。第１演算器の２段目の４番目のノードは「ｍｏｖ」コマンドにより前段の４番目のノードから入力される値をスルーする。それと並行して第１演算器の２段目の５番目のノードで、「＜＜」コマンドにより前段の５番目のノードから入力される値が１０ビット左シフトされ、１０２４倍される。第１演算器の２段目の６番目のノードは「ｍｏｖ」コマンドにより前段の５番目のノードから入力される値をスルーする。 At the third node of the second stage of the first arithmetic unit, the value input from the fourth node of the previous stage by the “<<” command is shifted left by 9 bits and multiplied by 512. The fourth node in the second stage of the first arithmetic unit passes through the value input from the fourth node in the previous stage by the “mov” command. At the same time, at the fifth node of the second stage of the first arithmetic unit, the value input from the fifth node of the previous stage by the “<<” command is shifted left by 10 bits and multiplied by 1024. The sixth node of the second stage of the first arithmetic unit passes through the value input from the fifth node of the previous stage by the “mov” command.

第１演算器の３段目の４番目のノードで、「＋」コマンドにより前段の３番目ノードと４番目のノードから入力される値が加算される。それと並行して第１演算器の３段目の４番目のノードで、「＋」コマンドにより前段の５番目ノードと６番目のノードから入力される値が加算される。それと並行して第２演算器の２段目のノード（第１演算器の３段目と４段目に対応）で、「×」コマンドにより変数ｉｎ１と変数ｉｎ２とが乗算される。 At the fourth node in the third stage of the first arithmetic unit, the values inputted from the third node and the fourth node in the previous stage are added by the “+” command. In parallel, at the fourth node of the third stage of the first computing unit, the values input from the fifth and sixth nodes of the previous stage are added by the “+” command. At the same time, the variable in1 and the variable in2 are multiplied by the “x” command at the second node of the second calculator (corresponding to the third and fourth stages of the first calculator).

第１演算器の５段目の６番目のノードで、「＋」コマンドにより第１演算器の４段目の５番目ノードから入力される値と、第２演算器の２段目のノードから入力される値が加算される。これにより、変数ｉｎ１と変数ｉｎ２との乗算結果と、変数ｉｎ１と定数「８２０８」との乗算結果と、変数ｉｎ２と定数「８２００」との乗算結果との加算が完了する。 The value input from the fifth node of the fourth stage of the first arithmetic unit by the “+” command at the sixth node of the fifth stage of the first arithmetic unit, and the second node of the second arithmetic unit Input values are added. Thus, the addition of the multiplication result of the variable in1 and the variable in2, the multiplication result of the variable in1 and the constant “8208”, and the multiplication result of the variable in2 and the constant “8200” is completed.

ソースプログラム例５に示すように、変数と変数との乗算、変数と定数との乗算が混在する演算式の場合、その複数の乗算を第２演算器の乗算演算機能のみを用いて実行するよりも第１演算器のシフト演算機能と第２演算器の乗算演算機能とを併用して実行するほうがデータフローグラフの行数が短くなることが多い。第１演算器のシフト演算機能と第２演算器の乗算演算機能とを併用すると、複数の乗算を並行して実行することができるためである。図２３と図２４のデータフローグラフを比較すると、前者は７行必要であり、後者は５行で足りる。 As shown in source program example 5, in the case of an arithmetic expression in which multiplication of a variable and a variable and multiplication of a variable and a constant are mixed, the plurality of multiplications are executed by using only the multiplication operation function of the second arithmetic unit. In many cases, the number of rows in the data flow graph is shortened when the shift operation function of the first operation unit and the multiplication operation function of the second operation unit are used in combination. This is because when the shift operation function of the first arithmetic unit and the multiplication operation function of the second arithmetic unit are used in combination, a plurality of multiplications can be executed in parallel. Comparing the data flow graphs of FIG. 23 and FIG. 24, the former requires 7 lines, and the latter requires 5 lines.

図２５は、ソースプログラム例６を示す図である。このソースプログラム例６はソースプログラム例５と類似する。このソースプログラム例６は変数ｉｎ１と変数ｉｎ２との乗算結果と、変数ｉｎ１と定数「２５２」との乗算結果と、変数ｉｎ２と定数「８１９０」との乗算結果との加算を記述したものである。 FIG. 25 is a diagram illustrating a source program example 6. The source program example 6 is similar to the source program example 5. This source program example 6 describes the addition of the multiplication result of the variable in1 and the variable in2, the multiplication result of the variable in1 and the constant “252”, and the multiplication result of the variable in2 and the constant “8190”. .

図２６は、図４または図５に示した演算部１０で、ソースプログラム例６を実行する場合のデータフローグラフ例１を示す。このデータフローグラフ例１は、上記３つの乗算をすべての第２演算器の乗算演算機能を用いて実行する例である。図２６のデータフローグラフは、図２３のデータフローグラフと基本的に同じ構造であり、第２演算器の２段目のノード（第１演算器の３段目と４段目に相当する）、およびその３段目のノードで（第１演算器の５段目と６段目に相当する）で、「×」コマンドにより乗算される定数が変更された点のみが異なる。 FIG. 26 shows a data flow graph example 1 when the source program example 6 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. This data flow graph example 1 is an example in which the above three multiplications are executed using the multiplication operation functions of all the second arithmetic units. The data flow graph of FIG. 26 has basically the same structure as the data flow graph of FIG. 23, and is the second node of the second arithmetic unit (corresponding to the third and fourth stages of the first arithmetic unit). And the third stage node (corresponding to the fifth stage and the sixth stage of the first computing unit) are different only in that the constant multiplied by the “x” command is changed.

図２７は、図４または図５に示した演算部１０で、ソースプログラム例６を実行する場合のデータフローグラフ例２を示す。このデータフローグラフ例２は、上記３つの乗算を第１演算器のシフト演算機能と第２演算器の乗算演算機能を併用して実行する例である。図２７のデータフローグラフは、図２４のデータフローグラフと基本的に同じ構造である。以下、相違点について説明する。第１演算器の１段目の４番目のノードで、「＜＜」コマンドにより変数ｉｎ１が２ビット左シフトされ、４倍される。それと並行して第１演算器の１段目の５番目のノードで、「＜＜」コマンドにより、変数ｉｎ２が１ビット左シフトされ、２倍される。 FIG. 27 shows a data flow graph example 2 when the source program example 6 is executed by the arithmetic unit 10 shown in FIG. 4 or FIG. This data flow graph example 2 is an example in which the above three multiplications are executed using both the shift operation function of the first arithmetic unit and the multiplication operation function of the second arithmetic unit. The data flow graph of FIG. 27 has basically the same structure as the data flow graph of FIG. Hereinafter, differences will be described. At the fourth node of the first stage of the first arithmetic unit, the variable “in1” is shifted left by 2 bits by the “<<” command and multiplied by four. At the same time, at the fifth node of the first stage of the first arithmetic unit, the variable “in2” is shifted left by 1 bit and doubled by the “<<” command.

第１演算器の２段目の３番目のノードで、「＜＜」コマンドにより前段の４番目のノードから入力される値が６ビット左シフトされ、６４倍される。第１演算器の２段目の４番目のノードは「ｎｅｇ」コマンドにより前段の４番目のノードから入力される値の符号が反転される。それと並行して第１演算器の２段目の５番目のノードで、「＜＜」コマンドにより前段の５番目のノードから入力される値が１２ビット左シフトされ、４０９６倍される。第１演算器の２段目の６番目のノードは「ｎｅｇ」コマンドにより前段の５番目のノードから入力される値の符号を反転させる。以下の処理は、図２３のデータフローグラフと同じである。 At the third node of the second stage of the first computing unit, the value input from the fourth node of the previous stage by the “<<” command is shifted left by 6 bits and multiplied by 64. The sign of the value input from the preceding fourth node is inverted by the “neg” command at the fourth node in the second stage of the first computing unit. At the same time, at the fifth node of the second stage of the first arithmetic unit, the value input from the fifth node of the previous stage by the “<<” command is shifted left by 12 bits and multiplied by 4096. The sixth node of the second stage of the first arithmetic unit inverts the sign of the value input from the fifth node of the previous stage by the “neg” command. The following processing is the same as the data flow graph of FIG.

図２４と図２７のデータフローグラフを比較すると、前者では変数と定数との乗算が２回のシフト演算と１回の加算に展開され、後者では２回のシフト演算と１回の減算に展開される。ソースプログラム例６でも、上記複数の乗算を第２演算器の乗算演算機能のみを用いて実行するよりも第１演算器のシフト演算機能と第２演算器の乗算演算機能とを併用して実行するほうがデータフローグラフの行数が短くなる。図２６と図２７のデータフローグラフを比較すると、前者は７行必要であり、後者は５行で足りる。 Comparing the data flow graphs of FIG. 24 and FIG. 27, in the former, multiplication of a variable and a constant is expanded into two shift operations and one addition, and in the latter, it is expanded into two shift operations and one subtraction. Is done. In the source program example 6 as well, the shift operation function of the first arithmetic unit and the multiplication operation function of the second arithmetic unit are executed in combination rather than performing the plurality of multiplications using only the multiplication operation function of the second arithmetic unit. Doing so will reduce the number of rows in the data flow graph. Comparing the data flow graphs of FIG. 26 and FIG. 27, the former requires 7 lines, and the latter requires 5 lines.

以上説明したように本実施の形態によれば、複数の第１演算器を含む演算部を備える演算処理装置において、乗算を単体で実行可能な第２演算器を搭載することにより、乗算を効率的に実行することができる。第２演算器を搭載することにより、その分、上記演算部の規模が増大するが、乗算演算にかかる処理時間が短縮される。乗算が多いアプリケーションでは、第２演算器を搭載したほうが全体の処理時間を短縮することができ、また、第１演算器アレイに含まれる複数の第１演算器アレイの利用効率を高めることができる。この場合、第１演算器アレイの回路規模を削減することができ、演算部全体の回路規模を削減することにもつながる。 As described above, according to the present embodiment, in the arithmetic processing device including the arithmetic unit including a plurality of first arithmetic units, the second arithmetic unit capable of executing multiplication alone is mounted, thereby efficiently performing the multiplication. Can be executed automatically. By installing the second arithmetic unit, the scale of the arithmetic unit is increased correspondingly, but the processing time for the multiplication operation is shortened. For applications with many multiplications, it is possible to reduce the overall processing time by installing the second arithmetic unit, and it is possible to increase the utilization efficiency of the plurality of first arithmetic unit arrays included in the first arithmetic unit array. . In this case, the circuit scale of the first arithmetic unit array can be reduced, leading to a reduction in the circuit scale of the entire arithmetic unit.

また、変換装置がソースプログラムをコンパイルする際、乗算を第１演算器のシフト演算機能を用いて実行するか、第２演算器の乗算演算機能を用いて実行するかを適宜、決定することにより、乗算演算にかかる処理時間を最適化することができる。それにより、消費電力も低減することができる。たとえば、変数と定数の乗算の場合で、その乗算が少ない回数のシフト演算の組み合わせで実現できる場合、第２演算器の乗算演算機能を用いるより、第１演算器のシフト演算を用いるほうが処理時間を短くすることができる場合が多い。 Further, when the conversion device compiles the source program, by appropriately determining whether to execute the multiplication using the shift operation function of the first arithmetic unit or the multiplication operation function of the second arithmetic unit The processing time required for the multiplication operation can be optimized. Thereby, power consumption can also be reduced. For example, in the case of multiplication of a variable and a constant, if the multiplication can be realized by a combination of shift operations with a small number of times, it is more processing time to use the shift operation of the first arithmetic unit than to use the multiplication operation function of the second arithmetic unit. Can often be shortened.

以上、本発明をいくつかの実施形態をもとに説明した。これらの実施の形態は例示であり、それらの各構成要素や各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on some embodiments. It is understood by those skilled in the art that these embodiments are exemplifications, and that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are also within the scope of the present invention. By the way.

上述した実施の形態では、第２演算器が乗算演算のみを実行したが、第２演算器がその他の算術論理演算を実行してもよい。 In the above-described embodiment, the second arithmetic unit executes only the multiplication operation, but the second arithmetic unit may execute other arithmetic logic operations.

本発明の実施の形態１に係る演算処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the arithmetic processing apparatus which concerns on Embodiment 1 of this invention. 従来技術に係る演算部の第１構成例を示す図である。It is a figure which shows the 1st structural example of the calculating part which concerns on a prior art. 従来技術に係る演算部の第２構成例を示す図である。It is a figure which shows the 2nd structural example of the calculating part which concerns on a prior art. 本発明の実施の形態１に係る演算部の第１構成例を示す図である。It is a figure which shows the 1st structural example of the calculating part which concerns on Embodiment 1 of this invention. 本発明の実施の形態１に係る演算部の第２構成例を示す図である。It is a figure which shows the 2nd structural example of the calculating part which concerns on Embodiment 1 of this invention. 本発明の実施の形態１に係る演算部の第３構成例を示す図である。It is a figure which shows the 3rd structural example of the calculating part which concerns on Embodiment 1 of this invention. 本発明の実施の形態１に係る演算部の第４構成例を示す図である。It is a figure which shows the 4th structural example of the calculating part which concerns on Embodiment 1 of this invention. 本発明の実施の形態１に係る演算部の第５構成例を示す図である。It is a figure which shows the 5th structural example of the calculating part which concerns on Embodiment 1 of this invention. 本発明の実施の形態２に係る変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the converter which concerns on Embodiment 2 of this invention. ソースプログラム例１を示す図である。It is a figure which shows the source program example 1. FIG. 図２または図３に示した演算部で、ソースプログラム例１を実行する場合のデータフローグラフ例を示す図である。It is a figure which shows the example of a data flow graph in the case of performing the source program example 1 in the calculating part shown in FIG. 2 or FIG. 図４または図５に示した演算部で、ソースプログラム例１を実行する場合のデータフローグラフ例を示す図である。FIG. 6 is a diagram illustrating an example of a data flow graph when the source program example 1 is executed by the calculation unit illustrated in FIG. 4 or FIG. 5. ソースプログラム例２を示す図である。It is a figure which shows the example 2 of a source program. 図２または図３に示した演算部で、ソースプログラム例２を実行する場合のデータフローグラフ例を示す図である。It is a figure which shows the example of a data flow graph in the case of performing the source program example 2 in the calculating part shown in FIG. 2 or FIG. 図４または図５に示した演算部で、ソースプログラム例２を実行する場合のデータフローグラフ例を示す図である。6 is a diagram illustrating an example of a data flow graph when the source program example 2 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. FIG. ソースプログラム例３を示す図である。It is a figure which shows the example 3 of a source program. 図２または図３に示した演算部で、ソースプログラム例３を実行する場合のデータフローグラフ例を示す図である。It is a figure which shows the example of a data flow graph in the case of performing the source program example 3 in the calculating part shown in FIG. 2 or FIG. 図４または図５に示した演算部で、ソースプログラム例３を実行する場合のデータフローグラフ例を示す図である。FIG. 6 is a diagram illustrating a data flow graph example when a source program example 3 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. ソースプログラム例４を示す図である。It is a figure which shows the example 4 of a source program. 図２または図３に示した演算部で、ソースプログラム例４を実行する場合のデータフローグラフ例を示す図である。It is a figure which shows the example of a data flow graph in the case of performing the source program example 4 in the calculating part shown in FIG. 2 or FIG. 図４または図５に示した演算部で、ソースプログラム例４を実行する場合のデータフローグラフ例を示す図である。FIG. 6 is a diagram illustrating an example of a data flow graph when the source program example 4 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. ソースプログラム例５を示す図である。It is a figure which shows the example 5 of a source program. 図４または図５に示した演算部で、ソースプログラム例５を実行する場合のデータフローグラフ例１を示す図である。FIG. 6 is a diagram illustrating a data flow graph example 1 when a source program example 5 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. 図４または図５に示した演算部で、ソースプログラム例５を実行する場合のデータフローグラフ例２を示す図である。FIG. 6 is a diagram illustrating a data flow graph example 2 when the source program example 5 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. ソースプログラム例６を示す図である。It is a figure which shows the example 6 of a source program. 図４または図５に示した演算部で、ソースプログラム例６を実行する場合のデータフローグラフ例１を示す図である。FIG. 6 is a diagram illustrating a data flow graph example 1 when the source program example 6 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5. 図４または図５に示した演算部で、ソースプログラム例６を実行する場合のデータフローグラフ例２を示す図である。FIG. 6 is a diagram illustrating a data flow graph example 2 in a case where the source program example 6 is executed by the arithmetic unit illustrated in FIG. 4 or FIG. 5.

Explanation of symbols

１０演算部、１１第１演算器、２０制御部、３０記憶部、５１接続部、６１第２演算器、１００演算処理装置、２００変換装置、２１０抽出部、２２０判定部、２３０データフローグラフ生成部、２４０コマンドデータ生成部。 10 arithmetic units, 11 first arithmetic unit, 20 control unit, 30 storage unit, 51 connection unit, 61 second arithmetic unit, 100 arithmetic processing unit, 200 conversion unit, 210 extraction unit, 220 determination unit, 230 data flow graph generation Part, 240 command data generation part.

Claims

An arithmetic processing apparatus including an arithmetic unit capable of changing functions according to setting data supplied from the outside,
The computing unit is
A plurality of first arithmetic units capable of selectively executing a plurality of types of arithmetic logic operations excluding multiplication;
At least one second computing unit capable of performing multiplication alone;
An arithmetic processing device comprising:

The first computing unit constitutes a first computing unit array of x (x is an integer of 2 or more) rows × y (y is an integer of 2 or more) columns,
2. The second arithmetic unit comprises a second arithmetic unit column or a second arithmetic unit array having m (m is a natural number equal to or less than x) rows × n (n is a natural number) columns. The arithmetic processing unit described.

The first computing unit constitutes a first computing unit array of x (x is an integer of 2 or more) rows × y (y is an integer of 2 or more) columns,
The arithmetic processing apparatus according to claim 1, wherein the second arithmetic unit is provided for each of a plurality of rows of the first arithmetic unit array.

A conversion device for converting a source program into setting data to be processed by the arithmetic processing device according to any one of claims 1 to 3,
A determination unit that determines whether the multiplication process included in the source program is executed using the shift operation function of the first arithmetic unit or the multiplication function of the second arithmetic unit;
A setting data generation unit that converts the multiplication processing into setting data according to a determination result by the determination unit;
A conversion device comprising:

The determination unit selects execution using the shift operation function of the first arithmetic unit when the multiplication process is multiplication of a variable and a constant, and the constant is a multiplier of 2, and the other 5. The conversion apparatus according to claim 4, wherein execution is selected using a multiplication function of the second arithmetic unit.

The determination unit performs the shift operation of the first arithmetic unit when the multiplication process is multiplication of a variable and a constant, and when the constant is expressed in binary, the number of 1 is equal to or less than a predetermined set value. 5. The conversion apparatus according to claim 4, wherein execution is selected using a function, and execution is performed using a multiplication function of the second arithmetic unit at other times.

A conversion device for converting a source program into setting data to be processed by the arithmetic processing device according to claim 2,
A determination unit that determines whether the multiplication process included in the source program is executed using the shift operation function of the first arithmetic unit or the multiplication function of the second arithmetic unit;
A setting data generation unit that converts the multiplication processing into setting data according to a determination result by the determination unit;
The determination unit is executed using the shift operation function of the first arithmetic unit, and is executed using the multiplication function of the second arithmetic unit, based on the number of rows of the first arithmetic unit array. A conversion apparatus that selects one that can execute multiplication processing with a small number of rows.