JP4123884B2

JP4123884B2 - Signal processing circuit

Info

Publication number: JP4123884B2
Application number: JP2002273472A
Authority: JP
Inventors: 喜孝柏木; 龍一祖田; 清太郎大田
Original assignee: Yaskawa Electric Corp
Current assignee: Yaskawa Electric Corp
Priority date: 2002-09-19
Filing date: 2002-09-19
Publication date: 2008-07-23
Anticipated expiration: 2022-09-19
Also published as: JP2004110528A

Description

【０００１】
【発明の属する技術分野】
この発明は、入力された信号を元に後段に接続された回路への指令となる出力信号を生成する信号処理回路、特に組み込み機器分野における信号処理回路に関するものである。
【０００２】
【従来の技術】
従来の技術の代表例として、モータの制御を行なうシステムを中心に話を進める。マイクロプロセッサ（ＭＰＵ）を利用し、信号処理を行なうシステム構成を図１０に示す。ＭＰＵ１０１と、プログラムを格納したＲＯＭ１０２、作業領域として利用されるＲＡＭ１０３、および一定時間毎にＭＰＵ１０１に対して割り込み信号を発生するタイマ１０４が、バスＢ１を介して接続される。これらは、モータ制御以外の映像や音声等の信号処理においても共通に見られる構成である。
図１０において点線で囲まれた部分は、信号処理の対象により構成が異なる。モータ制御の例では、制御対象であるモータ１０７、モータ１０７をドライブするドライバ１０６、制御対象であるモータ１０７の動きを観測するセンサ１０８、センサ１０８の出力を取得するカウンタ１０５から構成される。
ＲＯＭ１０２には、オペレーティングシステム（ＯＳ）とアプリケーションプログラムが格納されており、モータ１０７の制御はＯＳ上の一つのタスクとして、アプリケーションプログラムにより実行される。組み込みシステムの場合には、リアルタイム性が要求されるために、リアルタイムＯＳが一般的に利用される。
【０００３】
この、アプリケーションプログラムが信号処理の本体であり、モータ制御の例では制御アルゴリズムがプログラムとして記述されている。例えば、図１１に示されている比例要素Ｋ_Pを設けた比例制御や、図１２に示されている比例要素Ｋ_Pおよび積分定数Ｋ_Iを有する積分要素∫を設けた比例積分制御、図１３に示されている比例要素Ｋ_Pおよび積分定数Ｋ_Iを有する積分要素∫ならびに微分定数Ｋ_Dを有する微分要素ｄ／ｄｔを設けた比例積分微分制御等がある。
このように、制御の性能を高めるために、同じ目的に対して複数のアルゴリズムが考えられ、実際に利用されている。しかし、性能を高めるために、信号処理の内容が複雑になる。さらに、内部の状態を変数として利用したり、多入力多出力になるなど、性能向上を図るために、処理内容は複雑になるばかりである。
【０００４】
ディジタル処理を基本とする信号処理においては、微分が差分に、積分が和分で処理されるので、積和演算が多用される傾向にある。そのため、ＭＰＵではなく、積和演算を高速に実行できるように設計されているディジタルシグナルプロセッサ（ＤＳＰ）が利用されることも多い。そのため、ＤＳＰの演算処理部に対してデータの依存性をなくすことにより、連続して実行される信号処理を高速に行なうようにしている。また、ＭＰＵで高速な信号処理を行なうため、ＡＬＵ（算術論理演算器）以外に乗算器、さらに積和演算器を持たせたものもある。
さらに、アルゴリズムを変更することが必要なくなり固定化できると、ＭＰＵやＤＳＰ上のアプリケーションプログラムで実現していたアルゴリズムを、専用ＬＳＩとしてハードウェア化することができる。これにより、信号処理速度やコストといった要求に対応する。
しかしながら、ＭＰＵやＤＳＰを用いてプログラムにより処理を行なう従来の信号処理回路では、いくつかの問題点を持っている。
【０００５】
複数のアプリケーションに対応するためには、その数だけの信号処理のアルゴリズムを必要とするという問題がある。モータの制御に限っても、モータの負荷の種類により、多種多様なアルゴリズムがあり、従って、アプリケーションプログラムも複数必要となる。しかし、組み込み機器の場合には、ＭＰＵや、ＲＯＭ、ＲＡＭ等の利用できるリソースが制限されており、２、３種類のアルゴリズム程度しかプログラムとしてＲＯＭ上に搭載することができない。これは、アプリケーションに対応した、カスタマイズを行なうことを困難とする。そのため、少量多品種のシステムを作ることになり、開発と現場の双方ともメンテナンスが煩雑になり、システムの信頼性を落とす要因となる。また、個別対応となるため、製造コストも増加する。
次に、信号処理にかかる処理時間の問題がある。基本的に、ＭＰＵやＤＳＰは処理が逐次的であり、処理の高速化はクロック周波数に依存する。そのため、信号処理にかかる時間を短縮する場合には、クロック周波数を上げる必要がある。しかし、クロック周波数を上げることにより、消費電力や不要輻射ノイズが増大する。これは、システムの開発を困難とし、さらに、システムとしての信頼性を損なう。一方、信号処理は処理時間がかかると同時に、優先度がつけにくいものである。また、信号処理が終了し処理結果が得られないと、次の処理にかかれないという性質を持つ。そのため、信号処理が動作している間は、割り込みが禁止した上で、ＭＰＵやＤＰＳを占有して信号処理を行なう。そのため、優先度の高い処理であっても処理がブロックされ、リアルタイム性を損なう大きな要因となっている。
【０００６】
また、アルゴリズム自体はどのシステムでも共通であるが、システムに依存した形で、プログラム化しなければいけないという問題がある。アルゴリズムはＣ言語等の高級言語で記述し、ＭＰＵやＤＳＰで動作する機械語のプログラムに翻訳され、最終的にＲＯＭに搭載される。この際、ＭＰＵやＤＳＰの変更はもちろん、ＯＳやコンパイラなど、システムのどこかが変更された場合、翻訳された機械語は、システム構成は同じでも、そのままでは、そのプログラムは動作しないという問題を持つ。この問題は、専用品になり易い組み込みシステムほど顕著である。
同じアルゴリズムにも関わらず、プログラムがシステムに依存するため、アルゴリズムを蓄積し、再利用することが困難となる。さらに、カスタマイズはプログラム全体を交換する必要があり、ネットワーク特にインターネットを介して、プログラムを取得するには、プログラムサイズの問題もある。組み込みシステムでは、リソースが限定されているが、それでも数百キロバイト程度あり、電話回線を利用したＰＰＰ（Ｐｏｉｎｔ−ｔｏ−ＰｏｉｎｔＰｒｏｔｏｃｏｌ）では、非常に大きなサイズといえる。
信号処理にかかる処理時間の問題を解決するために、アルゴリズムをハードウェア化したＬＳＩによるアプローチがある。しかし、専用化しハードウェア化しているため、他のアプリケーションに対応するカスタマイズが行なえない、という問題は依然として持っている。カスタマイズ可能とすると、マイクロコード方式をとるため、ＭＰＵやＤＳＰと同じ問題が現れることになる。
【０００７】
このような問題を解消するために、乗算や加算、データ転送、およびシーケンスコントロール等の基本演算処理を行なうと共に、各種データや命令信号を出力するＤＳＰ（ディジタルシグナルプロセッサ）コアと、このＤＳＰコアと接続されたデータバスと、このデータバスを共有し、命令信号に対応した基本演算処理以外の特殊処理をそれぞれ実行する複数の機能ブロックと、内部ステータス信号によって、一つまたは複数の機能ブロックを選択して、命令信号に対応した特殊処理を実行可能にする選択回路とを備えた信号処理演算器が提案されている（例えば、特許文献１参照）。これにより、コスト上昇を伴うハードウェアの大幅な変更なしに、種々のアプリケーションに適応することができる信号処理演算回路が得られる。
【特許文献１】
特開平８−１０６３７５号公報（第３−４頁、図１、図２）
【０００８】
【発明が解決しようとする課題】
しかしながら、この特許文献１に開示された信号処理においては、基本演算処理をＤＳＰコアにおいて行ない、複数の機能ブロックにおいてはＤＳＰコアで扱うのが困難な特殊処理を実行するという役割分担を行なっているため、特殊処理以外の基本演算処理の処理速度が、ＤＳＰコアを構成するメモリの容量や演算回路等の処理速度で制限され、それ以上の高速化が困難であるという問題があった。
したがってこの発明の目的は、以上述べた問題点を解消するために、信号処理を行なう多種多様なアルゴリズムを、単一のシステムとして実現でき、カスタマイズ性と、柔軟性、高速性を備え、同時に消費電力と不要輻射ノイズの低減を実現するため、高速性と動作周波数を下げることを両立するアーキテクチャを備えた信号処理回路を提供することである。
【０００９】
【課題を解決するための手段】
上記問題を解決するため、本発明の第１の構成に係る信号処理回路は、前段のプロセッサもしくは回路から与えられる入力信号に対して、任意の演算処理を施し、後段のプロセッサもしくは回路へ与える指令となる出力信号を生成する信号処理回路において、
信号処理の特徴を抽出しディジタルデータ化したコンテンツを記憶し、外部回路からアクセス可能なコンテンツ記憶部と、外部回路からアクセス可能であり、基本的な演算機能を持つ演算回路から構成され、前記コンテンツ記憶部のコンテンツに基づいて演算の実行、入力制御、出力制御を行ない、並列動作が可能な複数の基本演算ブロックと、前記入力信号、前記出力信号、前記コンテンツ記憶部、および前記基本演算ブロックを、任意の組み合わせで接続することが可能な配線ネットワークと、前記コンテンツ記憶部のコンテンツを元に前記配線ネットワークの組合わせと、前記基本演算ブロックの動作順序の制御を行なう制御信号を生成するグローバル順序回路とを備え、
前記基本演算ブロックは、基本演算部とローカル順序回路とを備え、
前記基本演算部は、前記コンテンツ記憶部のコンテンツを一方の入力として利用する、少なくとも一つの算術演算回路および前記コンテンツ記憶部のコンテンツをシフト量として利用するシフト回路を、布線論理で構成する演算処理部と、前記配線ネットワークに接続され、前記演算処理部の結果を保持し、外部回路からアクセス可能な結果レジスタ部と、前記入力信号および他の任意の前記基本演算ブロックから、前記配線ネットワークを経由し、前記演算処理部への入力を保持し、外部回路からアクセス可能な入力レジスタ部とで構成され、
前記ローカル順序回路は、前記グローバル順序回路により生成された制御信号を起動信号とし、前記コンテンツ記憶部のコンテンツをローカル順序回路の制御に利用し、前記基本演算部の制御を行なう制御信号を生成するものであり、
前記配線ネットワークは、任意の前記基本演算ブロックからアクセス可能なｎポートの共有メモリを備え、
前記グローバル順序回路は、前記コンテンツ記憶部のコンテンツを元に前記共有メモリへのアドレスや制御信号を生成する共有メモリ制御部と、前記コンテンツ記憶部のコンテンツを元に前記基本演算ブロックの動作順序を制御する制御信号を生成するブロック制御部とを備え、
前記コンテンツ記憶部に記憶されるコンテンツのデータ構造として、グローバル順序回路データ部と、複数の基本演算ブロックデータ部とを有し、
前記グローバル順序回路データ部は、前記グローバル順序回路において、前記配線ネットワークの組合わせおよび前記基本演算ブロックの動作順序の決定に利用され、前記基本演算ブロックデータ部を頂点の要素とするデータフローグラフを表現する、グローバル順序回路データを保存したものであり、
前記基本演算ブロックデータ部は、前記グローバル順序回路データ部のフローグラフ各頂点に対応するためのタグ部と、前記演算処理部による演算の実行、入力制御、出力制御を行なうために利用される演算処理データ部と、前記ローカル順序回路を制御するために利用されるローカル順序回路データ部とを有することを特徴とする。
この第１の構成の信号処理回路においては、コンテンツ記憶部と、複数の基本演算ブロックと、配線ネットワークと、グローバル順序回路の間で信号処理を行なうことにより、多種多様なアルゴリズムを、単一のアーキテクチャとして実現でき、多種多様なアルゴリズムの特徴を、データとして抽出したコンテンツにより、コンテンツに対応したアルゴリズムに最適なアーキテクチャとすることができる。そして、複数の基本演算ブロックを並列に動作させることで、高速な信号処理を行なえ、システムの動作周波数を下げることができる。
また、演算処理部の処理回路自体は固定であるが、コンテンツを演算式の乗算、加算、シフタへの入力とすることで、処理回路が表現する演算式としての働きを簡単に変更できる。さらに、演算処理部のデータパスと制御回路を簡単にすることができるために、実装が容易となる。
また、複数の共有メモリに与えるアドレスとストローブ信号のみで通信を行なうので、簡単な制御により、基本演算ブロック部や、コンテンツ記憶部、グローバル順序回路、入力、出力間の通信を行なうことができる。
さらに、信号処理のアルゴリズムという定性的な情報を、アーキテクチャの動作の決定と、最適化のための調整値という形で、特徴を抽出しディジタルデータ化することにより、システムに依存しないデータ（コンテンツ）という形で、アルゴリズムのデータベース化を行なえるようになり、アルゴリズムの再利用を簡単にすることができる。
【００１０】
本発明の第２の構成に係る信号処理回路は、第１の構成における入力レジスタ部は、一つの前記演算処理部を重複して利用し、演算処理をｎ（ｎ＞１）回行なうために、前記配線ネットワークを介して前記外部回路からアクセス可能なｎ個のレジスタからなる入力ベクトルレジスタ部と、前記演算処理部への入力となるように前記入力ベクトルレジスタ部の出力をｎ対１にマルチプレクスするマルチプレクサ部とから構成され、前記結果レジスタ部は、前記演算処理部の各処理の結果を１対ｎにデマルチプレクスするデマルチプレクサ部と、該デマルチプレクサ部の出力を保持する、配線ネットワークに接続され、外部回路からアクセス可能な、ｎ個のレジスタからなる、結果ベクトルレジスタ部とから構成され、前記ローカル順序回路は、前記グローバル順序回路により生成された制御信号を起動信号とし、また前記コンテンツ記憶部のコンテンツを前記ローカル順序回路の制御にそれぞれ利用して、前記入力レジスタ部を制御する信号を生成する入力制御部と、前記結果レジスタ部を制御する信号を生成する出力制御部と、前記演算処理部をＮ（ｎ≧Ｎ）回重複動作させるように制御する信号を生成する演算制御部とから構成されたことを特徴とする。
この第２の構成の信号処理回路においては、任意の基本演算ブロックの演算処理部を重複して信号処理に利用することにより、実装に必要とする回路規模を減らし、行列演算のような繰り返しが要求される処理を高速化することができる。
【００１１】
本発明の第３の構成に係る信号処理回路は、第１または第２の構成において、基本演算ブロックの基本演算部に、外部回路からアクセス可能であり、任意の基本演算ブロックの任意の時点での演算結果を保存し、将来の予測値を次回以降の演算に利用可能とするバッファメモリを備えたことを特徴とする。
この第３の構成においては、任意の時点での演算結果を利用したフィードバック処理や、予測値を利用したフィードフォワード処理ができるために、精度の高い信号処理を行なえる。
【００１５】
【発明の実施の形態】
以下、本発明の実施の形態を、図１から図９を用いて説明する。
＜第１実施形態＞
図１に示す第１実施形態は、本発明の信号処理回路の基本的な概念を説明するものである。基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、マイクロプロセッサ（ＭＰＵ）１からの入力信号Ｓ１、回路２への出力信号Ｓ２の各リソースが配線ネットワーク１５に接続されている。
コンテンツ記憶部１１は、信号処理の特徴を抽出しディジタルデータ化したコンテンツを記憶し、配線ネットワーク１５を介して、外部回路からアクセス可能な構成となっている。
【００１６】
基本演算ブロック１２，１３，１４は、配線ネットワーク１５を介して外部回路からアクセス可能であり、基本的な演算機能を持つ演算回路から構成され、コンテンツ記憶部１１のコンテンツＳ３に基づいて演算機能の調整、すなわち演算の実行、入力制御、出力制御を行ない、並列動作が可能な構成となっている。基本演算ブロック１２，１３，１４は、様々な演算パターンを提供し、信号処理の演算式を構成する基本的なパーツとして利用される。演算パターンと信号処理の演算式は、素数と合成数の関係にたとえることができる。すなわち、上限の数を任意にとった場合、任意の素数を除く合成数は、それよりも小さい素数の集合で表現することができる。つまり、この素数の集合となるような基本演算ブロックを選択することで、少数の基本演算ブロックを組合わせ、多種多様な演算式を構成することができるようになる。そして、この組合わせはコンテンツ記憶部１１のコンテンツＳ３により決定する。
グローバル順序回路１６は、コンテンツ記憶部１１のコンテンツＳ３を元に配線ネットワーク１５の組合わせと、基本演算ブロック１２，１３，１４の動作順序の制御を行なう制御信号Ｓ４を生成する機能を有している。
【００１７】
次に、この第１実施形態の動作について説明する。
外部回路であるＭＰＵ１から所定の入力信号Ｓ１、たとえばモータを所定の動作で制御するための指令信号が信号処理回路に入力されると、配線ネットワーク１５を介して、コンテンツ記憶部１１にその信号が伝達される。コンテンツ記憶部１１は、入力信号Ｓ１に応じた信号処理の内容および順序を表すコンテンツＳ３を基本演算ブロック１２，１３，１４およびグローバル順序回路１６に出力する。
各基本演算ブロック１２，１３，１４では、それぞれ、コンテンツ記憶部１１のコンテンツＳ３に基づいて演算を行なう。またグローバル順序回路１６は、コンテンツ記憶部１１のコンテンツＳ３を元に、基本演算ブロック１２，１３，１４の動作順序の状態遷移を決定し、配線ネットワーク１５の組み合わせを決定する制御信号Ｓ４を生成する。配線ネットワーク１５からは、回路２への出力信号Ｓ２が出力される。
この第１実施形態の信号処理回路によれば、信号処理を行なう多種多様なアルゴリズムを、単一のアーキテクチャとして実現でき、多種多様なアルゴリズムの特徴を、コンテンツ記憶部１１のデータとして抽出したコンテンツにより、コンテンツに対応したアルゴリズムに最適なアーキテクチャとすることができる。そして、複数の基本演算ブロック１２，１３，１４を並列に動作させることで、高速な信号処理を行なえ、システムの動作周波数を下げることができる。
【００１８】
＜第２実施形態＞
本発明の第２実施形態の信号処理回路について、図２を用いて説明する。図２は、基本演算ブロック１２，１３，１４の具体的構成について説明するものである。
本実施形態において、基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースが配線ネットワーク１５に接続されている。グローバル順序回路１６はコンテンツ記憶部１１のコンテンツＳ３を元に、配線ネットワーク１５の組合わせと、基本演算ブロック１２，１３，１４の動作順序の制御を行なう制御信号Ｓ４を生成する。この制御信号Ｓ４は、ローカル順序回路１２２の起動信号となる。
基本演算ブロック１２，１３，１４は、基本演算部１２１とローカル順序回路１２２から構成される。さらに、基本演算部１２１は、入力レジスタ部１２１２，１２１３と、結果レジスタ部１２１４、演算処理部１２１１から構成される。演算処理部１２１１は、信号処理の演算を行なう基本的なパターンを提供する。
ローカル順序回路１２２は、グローバル順序回路１６により生成された制御信号Ｓ４を起動信号とし、コンテンツ記憶部１１のコンテンツの制御に利用し、基本演算部１２１の制御を行なう制御信号Ｓ１０を生成する。
【００１９】
本実施形態では、基本的なパターンとして、積和演算を布線論理で実現した演算処理部１２１１で説明する。基本演算部１２１の入力は、入力レジスタ部１２１２，１２１３で記憶され、乗算器１２１５，１２１６の被乗数として利用され、乗数はコンテンツ記憶部１１のコンテンツＳ３が利用される。乗算器１２１５，１２１６の出力は、演算精度を調節するため、バレルシフタ１２１７，１２１８に入力される。バレルシフタ１２１７，１２１８のシフト量は、ここでもコンテンツ記憶部１６のコンテンツが利用される。バレルシフタ１２１７，１２１８の出力が、加算器１２１９で加算され、結果レジスタ部１２１４で演算結果が記憶される。演算パターンである演算処理部１２１１は、コンテンツ記憶部１１のコンテンツＳ３を利用することで、演算処理の最適化が行なえるのである。例えば、乗数の一方を０とすることで、演算処理部１２１１を単純な乗算器として利用できるようになる。また、乗数の一方を１、もう一方を−１とすることで、入力の差分を得ることができる。
この第２実施形態においては、演算処理部１２１１の処理回路自体は固定であるが、コンテンツを演算式の乗算、加算、シフタへの入力とすることで、処理回路が表現する演算式としての働きを容易に変更することができる。
【００２０】
＜第３実施形態＞
本発明の第３実施形態に係る信号処理回路を、図３を用いて説明する。図３は、演算処理部１２１１を重複させて動作させる実施例である。このような構成にすることにより、行列の扱いを容易にすることができる。基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースが配線ネットワーク１５に接続されている。グローバル順序回路１６はコンテンツ記憶部１１のコンテンツを元に、基本演算ブロック１２，１３，１４の動作順序の制御と各リソースの制御を行なう制御信号Ｓ４を生成する。この制御信号は、ローカル順序回路１２２の起動信号となる。
ローカル順序回路１２２は、入力制御部１２２１、出力制御部１２２２、演算制御部１２２３で構成される。
【００２１】
基本演算部１２１は入力レジスタ部１２１２，１２１３と、演算処理部１２１１、結果レジスタ部１２１４から構成される。
さらに、入力レジスタ部１２１２，１２１３は、入力ベクトルレジスタ部１２１１０，１２１１１，１２１１２，１２１１３，１２１１４，１２１１５とマルチプレクサ部１２１１６，１２１１７から構成され、結果レジスタ部１２１４は、デマルチプレクサ部１２１１８と結果ベクトルレジスタ部１２１１９，１２１２０，１２１２１から構成される。コンテンツ記憶部１１に記憶されているコンテンツを元に、ローカル順序回路１２２の入力制御部１２２１は入力レジスタ部１２１２，１２１３の入力の切り替えと記憶を制御するための制御信号Ｓ１０１を、演算制御部１２２３は入力レジスタ部１２１２，１２１３の値を利用した演算を制御するための制御信号Ｓ１０３を、出力制御部１２２２は演算処理部１２１１の結果を結果レジスタ部１２１４内の適切なレジスタへ記憶させるための制御を行なう制御信号Ｓ１０２をそれぞれ生成する。この入力レジスタ部１２１２，１２１３と、演算処理部１２１１、結果レジスタ部１２１４を対応させて制御することで、演算処理部１２１１を重複させても正しい演算結果を得ることができる。
【００２２】
＜第４実施形態＞
本発明の第４実施形態に係る信号処理回路を、図４を用いて説明する。図４は、基本演算ブロック１２，１３，１４内の基本演算部１２１として、バッファメモリ１２１２２を用いる例である。基本演算ブロック１２、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースが配線ネットワーク１５に接続されている。グローバル順序回路１６が各リソースの制御を行なう制御信号Ｓ４を生成する。この制御信号は、ローカル順序回路１２２の起動信号となる。ローカル順序回路１２２は基本演算ブロック１２を制御する制御信号Ｓ１０を生成する。グローバル順序回路１６、ローカル順序回路１２２は、コンテンツ記憶部１１のコンテンツＳ３を元に、基本演算ブロック１２，１３，１４の動作順序の制御を行なう制御信号Ｓ４を出力する。
ここで、基本演算ブロック１２，１３，１４の演算結果を記憶することに、バッファメモリ１２１２２を利用すると、信号処理のアルゴリズムをフィードバックシステムとすることができる。また、未来の予測値を格納することに、バッファメモリ１２１２２を利用すると、信号処理のアルゴリズムをフィードフォワードシステムとすることができる。バッファメモリ１２１２２に記憶する内容は、どちらか一方だけでなく、混在してもかまわない。
【００２３】
＜第５実施形態＞
本発明の第５実施形態に係る信号処理回路を、図５を用いて説明する。図５は、配線ネットワーク１５としてクロスバスイッチ１５１を用いる例である。基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースがクロスバスイッチ１５１に接続されている。
グローバル順序回路１６内のクロスバスイッチ制御部１６２は、クロスバスイッチ１５１につながるリソース間で通信を行なうクロスポイントのスイッチと、以前通信が行なわれたが不要となったスイッチの開閉の制御を行なう制御信号Ｓ４２を生成する。また、ブロック制御部１６１は、基本演算ブロック１２，１３，１４で行なう制御の基準信号となる制御信号Ｓ４１を生成する。
コンテンツ記憶部１１のコンテンツＳ３は、ブロック制御部１６１とクロスバスイッチ制御部１６２の双方で、動作順序の状態遷移の制御に利用される。クロスバスイッチ１５１の特性により、入力と基本ブロック１２の通信と、基本ブロック１３，１４間の通信が同時に行なえる。
このようにリソース間の通信を複数行なうことができるため、入力から信号処理の出力までの遅延を最小にすることができる。
【００２４】
＜第６実施形態＞
本発明の第６実施形態に係る信号処理回路を、図６を用いて説明する。図６は、配線ネットワーク１５として共有メモリ１５２を用いる例である。基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースが共有メモリ１５２の共有バスに接続されている。
グローバル順序回路１６内の共有メモリ制御部１６３は、共有メモリ１５２につながるリソース間で通信を行なうため、アドレスとストローブ信号を生成し、通信の制御を行なう制御信号Ｓ４３を生成する。
ブロック制御部１６１は、基本演算ブロック１２，１３，１４で行なう制御の基準信号となる制御信号Ｓ４１を生成する。
コンテンツ記憶部１１のコンテンツＳ３は、ブロック制御部１６１と共有メモリ制御部１６３の双方で、動作順序の状態遷移の制御に利用される。
共有メモリ１５２の場合も、第５実施形態のクロスバスイッチ１５１のように、リソース間の通信を複数行なうことができるため、入力から信号処理の出力までの遅延を小さくすることができる。ただし、メモリという緩衝帯を一段含むことになるため、クロスバスイッチの場合より遅延が多少増加する。しかし、アドレスにより通信の制御ができるために、制御回路を簡単にすることができる。
【００２５】
＜第７実施形態＞
本発明の第７実施形態に係る信号処理回路を、図７を用いて説明する。図７は、配線ネットワーク１５として共有バス１５３を用いる例である。基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、入力信号Ｓ１、出力信号Ｓ２の各リソースが共有バス１５３に接続されている。
グローバル順序回路１６内の共有バス制御部１６４は、共有バス１５３につながるリソース間で通信を行なうため、通信を行なうリソースとその方向を制御する制御信号Ｓ４４を生成する。
ブロック制御部１６１は、基本演算ブロック１２，１３，１４で行なう制御の基準信号となる制御信号Ｓ４１を生成する。
コンテンツ記憶部１１のコンテンツＳ３は、ブロック制御部１６１と共有バス制御部１６４の双方で、回路の動作順序の状態遷移の制御に利用される。クロスバスイッチや共有メモリのように、リソース間の通信を複数行なうことができないが、共有バス方式は制御が容易でかつ回路規模も小さくできる。
【００２６】
＜第８実施形態＞
本発明の第８実施形態に係る信号処理回路を、図８を用いて説明する。図８は、配線ネットワークを複数持つ場合の構成を示すものであり、この例では２つ持つ。配線ネットワーク１５−１、配線ネットワーク１５−２は、それぞれ独立に存在しており、基本演算ブロック１２，１３，１４、コンテンツ記憶部１１、グローバル順序回路１６、入力信号Ｓ１、出力信号Ｓ２は、配線ネットワーク１５−１，１５−２にそれぞれ接続されている。そのため、２つの入力や、２つの出力、入力と出力のように、通信を並行して行なうことができる。
コンテンツ記憶部１１のコンテンツＳ３は、基本演算ブロック１２，１３，１４、配線ネットワーク１５−１と１５−２、グローバル順序回路１６に接続され、各ブロックの調整最適化を行なう。また、グローバル順序回路１６からは制御信号が生成され、基本演算ブロック１２，１３，１４、配線ネットワーク１５−１と１５−２の組合わせや動作を決定する。
【００２７】
＜第９実施形態＞
本発明の第９実施形態に係る信号処理回路を、図９を用いて説明する。図９は、コンテンツデータＤ１の構造を示すものである。
コンテンツデータＤ１は、グローバル順序回路データ部Ｄ１１と、基本演算ブロックデータ部Ｄ１２，Ｄ１３，Ｄ１４，Ｄ１５を有している。
グローバル順序回路データ部Ｄ１１は、入力信号と、出力信号、基本演算ブロックのデータフローグラフを表現する。これは、並列性を含む、時間軸の成分も表現される。このデータグラフの頂点が、各基本演算ブロックに対応する。
基本演算ブロックデータ部Ｄ１２，Ｄ１３，Ｄ１４，Ｄ１５は、それぞれ、グローバル順序回路データ部Ｄ１１のデータグラフの頂点との関係を示すタグ部Ｄ１２１，Ｄ１３１，Ｄ１４１，Ｄ１５１と、演算処理部１２１１の回路を最適に調整する演算処理データ部Ｄ１２２，Ｄ１３２，Ｄ１４２，Ｄ１５２と、基本演算部１２１内の制御を行なうローカル順序回路１２２の動作を決定するために利用されるローカル順序回路データ部Ｄ１２３，Ｄ１３３，Ｄ１４３，Ｄ１５３を持っている。
【００２８】
基本演算ブロックデータ部Ｄ１２，Ｄ１３，・・・の個数は、グローバル順序回路データ部Ｄ１１のデータグラフの頂点の数である。
この第９実施形態においては、信号処理のアルゴリズムという定性的な情報を、アーキテクチャの動作の決定と、最適化のための調整値という形で、特徴を抽出しディジタルデータ化できる。これにより、システムに依存しないデータ（コンテンツ）という形で、アルゴリズムのデータベース化を行なえるようになり、アルゴリズムの再利用を簡単にすることができる。さらに、アルゴリズムを表現したプログラムと異なり、システムに依存しないデータのため、本信号処理回路を利用すると、どのようなシステム構成においても、データ（コンテンツ）の修正の必要無しに、アルゴリズムが動作可能であり、さらにコンテンツのサイズを非常に小さくできるので、インターネットを介した通信に最適となる。
【００２９】
【発明の効果】
本発明によれば、次の効果を奏する。
（１）本発明の第１の構成によれば、信号処理を行なう多種多様なアルゴリズムを、単一のアーキテクチャとして実現でき、多種多様なアルゴリズムの特徴を、データとして抽出したコンテンツにより、コンテンツに対応したアルゴリズムに最適なアーキテクチャとすることができる。そして、複数の基本演算ブロックを並列に動作させることで、高速な信号処理を行なえ、システムの動作周波数を下げることができる。
これにより、定性的なアルゴリズムを、定量的なディジタルデータであるコンテンツとして表現できるとともに、最適なアーキテクチャの決定をコンテンツにより行なえ、カスタマイズ性と柔軟性に非常に優れた信号処理回路が提供できる。また、並列処理による処理の高速化と同時に、動作周波数を下げることにより、消費電力と不要輻射ノイズを低減でき、信頼性を向上でき、さらに、システムの低コスト化を図ることができる。
（２）また、演算処理部の処理回路自体は固定であるが、コンテンツを演算式の乗算、加算、シフタへの入力とすることで、処理回路が表現する演算式としての働きを簡単に変更することができる。さらに、演算処理部のデータパスと制御回路を簡単にすることができるために、実装が容易となる。
（３）また、メモリに与えるアドレストスとストローブ信号のみで通信ができるために、簡単な制御により、基本演算ブロック部や、コンテンツ記憶部、グローバル順序回路、入力、出力間の通信を行なうことができる。
（４）さらに、信号処理のアルゴリズムという定性的な情報を、アーキテクチャの動作の決定と、最適化のための調整値という形で、特徴を抽出しディジタルデータ化できる。これにより、システムに依存しないデータ（コンテンツ）という形で、アルゴリズムのデータベース化を行なえるようになり、アルゴリズムの再利用を簡単にすることができる。さらに、アルゴリズムを表現したプログラムと異なり、システムに依存しないデータのため、本信号処理回路を利用すると、どのようなシステム構成においても、データ（コンテンツ）の修正の必要無しに、アルゴリズムが動作可能となり、さらにコンテンツのサイズは非常に小さくできるので、インターネットを介した通信に最適となる。
【００３０】
（５）本発明の第２の構成によれば、任意の基本演算ブロックを重複して信号処理に利用することができるために、実装に必要とする回路規模を減らし、行列演算のような繰り返しが要求される処理を高速化することができる。
【００３１】
（６）本発明の第３の構成によれば、基本演算ブロックの基本演算部として、外部回路からアクセス可能であり、任意の基本演算ブロックの任意の時点での演算結果や将来の予測値を、次回以降の演算に利用可能とするバッファメモリを備えたことにより、任意の時点での演算結果を利用したフィードバック処理や、予測値を利用したフィードフォワード処理ができるために、精度の高い信号処理を行なえる。
【図面の簡単な説明】
【図１】本発明の第１実施形態の構成を示すブロック図である。
【図２】本発明の第２実施形態の構成を示すブロック図である。
【図３】本発明の第３実施形態の構成を示すブロック図である。
【図４】本発明の第４実施形態の構成を示すブロック図である。
【図５】本発明の第５実施形態の構成を示すブロック図である。
【図６】本発明の第６実施形態の構成を示すブロック図である。
【図７】本発明の第７実施形態の構成を示すブロック図である。
【図８】本発明の第８実施形態の構成を示すブロック図である。
【図９】本発明の第９実施形態の構成を示すブロック図である。
【図１０】従来の信号処理回路の一例の構成を示すブロック図である。
【図１１】従来のモータ制御の比例制御の構成を示すブロック線図である。
【図１２】従来のモータ制御の比例積分の構成ブロック線図である。
【図１３】従来のモータ制御の比例積分微分示すブロック線図である。
【符号の説明】
１：ＭＰＵ
２：回路
１１：コンテンツ記憶部
１２：基本演算ブロック
１２１：基本演算部
１２１１：演算処理部
１２１２，１２１３：入力レジスタ部
１２１４：結果レジスタ部
１２１５，１２１６：乗算器
１２１７，１２１８：バレルシフタ
１２１９：加算器
１２１１０〜１２１１５：入力ベクトルレジスタ部
１２１１６，１２１１７：マルチプレクサ部
１２１１８：デマルチプレクサ部
１２１１９〜１２１２１：結果ベクトルレジスタ部
１２１２２：バッファメモリ
１２２：ローカル順序回路
１２２１：入力制御部
１２２２：出力制御部
１２２３：演算制御部
１３，１４：基本演算ブロック
１５，１５−１，１５−２：配線ネットワーク
１５１：クロスバスイッチ
１５２：共有メモリ
１５３：共有バス
１６：グローバル順序回路
１６１：ブロック制御部
１６２：クロスバスイッチ制御部
１６３：共有メモリ制御部
１６４：共有バス制御部
Ｄ１：コンテンツデータ
Ｄ１１：グローバル順序回路データ部
Ｄ１２〜Ｄ１５：基本演算ブロックデータ部
Ｄ１２１，Ｄ１３１，Ｄ１４１，Ｄ１５１：タグ部
Ｄ１２２，Ｄ１３２，Ｄ１４２，Ｄ１５２：演算処理データ部
Ｄ１２３，Ｄ１３３，Ｄ１４３，Ｄ１５３：ローカル順序回路データ部[0001]
BACKGROUND OF THE INVENTION
  The present invention relates to a signal processing circuit that generates an output signal serving as a command to a circuit connected to a subsequent stage based on an input signal, and more particularly to a signal processing circuit in the field of embedded devices.
[0002]
[Prior art]
  As a representative example of the prior art, the discussion will focus on a system that controls a motor. A system configuration for performing signal processing using a microprocessor (MPU) is shown in FIG. The MPU 101, a ROM 102 storing a program, a RAM 103 used as a work area, and a timer 104 that generates an interrupt signal for the MPU 101 at regular intervals are connected via a bus B1. These are configurations commonly seen in signal processing such as video and audio other than motor control.
  In FIG. 10, the configuration surrounded by the dotted line differs depending on the signal processing target. In the example of motor control, the motor 107 is a control target, the driver 106 drives the motor 107, the sensor 108 that observes the movement of the motor 107 that is the control target, and the counter 105 that acquires the output of the sensor 108.
  The ROM 102 stores an operating system (OS) and an application program, and the control of the motor 107 is executed by the application program as one task on the OS. In the case of an embedded system, since a real-time property is required, a real-time OS is generally used.
[0003]
  This application program is the main body of signal processing. In the example of motor control, a control algorithm is described as a program. For example, the proportional element K shown in FIG._PAnd proportional element K shown in FIG._PAnd the integral constant K_IProportional integral control provided with an integral element を having the proportional element K shown in FIG._PAnd the integral constant K_IIntegral element を with a differential constant K_DAnd proportional integral differential control provided with a differential element d / dt having
  As described above, in order to improve the performance of control, a plurality of algorithms are considered for the same purpose and are actually used. However, in order to improve performance, the content of signal processing becomes complicated. Furthermore, the processing contents are only complicated in order to improve performance, such as using the internal state as a variable or using multiple inputs and multiple outputs.
[0004]
  In signal processing based on digital processing, differentiation is processed as a difference, and integration is processed as a sum. For this reason, not a MPU but a digital signal processor (DSP) that is designed to perform product-sum operations at high speed is often used. For this reason, signal processing that is continuously executed is performed at high speed by eliminating the dependency of data on the arithmetic processing unit of the DSP. In addition, in order to perform high-speed signal processing with the MPU, there are some which have a multiplier and a product-sum operation unit in addition to an ALU (arithmetic logic operation unit).
  Furthermore, if the algorithm need not be changed and can be fixed, the algorithm realized by the application program on the MPU or DSP can be hardwareized as a dedicated LSI. Thereby, it responds to requests such as signal processing speed and cost.
  However, the conventional signal processing circuit that performs processing by a program using an MPU or DSP has several problems.
[0005]
  In order to cope with a plurality of applications, there is a problem that the signal processing algorithms corresponding to that number are required. Even in the case of motor control, there are various algorithms depending on the type of motor load. Therefore, a plurality of application programs are required. However, in the case of an embedded device, available resources such as an MPU, a ROM, and a RAM are limited, and only about a few types of algorithms can be loaded on the ROM as a program. This makes it difficult to perform customization corresponding to the application. For this reason, a system of a small quantity and a variety of products is to be created, and both the development and the on-site maintenance are complicated, which causes a decrease in the reliability of the system. Moreover, since it corresponds individually, manufacturing cost also increases.
  Next, there is a problem of processing time for signal processing. Basically, the MPU and DSP process is sequential, and the speeding up of the process depends on the clock frequency. Therefore, in order to shorten the time required for signal processing, it is necessary to increase the clock frequency. However, increasing the clock frequency increases power consumption and unnecessary radiation noise. This makes it difficult to develop the system and further impairs the reliability of the system. On the other hand, signal processing takes processing time and is difficult to prioritize. In addition, if the signal processing is completed and the processing result is not obtained, the next processing is not performed. Therefore, while the signal processing is in operation, the interrupt is prohibited and the MPU and DPS are occupied to perform the signal processing. For this reason, even high priority processing is blocked, which is a major factor that impairs real-time performance.
[0006]
  The algorithm itself is common to all systems, but there is a problem that it must be programmed in a system-dependent manner. The algorithm is described in a high-level language such as C language, translated into a machine language program that operates on an MPU or DSP, and finally mounted on a ROM. At this time, if some part of the system, such as the OS or compiler, is changed as well as the MPU and DSP, the translated machine language has the same system configuration, but the program does not work as it is. Have. This problem is more conspicuous for embedded systems that tend to become specialized products.
  Despite the same algorithm, since the program depends on the system, it is difficult to store and reuse the algorithm. Furthermore, the customization requires exchanging the entire program, and there is a problem of the program size in order to obtain the program via the network, particularly the Internet. In an embedded system, resources are limited, but there are still several hundred kilobytes, and it can be said that PPP (Point-to-Point Protocol) using a telephone line is a very large size.
  In order to solve the problem of the processing time required for signal processing, there is an approach by LSI in which an algorithm is hardwareized. However, there is still a problem that customization corresponding to other applications cannot be performed because it is dedicated and hardwareized. If customization is possible, the same problem as MPU and DSP appears because of the microcode system.
[0007]
  In order to solve such problems, a DSP (digital signal processor) core that performs basic arithmetic processing such as multiplication, addition, data transfer, and sequence control, and outputs various data and command signals, and the DSP core Select one or more functional blocks based on internal status signals and multiple functional blocks that share the data bus and share this data bus and execute special processing other than basic arithmetic processing corresponding to command signals. Thus, there has been proposed a signal processing arithmetic unit including a selection circuit that can execute special processing corresponding to a command signal (see, for example, Patent Document 1). As a result, a signal processing arithmetic circuit that can be adapted to various applications can be obtained without drastically changing hardware accompanied by an increase in cost.
[Patent Document 1]
          JP-A-8-106375 (page 3-4, FIGS. 1 and 2)
[0008]
[Problems to be solved by the invention]
  However, in the signal processing disclosed in Patent Document 1, the basic arithmetic processing is performed in the DSP core, and the division of roles is performed such that special functions that are difficult to handle in the DSP core are executed in a plurality of functional blocks. Therefore, there is a problem that the processing speed of basic arithmetic processing other than special processing is limited by the capacity of the memory constituting the DSP core and the processing speed of the arithmetic circuit, and it is difficult to increase the processing speed further.
  Accordingly, an object of the present invention is to realize a variety of algorithms for performing signal processing as a single system in order to solve the above-mentioned problems, providing customization, flexibility, high speed, and simultaneous consumption. In order to realize reduction of electric power and unnecessary radiation noise, it is to provide a signal processing circuit having an architecture that achieves both high speed and low operating frequency.
[0009]
[Means for Solving the Problems]
  In order to solve the above problem, the signal processing circuit according to the first configuration of the present invention performs an arbitrary arithmetic process on the input signal given from the preceding processor or circuit and gives a command to the following processor or circuit. In a signal processing circuit that generates an output signal
  Content that is extracted from the features of signal processing and stored as digital data, is composed of a content storage unit that can be accessed from an external circuit, and an arithmetic circuit that can be accessed from the external circuit and has basic arithmetic functions. Based on the contents of the storageCalculation execution, input control, output controlA wiring network capable of connecting a plurality of basic operation blocks capable of parallel operation, the input signal, the output signal, the content storage unit, and the basic operation block in any combination; and A combination of the wiring networks based on the content of the content storage unit, and a global sequential circuit that generates a control signal for controlling the operation order of the basic operation blocks,
  The basic operation block includes a basic operation unit and a local sequential circuit,
  The basic arithmetic unit is an arithmetic unit that uses wiring logic to configure at least one arithmetic operation circuit that uses the content of the content storage unit as one input and a shift circuit that uses the content of the content storage unit as a shift amount. A result register unit connected to the processing unit, connected to the wiring network, holding a result of the arithmetic processing unit, accessible from an external circuit, and the input network and any other basic arithmetic block; Via the input register unit that holds the input to the arithmetic processing unit and is accessible from an external circuit,
  The local sequential circuit uses the control signal generated by the global sequential circuit as a start signal, uses the content in the content storage unit for control of the local sequential circuit, and generates a control signal for controlling the basic arithmetic unit Is,
  The wiring network includes an n-port shared memory accessible from any of the basic operation blocks,
  The global sequential circuit generates a shared memory control unit that generates an address and a control signal to the shared memory based on content in the content storage unit, and an operation order of the basic operation blocks based on content in the content storage unit. A block control unit that generates a control signal to be controlled,
  As a data structure of content stored in the content storage unit, a global sequential circuit data unit, and a plurality of basic operation block data units,
  In the global sequential circuit, the global sequential circuit data portion is used for determining the combination of the wiring networks and the operation order of the basic arithmetic blocks, and a data flow graph having the basic arithmetic block data portion as a vertex element. The global sequential circuit data to be expressed is saved.
  The basic arithmetic block data part includes a tag part for corresponding to each vertex of the flow graph of the global sequential circuit data part, and the arithmetic processing partExecution, input control, output controlAnd an arithmetic processing data part used for performing the operation and a local sequential circuit data part used for controlling the local sequential circuit.
  In the signal processing circuit having the first configuration, by performing signal processing among the content storage unit, the plurality of basic operation blocks, the wiring network, and the global sequential circuit, a wide variety of algorithms can be obtained by a single unit. It can be realized as an architecture, and the characteristics of various algorithms can be made the most suitable architecture for the algorithm corresponding to the contents by the contents extracted as data. Then, by operating a plurality of basic arithmetic blocks in parallel, high-speed signal processing can be performed and the operating frequency of the system can be lowered.
  Further, although the processing circuit itself of the arithmetic processing unit is fixed, the function as the arithmetic expression expressed by the processing circuit can be easily changed by using the content as the input to the multiplication, addition, and shifter of the arithmetic expression. Further, since the data path and the control circuit of the arithmetic processing unit can be simplified, the mounting becomes easy.
  Further, since communication is performed using only addresses and strobe signals given to a plurality of shared memories, communication between the basic operation block unit, content storage unit, global sequential circuit, input, and output can be performed with simple control.
  In addition, system-independent data (content) is obtained by extracting features from qualitative information such as signal processing algorithms in the form of architecture operation determination and adjustment values for optimization, and converting them into digital data. In this way, it becomes possible to create a database of algorithms, and the reuse of algorithms can be simplified.
[0010]
  First of the present invention2The signal processing circuit according to the configuration of1The input register unit in the configuration of n can be accessed from the external circuit via the wiring network in order to use one arithmetic processing unit redundantly and perform arithmetic processing n (n> 1) times. An input vector register unit consisting ofThe arithmetic processing unitAnd a multiplexer unit that multiplexes the output of the input vector register unit in an n-to-one manner so as to be an input to the input. A demultiplexer unit that multiplexes, and a result vector register unit that is connected to a wiring network and that can be accessed from an external circuit, and that holds the output of the demultiplexer unit. Circuit,in frontThe control signal generated by the global sequential circuit is used as the start signal,AlsoOf the content storage unitcontentTheOf the local sequential circuitTo controlRespectivelyUseTheAn input control unit for generating a signal for controlling the input register unit, an output control unit for generating a signal for controlling the result register unit, and the arithmetic processing unit.N (n ≧ N) times of overlapping operationAnd an arithmetic control unit that generates a signal to be controlled.
  This first2In the signal processing circuit having the above configuration, the arithmetic processing unit of an arbitrary basic arithmetic block is used for signal processing redundantly, thereby reducing the circuit scale required for implementation and requiring repetition such as matrix arithmetic. Processing can be speeded up.
[0011]
  First of the present invention3The signal processing circuit according to the configuration of1Or second2With this configuration, the basic calculation block of the basic calculation block can be accessed from an external circuit, the calculation results at any point in any basic calculation block can be saved, and future predicted values can be used for subsequent calculations A buffer memory is provided.
  This first3With this configuration, since feedback processing using the calculation result at an arbitrary time point and feedforward processing using the predicted value can be performed, highly accurate signal processing can be performed.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
  Hereinafter, embodiments of the present invention will be described with reference to FIGS.
<First Embodiment>
  The first embodiment shown in FIG. 1 explains the basic concept of the signal processing circuit of the present invention. The basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1 from the microprocessor (MPU) 1, and the output signal S 2 to the circuit 2 are connected to the wiring network 15.
  The content storage unit 11 is configured to store content obtained by extracting signal processing characteristics and converting it into digital data, and accessible from an external circuit via the wiring network 15.
[0016]
  The basic operation blocks 12, 13, and 14 are accessible from an external circuit via the wiring network 15, are configured from operation circuits having basic operation functions, and are stored in the content storage unit 11.contentAdjustment of calculation function based on S3That is, execution of operations, input control, output controlTo achieve parallel operation. The basic calculation blocks 12, 13, and 14 provide various calculation patterns and are used as basic parts constituting a calculation formula for signal processing. The calculation pattern and the arithmetic expression of signal processing can be compared to the relationship between a prime number and a composite number. That is, when the upper limit number is arbitrarily set, the composite number excluding an arbitrary prime number can be expressed by a set of prime numbers smaller than that. That is, by selecting a basic operation block that is a set of prime numbers, it becomes possible to combine a small number of basic operation blocks to form a wide variety of arithmetic expressions. This combination is stored in the content storage unit 11.contentDetermined by S3.
  The global sequential circuit 16 is stored in the content storage unit 11.contentBased on S3, it has a function of generating a control signal S4 for controlling the combination of the wiring networks 15 and the operation sequence of the basic operation blocks 12, 13, and 14.
[0017]
  Next, the operation of the first embodiment will be described.
  When a predetermined input signal S1, for example, a command signal for controlling the motor with a predetermined operation, is input to the signal processing circuit from the MPU 1 which is an external circuit, the signal is sent to the content storage unit 11 via the wiring network 15. Communicated. The content storage unit 11 represents the content and order of signal processing corresponding to the input signal S1.contentS3 is output to the basic operation blocks 12, 13, and 14 and the global sequential circuit 16.
  In each of the basic operation blocks 12, 13, and 14, the content storage unit 11contentAn operation is performed based on S3. In addition, the global sequential circuit 16 is stored in the content storage unit 11.contentBased on S3, the state transition of the operation sequence of the basic operation blocks 12, 13, and 14 is determined, and the control signal S4 for determining the combination of the wiring network 15 is generated. An output signal S2 to the circuit 2 is output from the wiring network 15.
  According to the signal processing circuit of the first embodiment, a wide variety of algorithms for performing signal processing can be realized as a single architecture, and the characteristics of the various algorithms are extracted by content extracted as data in the content storage unit 11. The architecture can be optimized for the algorithm corresponding to the content. Then, by operating the plurality of basic operation blocks 12, 13, and 14 in parallel, high-speed signal processing can be performed and the operating frequency of the system can be lowered.
[0018]
Second Embodiment
  A signal processing circuit according to a second embodiment of the present invention will be described with reference to FIG. FIG. 2 illustrates a specific configuration of the basic arithmetic blocks 12, 13, and 14.
  In the present embodiment, the basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1, and the output signal S 2 are connected to the wiring network 15. The global sequential circuit 16 is stored in the content storage unit 11.contentBased on S3, a control signal S4 for controlling the combination of the wiring networks 15 and the operation sequence of the basic operation blocks 12, 13, and 14 is generated. This control signal S4 serves as an activation signal for the local sequential circuit 122.
  The basic operation blocks 12, 13, and 14 include a basic operation unit 121 and a local sequential circuit 122. Further, the basic arithmetic unit 121 includes input register units 1212 and 1213, a result register unit 1214, and an arithmetic processing unit 1211. The arithmetic processing unit 1211 provides a basic pattern for performing signal processing arithmetic.
  The local sequential circuit 122 uses the control signal S4 generated by the global sequential circuit 16 as an activation signal, and the content storage unit 11contentA control signal S10 for controlling the basic arithmetic unit 121 is generated.
[0019]
  In the present embodiment, as a basic pattern, a description will be given of an arithmetic processing unit 1211 that implements a product-sum operation with wiring logic. The input of the basic arithmetic unit 121 is stored in the input register units 1212 and 1213 and used as a multiplicand of the multipliers 1215 and 1216, and the multiplier is stored in the content storage unit 11.contentS3 is used. The outputs of the multipliers 1215 and 1216 are input to barrel shifters 1217 and 1218 in order to adjust the calculation accuracy. The shift amount of the barrel shifters 1217 and 1218 is again the content storage unit 16contentIs used. The outputs of the barrel shifters 1217 and 1218 are added by the adder 1219, and the calculation result is stored in the result register unit 1214. An arithmetic processing unit 121 that is an arithmetic pattern1Of the content storage unit 11contentBy using S3, calculation processing can be optimized. For example, by setting one of the multipliers to 0, the arithmetic processing unit 1211 can be used as a simple multiplier. Further, by setting one of the multipliers to 1 and the other to -1, an input difference can be obtained.
  In the second embodiment, the arithmetic processing unit 1211Although the processing circuit itself is fixed, the operation as the arithmetic expression expressed by the processing circuit can be easily changed by using the content as the input to the multiplication, addition, and shifter of the arithmetic expression.
[0020]
<Third Embodiment>
  A signal processing circuit according to a third embodiment of the present invention will be described with reference to FIG. FIG. 3 shows an embodiment in which the arithmetic processing units 1211 are operated in an overlapping manner. With such a configuration, it is possible to easily handle the matrix. Each resource of the basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1, and the output signal S 2 is connected to the wiring network 15. The global sequential circuit 16 is a content storage unit.11 contentsBased on the above, a control signal S4 for controlling the operation sequence of the basic operation blocks 12, 13, and 14 and for controlling each resource is generated. This control signal serves as an activation signal for the local sequential circuit 122.
  The local sequential circuit 122 includes an input control unit 1221, an output control unit 1222, and an arithmetic control unit 1223.
[0021]
  The basic arithmetic unit 121 includes input register units 1212 and 1213, an arithmetic processing unit 1211, and a result register unit 1214.
  Further, the input register units 1212 and 1213 include input vector register units 12110, 12111, 12112, 12113, 12114, and 12115 and multiplexer units 12116 and 12117, and the result register unit 1214 includes the demultiplexer unit 12118 and the result vector register unit. 12119, 12120, and 12121. Content storageContent stored in 11Based on the local sequential circuit122Input control unit1221Is the input register section1212,1213The control signal S101 for controlling the input switching and storage of1223Is the input register section1212,1213The control signal S103 for controlling the calculation using the value of1222Is the processing unit1211The result ofIn the result register unit 1214A control signal S102 for performing control for storing in an appropriate register is generated. This input register section1212,1213And processing unit1211, Result register section1214Control processing unit by controlling1211Even if they are overlapped, a correct calculation result can be obtained.
[0022]
<Fourth embodiment>
  A signal processing circuit according to a fourth embodiment of the present invention will be described with reference to FIG. FIG. 4 shows an example in which a buffer memory 12122 is used as the basic calculation unit 121 in the basic calculation blocks 12, 13, and 14. Each resource of the basic arithmetic block 12, the content storage unit 11, the input signal S1, and the output signal S2 is connected to the wiring network 15. The global sequential circuit 16 generates a control signal S4 for controlling each resource. This control signal serves as an activation signal for the local sequential circuit 122. The local sequential circuit 122 generates a control signal S10 that controls the basic operation block 12. The global sequential circuit 16 and the local sequential circuit 122 are stored in the content storage unit 11.contentBased on S3, a control signal S4 for controlling the operation sequence of the basic arithmetic blocks 12, 13, and 14 is output.
  Here, if the buffer memory 12122 is used to store the calculation results of the basic calculation blocks 12, 13, and 14, the signal processing algorithm can be a feedback system. Further, if the buffer memory 12122 is used to store the predicted value of the future, the signal processing algorithm can be a feedforward system. The contents stored in the buffer memory 12122 are not limited to either one but may be mixed.
[0023]
<Fifth Embodiment>
  A signal processing circuit according to a fifth embodiment of the present invention will be described with reference to FIG. FIG. 5 shows an example in which a crossbar switch 151 is used as the wiring network 15. Each resource of the basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1, and the output signal S 2 is connected to the crossbar switch 151.
  The crossbar switch control unit 162 in the global sequential circuit 16 is a control signal for controlling the opening and closing of a crosspoint switch that performs communication between resources connected to the crossbar switch 151 and a switch that has been previously communicated but becomes unnecessary. S42 is generated. In addition, the block control unit 161 generates a control signal S41 that is a reference signal for control performed in the basic arithmetic blocks 12, 13, and 14.
  Content storage11 contentsS3 is used by both the block control unit 161 and the crossbar switch control unit 162 to control the state transition of the operation sequence. Due to the characteristics of the crossbar switch 151, communication between the input and the basic block 12 and communication between the basic blocks 13 and 14 can be performed simultaneously.
  As described above, since a plurality of communications between resources can be performed, a delay from an input to an output of signal processing can be minimized.
[0024]
<Sixth Embodiment>
  A signal processing circuit according to a sixth embodiment of the present invention will be described with reference to FIG. FIG. 6 shows an example in which a shared memory 152 is used as the wiring network 15. Each resource of the basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1, and the output signal S 2 is connected to the shared bus of the shared memory 152.
  The shared memory control unit 163 in the global sequential circuit 16 generates an address and a strobe signal to perform communication between resources connected to the shared memory 152, and generates a control signal S43 for controlling communication.
  The block control unit 161 generates a control signal S41 that is a reference signal for control performed in the basic arithmetic blocks 12, 13, and 14.
  Content storage11 contentsS3 is used by both the block control unit 161 and the shared memory control unit 163 to control the state transition of the operation sequence.
  Also in the case of the shared memory 152, a plurality of communications between resources can be performed as in the crossbar switch 151 of the fifth embodiment, so that the delay from input to output of signal processing can be reduced. However, since a buffer band called a memory is included, the delay is slightly increased as compared with the crossbar switch. However, since the communication can be controlled by the address, the control circuit can be simplified.
[0025]
<Seventh embodiment>
  A signal processing circuit according to a seventh embodiment of the present invention will be described with reference to FIG. FIG. 7 shows an example in which a shared bus 153 is used as the wiring network 15. Each resource of the basic operation blocks 12, 13, 14, the content storage unit 11, the input signal S 1, and the output signal S 2 is connected to the shared bus 153.
  Since the shared bus control unit 164 in the global sequential circuit 16 performs communication between resources connected to the shared bus 153, the shared bus control unit 164 generates a control signal S44 for controlling the resource to be communicated and its direction.
  The block control unit 161 generates a control signal S41 that is a reference signal for control performed in the basic arithmetic blocks 12, 13, and 14.
  Content storage11 contentsS3 is used by both the block control unit 161 and the shared bus control unit 164 to control the state transition of the operation sequence of the circuit. Unlike the crossbar switch and the shared memory, a plurality of communication between resources cannot be performed. However, the shared bus method can be easily controlled and the circuit scale can be reduced.
[0026]
<Eighth Embodiment>
  A signal processing circuit according to an eighth embodiment of the present invention will be described with reference to FIG. FIG. 8 shows a configuration in the case of having a plurality of wiring networks, and in this example, there are two. The wiring network 15-1 and the wiring network 15-2 exist independently, and the basic operation blocks 12, 13, and 14, the content storage unit 11, the global sequential circuit 16, the input signal S1, and the output signal S2 Connected to networks 15-1 and 15-2, respectively. Therefore, communication can be performed in parallel like two inputs, two outputs, and inputs and outputs.
  Content storage11 contentsS3 is connected to the basic arithmetic blocks 12, 13, and 14, the wiring networks 15-1 and 15-2, and the global sequential circuit 16, and performs adjustment optimization of each block. Further, a control signal is generated from the global sequential circuit 16, and the combination and operation of the basic operation blocks 12, 13, 14 and the wiring networks 15-1 and 15-2 are determined.
[0027]
<Ninth Embodiment>
  A signal processing circuit according to a ninth embodiment of the present invention will be described with reference to FIG. FIG. 9 shows the structure of the content data D1.
  The content data D1 has a global sequential circuit data part D11 and basic operation block data parts D12, D13, D14, D15.
  The global sequential circuit data part D11 represents an input signal, an output signal, and a data flow graph of basic operation blocks. This also represents time-axis components including parallelism. The vertex of this data graph corresponds to each basic operation block.
  The basic operation block data parts D12, D13, D14, and D15 respectively include tag parts D121, D131, D141, and D151 indicating the relationship with the vertices of the data graph of the global sequential circuit data part D11, and the circuit of the operation processing part 1211. Arithmetic processing for optimal adjustmentReasonData units D122, D132, D142, and D152, and local sequential circuit data units D123, D133, D143, and D153 used to determine the operation of the local sequential circuit 122 that performs control in the basic arithmetic unit 121. Yes.
[0028]
  The number of basic operation block data parts D12, D13,... Is the number of vertices in the data graph of the global sequential circuit data part D11.
  In the ninth embodiment, qualitative information such as a signal processing algorithm can be extracted into digital data by extracting features in the form of determination of architecture operation and adjustment values for optimization. Thereby, the database of the algorithm can be made in the form of data (content) independent of the system, and the reuse of the algorithm can be simplified. Furthermore, unlike a program that expresses an algorithm, the data does not depend on the system. Therefore, if this signal processing circuit is used, the algorithm can be operated in any system configuration without the need to modify the data (content). Moreover, since the size of the content can be made very small, it is optimal for communication via the Internet.
[0029]
【The invention's effect】
  The present invention has the following effects.
(1) According to the first configuration of the present invention, various algorithms for performing signal processing can be realized as a single architecture, and features of various algorithms can be dealt with by contents extracted as data. It is possible to make the architecture optimal for the algorithm. Then, by operating a plurality of basic arithmetic blocks in parallel, high-speed signal processing can be performed and the operating frequency of the system can be lowered.
  As a result, a qualitative algorithm can be expressed as content that is quantitative digital data, and an optimum architecture can be determined by the content, thereby providing a signal processing circuit that is extremely excellent in customization and flexibility. In addition, by reducing the operating frequency at the same time as the processing speed by parallel processing, power consumption and unnecessary radiation noise can be reduced, reliability can be improved, and system cost can be reduced.
(2) Although the processing circuit itself of the arithmetic processing unit is fixed, the function as the arithmetic expression expressed by the processing circuit can be easily changed by using the content as multiplication, addition, and input to the shifter. can do. Further, since the data path and the control circuit of the arithmetic processing unit can be simplified, the mounting becomes easy.
(3) Since communication is possible only with the address toss and the strobe signal given to the memory, communication between the basic operation block unit, the content storage unit, the global sequential circuit, the input and the output can be performed by simple control. it can.
(4) Further, the qualitative information of the signal processing algorithm can be extracted into digital data by extracting features in the form of determination of the operation of the architecture and adjustment values for optimization. Thereby, the database of the algorithm can be made in the form of data (content) independent of the system, and the reuse of the algorithm can be simplified. Furthermore, unlike a program that expresses an algorithm, the data does not depend on the system. Therefore, if this signal processing circuit is used, the algorithm can be operated in any system configuration without the need to modify the data (content). Furthermore, since the size of the content can be made very small, it is optimal for communication via the Internet.
[0030]
(5) Of the present invention2With this configuration, any basic operation block can be used for signal processing in duplicate, reducing the circuit scale required for implementation and speeding up processing that requires repetition, such as matrix operation can do.
[0031]
(6) Of the present invention3According to the configuration, it can be accessed from an external circuit as the basic operation block of the basic operation block, and the operation results and future predicted values at any time of any basic operation block can be used for the next and subsequent operations. By providing the buffer memory as described above, it is possible to perform feedback processing using a calculation result at an arbitrary time point and feedforward processing using a predicted value, so that highly accurate signal processing can be performed.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.
FIG. 3 is a block diagram showing a configuration of a third exemplary embodiment of the present invention.
FIG. 4 is a block diagram showing a configuration of a fourth embodiment of the present invention.
FIG. 5 is a block diagram showing a configuration of a fifth exemplary embodiment of the present invention.
FIG. 6 is a block diagram showing a configuration of a sixth embodiment of the present invention.
FIG. 7 is a block diagram showing a configuration of a seventh exemplary embodiment of the present invention.
FIG. 8 is a block diagram showing a configuration of an eighth embodiment of the present invention.
FIG. 9 is a block diagram showing a configuration of a ninth embodiment of the present invention.
FIG. 10 is a block diagram showing a configuration of an example of a conventional signal processing circuit.
FIG. 11 is a block diagram showing a configuration of proportional control of conventional motor control.
FIG. 12 is a configuration block diagram of proportional integration of conventional motor control.
FIG. 13 is a block diagram showing proportional integral differentiation of conventional motor control.
[Explanation of symbols]
1: MPU
2: Circuit
11: Content storage unit
12: Basic calculation block
121: Basic calculation unit
1211: Arithmetic processing part
1212, 1213: Input register section
1214: Result register part
1215, 1216: Multiplier
1217, 1218: Barrel shifter
1219: Adder
12110-12115: Input vector register section
12116, 12117: Multiplexer section
12118: Demultiplexer unit
12119-12121: Result vector register section
12122: Buffer memory
122: Local sequential circuit
1221: Input control unit
1222: Output control unit
1223: Calculation control unit
13, 14: Basic operation block
15, 15-1, 15-2: Wiring network
151: Crossbar switch
152: Shared memory
153: Shared bus
16: Global sequential circuit
161: Block control unit
162: Crossbar switch controller
163: Shared memory control unit
164: Shared bus control unit
D1: Content data
D11: Global sequential circuit data part
D12 to D15: Basic calculation block data section
D121, D131, D141, D151: Tag part
D122, D132, D142, D152: Arithmetic processingReasonData section
D123, D133, D143, D153: Local sequential circuit data part

Claims

In the signal processing circuit that performs arbitrary arithmetic processing on the input signal given from the processor or circuit in the previous stage and generates an output signal that is a command to be given to the processor or circuit in the subsequent stage
A content storage unit that extracts the characteristics of signal processing and stores the digitalized content, and is accessible from an external circuit;
A plurality of basic units that can be accessed from an external circuit and are composed of an arithmetic circuit having a basic arithmetic function, and that perform arithmetic operations, input control, and output control based on the content of the content storage unit, and can perform parallel operations. An arithmetic block;
A wiring network capable of connecting the input signal, the output signal, the content storage unit, and the basic arithmetic block in any combination;
A combination of the wiring networks based on the content of the content storage unit, and a global sequential circuit that generates a control signal for controlling the operation order of the basic operation blocks,
The basic operation block includes a basic operation unit and a local sequential circuit,
The basic arithmetic unit is an arithmetic unit that uses wiring logic to configure at least one arithmetic operation circuit that uses the content of the content storage unit as one input and a shift circuit that uses the content of the content storage unit as a shift amount. A result register unit connected to the processing unit, connected to the wiring network, holding a result of the arithmetic processing unit, accessible from an external circuit, and the input network and any other basic arithmetic block; Via the input register unit that holds the input to the arithmetic processing unit and is accessible from an external circuit,
The local sequential circuit uses the control signal generated by the global sequential circuit as a start signal, uses the content in the content storage unit for control of the local sequential circuit, and generates a control signal for controlling the basic arithmetic unit Is,
The wiring network includes an n-port shared memory accessible from any of the basic operation blocks,
The global sequential circuit is:
A shared memory control unit that generates an address and a control signal to the shared memory based on the content of the content storage unit;
A block control unit that generates a control signal for controlling the operation order of the basic operation blocks based on the content of the content storage unit;
As a data structure of content stored in the content storage unit, a global sequential circuit data unit, and a plurality of basic operation block data units,
In the global sequential circuit, the global sequential circuit data portion is used for determining the combination of the wiring networks and the operation order of the basic arithmetic blocks, and a data flow graph having the basic arithmetic block data portion as a vertex element. The global sequential circuit data to be expressed is saved.
The basic operation block data part is
A tag portion for corresponding to each vertex of the flow graph of the global sequential circuit data portion;
Arithmetic processing data unit used for performing calculation, input control, and output control by the arithmetic processing unit;
And a local sequential circuit data unit used to control the local sequential circuit.

The input register unit uses n arithmetic processing units redundantly, and n registers that can be accessed from the external circuit via the wiring network in order to perform arithmetic processing n (n> 1) times. An input vector register unit, and a multiplexer unit that multiplexes the output of the input vector register unit n-to-1 so as to be input to the arithmetic processing unit,
The result register unit is connected to a demultiplexer unit that demultiplexes the results of each processing of the arithmetic processing unit in a 1: n manner and an output of the demultiplexer unit, and is connected to a wiring network and is accessible from an external circuit The result vector register unit is composed of n registers.
The local sequential circuit uses a control signal generated by the global sequential circuit as an activation signal, and uses a content in the content storage unit for controlling the local sequential circuit, and controls a signal for controlling the input register unit. An input control unit for generating, an output control unit for generating a signal for controlling the result register unit, and an arithmetic control unit for generating a signal for controlling the arithmetic processing unit to perform an overlap operation N (n ≧ N) times; The signal processing circuit according to claim 2, comprising:

A buffer that can be accessed from an external circuit to the basic calculation block of the basic calculation block, stores the calculation result at an arbitrary point in an arbitrary basic calculation block, and can use future predicted values for subsequent calculations The signal processing circuit according to claim 1, further comprising a memory.