CN107817708B - High-compatibility programmable neural network acceleration array - Google Patents
High-compatibility programmable neural network acceleration array Download PDFInfo
- Publication number
- CN107817708B CN107817708B CN201711131564.9A CN201711131564A CN107817708B CN 107817708 B CN107817708 B CN 107817708B CN 201711131564 A CN201711131564 A CN 201711131564A CN 107817708 B CN107817708 B CN 107817708B
- Authority
- CN
- China
- Prior art keywords
- programmable
- neural network
- unit
- multiply
- network computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/04—Programme control other than numerical control, i.e. in sequence controllers or logic controllers
Abstract
The invention belongs to the technical field of integrated circuits, and particularly relates to a high-compatibility programmable neural network acceleration array. The array adopts a reconfigurable architecture and comprises a central controller, a characteristic vector emitter and a plurality of neural network computing units; the computing unit chip comprises basic neural network computing modules such as a programmable multiply-add unit, a programmable activation unit and a unit chip controller, and the acceleration array carries out communication among any unit chips through a programmable communication route. The programmable neural network acceleration array can be compatible with various neural network algorithms, does not lose high energy efficiency, and is suitable for various deep learning intelligent systems.
Description
Technical Field
The invention belongs to the technical field of integrated circuits, and particularly relates to a high-compatibility programmable neural network acceleration array.
Background
The development of customized deep learning acceleration chips on mobile devices is getting hot today, and the challenge is that the performance of the chip is limited by the types of deep learning networks, such as CNN (convolutional neural network) and RNN (cyclic neural network), and in order to design a highly energy-efficient customized deep learning acceleration chip, the chip is often optimized for some networks, and the performance is high when the networks are used, and the performance is not good under other networks. However, due to the recent rapid development of the deep learning field, an improved version of CNN or RNN network may appear in the future, and even other new deep learning neural network algorithms appear, so that the existing special deep learning acceleration chip cannot meet the required performance requirement, and the development of deep learning intelligence is fundamentally limited.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present invention to provide a highly compatible programmable neural network acceleration array.
The invention provides a high-compatibility programmable neural network acceleration array, which adopts a reconfigurable architecture and comprises a central controller, a feature vector transmitter and a plurality of neural network computing units. Wherein:
the central controller is responsible for the global control of the deep learning neural network;
the characteristic vector emitter is responsible for emitting required characteristic vectors to all the neural network computing units;
the neural network computing unit chip comprises basic neural network computing modules, including but not limited to a programmable multiply-add unit, a programmable activation unit, a unit chip controller and a cache (optional);
the neural network computing unit can carry out communication between any units through a programmable communication route.
The invention has the technical effects that the reconfigurability of the neural network acceleration array architecture, the programmability of a calculation module in a calculation unit chip and the programmability of communication among the unit chips have great flexibility, the calculation mode, the data storage and the data trend can be combined at will through programming, various deep learning network topological structures and new algorithms which may appear in the future can be compatible, high energy efficiency is kept simultaneously, and the invention has great application prospect in an artificial intelligent system applying the deep learning algorithms.
Drawings
FIG. 1 is a schematic diagram of a high-compatibility programmable neural network acceleration array architecture according to the present invention.
Fig. 2 is a schematic structural diagram of a neural network computing unit chip of the present invention.
FIG. 3 is a communication diagram of the neural network computing unit inter-die communication routing of the present invention.
FIG. 4 is a schematic diagram of a neural network computing unit in accordance with an embodiment of the present invention.
Reference numbers in the figures: 11 is a central controller, 12 is a characteristic vector emitter, and 13 is a neural network computing unit chip; 21 is a programmable multiplication and addition unit, 22 is a programmable activation unit, 23 is a unit controller, and 24 is a cache; 31 is calculating inter-die communication routes.
Detailed Description
The present invention will be described more fully hereinafter in the reference to the accompanying drawings, which provide preferred embodiments of the invention, and which are not to be considered as limited to the embodiments set forth herein.
Fig. 1 is a schematic diagram of an architecture of a high-compatibility programmable neural network acceleration array of the present invention, in which a central controller 11 is responsible for global control of a deep learning neural network, and a feature vector transmitter 12 is responsible for transmitting required feature vectors to all neural network computing units 13.
The neural network computing unit chip of the present invention comprises all basic deep learning algorithm computing modules, as shown in fig. 2, the computing unit chip comprises a programmable multiply-add unit 21, a programmable activation unit 22, an on-chip controller 23, and a cache 24. The programmable multiplication and addition unit 21 can complete vector multiplication and addition calculation generally required in the deep learning algorithm, the programmability of the programmable multiplication and addition unit can be realized by a switch array, and multiplication and addition operation with different precisions can be realized in an online programmable mode; the programmable activation unit 22 can complete activation calculations commonly required in deep learning algorithms, such as nonlinear calculations of sigmoid, relu functions and the like; the in-cell controller 23 is responsible for the control function in the cell; the cache 24 is optional and may be used to store computed intermediate values, etc.
As shown in fig. 4, the embodiment of the calculating unit chip is illustrated, wherein the programmable multiply-add unit is divided into a multiply part and an add part, the multiply part is composed of a switch array and a multiplier, and can implement vector multiplication with two accuracies of 8 bits and 4 bits, each accuracy has a plurality of identical calculating modules, and implements multiplication of a feature vector and a weight vector, and the obtained result is processed by programmable accumulation calculation and finally enters a programmable activation unit, which includes a controller, a shift and piecewise linear calculation, an Arithmetic Logic Unit (ALU) and a Multiplexer (MUX), and in order to take account of special transitive operation required by the recurrent neural network, a combination of nonlinear calculation and basic operation of other variables can be implemented by online programming. The computing unit slice realizes the flexibility of a computing mode of a deep learning algorithm on the framework.
FIG. 3 illustrates the communication of neural network computational inter-die communication routes, with an inter-die communication route 31 used in conjunction with die 13 to allow communication between any of the dies to achieve the flexibility of speeding up the data flow of the array architecture.
While the embodiments of the present invention have been described with reference to specific examples, those skilled in the art will readily appreciate that the various illustrative embodiments are capable of providing many other embodiments and that many other advantages and features of the invention are possible. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.
Claims (1)
1. A high-compatibility programmable neural network acceleration array is characterized in that a reconfigurable architecture is adopted, and the high-compatibility programmable neural network acceleration array comprises a central controller, a feature vector transmitter and a plurality of neural network computing unit chips; wherein:
the central controller is responsible for the global control of the deep learning neural network;
the characteristic vector transmitter is responsible for transmitting required characteristic vectors to all the neural network computing units in a broadcast mode;
the neural network computing unit chip comprises a basic neural network computing module;
the neural network computing unit chip carries out communication between any two units through a programmable communication route;
the basic neural network computing module comprises a programmable multiply-add unit, a programmable activation unit and a unit chip controller;
the basic neural network computing module also comprises a cache used for storing and computing the intermediate value;
the programmable multiply-add unit is used for finishing vector multiplication and addition calculation in a deep learning algorithm, the programmability of the programmable multiply-add unit is realized by a switch array, and multiply-add operation with different precisions is realized in an online programmable mode; the programmable activation unit is used for completing an activation meter in a deep learning algorithm; the unit chip controller is responsible for the control function in the unit chip;
the programmable multiply-add unit is divided into a multiply part and an add part, the multiply part is composed of a switch array and a multiplier, vector multiplication of 8-bit precision and 4-bit precision is realized, each precision is provided with a plurality of same calculation modules, the product of a feature vector and a weight vector is realized, programmable accumulation calculation is carried out on the obtained result, and finally the result enters a programmable activation unit which comprises a controller, a shift and piecewise linear calculation, an Arithmetic Logic Unit (ALU) and a Multiplexer (MUX).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711131564.9A CN107817708B (en) | 2017-11-15 | 2017-11-15 | High-compatibility programmable neural network acceleration array |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711131564.9A CN107817708B (en) | 2017-11-15 | 2017-11-15 | High-compatibility programmable neural network acceleration array |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107817708A CN107817708A (en) | 2018-03-20 |
CN107817708B true CN107817708B (en) | 2020-07-07 |
Family
ID=61609167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711131564.9A Active CN107817708B (en) | 2017-11-15 | 2017-11-15 | High-compatibility programmable neural network acceleration array |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107817708B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106558146A (en) * | 2015-09-29 | 2017-04-05 | 广东工业大学 | A kind of network transmission pattern of Intelligent bus card accounts information |
US20200034699A1 (en) * | 2018-07-24 | 2020-01-30 | SK Hynix Inc. | Accelerating appratus of neural network and operating method thereof |
CN110572593B (en) * | 2019-08-19 | 2022-03-04 | 上海集成电路研发中心有限公司 | 3D heap image sensor |
CN111126580B (en) * | 2019-11-20 | 2023-05-02 | 复旦大学 | Multi-precision weight coefficient neural network acceleration chip arithmetic device adopting Booth coding |
CN111062471B (en) * | 2019-11-23 | 2023-05-02 | 复旦大学 | Deep learning accelerator for accelerating BERT neural network operation |
CN114722751B (en) * | 2022-06-07 | 2022-09-02 | 深圳鸿芯微纳技术有限公司 | Framework selection model training method and framework selection method for operation unit |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102736684A (en) * | 2011-04-05 | 2012-10-17 | 温保成 | Programmable hard accelerator chip array |
CN105740946B (en) * | 2015-07-29 | 2019-02-12 | 上海磁宇信息科技有限公司 | A kind of method that application cell array computation system realizes neural computing |
CN105488565A (en) * | 2015-11-17 | 2016-04-13 | 中国科学院计算技术研究所 | Calculation apparatus and method for accelerator chip accelerating deep neural network algorithm |
CN107239824A (en) * | 2016-12-05 | 2017-10-10 | 北京深鉴智能科技有限公司 | Apparatus and method for realizing sparse convolution neutral net accelerator |
CN106940815B (en) * | 2017-02-13 | 2020-07-28 | 西安交通大学 | Programmable convolutional neural network coprocessor IP core |
CN107169560B (en) * | 2017-04-19 | 2020-10-16 | 清华大学 | Self-adaptive reconfigurable deep convolutional neural network computing method and device |
-
2017
- 2017-11-15 CN CN201711131564.9A patent/CN107817708B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107817708A (en) | 2018-03-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107817708B (en) | High-compatibility programmable neural network acceleration array | |
Ma et al. | ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler | |
CN105892989B (en) | Neural network accelerator and operational method thereof | |
Al Bahou et al. | XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks | |
US11451229B1 (en) | Application specific integrated circuit accelerators | |
CN111860815A (en) | Convolution operation method and device | |
US11080593B2 (en) | Electronic circuit, in particular capable of implementing a neural network, and neural system | |
Denk et al. | Real-time interface board for closed-loop robotic tasks on the SpiNNaker neural computing system | |
CN107430586B (en) | Adaptive chip and configuration method | |
CN109783412A (en) | A kind of method that deeply study accelerates training | |
Park et al. | A multi-mode 8k-MAC HW-utilization-aware neural processing unit with a unified multi-precision datapath in 4-nm flagship mobile SoC | |
CN108960414B (en) | Method for realizing single broadcast multiple operations based on deep learning accelerator | |
CN110543936B (en) | Multi-parallel acceleration method for CNN full-connection layer operation | |
US20220129320A1 (en) | Schedule-aware dynamically reconfigurable adder tree architecture for partial sum accumulation in machine learning accelerators | |
Höppner et al. | Spinnaker2-towards extremely efficient digital neuromorphics and multi-scale brain emulation | |
CN106294278A (en) | The pre-configured controller of adaptive hardware of system is calculated for dynamic reconfigurable array | |
CN109615061B (en) | Convolution operation method and device | |
CN108647780A (en) | Restructural pond operation module structure towards neural network and its implementation | |
CN112132276A (en) | Compatible programmable micro-neuron network acceleration array | |
Dazzi et al. | 5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory | |
KR20220051367A (en) | On-Chip Operation Initialization | |
CN108804974B (en) | Method and system for estimating and configuring resources of hardware architecture of target detection algorithm | |
Ayhan et al. | Approximate fully connected neural network generation | |
JP6888073B2 (en) | Chip equipment and related products | |
JP6888074B2 (en) | Chip equipment and related products |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |