CN107817708B - High-compatibility programmable neural network acceleration array - Google Patents

High-compatibility programmable neural network acceleration array Download PDF

Info

Publication number
CN107817708B
CN107817708B CN201711131564.9A CN201711131564A CN107817708B CN 107817708 B CN107817708 B CN 107817708B CN 201711131564 A CN201711131564 A CN 201711131564A CN 107817708 B CN107817708 B CN 107817708B
Authority
CN
China
Prior art keywords
programmable
neural network
unit
multiply
network computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711131564.9A
Other languages
Chinese (zh)
Other versions
CN107817708A (en
Inventor
陈迟晓
史传进
张怡云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201711131564.9A priority Critical patent/CN107817708B/en
Publication of CN107817708A publication Critical patent/CN107817708A/en
Application granted granted Critical
Publication of CN107817708B publication Critical patent/CN107817708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/04Programme control other than numerical control, i.e. in sequence controllers or logic controllers

Abstract

The invention belongs to the technical field of integrated circuits, and particularly relates to a high-compatibility programmable neural network acceleration array. The array adopts a reconfigurable architecture and comprises a central controller, a characteristic vector emitter and a plurality of neural network computing units; the computing unit chip comprises basic neural network computing modules such as a programmable multiply-add unit, a programmable activation unit and a unit chip controller, and the acceleration array carries out communication among any unit chips through a programmable communication route. The programmable neural network acceleration array can be compatible with various neural network algorithms, does not lose high energy efficiency, and is suitable for various deep learning intelligent systems.

Description

High-compatibility programmable neural network acceleration array
Technical Field
The invention belongs to the technical field of integrated circuits, and particularly relates to a high-compatibility programmable neural network acceleration array.
Background
The development of customized deep learning acceleration chips on mobile devices is getting hot today, and the challenge is that the performance of the chip is limited by the types of deep learning networks, such as CNN (convolutional neural network) and RNN (cyclic neural network), and in order to design a highly energy-efficient customized deep learning acceleration chip, the chip is often optimized for some networks, and the performance is high when the networks are used, and the performance is not good under other networks. However, due to the recent rapid development of the deep learning field, an improved version of CNN or RNN network may appear in the future, and even other new deep learning neural network algorithms appear, so that the existing special deep learning acceleration chip cannot meet the required performance requirement, and the development of deep learning intelligence is fundamentally limited.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present invention to provide a highly compatible programmable neural network acceleration array.
The invention provides a high-compatibility programmable neural network acceleration array, which adopts a reconfigurable architecture and comprises a central controller, a feature vector transmitter and a plurality of neural network computing units. Wherein:
the central controller is responsible for the global control of the deep learning neural network;
the characteristic vector emitter is responsible for emitting required characteristic vectors to all the neural network computing units;
the neural network computing unit chip comprises basic neural network computing modules, including but not limited to a programmable multiply-add unit, a programmable activation unit, a unit chip controller and a cache (optional);
the neural network computing unit can carry out communication between any units through a programmable communication route.
The invention has the technical effects that the reconfigurability of the neural network acceleration array architecture, the programmability of a calculation module in a calculation unit chip and the programmability of communication among the unit chips have great flexibility, the calculation mode, the data storage and the data trend can be combined at will through programming, various deep learning network topological structures and new algorithms which may appear in the future can be compatible, high energy efficiency is kept simultaneously, and the invention has great application prospect in an artificial intelligent system applying the deep learning algorithms.
Drawings
FIG. 1 is a schematic diagram of a high-compatibility programmable neural network acceleration array architecture according to the present invention.
Fig. 2 is a schematic structural diagram of a neural network computing unit chip of the present invention.
FIG. 3 is a communication diagram of the neural network computing unit inter-die communication routing of the present invention.
FIG. 4 is a schematic diagram of a neural network computing unit in accordance with an embodiment of the present invention.
Reference numbers in the figures: 11 is a central controller, 12 is a characteristic vector emitter, and 13 is a neural network computing unit chip; 21 is a programmable multiplication and addition unit, 22 is a programmable activation unit, 23 is a unit controller, and 24 is a cache; 31 is calculating inter-die communication routes.
Detailed Description
The present invention will be described more fully hereinafter in the reference to the accompanying drawings, which provide preferred embodiments of the invention, and which are not to be considered as limited to the embodiments set forth herein.
Fig. 1 is a schematic diagram of an architecture of a high-compatibility programmable neural network acceleration array of the present invention, in which a central controller 11 is responsible for global control of a deep learning neural network, and a feature vector transmitter 12 is responsible for transmitting required feature vectors to all neural network computing units 13.
The neural network computing unit chip of the present invention comprises all basic deep learning algorithm computing modules, as shown in fig. 2, the computing unit chip comprises a programmable multiply-add unit 21, a programmable activation unit 22, an on-chip controller 23, and a cache 24. The programmable multiplication and addition unit 21 can complete vector multiplication and addition calculation generally required in the deep learning algorithm, the programmability of the programmable multiplication and addition unit can be realized by a switch array, and multiplication and addition operation with different precisions can be realized in an online programmable mode; the programmable activation unit 22 can complete activation calculations commonly required in deep learning algorithms, such as nonlinear calculations of sigmoid, relu functions and the like; the in-cell controller 23 is responsible for the control function in the cell; the cache 24 is optional and may be used to store computed intermediate values, etc.
As shown in fig. 4, the embodiment of the calculating unit chip is illustrated, wherein the programmable multiply-add unit is divided into a multiply part and an add part, the multiply part is composed of a switch array and a multiplier, and can implement vector multiplication with two accuracies of 8 bits and 4 bits, each accuracy has a plurality of identical calculating modules, and implements multiplication of a feature vector and a weight vector, and the obtained result is processed by programmable accumulation calculation and finally enters a programmable activation unit, which includes a controller, a shift and piecewise linear calculation, an Arithmetic Logic Unit (ALU) and a Multiplexer (MUX), and in order to take account of special transitive operation required by the recurrent neural network, a combination of nonlinear calculation and basic operation of other variables can be implemented by online programming. The computing unit slice realizes the flexibility of a computing mode of a deep learning algorithm on the framework.
FIG. 3 illustrates the communication of neural network computational inter-die communication routes, with an inter-die communication route 31 used in conjunction with die 13 to allow communication between any of the dies to achieve the flexibility of speeding up the data flow of the array architecture.
While the embodiments of the present invention have been described with reference to specific examples, those skilled in the art will readily appreciate that the various illustrative embodiments are capable of providing many other embodiments and that many other advantages and features of the invention are possible. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.

Claims (1)

1. A high-compatibility programmable neural network acceleration array is characterized in that a reconfigurable architecture is adopted, and the high-compatibility programmable neural network acceleration array comprises a central controller, a feature vector transmitter and a plurality of neural network computing unit chips; wherein:
the central controller is responsible for the global control of the deep learning neural network;
the characteristic vector transmitter is responsible for transmitting required characteristic vectors to all the neural network computing units in a broadcast mode;
the neural network computing unit chip comprises a basic neural network computing module;
the neural network computing unit chip carries out communication between any two units through a programmable communication route;
the basic neural network computing module comprises a programmable multiply-add unit, a programmable activation unit and a unit chip controller;
the basic neural network computing module also comprises a cache used for storing and computing the intermediate value;
the programmable multiply-add unit is used for finishing vector multiplication and addition calculation in a deep learning algorithm, the programmability of the programmable multiply-add unit is realized by a switch array, and multiply-add operation with different precisions is realized in an online programmable mode; the programmable activation unit is used for completing an activation meter in a deep learning algorithm; the unit chip controller is responsible for the control function in the unit chip;
the programmable multiply-add unit is divided into a multiply part and an add part, the multiply part is composed of a switch array and a multiplier, vector multiplication of 8-bit precision and 4-bit precision is realized, each precision is provided with a plurality of same calculation modules, the product of a feature vector and a weight vector is realized, programmable accumulation calculation is carried out on the obtained result, and finally the result enters a programmable activation unit which comprises a controller, a shift and piecewise linear calculation, an Arithmetic Logic Unit (ALU) and a Multiplexer (MUX).
CN201711131564.9A 2017-11-15 2017-11-15 High-compatibility programmable neural network acceleration array Active CN107817708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711131564.9A CN107817708B (en) 2017-11-15 2017-11-15 High-compatibility programmable neural network acceleration array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711131564.9A CN107817708B (en) 2017-11-15 2017-11-15 High-compatibility programmable neural network acceleration array

Publications (2)

Publication Number Publication Date
CN107817708A CN107817708A (en) 2018-03-20
CN107817708B true CN107817708B (en) 2020-07-07

Family

ID=61609167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711131564.9A Active CN107817708B (en) 2017-11-15 2017-11-15 High-compatibility programmable neural network acceleration array

Country Status (1)

Country Link
CN (1) CN107817708B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106558146A (en) * 2015-09-29 2017-04-05 广东工业大学 A kind of network transmission pattern of Intelligent bus card accounts information
US20200034699A1 (en) * 2018-07-24 2020-01-30 SK Hynix Inc. Accelerating appratus of neural network and operating method thereof
CN110572593B (en) * 2019-08-19 2022-03-04 上海集成电路研发中心有限公司 3D heap image sensor
CN111126580B (en) * 2019-11-20 2023-05-02 复旦大学 Multi-precision weight coefficient neural network acceleration chip arithmetic device adopting Booth coding
CN111062471B (en) * 2019-11-23 2023-05-02 复旦大学 Deep learning accelerator for accelerating BERT neural network operation
CN114722751B (en) * 2022-06-07 2022-09-02 深圳鸿芯微纳技术有限公司 Framework selection model training method and framework selection method for operation unit

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102736684A (en) * 2011-04-05 2012-10-17 温保成 Programmable hard accelerator chip array
CN105740946B (en) * 2015-07-29 2019-02-12 上海磁宇信息科技有限公司 A kind of method that application cell array computation system realizes neural computing
CN105488565A (en) * 2015-11-17 2016-04-13 中国科学院计算技术研究所 Calculation apparatus and method for accelerator chip accelerating deep neural network algorithm
CN107239824A (en) * 2016-12-05 2017-10-10 北京深鉴智能科技有限公司 Apparatus and method for realizing sparse convolution neutral net accelerator
CN106940815B (en) * 2017-02-13 2020-07-28 西安交通大学 Programmable convolutional neural network coprocessor IP core
CN107169560B (en) * 2017-04-19 2020-10-16 清华大学 Self-adaptive reconfigurable deep convolutional neural network computing method and device

Also Published As

Publication number Publication date
CN107817708A (en) 2018-03-20

Similar Documents

Publication Publication Date Title
CN107817708B (en) High-compatibility programmable neural network acceleration array
Ma et al. ALAMO: FPGA acceleration of deep learning algorithms with a modularized RTL compiler
CN105892989B (en) Neural network accelerator and operational method thereof
Al Bahou et al. XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks
US11451229B1 (en) Application specific integrated circuit accelerators
CN111860815A (en) Convolution operation method and device
US11080593B2 (en) Electronic circuit, in particular capable of implementing a neural network, and neural system
Denk et al. Real-time interface board for closed-loop robotic tasks on the SpiNNaker neural computing system
CN107430586B (en) Adaptive chip and configuration method
CN109783412A (en) A kind of method that deeply study accelerates training
Park et al. A multi-mode 8k-MAC HW-utilization-aware neural processing unit with a unified multi-precision datapath in 4-nm flagship mobile SoC
CN108960414B (en) Method for realizing single broadcast multiple operations based on deep learning accelerator
CN110543936B (en) Multi-parallel acceleration method for CNN full-connection layer operation
US20220129320A1 (en) Schedule-aware dynamically reconfigurable adder tree architecture for partial sum accumulation in machine learning accelerators
Höppner et al. Spinnaker2-towards extremely efficient digital neuromorphics and multi-scale brain emulation
CN106294278A (en) The pre-configured controller of adaptive hardware of system is calculated for dynamic reconfigurable array
CN109615061B (en) Convolution operation method and device
CN108647780A (en) Restructural pond operation module structure towards neural network and its implementation
CN112132276A (en) Compatible programmable micro-neuron network acceleration array
Dazzi et al. 5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory
KR20220051367A (en) On-Chip Operation Initialization
CN108804974B (en) Method and system for estimating and configuring resources of hardware architecture of target detection algorithm
Ayhan et al. Approximate fully connected neural network generation
JP6888073B2 (en) Chip equipment and related products
JP6888074B2 (en) Chip equipment and related products

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant