WO2022083536A1 - Neural network construction method and apparatus - Google Patents

Neural network construction method and apparatus Download PDF

Info

Publication number
WO2022083536A1
WO2022083536A1 PCT/CN2021/124360 CN2021124360W WO2022083536A1 WO 2022083536 A1 WO2022083536 A1 WO 2022083536A1 CN 2021124360 W CN2021124360 W CN 2021124360W WO 2022083536 A1 WO2022083536 A1 WO 2022083536A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
neural
network
target
neural networks
Prior art date
Application number
PCT/CN2021/124360
Other languages
French (fr)
Chinese (zh)
Inventor
韩凯
王云鹤
张秋林
张维
许春景
钱莉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022083536A1 publication Critical patent/WO2022083536A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

A neural network construction method and apparatus in the field of artificial intelligence are used for accurately and efficiently constructing a target neural network better adapted to hardware under a given hardware constraint condition. The method comprises: performing sampling in a preset search space to obtain at least one set of first parameter combinations, wherein the search space comprises value ranges of various parameters used in constructing a neural network; constructing a plurality of first neural networks according to the at least one set of first parameter combinations; producing a mapping relationship, wherein the mapping relationship comprises relationships between the various parameters and evaluation results of the plurality of first neural networks; acquiring a constraint range, wherein the constraint range comprises a numerical range that identifies the computing capabilities of a computing device; obtaining a second parameter combination corresponding to the constraint range according to the mapping relationship; and obtaining a target neural network according to the second parameter combination.

Description

一种神经网络构建方法以及装置A kind of neural network construction method and device
本申请要求于2020年10月21日提交中国专利局、申请号为“202011131423.9”、申请名称为“一种神经网络构建方法以及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number "202011131423.9" and the application title "A neural network construction method and device" filed with the China Patent Office on October 21, 2020, the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及人工智能领域,尤其涉及一种神经网络构建方法以及装置。The present application relates to the field of artificial intelligence, and in particular, to a method and apparatus for constructing a neural network.
背景技术Background technique
在人工智能领域中,神经网络尤其是深度神经网络近年来在计算视觉领域取得了巨大的成功。受益于计算力的增加与越来越多不同的组成元件的提出,神经网络结构朝着越来越复杂的方向发展。In the field of artificial intelligence, neural networks, especially deep neural networks, have achieved great success in the field of computational vision in recent years. Benefiting from the increase in computing power and the introduction of more and more different components, the neural network structure is developing in a more and more complex direction.
为了使神经网络适应在不同硬件设备中运行,可以调节神经网络的深度或宽度等。然而,若需要都神经网络的多项都进行人工调节,才能得到与硬件设备更匹配的神经网络,调节效率低下。因此,如何高效地得到与硬件设备适配的神经网络,成为亟待解决的问题。In order to adapt the neural network to run in different hardware devices, the depth or width of the neural network can be adjusted. However, if multiple items of the neural network need to be manually adjusted, a neural network that better matches the hardware device can be obtained, and the adjustment efficiency is low. Therefore, how to efficiently obtain a neural network adapted to hardware devices has become an urgent problem to be solved.
发明内容SUMMARY OF THE INVENTION
本申请提供一种神经网络构建方法以及装置,用于在给定硬件约束条件下,准确高效地构建出与硬件更适配的目标神经网络。The present application provides a method and apparatus for constructing a neural network, which are used to accurately and efficiently construct a target neural network that is more suitable for hardware under given hardware constraints.
有鉴于此,第一方面,本申请提供一种神经网络构建方法,包括:从预设的搜索空间中采样得到至少一组第一参数组合,搜索空间中包括构建神经网络时使用的多种参数的取值范围,至少一组第一参数组合中的每个第一参数组合包括多种参数中的每一种参数的值;根据至少一组第一参数组合构建多个第一神经网络;获取约束范围,所述约束范围包括标识计算装置的计算能力的数值范围,约束范围可以是根据计算装置的计算能力的信息确定的数值范围;根据映射关系,得到和约束范围对应的第二参数组合,该映射关系包括多种参数和多个第一神经网络的评估结果之间的关系,评估结果为对多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;根据第二参数组合得到目标神经网络。In view of this, in a first aspect, the present application provides a method for constructing a neural network, comprising: sampling at least one set of first parameter combinations from a preset search space, where the search space includes various parameters used in constructing a neural network The range of values, each first parameter combination in at least one set of first parameter combinations includes the value of each parameter in a variety of parameters; construct a plurality of first neural networks according to at least one set of first parameter combinations; obtain A constraint range, the constraint range includes a numerical range that identifies the computing capability of the computing device, and the constraint range may be a numerical range determined according to information on the computing capability of the computing device; according to the mapping relationship, a second parameter combination corresponding to the constraint range is obtained, The mapping relationship includes the relationship between various parameters and the evaluation results of the plurality of first neural networks, and the evaluation results are the results obtained by evaluating the structure of each first neural network in the plurality of first neural networks; The two parameters are combined to obtain the target neural network.
因此,在本申请实施方式中,可以从搜索空间中搜索得到至少一组参数组合,基于该至少一组参数组合得到多个神经网络,并基于该至少一组参数组合和多个神经网络的结构的评估结果来生成映射关系。随后可以根据该映射关系,得到与计算装置的约束范围对应的参数组合,从而根据该参数组合得到与硬件适配的最优模型。可以理解为,可以以硬件的计算能力作为约束对模型进行缩放,得到与硬件适配的最优模型。Therefore, in the embodiments of the present application, at least one set of parameter combinations can be obtained from the search space, multiple neural networks can be obtained based on the at least one set of parameter combinations, and based on the at least one set of parameter combinations and the structures of the multiple neural networks the evaluation results to generate the mapping relationship. Then, according to the mapping relationship, a parameter combination corresponding to the constraint range of the computing device can be obtained, so as to obtain an optimal model adapted to the hardware according to the parameter combination. It can be understood that the model can be scaled with the computing capability of the hardware as a constraint to obtain an optimal model adapted to the hardware.
在一种可能的实施方式中,在生成映射关系之前,该方法还可以包括:使用预设的第一数据集训练多个第一神经网络,得到训练后的多个第一神经网络;根据训练后的多个第一神经网络中的每个第一神经网络的评估结果或训练后的每个训练后的第一神经网络的输出精度,从多个第一神经网络中筛选出至少一个第二神经网络;映射关系是至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系。In a possible implementation manner, before generating the mapping relationship, the method may further include: training a plurality of first neural networks by using a preset first data set to obtain a plurality of first neural networks after training; After the evaluation result of each first neural network in the plurality of first neural networks or the output accuracy of each trained first neural network after training, at least one second neural network is selected from the plurality of first neural networks Neural network; the mapping relationship is the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
因此,在本申请实施方式中,可以对多个第一神经网络进行训练,并从训练得到的多个第一神经网络中选择出更优的多个第二神经网络,从而后续可以根据更有的第二神经网络来得到映射关系,以使根据该映射关系筛选出的参数组合对应的目标神经网络在结构或者输出精度的表现更优,更能与硬件适配。Therefore, in the embodiment of the present application, a plurality of first neural networks can be trained, and a plurality of better second neural networks can be selected from the plurality of first neural networks obtained by training, so that the subsequent The second neural network is used to obtain the mapping relationship, so that the target neural network corresponding to the parameter combination selected according to the mapping relationship has better performance in structure or output accuracy, and can better adapt to the hardware.
在一种可能的实施方式中,可以从多个第一神经网络中,筛选出评估结果最优或者输出精度最高的多个神经网络作为第二神经网络,从而使后续可以根据在结构或者输出精度上表现更优的第二神经网络来生成映射关系,以使根据该映射关系筛选出的参数组合对应的目标神经网络在结构或者输出精度的表现更优,更能与硬件适配。In a possible implementation manner, a plurality of neural networks with the best evaluation results or the highest output accuracy can be selected from the plurality of first neural networks as the second neural networks, so that the subsequent neural networks can be based on the structure or output accuracy. A second neural network with better performance is used to generate a mapping relationship, so that the target neural network corresponding to the parameter combination selected according to the mapping relationship has better performance in structure or output accuracy, and can better adapt to the hardware.
在一种可能的实施方式中,映射关系可以是对至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系进行拟合得到。In a possible implementation manner, the mapping relationship may be obtained by fitting the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network .
因此,在本申请实施方式中,可以根据在结构或者输出精度上表现更优的第二神经网络来拟合映射关系,以使根据该映射关系筛选出的参数组合对应的目标神经网络在结构或者输出精度的表现更优,更能与硬件适配。Therefore, in the embodiment of the present application, the mapping relationship can be fitted according to the second neural network that performs better in structure or output accuracy, so that the target neural network corresponding to the parameter combination selected according to the mapping relationship is in the structure or The output accuracy is better and more adaptable to hardware.
在一种可能的实施方式中,搜索空间中包括的参数包括以下一种或者多种:宽度、深度、分辨率或卷积核的大小,宽度为神经网络中每一层网络所包括的基础单元的数量,深度为神经网络的网络层的层数,分辨率为输入至神经网络的图像的分辨率。In a possible implementation manner, the parameters included in the search space include one or more of the following: width, depth, resolution or the size of the convolution kernel, where the width is the basic unit included in each layer of the neural network , the depth is the number of layers of the neural network, and the resolution is the resolution of the image input to the neural network.
因此,在本申请实施方式中,可以基于深度、宽度、分辨率、卷积核的大小或者卷积核的组的大小来构建神经网络,或者对基础神经网络来进行调整,得到新的第一神经网络。Therefore, in the embodiments of the present application, a neural network may be constructed based on the depth, width, resolution, size of convolution kernels, or the size of groups of convolution kernels, or the basic neural network may be adjusted to obtain a new first Neural Networks.
在一种可能的实施方式中,评估结果包括:每个第一神经网络的总浮点运算次数flops、每个第一神经网络的正向推理的运行时长、运行每个第一神经网络占用的内存量、或者每个第一神经网络的参数量中的一项或者多项。In a possible implementation manner, the evaluation result includes: the total number of floating-point operations flops of each first neural network, the running time of forward inference of each first neural network, the time occupied by running each first neural network One or more of the amount of memory, or the amount of parameters of each first neural network.
因此,在本申请实施方式中,对第一神经网络的结构进行评估的结果可以包括flops、运行时长、内存占用量或者神经网络的参数量等,通过其中的一种或者多种来衡量第一神经网络的结构的质量,从而筛选出与硬件更适配的神经网络。Therefore, in the embodiments of the present application, the result of evaluating the structure of the first neural network may include flops, running time, memory usage, or the amount of parameters of the neural network, etc., and one or more of them are used to measure the first neural network. The quality of the structure of the neural network, so as to screen out the neural network that is more suitable for the hardware.
具体地,若评估结果中包括运行时长或内存占用量等数据,则可以在计算装置中运行第一神经网络,该计算装置也可以是仿真得到的装置,并在仿真的装置中运行第一神经网络,从而得到运行时长或内存占用量等数据,通常不同的结构对应的运行时长或内存占用量等数据也可能不相同。若评估结果中包括flops或者参数量等数据,则可以直接对第一神经网络的结构进行统计,得到flops或者参数量等数据。Specifically, if the evaluation result includes data such as running time or memory occupancy, the first neural network can be run in a computing device, which can also be a simulated device, and the first neural network can be run in the simulated device. The network is used to obtain data such as running time or memory usage. Usually, the data such as running time or memory usage corresponding to different structures may also be different. If the evaluation result includes data such as flops or the amount of parameters, the structure of the first neural network can be directly counted to obtain data such as flops or the amount of parameters.
在一种可能的实施方式中,前述的获取约束范围,可以包括:接收用户输入数据,并根据用户输入数据获取约束范围。因此,在本申请实施方式在红,约束范围可以是根据用户输入数据得到的,可以实现根据用户的需求来构建目标神经网络,提高用户体验。In a possible implementation manner, the aforementioned obtaining of the constraint range may include: receiving user input data, and obtaining the constraint range according to the user input data. Therefore, in the embodiment of the present application, the constraint range can be obtained according to the user input data, and the target neural network can be constructed according to the user's needs to improve the user experience.
在一种可能的实施方式中,前述的根据用户输入数据获取约束范围,可以包括:从用户输入数据中获取计算装置的标识信息;根据计算装置的标识信息获取约束范围。因此,在本申请实施方式中,可以根据计算装置的标识来获知约束范围,用户仅需提供计算装置的标识即可,无需提供更多的信息,提高用户的交互体验。In a possible implementation manner, the aforementioned obtaining of the constraint range according to the user input data may include: obtaining the identification information of the computing device from the user input data; and obtaining the constraint scope according to the identification information of the computing device. Therefore, in the embodiment of the present application, the constraint scope can be known according to the identifier of the computing device, and the user only needs to provide the identifier of the computing device, and no more information needs to be provided, thereby improving the user's interactive experience.
在一种可能的实施方式中,目标神经网络的类型也可以是根据用户输入数据确定的。 因此,在本申请实施例中,可以为用户构建出与计算装置的计算能力适配且符合用户需求的目标神经网络,极大提高了用户体验。In a possible implementation, the type of the target neural network may also be determined according to user input data. Therefore, in the embodiment of the present application, a target neural network that is adapted to the computing capability of the computing device and meets the user's needs can be constructed for the user, which greatly improves the user experience.
在一种可能的实施方式中,上述的方法还可以包括:使用预设的第二数据集对目标神经网络进行训练,得到训练后的目标神经网络。在本申请实施方式中,最终得到的神经网络可以是训练后的目标神经网络,从而使该目标神经网络可以直接部署于计算装置中,实现相应的功能。In a possible implementation manner, the above method may further include: using a preset second data set to train the target neural network to obtain a trained target neural network. In the embodiment of the present application, the neural network finally obtained may be a trained target neural network, so that the target neural network can be directly deployed in a computing device to realize corresponding functions.
在一种可能的实施方式中,目标神经网络用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的至少一种。因此,在本申请实施方式中,目标神经网络可以是想特征提取、语义分割、分类任务、超分辨率或者目标检测中的一种或者多种功能,适应更多的场景,泛化能力强。In a possible implementation, the target neural network is used to perform at least one of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection. Therefore, in the embodiments of the present application, the target neural network can be one or more functions of feature extraction, semantic segmentation, classification tasks, super-resolution, or target detection, adapting to more scenarios and having strong generalization ability.
第二方面,本申请提供一种神经网络构建装置,包括:In a second aspect, the present application provides a device for constructing a neural network, including:
采样模块,用于从预设的搜索空间中采样得到至少一组第一参数组合,搜索空间中包括构建神经网络时使用的多种参数的取值范围,至少一组第一参数组合中的每个第一参数组合包括多种参数中的每一种参数的值;The sampling module is configured to sample at least one set of first parameter combinations from a preset search space, where the search space includes the value ranges of various parameters used when constructing the neural network, and each of the at least one set of first parameter combinations a first parameter combination comprising a value for each of the plurality of parameters;
构建模块,用于根据至少一组第一参数组合构建多个第一神经网络;a building module for building a plurality of first neural networks according to at least one set of first parameter combinations;
获取模块,用于获取约束范围,所述约束范围包括标识计算装置的计算能力的数值范围,约束范围可以是根据计算装置的计算能力的信息确定的数值范围,且评估结果的数据类型包括约束范围对应的数据类型;an acquisition module, configured to acquire a constraint range, where the constraint range includes a numerical range identifying the computing capability of the computing device, the constraint range may be a numerical range determined according to information on the computing capability of the computing device, and the data type of the evaluation result includes the constraint range the corresponding data type;
计算模块,用于根据映射关系,得到和约束范围对应的第二参数组合,所述映射关系包括所述至少一组参数组合和所述多个第一神经网络的评估结果之间的关系,所述评估结果为对所述多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;A calculation module, configured to obtain a second parameter combination corresponding to the constraint range according to a mapping relationship, where the mapping relationship includes the relationship between the at least one set of parameter combinations and the evaluation results of the plurality of first neural networks, so The evaluation result is a result obtained by evaluating the structure of each first neural network in the plurality of first neural networks;
构建模块,还用于根据第二参数组合得到目标神经网络。The building block is further used to obtain the target neural network according to the second parameter combination.
第二方面以及第二方面中任一项实施例的效果可以参阅前述第一方面,此处不再赘述。For the effect of the second aspect and any one of the embodiments of the second aspect, reference may be made to the foregoing first aspect, and details are not repeated here.
在一种可能的实施方式中,该装置还可以包括:In a possible implementation, the device may further include:
第一训练模块,用于使用预设的第一数据集训练多个第一神经网络,得到训练后的多个第一神经网络;a first training module, used for training a plurality of first neural networks by using a preset first data set to obtain a plurality of first neural networks after training;
筛选模块,用于根据训练后的多个第一神经网络中的每个第一神经网络的评估结果或训练后的每个训练后的第一神经网络的输出精度,从多个第一神经网络中筛选出至少一个第二神经网络;映射关系可以包括至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的映射关系。The screening module is configured to select from the plurality of first neural networks according to the evaluation result of each first neural network in the plurality of first neural networks after training or the output accuracy of each trained first neural network after training At least one second neural network is screened out in the at least one second neural network; the mapping relationship may include a mapping relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
在一种可能的实施方式中,映射关系通过对至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系进行拟合得到。In a possible implementation manner, the mapping relationship is obtained by fitting the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
在一种可能的实施方式中,搜索空间中包括的参数包括以下一种或者多种:宽度、深度、分辨率或卷积核的大小,宽度为神经网络中每一层网络所包括的基础单元的数量,深度为神经网络的网络层的层数,分辨率为输入至神经网络的图像的分辨率。In a possible implementation manner, the parameters included in the search space include one or more of the following: width, depth, resolution or the size of the convolution kernel, where the width is the basic unit included in each layer of the neural network , the depth is the number of layers of the neural network, and the resolution is the resolution of the image input to the neural network.
在一种可能的实施方式中,评估结果可以包括:每个第一神经网络的总浮点运算次数flops、每个第一神经网络的正向推理的运行时长、运行每个第一神经网络占用的内存量、 或者每个第一神经网络的参数量中的一项或者多项。In a possible implementation manner, the evaluation result may include: the total number of floating-point operations flops of each first neural network, the running time of the forward inference of each first neural network, the running time of each first neural network One or more of the amount of memory of , or the amount of parameters of each first neural network.
在一种可能的实施方式中,获取模块,具体用于接收用户输入数据,并根据用户输入数据获取约束范围。In a possible implementation manner, the obtaining module is specifically configured to receive user input data, and obtain the constraint range according to the user input data.
在一种可能的实施方式中,获取模块,具体用于:从用户输入数据中获取计算装置的标识信息;根据计算装置的标识信息获取约束范围。In a possible implementation manner, the obtaining module is specifically configured to: obtain identification information of the computing device from user input data; and obtain the constraint range according to the identification information of the computing device.
在一种可能的实施方式中,上述装置还可以包括,第二训练模块用于使用预设的第二数据集对目标神经网络进行训练,得到训练后的目标神经网络。In a possible implementation manner, the above-mentioned apparatus may further include a second training module configured to use a preset second data set to train the target neural network to obtain a trained target neural network.
在一种可能的实施方式中,目标神经网络用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的至少一种。In a possible implementation, the target neural network is used to perform at least one of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection.
第三方面,本申请实施例提供一种神经网络构建装置,该神经网络构建装置具有实现上述第一方面神经网络构建方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。In a third aspect, an embodiment of the present application provides an apparatus for constructing a neural network, and the apparatus for constructing a neural network has a function of implementing the method for constructing a neural network in the first aspect. This function can be implemented by hardware or by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions.
第四方面,本申请实施例提供一种神经网络构建装置,包括:处理器和存储器,其中,处理器和存储器通过线路互联,处理器调用存储器中的程序代码用于执行上述第一方面任一项所示的神经网络构建方法中与处理相关的功能。可选地,该神经网络构建装置可以是芯片。In a fourth aspect, an embodiment of the present application provides an apparatus for constructing a neural network, including: a processor and a memory, wherein the processor and the memory are interconnected through a line, and the processor invokes program codes in the memory to execute any one of the above-mentioned first aspects The processing-related functions in the neural network construction method shown in item. Optionally, the neural network construction device may be a chip.
第五方面,本申请实施例提供了一种神经网络构建装置,该神经网络构建装置也可以称为数字处理芯片或者芯片,芯片包括处理单元和通信接口,处理单元通过通信接口获取程序指令,程序指令被处理单元执行,处理单元用于执行如上述第一方面或第一方面任一可选实施方式中与处理相关的功能。In a fifth aspect, an embodiment of the present application provides a neural network construction device. The neural network construction device may also be called a digital processing chip or a chip. The chip includes a processing unit and a communication interface. The processing unit obtains program instructions through the communication interface. The instructions are executed by a processing unit, and the processing unit is configured to perform processing-related functions as in the first aspect or any of the optional embodiments of the first aspect.
第六方面,本申请实施例提供了一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面任一可选实施方式中的方法。In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, including instructions, which, when executed on a computer, cause the computer to execute the method in the first aspect or any optional implementation manner of the first aspect.
第七方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面任一可选实施方式中的方法。In a seventh aspect, an embodiment of the present application provides a computer program product including instructions, which, when run on a computer, enables the computer to execute the method in the first aspect or any optional implementation manner of the first aspect.
附图说明Description of drawings
图1本申请应用的一种人工智能主体框架示意图;Fig. 1 is a kind of artificial intelligence main frame schematic diagram applied in this application;
图2本申请提供的一种系统架构示意图;2 is a schematic diagram of a system architecture provided by the present application;
图3为本申请实施例提供的一种卷积神经网络结构示意图;3 is a schematic structural diagram of a convolutional neural network provided by an embodiment of the present application;
图4本申请提供的另一种系统架构示意图;4 is another schematic diagram of the system architecture provided by the present application;
图5为本申请实施例提供的一种神经网络构建方法的流程示意图;5 is a schematic flowchart of a method for constructing a neural network according to an embodiment of the present application;
图6为本申请实施例提供的另一种神经网络构建方法的流程示意图;6 is a schematic flowchart of another method for constructing a neural network according to an embodiment of the present application;
图7为本申请实施例提供的一种应用场景示意图;FIG. 7 is a schematic diagram of an application scenario provided by an embodiment of the present application;
图8为本申请实施例提供的另一种应用场景示意图;FIG. 8 is a schematic diagram of another application scenario provided by an embodiment of the present application;
图9为本申请实施例提供的另一种应用场景示意图;FIG. 9 is a schematic diagram of another application scenario provided by an embodiment of the present application;
图10为本申请实施例提供的另一种神经网络构建方法的流程示意图;10 is a schematic flowchart of another method for constructing a neural network according to an embodiment of the present application;
图11A为本申请实施例提供的一种映射关系示意图;11A is a schematic diagram of a mapping relationship provided by an embodiment of the present application;
图11B为本申请实施例提供的另一种映射关系示意图;FIG. 11B is a schematic diagram of another mapping relationship provided by an embodiment of the present application;
图11C为本申请实施例提供的另一种映射关系示意图;FIG. 11C is a schematic diagram of another mapping relationship provided by an embodiment of the present application;
图12为本申请实施例提供的一种神经网络构建装置的结构示意图;12 is a schematic structural diagram of an apparatus for constructing a neural network provided by an embodiment of the application;
图13为本申请实施例提供的另一种神经网络构建装置的结构示意图;13 is a schematic structural diagram of another apparatus for constructing a neural network provided by an embodiment of the present application;
图14为本申请实施例提供的一种芯片的结构示意图。FIG. 14 is a schematic structural diagram of a chip according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
首先对人工智能系统总体工作流程进行描述,请参见图1,图1示出的为人工智能主体框架的一种结构示意图,下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。其中,“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。First, the overall workflow of the artificial intelligence system will be described. Please refer to Figure 1. Figure 1 shows a schematic structural diagram of the main frame of artificial intelligence. The above-mentioned artificial intelligence theme framework is explained in two dimensions (vertical axis). Among them, the "intelligent information chain" reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, data has gone through the process of "data-information-knowledge-wisdom". The "IT value chain" reflects the value brought by artificial intelligence to the information technology industry from the underlying infrastructure of human intelligence, information (providing and processing technology implementation) to the industrial ecological process of the system.
(1)基础设施(1) Infrastructure
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片,如中央处理器(central processing unit,CPU)、网络处理器(neural-network processing unit,NPU)、图形处理器(英语:graphics processing unit,GPU)、专用集成电路(application specific integrated circuit,ASIC)或现场可编程逻辑门阵列(field programmable gate array,FPGA)等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。The infrastructure provides computing power support for artificial intelligence systems, realizes communication with the outside world, and supports through the basic platform. Communicate with the outside through sensors; computing power is provided by intelligent chips, such as central processing unit (CPU), network processor (neural-network processing unit, NPU), graphics processor (English: graphics processing unit, GPU), Application specific integrated circuit (ASIC) or field programmable gate array (field programmable gate array, FPGA) and other hardware acceleration chips) are provided; the basic platform includes distributed computing framework and network related platform guarantee and support, It can include cloud storage and computing, interconnection networks, etc. For example, sensors communicate with external parties to obtain data, and these data are provided to the intelligent chips in the distributed computing system provided by the basic platform for calculation.
(2)数据(2) Data
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。The data on the upper layer of the infrastructure is used to represent the data sources in the field of artificial intelligence. The data involves graphics, images, voice, and text, as well as IoT data from traditional devices, including business data from existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
(3)数据处理(3) Data processing
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making, etc.
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。Among them, machine learning and deep learning can perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc. on data.
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。Reasoning refers to the process of simulating human's intelligent reasoning method in a computer or intelligent system, using formalized information to carry out machine thinking and solving problems according to the reasoning control strategy, and the typical function is search and matching.
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。Decision-making refers to the process of making decisions after intelligent information is reasoned, usually providing functions such as classification, sorting, and prediction.
(4)通用能力(4) General ability
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。After the above-mentioned data processing, some general capabilities can be formed based on the results of data processing, such as algorithms or a general system, such as translation, text analysis, computer vision processing, speech recognition, image identification, etc.
(5)智能产品及行业应用(5) Smart products and industry applications
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能终端、智能交通、智能医疗、自动驾驶、智慧城市等。Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. They are the encapsulation of the overall solution of artificial intelligence, and the productization of intelligent information decision-making to achieve landing applications. Its application areas mainly include: intelligent terminals, intelligent transportation, Smart healthcare, autonomous driving, smart city, etc.
本申请实施例涉及了大量神经网络的相关应用,为了更好地理解本申请实施例的方案,下面先对本申请实施例可能涉及的神经网络的相关术语和概念进行介绍。The embodiments of the present application involve a large number of related applications of neural networks. In order to better understand the solutions of the embodiments of the present application, the related terms and concepts of the neural networks that may be involved in the embodiments of the present application are first introduced below.
(1)神经网络(1) Neural network
神经网络可以是由神经单元组成的,神经单元可以是指以xs和截距1为输入的运算单元,该运算单元的输出可以如公式(1-1)所示:A neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes xs and intercept 1 as inputs, and the output of the operation unit can be shown in formula (1-1):
Figure PCTCN2021124360-appb-000001
Figure PCTCN2021124360-appb-000001
其中,s=1、2、……n,n为大于1的自然数,Ws为xs的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。Among them, s=1, 2,...n, n is a natural number greater than 1, Ws is the weight of xs, and b is the bias of the neural unit. f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function. A neural network is a network formed by connecting a plurality of the above single neural units together, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field, and the local receptive field can be an area composed of several neural units.
(2)深度神经网络(2) Deep neural network
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层中间层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,中间层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是中间层,或者称为隐层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。A deep neural network (DNN), also known as a multi-layer neural network, can be understood as a neural network with multiple intermediate layers. The DNN is divided according to the position of different layers. The neural network inside the DNN can be divided into three categories: input layer, intermediate layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the middle layers are all intermediate layers, or hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
虽然DNN看起来很复杂,其每一层可以表示为线性关系表达式:
Figure PCTCN2021124360-appb-000002
其中,
Figure PCTCN2021124360-appb-000003
是输入向量,
Figure PCTCN2021124360-appb-000004
是输出向量,
Figure PCTCN2021124360-appb-000005
是偏移向量或者称为偏置参数,w是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2021124360-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2021124360-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2021124360-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数w为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2021124360-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层 索引2和输入的第二层索引4。
Although DNN looks complicated, each layer can be expressed as a linear relational expression:
Figure PCTCN2021124360-appb-000002
in,
Figure PCTCN2021124360-appb-000003
is the input vector,
Figure PCTCN2021124360-appb-000004
is the output vector,
Figure PCTCN2021124360-appb-000005
is the offset vector or the bias parameter, w is the weight matrix (also called the coefficient), and α() is the activation function. Each layer is just an input vector
Figure PCTCN2021124360-appb-000006
After such a simple operation to get the output vector
Figure PCTCN2021124360-appb-000007
Due to the large number of DNN layers, the coefficient W and offset vector
Figure PCTCN2021124360-appb-000008
The number is also higher. These parameters are defined in the DNN as follows: Take the coefficient w as an example: Suppose in a three-layer DNN, the linear coefficient from the 4th neuron in the second layer to the 2nd neuron in the third layer is defined as
Figure PCTCN2021124360-appb-000009
The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2021124360-appb-000010
To sum up, the coefficient from the kth neuron in the L-1 layer to the jth neuron in the Lth layer is defined as
Figure PCTCN2021124360-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的中间层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。It should be noted that the input layer does not have a W parameter. In a deep neural network, more intermediate layers allow the network to better capture the complexities of the real world. In theory, a model with more parameters is more complex and has a larger "capacity", which means that it can complete more complex learning tasks. Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vectors W of many layers).
(3)卷积神经网络(3) Convolutional Neural Network
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。Convolutional neural network (CNN) is a deep neural network with a convolutional structure. A convolutional neural network consists of a feature extractor consisting of convolutional layers and subsampling layers, which can be viewed as a filter. The convolutional layer refers to the neuron layer in the convolutional neural network that convolves the input signal. In a convolutional layer of a convolutional neural network, a neuron can only be connected to some of its neighbors. A convolutional layer usually contains several feature planes, and each feature plane can be composed of some neural units arranged in a rectangle. Neural units in the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as the way to extract image information is independent of location. The convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights by learning during the training process of the convolutional neural network. In addition, the immediate benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
(4)循环神经网络(recurrent neural networks,RNN),也称为递归神经网络,是用来处理序列数据的。在传统的神经网络模型中,是从输入层到中间层再到输出层,层与层之间是全连接的,而对于每一层层内之间的各个节点是无连接的。这种普通的神经网络虽然解决了很多难题,但是却仍然对很多问题无能无力。例如,你要预测句子的下一个单词是什么,一般需要用到前面的单词,因为一个句子中前后单词并不是独立的。RNN之所以称为循环神经网路,即一个序列当前的输出与前面的输出也有关。具体的表现形式为网络会对前面的信息进行记忆并应用于当前输出的计算中,即中间层本层之间的节点不再无连接而是有连接的,并且中间层的输入不仅包括输入层的输出还包括上一时刻中间层的输出。理论上,RNN能够对任何长度的序列数据进行处理。对于RNN的训练和对传统的CNN或DNN的训练一样。(4) Recurrent neural networks (RNN), also known as recurrent neural networks, are used to process sequence data. In the traditional neural network model, from the input layer to the middle layer and then to the output layer, the layers are fully connected, but each node in each layer is unconnected. Although this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict the next word of a sentence, you generally need to use the previous words, because the front and rear words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output. The specific manifestation is that the network will memorize the previous information and apply it to the calculation of the current output, that is, the nodes between the middle layer and this layer are no longer unconnected but connected, and the input of the middle layer not only includes the input layer The output of also includes the output of the middle layer at the previous moment. In theory, RNN can process sequence data of any length. The training of RNN is the same as the training of traditional CNN or DNN.
(5)残差神经网络(ResNet)(5) Residual Neural Network (ResNet)
残差神经网络是为例解决神经度神经网络的隐藏层过多时产生的退化(degradation)问题而提出。退化问题是指:当网络隐藏层变多时,网络的准确度达到饱和然后急剧退化,而且这个退化不是由于过拟合引起的,而是在进行反向传播时,传播到底层时各个梯度相关性不大,梯度更新不充分,从而使最终得到的模型的预测标签的准确度降低。当神经网络退化时,浅层网络能够达到比深层网络更好的训练效果,这时如果把低层的特征传到高层,那么效果应该至少不比浅层的网络效果差,因此可以通过一条恒等映射(Identity Mapping)来达到此效果。这条恒等映射称为残差连接(shortcut),优化这种残差映射要比优化原始的映射容易。Residual neural network is proposed as an example to solve the problem of degradation caused by too many hidden layers of neural network. The degradation problem refers to: when there are more hidden layers in the network, the accuracy of the network reaches saturation and then degrades sharply, and this degradation is not caused by overfitting, but when backpropagating, the correlation of each gradient when it propagates to the bottom layer If it is not large, the gradient update is insufficient, which reduces the accuracy of the predicted label of the final model. When the neural network degenerates, the shallow network can achieve a better training effect than the deep network. At this time, if the low-level features are transferred to the high-level, the effect should be at least no worse than that of the shallow network, so an identity mapping can be used. (Identity Mapping) to achieve this effect. This identity map is called a residual connection (shortcut), and optimizing this residual map is easier than optimizing the original map.
(6)损失函数(6) Loss function
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。In the process of training a deep neural network, because it is hoped that the output of the deep neural network is as close as possible to the value you really want to predict, you can compare the predicted value of the current network with the target value you really want, and then based on the difference between the two to update the weight vector of each layer of neural network (of course, there is usually an initialization process before the first update, that is, to pre-configure parameters for each layer in the deep neural network), for example, if the predicted value of the network If it is high, adjust the weight vector to make the prediction lower, and keep adjusting until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", which is the loss function (loss function) or objective function (objective function), which are used to measure the difference between the predicted value and the target value. important equation. Among them, taking the loss function as an example, the higher the output value of the loss function (loss), the greater the difference, then the training of the deep neural network becomes the process of reducing the loss as much as possible.
(7)反向传播算法(7) Back propagation algorithm
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的神经网络模型的参数,例如权重矩阵。The neural network can use the error back propagation (BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, the input signal is passed forward until the output will generate error loss, and the parameters in the initial neural network model are updated by back-propagating the error loss information, so that the error loss converges. The back-propagation algorithm is a back-propagation movement dominated by error loss, aiming to obtain the parameters of the optimal neural network model, such as the weight matrix.
通常,CNN是一种常用的神经网络,为便于理解,下面示例性地,对卷积神经网络的结构进行介绍。Generally, CNN is a commonly used neural network. For ease of understanding, the structure of the convolutional neural network is exemplarily introduced below.
下面结合图2示例性地对CNN的结构进行详细的介绍。如上文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的图像作出响应。The structure of the CNN will be described in detail below by way of example in conjunction with FIG. 2 . As mentioned in the introduction to the basic concepts above, a convolutional neural network is a deep neural network with a convolutional structure and a deep learning architecture. A deep learning architecture refers to an algorithm based on machine learning. learning at multiple levels of abstraction. As a deep learning architecture, CNN is a feed-forward artificial neural network in which individual neurons can respond to images fed into it.
如图2所示,卷积神经网络(CNN)200可以包括输入层210,卷积层/池化层220(其中池化层为可选的),以及神经网络层230。在本申请以下实施方式中,为便于理解,将每一层称为一个stage。下面对这些层的相关内容做详细介绍。As shown in FIG. 2 , a convolutional neural network (CNN) 200 may include an input layer 210 , a convolutional/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230 . In the following embodiments of the present application, for ease of understanding, each layer is referred to as a stage. The relevant contents of these layers are described in detail below.
卷积层/池化层220:Convolutional layer/pooling layer 220:
卷积层:Convolutional layer:
如图2所示卷积层/池化层220可以包括如示例221-226层,举例来说:在一种实现方式中,221层为卷积层,222层为池化层,223层为卷积层,224层为池化层,225为卷积层,226为池化层;在另一种实现方式中,221、222为卷积层,223为池化层,224、225为卷积层,226为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。As shown in FIG. 2 , the convolutional layer/pooling layer 220 may include layers 221-226 as examples. For example, in one implementation, layer 221 is a convolutional layer, layer 222 is a pooling layer, and layer 223 is a Convolution layer, 224 layers are pooling layers, 225 are convolution layers, and 226 are pooling layers; in another implementation, 221 and 222 are convolution layers, 223 are pooling layers, and 224 and 225 are volumes Layer, 226 is the pooling layer. That is, the output of a convolutional layer can be used as the input of a subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.
下面将以卷积层221为例,介绍一层卷积层的内部工作原理。The following will take the convolutional layer 221 as an example to introduce the inner working principle of a convolutional layer.
卷积层221可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图 像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的特征图的尺寸也相同,再将提取到的多个尺寸相同的特征图合并形成卷积运算的输出。The convolution layer 221 may include many convolution operators. The convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix. The convolution operator is essentially Can be a weight matrix, which is usually pre-defined, usually one pixel by one pixel (or two pixels by two pixels) along the horizontal direction on the input image during the convolution operation on the image. ...It depends on the value of the stride step) to process, so as to complete the work of extracting specific features from the image. The size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image. During the convolution operation, the weight matrix will be extended to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will result in a single depth dimension of the convolutional output, but in most cases a single weight matrix is not used, but multiple weight matrices of the same size (row × column) are applied, That is, multiple isotype matrices. The output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" described above. Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to extract unwanted noise in the image. Blur, etc. The multiple weight matrices have the same size (row×column), and the size of the feature maps extracted from the multiple weight matrices with the same size is also the same, and then the multiple extracted feature maps with the same size are combined to form a convolution operation. output.
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络200进行正确的预测。The weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained by training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions .
当卷积神经网络200有多个卷积层的时候,初始的卷积层(例如221)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络200深度的加深,越往后的卷积层(例如226)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。When the convolutional neural network 200 has multiple convolutional layers, the initial convolutional layer (eg, 221 ) often extracts more general features, which can also be called low-level features; with the convolutional neural network As the depth of the network 200 deepens, the features extracted by the later convolutional layers (eg, 226) become more and more complex, such as features such as high-level semantics. Features with higher semantics are more suitable for the problem to be solved.
池化层/池化层220:Pooling Layer/Pooling Layer 220:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,池化层也可以称为下采样层。在如图2中220所示例的221-226各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。Since it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce a pooling layer after the convolutional layer, which can also be called a downsampling layer. In each layer 221-226 exemplified by 220 in FIG. 2, it can be a convolutional layer followed by a pooling layer, or a multi-layer convolutional layer followed by one or more pooling layers. During image processing, the only purpose of pooling layers is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a max pooling operator for sampling the input image to obtain a smaller size image. The average pooling operator can calculate the pixel values in the image within a certain range to produce an average value as the result of average pooling. The max pooling operator can take the pixel with the largest value within a specific range as the result of max pooling. Also, just as the size of the weight matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the output image after processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
神经网络层230:Neural network layer 230:
在经过卷积层/池化层220的处理后,卷积神经网络200还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层220只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络200需要利用神经网络层230来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层230中可以包括多层中间层(如图2所示的231、232至23n)以及输出层240,该输出层也可以称为全连接(fully connected,FC)层,该多层中间层中所包含的参数可以根据具体的 任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像识别,图像分类,图像超分辨率重建等等。After being processed by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not sufficient to output the required output information. Because as mentioned before, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to utilize the neural network layer 230 to generate one or a set of outputs of the desired number of classes. Therefore, the neural network layer 230 may include multiple intermediate layers (231, 232 to 23n as shown in FIG. 2) and an output layer 240. The output layer may also be called a fully connected (FC) layer. The parameters contained in the multi-layer intermediate layer can be obtained by pre-training according to the relevant training data of a specific task type, for example, the task type can include image recognition, image classification, image super-resolution reconstruction and so on.
在神经网络层230中的多层中间层之后,也就是整个卷积神经网络200的最后层为输出层240,该输出层240具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络200的前向传播(如图2由210至240方向的传播为前向传播)完成,反向传播(如图2由240至210方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络200的损失,及卷积神经网络200通过输出层输出的结果和理想结果之间的误差。After the multi-layer intermediate layers in the neural network layer 230, that is, the last layer of the entire convolutional neural network 200 is the output layer 240, which has a loss function similar to categorical cross entropy, and is specifically used to calculate the prediction error. Once The forward propagation of the entire convolutional neural network 200 (as shown in Figure 2, the propagation from the direction 210 to 240 is forward propagation) is completed, and the back propagation (as shown in Figure 2, the propagation from 240 to 210 direction is the back propagation) will begin. The weight values and biases of the aforementioned layers are updated to reduce the loss of the convolutional neural network 200 and the error between the result outputted by the convolutional neural network 200 through the output layer and the ideal result.
需要说明的是,如图2所示的卷积神经网络200仅作为一种卷积神经网络的示例,在具体的应用中,卷积神经网络还可以以其他网络模型的形式存在。It should be noted that the convolutional neural network 200 shown in FIG. 2 is only used as an example of a convolutional neural network, and in a specific application, the convolutional neural network may also exist in the form of other network models.
本申请中,可以采用图2所示的卷积神经网络200对待处理图像进行处理,得到待处理图像的分类结果。如图2所示,待处理图像经过输入层210、卷积层/池化层220以及神经网络层230的处理后输出待处理图像的分类结果。In the present application, the convolutional neural network 200 shown in FIG. 2 can be used to process the image to be processed to obtain a classification result of the image to be processed. As shown in FIG. 2 , the image to be processed is processed by the input layer 210 , the convolution layer/pooling layer 220 and the neural network layer 230 to output the classification result of the image to be processed.
本申请实施例提供的用于计算设备的深度学习训练方法可以在服务器上被执行,还可以在终端设备上被执行。其中该终端设备可以是具有图像处理功能的移动电话、平板个人电脑(tablet personal computer,TPC)、媒体播放器、智能电视、笔记本电脑(laptop computer,LC)、个人数字助理(personal digital assistant,PDA)、个人计算机(personal computer,PC)、照相机、摄像机、智能手表、可穿戴式设备(wearable device,WD)或者自动驾驶的车辆等,本申请实施例对此不作限定。The deep learning training method for a computing device provided by the embodiment of the present application may be executed on a server, and may also be executed on a terminal device. The terminal device can be a mobile phone with image processing function, tablet personal computer (TPC), media player, smart TV, laptop computer (LC), personal digital assistant (PDA) ), a personal computer (PC), a camera, a video camera, a smart watch, a wearable device (WD), or an autonomous vehicle, etc., which are not limited in this embodiment of the present application.
参见附图3,本申请实施例提供了一种系统架构300。该系统架构中包括数据库330、客户设备340。数据采集设备360用于采集数据并存入数据库330,构建模块302基于数据库330中维护的数据生成目标模型/规则301。下面将更详细地描述构建模块302如何基于数据得到目标模型/规则301,目标模型/规则301即本申请以下实施方式中构建得到的神经网络,具体参阅以下图5-11C中的相关描述。Referring to FIG. 3 , an embodiment of the present application provides a system architecture 300 . The system architecture includes a database 330 and a client device 340 . The data collection device 360 is used to collect data and store it in the database 330 , and the building module 302 generates the target model/rule 301 based on the data maintained in the database 330 . The following will describe in more detail how the building module 302 obtains the target model/rule 301 based on the data. The target model/rule 301 is the neural network constructed in the following embodiments of the present application. For details, please refer to the relevant descriptions in FIGS. 5-11C below.
计算模块可以包括构建模块302,构建模块302得到的目标模型/规则可以应用不同的系统或设备中。在附图3中,执行设备310配置收发器312,该收发器312可以是无线收发器、光收发器或有线接口(如I/O接口)等,与外部设备进行数据交互,“用户”可以通过客户设备340向收发器312输入数据,例如,本申请以下实施方式,客户设备340可以向执行设备310发送基础网络、约束条件等,请求执行设备基于基础网络在约束条件的约束下构建目标神经网络。可选地,客户设备340还可以向执行设备310发送用于构建目标神经网络的数据库,即本申请以下提及的数据集,此处不再赘述。The computing module may include a building module 302, and the target models/rules obtained by the building module 302 may be applied in different systems or devices. In FIG. 3, the execution device 310 configures the transceiver 312, the transceiver 312 can be a wireless transceiver, an optical transceiver or a wired interface (such as an I/O interface), etc., to perform data interaction with external devices, and the "user" can The client device 340 inputs data to the transceiver 312. For example, in the following embodiments of the present application, the client device 340 can send the basic network, constraints, etc. to the execution device 310, requesting the execution device to construct a target neural network under the constraints of the constraints based on the basic network. The internet. Optionally, the client device 340 may also send to the execution device 310 a database for constructing the target neural network, that is, the data set mentioned below in this application, which will not be repeated here.
执行设备310可以调用数据存储系统350中的数据、代码等,也可以将数据、指令等存入数据存储系统350中。The execution device 310 can call data, codes, etc. in the data storage system 350 , and can also store data, instructions, etc. in the data storage system 350 .
计算模块311对输入的数据进行处理。具体地,计算模块311用于:从预设的搜索空间中采样得到至少一组第一参数组合,搜索空间中包括构建神经网络时使用的多种参数的取值范围,至少一组第一参数组合中的每个第一参数组合包括多种参数中的每一种参数的值;根据至少一组第一参数组合构建多个第一神经网络;生成映射关系,映射关系包括多 种参数和多个第一神经网络的评估结果之间的关系,评估结果为对多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;获取约束范围,约束范围包括根据计算装置的计算能力的信息确定的数值范围,该约束范围可以包括标识计算装置的计算能力的数值范围;根据映射关系,得到和约束范围对应的第二参数组合;根据第二参数组合得到目标神经网络,该目标神经网络即图3中所示的目标模型/规则301。The calculation module 311 processes the input data. Specifically, the calculation module 311 is configured to: sample at least one set of first parameter combinations from a preset search space, where the search space includes the value ranges of various parameters used in constructing the neural network, at least one set of first parameter combinations Each first parameter combination in the combination includes the value of each parameter in the multiple parameters; constructs multiple first neural networks according to at least one set of first parameter combinations; generates a mapping relationship, the mapping relationship includes multiple parameters and multiple parameters. The relationship between the evaluation results of the first neural networks, the evaluation results are the results obtained by evaluating the structure of each first neural network in the plurality of first neural networks; obtain the constraint range, and the constraint range includes according to the calculation device. The numerical range determined by the information of the computing capability, the constraint range may include a numerical range that identifies the computing capability of the computing device; according to the mapping relationship, a second parameter combination corresponding to the constraint range is obtained; the target neural network is obtained according to the second parameter combination, the The target neural network is the target model/rule 301 shown in FIG. 3 .
关联功能模块312和关联功能模块314为可选模块,可以用于构建与目标神经网络关联的除主干网络之外的其他网络,如区域生成网络(region proposal network,RPN)、特征金字塔网络(feature Pyramid network,FPN)等。The association function module 312 and the association function module 314 are optional modules, which can be used to construct other networks other than the backbone network associated with the target neural network, such as a region proposal network (RPN), a feature pyramid network (feature Pyramid network, FPN), etc.
最后,收发器312将构建得到的目标神经网络返回给客户设备340,以在客户设备340或者其他设备中部署该目标神经网络。Finally, the transceiver 312 returns the constructed target neural network to the client device 340 to deploy the target neural network in the client device 340 or other devices.
更深层地,构建模块302可以针对不同的目标任务,基于不同的候选集得到相应的目标模型/规则301,以给用户提供更佳的结果。More deeply, the building module 302 can obtain corresponding target models/rules 301 based on different candidate sets for different target tasks, so as to provide users with better results.
在附图3中所示情况下,可以根据用户的输入数据确定输入执行设备310中的数据,例如,用户可以在收发器312提供的界面中操作。另一种情况下,客户设备340可以自动地向收发器312输入数据并获得结果,若客户设备340自动输入数据需要获得用户的授权,用户可以在客户设备340中设置相应权限。用户可以在客户设备340查看执行设备310输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备340也可以作为数据采集端将采集到的数据存入数据库330。In the case shown in FIG. 3 , the data input into the execution device 310 can be determined according to the input data of the user, for example, the user can operate in the interface provided by the transceiver 312 . In another case, the client device 340 can automatically input data to the transceiver 312 and obtain the result. If the client device 340 automatically inputs data and needs to obtain the user's authorization, the user can set the corresponding permission in the client device 340 . The user can view the result output by the execution device 310 on the client device 340, and the specific presentation form can be a specific manner such as display, sound, and action. The client device 340 can also act as a data collection terminal to store the collected data in the database 330 .
需要说明的是,附图3仅是本申请实施例提供的一种系统架构的示例性的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制。例如,在附图3中,数据存储系统350相对执行设备310是外部存储器,在其它场景中,也可以将数据存储系统350置于执行设备310中。It should be noted that FIG. 3 is only an exemplary schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among the devices, devices, modules, etc. shown in the figure does not constitute any limitation. For example, in FIG. 3 , the data storage system 350 is an external memory relative to the execution device 310 . In other scenarios, the data storage system 350 may also be placed in the execution device 310 .
根据构建模块302构建得到的目标模型/规则101可以应用于不同的系统或设备中,如应用于手机,平板电脑,笔记本电脑,增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端设备等。The target model/rule 101 constructed according to the building block 302 can be applied to different systems or devices, such as mobile phones, tablet computers, laptop computers, augmented reality (AR)/virtual reality (VR) , a vehicle terminal, etc., or a server or a cloud device.
该目标模型/规则101在本申请实施例中可以是本申请中的目标神经网络,具体的,本申请实施例提供的目标神经网络可以CNN,深度卷积神经网络(deep convolutional neural networks,DCNN),循环神经网络(recurrent neural network,RNN)等等。The target model/rule 101 may be the target neural network in the present application in the embodiment of the present application. Specifically, the target neural network provided in the embodiment of the present application may be a CNN, a deep convolutional neural network (deep convolutional neural network, DCNN). , recurrent neural network (RNN) and so on.
参见附图4,本申请实施例还提供了一种系统架构400。执行设备310由一个或多个服务器实现,可选的,与其它计算设备配合,例如:数据存储、路由器、负载均衡器等设备;执行设备310可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备310可以使用数据存储系统350中的数据,或者调用数据存储系统350中的程序代码实现本申请以下图6对应的用于计算设备的深度学习训练方法的步骤。Referring to FIG. 4 , an embodiment of the present application further provides a system architecture 400 . The execution device 310 is implemented by one or more servers, and optionally, cooperates with other computing devices, such as: data storage, routers, load balancers and other devices; the execution device 310 may be arranged on a physical site, or distributed in multiple on the physical site. The execution device 310 can use the data in the data storage system 350 or call the program code in the data storage system 350 to implement the steps of the deep learning training method for a computing device corresponding to FIG. 6 below in this application.
用户可以操作各自的用户设备(例如本地设备401和本地设备402)与执行设备310进行交互。每个本地设备可以表示任何计算设备,例如个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。A user may operate respective user devices (eg, local device 401 and local device 402 ) to interact with execution device 310 . Each local device may represent any computing device, such as a personal computer, computer workstation, smartphone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set-top box, gaming console, etc.
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备310进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。具体地,该通信网络可以包括无线网络、有线网络或者无线网络与有线网络的组合等。该无线网络包括但不限于:第五代移动通信技术(5th-Generation,5G)系统,长期演进(long term evolution,LTE)系统、全球移动通信系统(global system for mobile communication,GSM)或码分多址(code division multiple access,CDMA)网络、宽带码分多址(wideband code division multiple access,WCDMA)网络、无线保真(wireless fidelity,WiFi)、蓝牙(bluetooth)、紫蜂协议(Zigbee)、射频识别技术(radio frequency identification,RFID)、远程(Long Range,Lora)无线通信、近距离无线通信(near field communication,NFC)中的任意一种或多种的组合。该有线网络可以包括光纤通信网络或同轴电缆组成的网络等。Each user's local device can interact with the execution device 310 through any communication mechanism/standard communication network, which can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof. Specifically, the communication network may include a wireless network, a wired network, or a combination of a wireless network and a wired network, and the like. The wireless network includes but is not limited to: the fifth generation mobile communication technology (5th-Generation, 5G) system, the long term evolution (long term evolution, LTE) system, the global system for mobile communication (global system for mobile communication, GSM) or code division Multiple access (code division multiple access, CDMA) network, wideband code division multiple access (wideband code division multiple access, WCDMA) network, wireless fidelity (wireless fidelity, WiFi), Bluetooth (bluetooth), Zigbee protocol (Zigbee), Any one or a combination of radio frequency identification technology (radio frequency identification, RFID), long range (Long Range, Lora) wireless communication, and near field communication (near field communication, NFC). The wired network may include an optical fiber communication network or a network composed of coaxial cables, and the like.
在另一种实现中,执行设备310的一个方面或多个方面可以由每个本地设备实现,例如,本地设备401可以为执行设备310提供本地数据或反馈计算结果。该本地设备也可以称为计算设备。In another implementation, one or more aspects of the execution device 310 may be implemented by each local device, for example, the local device 401 may provide the execution device 310 with local data or feedback calculation results. The local device may also be referred to as a computing device.
需要注意的,执行设备310的所有功能也可以由本地设备实现。例如,本地设备401实现执行设备310的功能并为自己的用户提供服务,或者为本地设备402的用户提供服务。It should be noted that all the functions of the execution device 310 can also be implemented by the local device. For example, local device 401 implements the functions of performing device 310 and provides services to its own users, or provides services to users of local device 402 .
神经网络在图片识别、目标检测等许多视觉任务中都取得了巨大的成功。神经网络结构的发展极大地提高了网络模型的性能,促进了神经网络在不同需求的实际应用中能够有效地落地和发展。Neural networks have achieved great success in many visual tasks such as image recognition and object detection. The development of the neural network structure has greatly improved the performance of the network model, and promoted the effective implementation and development of the neural network in practical applications with different requirements.
为了能够将庞大的神经网络应用到不同硬件设备中,可以通过调节网络的深度(Depth)、网络的宽度(Width)和图片的输入分辨率(Resolution)、卷积核大小、卷积核的group的数量等这参数来降低模型的内存占用以及运行延迟。然而,若由人工单独对神经网络的这些参数进行调节,需要耗费较大人力,得到最优模型的效率低下,也可能无法得到性能均衡的神经网络。因此,本申请提供一种神经网络构建方法,通过构建神经网络的多种参数和硬件优化指标(如计算量、参数量或占用内存量等)之间的映射关系,从而在给定硬件约束范围的情况下,得到合适的参数,从而高效地得到更优的神经网络。In order to be able to apply a huge neural network to different hardware devices, you can adjust the depth of the network (Depth), the width of the network (Width) and the input resolution of the image (Resolution), the size of the convolution kernel, and the group of the convolution kernel. and other parameters to reduce the memory footprint and running delay of the model. However, if these parameters of the neural network are adjusted manually, it will take a lot of manpower, the efficiency of obtaining the optimal model is low, and it may not be possible to obtain a neural network with balanced performance. Therefore, the present application provides a method for constructing a neural network. By constructing the mapping relationship between various parameters of the neural network and hardware optimization indicators (such as the amount of calculation, the amount of parameters, or the amount of memory occupied, etc.) In the case of , appropriate parameters are obtained, so as to efficiently obtain a better neural network.
下面结合前述的神经网或系统架构等,对本申请提供的神经网络构建方法进行详细说明。The method for constructing a neural network provided by the present application will be described in detail below in conjunction with the aforementioned neural network or system architecture.
参阅图5,本申请提供的一种神经网络构建方法的流程示意图,如下所述。Referring to FIG. 5 , a schematic flowchart of a method for constructing a neural network provided by the present application is as follows.
501、从预设的搜索空间中采样得到至少一组第一参数组合。501. Obtain at least one set of first parameter combinations by sampling from a preset search space.
其中,该搜索空间中可以包括构建神经网络时使用的多种参数的取值范围。具体地,该搜索空间可以包括宽度、深度、分辨率、卷积核的大小或卷积核的组(group)的大小等中的一项或者多项,宽度为神经网络中每一层网络所包括的基础单元的数量,深度为神经网络的网络层的层数,分辨率为输入至神经网络的图像的分辨率。每组第一参数组合可以包括宽度、深度、分辨率、卷积核的大小或卷积核的组(group)的大小等参数中的一项或者多项。The search space may include value ranges of various parameters used in constructing the neural network. Specifically, the search space may include one or more items of width, depth, resolution, size of convolution kernel or size of a group of convolution kernels, etc. The width is equal to the size of each layer of the neural network. The number of basic units included, the depth is the number of layers of the neural network, and the resolution is the resolution of the image input to the neural network. Each set of first parameter combinations may include one or more parameters such as width, depth, resolution, size of convolution kernel, or size of a group of convolution kernels.
在搜索空间中进行采样的方式可以随机采样,也可以按照分布来采样,具体可以根据 实际应用场景调整,本申请对此不作限定。例如,可以是在搜索空间中随机采样m组参数组合,每组参数组合中都可以包括深度、宽度、分辨率卷积核的大小或卷积核的组(group)的大小中的一项或者多项参数的取值。又例如,以对宽度进行采样为例,可以根据该宽度的分布确定采样概率,分布较多的宽度范围的采样概率相应地也较大,分布较少的宽度范围的采样概率相应的也较少等。The manner of sampling in the search space may be random sampling, or may be sampling according to distribution, and may be adjusted according to actual application scenarios, which is not limited in this application. For example, m groups of parameter combinations may be randomly sampled in the search space, and each parameter combination may include one of depth, width, resolution convolution kernel size or convolution kernel group size, or The value of multiple parameters. For another example, taking the sampling of the width as an example, the sampling probability can be determined according to the distribution of the width. Wait.
502、根据至少一组第一参数组合构建多个第一神经网络。502. Construct a plurality of first neural networks according to at least one set of first parameter combinations.
其中,在采样得到至少一组第一参数组合之后,基于该第一参数组合所包括的参数,构建得到多个第一神经网络。Wherein, after sampling to obtain at least one set of first parameter combinations, a plurality of first neural networks are constructed and obtained based on the parameters included in the first parameter combinations.
具体地,可以是基于第一参数组合包括的参数,使用预先设置的基础单元来进行堆叠,得到第一神经网络,也可以是基于第一参数组合包括的参数,对预先给定的基础网络来进行调整。Specifically, based on the parameters included in the first parameter combination, the first neural network can be obtained by stacking using a preset basic unit, or it can be based on the parameters included in the first parameter combination, for a predetermined basic network. make adjustments.
例如,第一参数组合可以包括深度、宽度和分辨率取值,深度为10,宽度为20,分辨率为100*200,然后构建网络层数为10,每一层包括20个基础单元的第一神经网络,分辨率可以是该第一神经网络能够进行处理的输入图像的大小,且可以根据分辨率来构建池化层的大小。For example, the first parameter combination can include the values of depth, width and resolution, the depth is 10, the width is 20, and the resolution is 100*200, and then the number of network layers is 10, and each layer includes the first layer of 20 basic units. In a neural network, the resolution can be the size of the input image that can be processed by the first neural network, and the size of the pooling layer can be constructed according to the resolution.
又例如,可以预先给定一个基础网络,该基础网络可以是CNN,还可以是ResNet、RNN等,此处以CNN为例进行说明,在采样得到第一参数组合之后,根据该第一参数组合中包括的深度和宽度,对CNN的网络层数和每一层网络的基础单元的数量调整为第一参数组合中的取值,得到调整后的CNN。For another example, a basic network can be predetermined, which can be CNN, ResNet, RNN, etc. Here, CNN is used as an example to illustrate, after sampling the first parameter combination, according to the first parameter combination Including the depth and width, the number of network layers of the CNN and the number of basic units of each layer of the network are adjusted to the values in the first parameter combination to obtain the adjusted CNN.
通常,第一神经网络的数量不少于第一参数组合的数量。在一些场景中,第一参数组合的数量与第一神经网络的数量相同。例如,若采样得到了5组参数组合,则相应地可以构建得到5个第一神经网络。而在一些场景中,第一神经网络的数量可能高于第一参数组合的数量。例如,若采样得到2组第一参数组合,在根据该2组第一参数组合包括的深度和宽度构建第一神经网络时,可以使用不同的基础单元或算子等,在相同的深度和宽度的情况下,可以得到结构不同的2个或2个以上的第一神经网络。Usually, the number of first neural networks is not less than the number of first parameter combinations. In some scenarios, the number of first parameter combinations is the same as the number of first neural networks. For example, if five sets of parameter combinations are obtained by sampling, five first neural networks can be constructed accordingly. In some scenarios, the number of first neural networks may be higher than the number of first parameter combinations. For example, if two sets of first parameter combinations are obtained by sampling, when constructing the first neural network according to the depth and width included in the two sets of first parameter combinations, different basic units or operators can be used, and the same depth and width can be used. In the case of , two or more first neural networks with different structures can be obtained.
503、生成映射关系,该映射关系为多种参数和多个第一神经网络的评估结果之间的映射关系。503. Generate a mapping relationship, where the mapping relationship is a mapping relationship between various parameters and the evaluation results of the multiple first neural networks.
在得到多个第一神经网络之后,可以对该多个第一神经网络的结构进行评估,得到每个第一神经网络的评估结果,并根据每个第一神经网络的参数组合和评估结果,生成多种参数和评估结果之间的映射关系,通常结构不同的神经网络可能具有不同的评估结果。例如,第一神经网络的参数组合可以包括深度和宽度,评估结果可以包括第一神经网络的flops,则该映射关系可以包括深度、宽度,与flops之间的关系。After a plurality of first neural networks are obtained, the structures of the plurality of first neural networks can be evaluated to obtain an evaluation result of each first neural network, and according to the parameter combination and evaluation result of each first neural network, Generate a mapping relationship between various parameters and evaluation results, usually neural networks with different structures may have different evaluation results. For example, the parameter combination of the first neural network may include depth and width, the evaluation result may include flops of the first neural network, and the mapping relationship may include the relationship between depth, width, and flops.
在一种可能的实施方式中,评估结果可以包括以下一项或者多项:每个第一神经网络的flops、每个第一神经网络的正向推理的运行时长、运行每个第一神经网络占用的内存量或者每个第一神经网络的参数量等。其中,每个第一神经网络的flops、正向推理的运行时长或者占用的内存量等,可以是在计算装置中运行第一神经网络所得到的。例如,若计算装置为终端设备,则可以在该终端设备上运行第一神经网络,并对第一神经网络在计 算装置中运行的过程和结果进行统计,从而获知该第一神经网络在终端设备中运行时的flops、一次正向推理的运行时长或者占用终端设备的内存量等。而第一神经网络的参数量或其他与第一神经网络结构直接相关的参数,则可以直接对第一神经网络进行评估得到,如直接统计第一神经网络中所包括的参数量。In a possible implementation, the evaluation result may include one or more of the following: flops of each first neural network, running time of forward inference of each first neural network, running each first neural network The amount of memory occupied or the amount of parameters of each first neural network, etc. Wherein, the flops of each first neural network, the running time of forward inference, or the amount of memory occupied, etc., may be obtained by running the first neural network in a computing device. For example, if the computing device is a terminal device, the first neural network can be run on the terminal device, and the process and results of the first neural network running in the computing device can be counted, so as to know that the first neural network is running on the terminal device. The flops of the mid-running time, the running time of a forward inference, or the amount of memory occupied by the terminal device, etc. The parameters of the first neural network or other parameters directly related to the structure of the first neural network can be obtained by directly evaluating the first neural network, such as directly counting the parameters included in the first neural network.
在一种可能的实施方式中,还可以使用预设的第一数据集训练所述多个第一神经网络,得到训练后的多个第一神经网络。然后可以对该训练后的多个第一神经网络的结果进行评估,得到每个训练后的第一神经网络的评估结果,或者,将数据集作为每个训练后的第一神经网络的输入,来计算每个训练后的第一神经网络的输出精度,然后根据每个训练后的第一神经网络的评估结果或者输出精度,从多个第一神经网络中筛选出至少一个第二神经网络。然后根据每个第二神经网络对应的参数组合和评估结果,来生成映射关系。In a possible implementation manner, the plurality of first neural networks may also be trained by using a preset first data set to obtain a plurality of trained first neural networks. Then the results of the trained multiple first neural networks can be evaluated to obtain the evaluation results of each trained first neural network, or the data set can be used as the input of each trained first neural network, to calculate the output accuracy of each trained first neural network, and then select at least one second neural network from multiple first neural networks according to the evaluation result or output accuracy of each trained first neural network. Then, the mapping relationship is generated according to the parameter combination corresponding to each second neural network and the evaluation result.
具体地,可以对至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系进行拟合,得到映射关系。该映射关系可以是线性关系,也可以是非线性关系。Specifically, the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network may be fitted to obtain the mapping relationship. The mapping relationship may be a linear relationship or a non-linear relationship.
例如,在得到m个参数组合之后,基于该m个参数组合构建得到m个第一神经网络,m≥2,使用数据集对该m个第一神经网络进行训练,得到训练后的m个第一神经网络。然后可以从该m个第一神经网络中,选择出评估结果最优或者输出精度最优的n个第二神经网络,n≤m。然后对该m个第二神经网络中每个第二神经网络的参数组合与评估结果之间的关系进行拟合,得到曲线关系;或者,还可以生成每个第二神经网络的参数组合与评估结果之间的映射表。For example, after obtaining m parameter combinations, construct m first neural networks based on the m parameter combinations, m≥2, use the data set to train the m first neural networks, and obtain the m first neural networks after training. a neural network. Then, from the m first neural networks, n second neural networks with the best evaluation results or the best output accuracy can be selected, where n≤m. Then, the relationship between the parameter combination of each second neural network in the m second neural networks and the evaluation result is fitted to obtain a curve relationship; or, the parameter combination and evaluation of each second neural network can also be generated. A mapping table between results.
504、获取与计算装置对应的约束范围。504. Acquire a constraint range corresponding to the computing device.
其中,该约束范围可以包括标识计算装置的计算能力的数值范围,或者说该约束范围可以包括与计算装置的计算能力关联的指标的取值范围,该约束范围可以影响该计算装置可承载的神经网络的结构。该约束范围可以是与前述的评估结果包括的数据类型相关的范围,或者说表示计算装置的计算能力的指标的数据。例如,若该约束范围可以是不超过计算装置可承载的神经网络的最大运行时长的数值范围,则前述的评估结果中可以包括第一神经网络的运行时长。又例如,若该约束范围是flops范围,则前述的评估结果中可以包括第一神经网络的flops的值。The constraint range may include a numerical range that identifies the computing capability of the computing device, or the constraint range may include a value range of an index associated with the computing capability of the computing device, and the constraint range may affect the neural network that the computing device can carry. The structure of the network. The constraint range may be a range related to the data type included in the aforementioned evaluation result, or data representing an index of the computing capability of the computing device. For example, if the constraint range can be a numerical range that does not exceed the maximum running duration of the neural network that can be carried by the computing device, the aforementioned evaluation result can include the running duration of the first neural network. For another example, if the constraint range is the flops range, the aforementioned evaluation result may include the value of the flops of the first neural network.
需要说明的是,本申请中所提及的计算装置,可以是实体装置,或者称为硬件,也可以是虚拟装置,如通过仿真得到的装置,为便于理解,下面以计算装置为硬件为例进行示例性说明,以下所提及的硬件也可以替换为计算装置或者虚拟装置等,以下不再赘述。It should be noted that the computing device mentioned in this application may be a physical device, or referred to as hardware, or a virtual device, such as a device obtained through simulation. For ease of understanding, the computing device is taken as an example of hardware below. For exemplary illustration, the hardware mentioned below can also be replaced by a computing device or a virtual device, etc., which will not be described in detail below.
硬件的计算能力可以通过多种参数来衡量,如可用内存量、运行时长、可支持的计算量或者可支持的参数量等。例如,在硬件中需要构建一个目标神经网络,该约束范围可以是该硬件可承载的目标神经网络的计算量的取值范围、运行时长的取值范围、或可用内存量等范围,其中,硬件的计算量可以通过flops或者乘积累加运算(multiply accumulate,MACC)等来衡量。The computing power of hardware can be measured by a variety of parameters, such as the amount of available memory, running time, the amount of computation that can be supported, or the amount of parameters that can be supported. For example, a target neural network needs to be constructed in hardware, and the constraint range can be the range of the calculation amount of the target neural network that can be carried by the hardware, the value range of the running time, or the amount of available memory, etc., where the hardware The amount of computation can be measured by flops or multiply-accumulate (MACC) operations.
在一种可能的实施方式中,若本申请提供的神经网络构建方法可以由云侧设备或者端侧设备执行,则该云侧设备或者端侧设备可以提取自身的硬件的信息,从而获知自身的计 算能力,得到针对需要在云侧设备或者端侧设备中运行的神经网络的约束范围。例如,云侧设备可以提取自身的硬件所支持的flops范围、可用内存量等,并将flops范围和可用内存量作为硬件的约束范围。In a possible implementation manner, if the neural network construction method provided by this application can be executed by a cloud-side device or a terminal-side device, the cloud-side device or terminal-side device can extract its own hardware information, so as to know its own hardware information. Computational power, obtain the constraint range for the neural network that needs to run in the cloud-side device or the device-side device. For example, a cloud-side device can extract the range of flops supported by its own hardware, the amount of available memory, etc., and use the range of flops and the amount of available memory as hardware constraints.
在一种可能的实施方式中,若本申请提供的神经网络构建方法可以由云侧设备执行,则该服务器可以接收端侧设备发送的用户输入数据,然后基于该用户输入数据来得到该约束范围。该用户输入数据可以是用户通过端侧设备的交互界面输入的信息,请求云侧设备根据用户输入的信息来构建出与端侧设备或者其他设备的计算能力适配的目标神经网络。例如,可以由用户在端侧设备提供的交互界面中输入flops范围、占用内存量等在硬件的计算能力内的一种或者多种范围,以使云侧设备将该一种或者多种范围作为针对硬件的约束范围,或者,用户也可以直接在端侧设备提供的交互界面中输入硬件型号、硬件标识号或者硬件名词等硬件的标识信息,以使云侧设备可以根据该标识信息,识别出硬件的计算能力,从而将该硬件的计算能力的范围作为约束范围等。In a possible implementation, if the neural network construction method provided by this application can be executed by a cloud-side device, the server can receive user input data sent by the terminal-side device, and then obtain the constraint range based on the user input data . The user input data may be information input by the user through the interactive interface of the terminal device, requesting the cloud side device to construct a target neural network adapted to the computing capability of the terminal device or other devices according to the information input by the user. For example, the user can input one or more ranges within the computing capability of the hardware, such as the flops range and the amount of memory occupied, in the interactive interface provided by the terminal device, so that the cloud-side device can use the one or more ranges as For the scope of hardware constraints, or, the user can also directly input hardware identification information such as the hardware model, hardware identification number, or hardware noun in the interactive interface provided by the terminal device, so that the cloud-side device can identify the identification information based on the identification information. The computing capability of the hardware, so that the range of the computing capability of the hardware is used as a constraint range, etc.
此外,若约束信息是根据用户输入数据得到的,则还可以根据该用户输入信息确定需要构建的目标神经网络的类型,如用户可以通过用户输入数据请求构建用于进行图像分类或者目标检测等的神经网络。In addition, if the constraint information is obtained according to user input data, the type of target neural network to be constructed can also be determined according to the user input information. For example, the user can request to construct a neural network for image classification or target detection through user input data. Neural Networks.
具体例如,该用户输入信息中可以直接包括约束范围,也可以包括硬件的标识信息,当云侧设备接收到用户输入数据之后,根据该硬件的标识信息识别出硬件的计算能力,并根据该计算能力确定对应的约束范围。例如,用户可以在端侧设备的交互界面中输入需要运行目标神经网络的终端的中央处理器(central processing unit,CPU)的型号,并通过端侧设备发送给云侧设备。云侧设备在接收到用户输入信息之后,根据用户输入信息中携带的CPU的型号,从本地的数据库中提取到该CPU对应的flops范围,即约束范围。Specifically, for example, the user input information may directly include the constraint range, or may include the identification information of the hardware. After the cloud-side device receives the user input data, it identifies the computing capability of the hardware according to the identification information of the hardware, and according to the calculation The capability determines the corresponding constraint range. For example, the user can enter the model of the central processing unit (CPU) of the terminal that needs to run the target neural network in the interactive interface of the terminal device, and send it to the cloud side device through the terminal device. After receiving the user input information, the cloud-side device extracts the flops range corresponding to the CPU, that is, the constraint range, from the local database according to the CPU model carried in the user input information.
还需要说明的是,本申请实施例中的步骤504可以是在步骤501之前执行,也可以是在步骤501之后执行,具体可以根据实际应用场景来进行调整,本申请对此不作限定。当先执行步骤501时,可以构建深度、宽度或者输入图像的分辨率等影响神经网络的结构的参数,和与硬件性能相关的参数,如运行时长、内存占用量、flops等之间的映射关系,约束范围可以是与硬件性能相关的参数的范围,如运行时长的范围、内存占用量的范围或flops的范围等。从而使后续在需要构建在硬件中运行的神经网络时,可以复用该映射关系,提高构建神经网络的效率。若先执行步骤504,则可以构建深度、宽度或者输入图像的分辨率等影响神经网络的结构的参数和约束范围对应的参数之间的映射关系,例如,若该约束范围为运行时长的约束范围,则可以构建深度、宽度或者输入图像的分辨率等影响神经网络的结构的参数和运行时长之间的映射关系,若该约束范围是flops的范围,则可以构建深度、宽度或者输入图像的分辨率等影响神经网络的结构的参数和flops之间的映射关系,从而无需构建更多参数之间的映射关系,减少了工作量。It should also be noted that, step 504 in this embodiment of the present application may be performed before step 501, or may be performed after step 501, and may be adjusted according to actual application scenarios, which is not limited in this application. When step 501 is performed first, parameters that affect the structure of the neural network, such as depth, width, or the resolution of the input image, and parameters related to hardware performance, such as running time, memory usage, flops, etc., can be constructed. The range of constraints can be the range of parameters related to hardware performance, such as the range of runtime, the range of memory footprint, or the range of flops. Therefore, when a neural network running in hardware needs to be constructed subsequently, the mapping relationship can be reused, and the efficiency of constructing the neural network can be improved. If step 504 is performed first, the mapping relationship between parameters affecting the structure of the neural network, such as depth, width, or resolution of the input image, and parameters corresponding to the constraint range can be constructed. For example, if the constraint range is the constraint range of the runtime length , the mapping relationship between parameters affecting the structure of the neural network, such as depth, width, or the resolution of the input image, and the running time can be constructed. If the constraint range is the range of flops, the resolution of the depth, width, or input image can be constructed. The mapping relationship between parameters that affect the structure of the neural network and the flops, such as rate, eliminates the need to build more mapping relationships between parameters and reduces the workload.
505、根据映射关系,得到和约束范围对应的第二参数组合。505. Obtain a second parameter combination corresponding to the constraint range according to the mapping relationship.
在确定映射关系和约束范围之后,通过映射关系计算与约束范围对应的参数组合。例如,该映射关系可以是深度、宽度和输入至第一神经网络的图像的分辨率等参数和flops之间的关系,该约束范围可以包括用户设定的终端的flops范围,将该终端的flops范围 代入至映射关系中,即可得到与该终端的flops范围对应的包括了深度、宽度和输入至第一神经网络的图像的分辨率等参数的一组或者多组参数组合。After the mapping relationship and the constraint range are determined, the parameter combination corresponding to the constraint range is calculated through the mapping relationship. For example, the mapping relationship may be the relationship between parameters such as depth, width, and the resolution of the image input to the first neural network and flops, and the constraint range may include the flops range of the terminal set by the user. By substituting the range into the mapping relationship, one or more sets of parameter combinations corresponding to the flops range of the terminal including parameters such as depth, width, and resolution of the image input to the first neural network can be obtained.
为便于理解,映射关系可以理解为自变量和因变量之间的关系,自变量可以包括深度、宽度和输入至第一神经网络的图像的分辨率等参数,因变量可以包括flops、运行时长或占用内存等变量。在得到约束范围之后,该约束范围可以看作因变量,将该因变量代入映射关系中,即可反推出自变量,即包括深度、宽度和输入至第一神经网络的图像的分辨率等参数的参数组合。当然,也可以将flops、运行时长、占用内存等作为自变量,深度、宽度和输入至第一神经网络的图像的分辨率等参数作为因变量,具体可以根据实际应用场景进行调整,此处仅仅是示例性说明,并不作为限定。For ease of understanding, the mapping relationship can be understood as the relationship between the independent variable and the dependent variable. The independent variable can include parameters such as depth, width, and the resolution of the image input to the first neural network. The dependent variable can include flops, running time or variables such as memory usage. After the constraint range is obtained, the constraint range can be regarded as a dependent variable, and the dependent variable can be substituted into the mapping relationship to invert the independent variable, which includes parameters such as depth, width, and the resolution of the image input to the first neural network. parameter combination. Of course, flops, running time, occupied memory, etc. can also be used as independent variables, and parameters such as depth, width, and the resolution of the image input to the first neural network can be used as dependent variables, which can be adjusted according to actual application scenarios. It is an exemplary illustration, not a limitation.
在一种可能的实施方式中,该映射关系也可以是映射表,如记录了深度、宽度和输入至第一神经网络的图像的分辨率等参数映射至flops、运行时长或占用内存等值的表格。在确定约束范围之后,即可从映射表中查询与该约束范围对应的一组或者多组参数组合。In a possible implementation manner, the mapping relationship may also be a mapping table, for example, parameters such as depth, width, and resolution of an image input to the first neural network are recorded and mapped to flops, running time, or occupied memory and other values. sheet. After the constraint range is determined, one or more sets of parameter combinations corresponding to the constraint range can be queried from the mapping table.
506、根据第二参数组合得到目标神经网络。506. Obtain the target neural network according to the second parameter combination.
在得到一组或者多组第二参数组合之后,即可基于该一组或者多组第二参数组合来构建神经网络,得到目标神经网络。After one or more sets of second parameter combinations are obtained, a neural network can be constructed based on the one or more sets of second parameter combinations to obtain a target neural network.
具体地,第二参数组合中可以包括深度、宽度、输入至第一神经网络的图像的分辨率、卷积核的大小或者卷积核的group的数量等,可以基于该第二参数组合中所包括的参数,以及给定的基础单元,构建得到目标神经网络。Specifically, the second parameter combination may include depth, width, the resolution of the image input to the first neural network, the size of the convolution kernel, or the number of groups of the convolution kernel, etc., which may be based on the second parameter combination. The parameters included, and given the base unit, are constructed to obtain the target neural network.
在一种场景中,可以通过一组或者多个第二参数组合,构建得到多个第三神经网络,则可以使用预先设定的第二数据集对该多个第三神经网络都进行训练,得到训练后的第三神经网络。然后可以从该多个训练后的第三神经网络中选择出最优的神经网络作为目标神经网络。例如,可以从多个训练后的第三神经网络中选择输出精度最高的神经网络作为目标神经网络,或者,从该多个训练后的第三神经网络中选择输出精度和计算量的比值在一定范围内的神经网络作为目标神经网络等。In one scenario, multiple third neural networks can be constructed by combining one or more second parameters, and then the multiple third neural networks can be trained by using the preset second data set, Get the third neural network after training. Then, the optimal neural network may be selected from the plurality of trained third neural networks as the target neural network. For example, the neural network with the highest output accuracy can be selected from the multiple trained third neural networks as the target neural network, or the ratio between the output accuracy and the calculation amount can be selected from the multiple trained third neural networks within a certain value. A neural network in the range as a target neural network, etc.
在一种可能的场景中,可以通过一组或者多个第二参数组合,构建得到多个第三神经网络,可以直接从该第三神经网络中选择出最优的神经网络,作为目标神经网络。例如,可以从该第三神经网络中选择flops最低的神经网络作为目标神经网络,或者,可以从该第三神经网络中选择运行一次正向推理的运行时长最短的神经网络作为目标神经网络,或者,可以从该第三神经网络中选择占用内存最少的神经网络作为目标神经网络等。In a possible scenario, multiple third neural networks can be constructed by combining one or more second parameters, and the optimal neural network can be directly selected from the third neural network as the target neural network . For example, the neural network with the lowest flops can be selected from the third neural network as the target neural network, or the neural network with the shortest running time for running one forward inference can be selected from the third neural network as the target neural network, or , the neural network that occupies the least memory can be selected from the third neural network as the target neural network, etc.
在另一种场景中,若仅得到一组第二参数组合,并构建得到一个神经网络,则可以直接将该神经网络作为目标神经网络。In another scenario, if only a set of second parameter combinations are obtained and a neural network is constructed, the neural network can be directly used as the target neural network.
可选地,在得到目标神经网络时,若该目标神经网络未被训练过,则还可以使用预先设定的第二数据集对该目标神经网络进行训练,得到训练后的目标神经网络。Optionally, when the target neural network is obtained, if the target neural network has not been trained, the target neural network may also be trained using a preset second data set to obtain a trained target neural network.
更具体地,本申请提及的目标神经网络,可以用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的一种或者多种。例如,在通过对基础网络进行缩放,得到骨干网络之后,可以通过添加不同的header,实现不同的功能,使得到的神经网络可以具有一项或者多项功能。More specifically, the target neural network mentioned in this application can be used for one or more of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection. For example, after the backbone network is obtained by scaling the basic network, different headers can be added to achieve different functions, so that the obtained neural network can have one or more functions.
在一种可能的实施方式中,若预先设定了基础网络,可以使用第二参数组合在该基础网络上来进行调整,得到最终的目标神经网络。例如,如用户可以通过输入数据给定基础网络,该基础网络可以是CNN、RNN或者ResNet等,第二参数组合中可以包括深度、宽度或者输入图像分辨率等参数,基于第二参数组合中包括的参数,对CNN、RNN或者ResNet的宽度、深度或者池化层的大小等进行调整,得到调整后的网络,即目标神经网络。因此,在本申请实施方式中,可以以硬件的计算能力作为约束对模型进行缩放,得到与硬件适配的最优模型。In a possible implementation manner, if a basic network is preset, the second parameter combination can be used to perform adjustment on the basic network to obtain the final target neural network. For example, if the user can give a basic network by inputting data, the basic network can be CNN, RNN or ResNet, etc. The second parameter combination can include parameters such as depth, width or input image resolution, based on the second parameter combination including Adjust the width, depth or size of the pooling layer of CNN, RNN or ResNet to obtain the adjusted network, that is, the target neural network. Therefore, in the embodiments of the present application, the model can be scaled by using the computing capability of the hardware as a constraint to obtain an optimal model adapted to the hardware.
因此,在本申请实施方式中,可以从搜索空间中搜索得到至少一组参数组合,基于该至少一组参数组合得到多个神经网络,并基于该至少一组参数组合和多个神经网络的结构的评估结果来生成映射关系。随后可以根据该映射关系,得到与硬件的约束范围对应的参数组合,从而根据该参数组合得到与硬件适配的最优模型。可以理解为,可以以硬件的计算能力作为约束对模型进行缩放,得到与硬件适配的最优模型。Therefore, in the embodiments of the present application, at least one set of parameter combinations can be obtained from the search space, multiple neural networks can be obtained based on the at least one set of parameter combinations, and based on the at least one set of parameter combinations and the structures of the multiple neural networks the evaluation results to generate the mapping relationship. Then, according to the mapping relationship, a parameter combination corresponding to the constraint range of the hardware can be obtained, so as to obtain an optimal model adapted to the hardware according to the parameter combination. It can be understood that the model can be scaled with the computing capability of the hardware as a constraint to obtain an optimal model adapted to the hardware.
前述对本申请提供的神经网络构建的方法的流程进行了说明,下面结合前述图5中的流程以及具体的应用场景,对本申请提供的神经网络构建方法进行更详细介绍。The foregoing describes the process of the method for constructing a neural network provided by the present application. The following describes the method for constructing a neural network provided by the present application in more detail with reference to the process in FIG. 5 and specific application scenarios.
在一种可能的实施方式中,本申请提供的神经网络构建方法可以由服务器、终端或者其他设备来执行,以服务器为例,若服务器自身需要构建一个部署于服务器的目标神经网络,则服务器可以获取自身的硬件信息,如CPU型号、可用内存量或者需求的运行时长等,根据该硬件信息生成硬件约束范围,基于该硬件约束范围来构建模型,或者对基础网络进行调整,得到与硬件适配的目标神经网络。又例如,端侧设备可以基于适配端侧硬件约束进行模型调整,调整后的模型在端侧使用,对模型进行调整以更合理利用硬件资源。端侧设备获取基础网络和对应参数信息后,获取该系列基础网络下flops和超参之间的关系,同时获取本地硬件信息,根据硬件型号,从数据库中调取查找表将时延速度映射到目标flops。根据基础模型和目标flops值,确定超参值,如宽度、深度和输入分辨率。根据基础模型和超参值,对基础网络进行表彰,得到适配本地硬件的模型,训练并在本地使用。In a possible implementation manner, the neural network construction method provided by this application may be executed by a server, a terminal or other devices. Taking a server as an example, if the server itself needs to build a target neural network deployed on the server, the server can Obtain its own hardware information, such as CPU model, available memory, or required running time, etc., generate a hardware constraint range based on the hardware information, build a model based on the hardware constraint range, or adjust the basic network to obtain a hardware adaptation. target neural network. For another example, the terminal-side device may adjust the model based on the adapted terminal-side hardware constraints, the adjusted model is used on the terminal-side, and the model is adjusted to utilize hardware resources more reasonably. After the end-side device obtains the basic network and corresponding parameter information, it obtains the relationship between flops and hyperparameters under the series of basic networks, and at the same time obtains local hardware information. Target flops. Based on the base model and target flops values, determine hyperparameter values such as width, depth, and input resolution. According to the basic model and hyperparameter values, the basic network is commended, and a model adapted to the local hardware is obtained, trained and used locally.
在另一些可能的场景中,本申请提供的神经网络构建方法可以由云侧设备来执行,针对目标神经网络的约束范围可以是由端侧设备发送至云侧设备,然后云侧设备根据端侧设备发送的数据,来生成映射关系,并计算出最优的参数组合,根据最优的参数组合构建出与硬件适配的神经网络。In some other possible scenarios, the neural network construction method provided in this application may be executed by a cloud-side device, and the constraint range for the target neural network may be sent by the terminal-side device to the cloud-side device, and then the cloud-side device will send the The data sent by the device is used to generate the mapping relationship, and the optimal parameter combination is calculated, and a neural network adapted to the hardware is constructed according to the optimal parameter combination.
为便于理解,参阅图6,对该场景下本申请提供的神经网络构建方法的流程进行示例性说明。其中,对于与前述图5中的流程相似的部分不再赘述,下面仅对更详细的内容进行解释。For ease of understanding, referring to FIG. 6 , the flow of the neural network construction method provided by the present application in this scenario is exemplarily described. Wherein, the parts similar to the above-mentioned flow in FIG. 5 will not be repeated, and only more detailed contents will be explained below.
601、端侧设备向云侧设备发送用户输入信息。601. The terminal-side device sends user input information to the cloud-side device.
端侧设备可以是移动电话、平板个人电脑(tablet personal computer,TPC)、媒体播放器、智能电视、笔记本电脑(laptop computer,LC)、个人数字助理(personal digital assistant,PDA)、个人计算机(personal computer,PC)、照相机、摄像机、智能手表、可穿戴式设备(wearable device,WD)或者自动驾驶的车辆等设备。The end-side device can be a mobile phone, a tablet personal computer (TPC), a media player, a smart TV, a laptop computer (LC), a personal digital assistant (PDA), a personal computer (personal computer) computer, PC), camera, video camera, smart watch, wearable device (WD) or self-driving vehicle and other devices.
端侧设备可以根据自身的硬件信息生成用户输入信息,也可以是为用户提供交互界面, 由用户进行输入得到用户输入数据。例如,端侧设备需要一个对相册进行分类的神经网络,端侧设备可以基于需要构建的神经网络的类型以及自身的硬件信息,如CPU型号、NPU型号、可用内存量、或者对神经网络需求的运行时长等,生成该用户输入信息,然后通过有线或者无线网络发送至云侧设备。或者,用户可以在端侧设备的交互界面中输入需要部署于另一终端设备的硬件信息以及神经网络的类型,如该端侧设备可以是用户的个人电脑,用户可以在个人电脑的交互界面中输入手机的硬件信息,以请求为手机的相册分类构建一个与手机硬件适配的神经网络。The terminal-side device may generate user input information according to its own hardware information, or may provide an interactive interface for the user, and obtain the user input data through the user's input. For example, the terminal device needs a neural network to classify photo albums. The terminal device can be based on the type of neural network to be built and its own hardware information, such as CPU model, NPU model, available memory, or neural network requirements. The running time, etc., generates the user input information, and then sends it to the cloud-side device through a wired or wireless network. Alternatively, the user can input the hardware information that needs to be deployed in another terminal device and the type of neural network in the interactive interface of the terminal device. Enter the hardware information of the mobile phone to request to build a neural network adapted to the mobile phone hardware for the mobile phone's album classification.
602、云侧设备提取约束范围。602. The cloud-side device extracts the constraint range.
云侧设备可以包括服务器、个人计算机或者计算机工作站等等设备,在云侧设备接收到用户输入数据之后,从该用户输入信息中提取信息,从而得到约束范围。该约束范围是在硬件的计算能力范围内的范围,通常,硬件的计算能力越高,该约束范围的上限也就越高,即该约束范围的上限与硬件的计算能力呈正相关。硬件的计算能力可以通过flops、可支持的神经网络的运行时长、可用内存量或可承载的神经网络的参数量等来衡量。The cloud-side device may include a server, a personal computer, or a computer workstation, etc. After the cloud-side device receives the user input data, it extracts information from the user input information to obtain the constraint range. The constraint range is a range within the computing capability of the hardware. Generally, the higher the computing capability of the hardware, the higher the upper limit of the constraint range, that is, the upper limit of the constraint range is positively correlated with the computing capability of the hardware. The computing power of the hardware can be measured in flops, the running time of the neural network that can be supported, the amount of available memory or the number of parameters of the neural network that can be carried, etc.
具体地,该用户输入信息中可以直接包括取值范围,可以直接将该取值范围作为约束范围。例如,该用户输入信息可以直接包括flops范围、运行时长的范围或者内存占用量的范围等与硬件相关的参数的范围,云侧设备可以直接从该用户输入数据中提取到约束范围。Specifically, the user input information may directly include a value range, and the value range may be directly used as a constraint range. For example, the user input information may directly include the range of hardware-related parameters such as the range of flops, the range of running time, or the range of memory occupancy, and the cloud-side device may directly extract the constraint range from the user input data.
该用户输入信息中还可以包括硬件标识信息,如端侧设备的CPU的型号或NPU的型号等,云侧设备在接收到用户输入信息之后,即可根据该硬件标识信息,从本地存储的数据中提取与硬件标识信息到约束范围,如CPU可支持的flops范围,可支持的内存占用量等表示CPU的计算能力的信息。The user input information may also include hardware identification information, such as the CPU model of the terminal device or the NPU model, etc. After receiving the user input information, the cloud-side device can retrieve data from the locally stored data according to the hardware identification information. Extract the hardware identification information to the constraint range, such as the range of flops that the CPU can support, the amount of memory that can be supported, and other information that represents the computing power of the CPU.
603、云侧设备从预设的搜索空间中采样得到m组第一参数组合。603. The cloud-side device samples the preset search space to obtain m groups of first parameter combinations.
在从用户输入信息中提取到硬件约束之后,云侧设备从预先设定的搜索空间中进行采样,得到m组第一参数组合,每组第一参数组合中包括了深度、宽度或者输入图像的分辨率等影响神经网络的结构的参数,以便后续构建第一神经网络。After extracting the hardware constraints from the user input information, the cloud-side device samples from the preset search space to obtain m sets of first parameter combinations, each set of first parameter combinations includes depth, width or input image Parameters such as resolution affect the structure of the neural network, so that the first neural network can be constructed subsequently.
例如,搜索空间中包括[5,59]的数值范围,可以从该数值范围内采样得到一组宽度、深度和分辨率的值,如[15.2,34.5,55.6]。For example, the search space includes a numerical range of [5, 59], from which a set of width, depth and resolution values can be sampled, such as [15.2, 34.5, 55.6].
604、云侧设备构建得到m个第一神经网络。604. The cloud-side device is constructed to obtain m first neural networks.
在得到m组第一参数组合之后,即可构建出至少m个第一神经网络。After obtaining m groups of first parameter combinations, at least m first neural networks can be constructed.
具体地,若存在基础网络,则可以直接在该基础网络的基础上进行调整,得到m个第一神经网络。若无基础网络,则可以直接使用基础单元来进行构建,得到m个第一神经网络。具体参阅前述步骤502,此处不再赘述。Specifically, if there is a basic network, adjustment can be made directly on the basis of the basic network to obtain m first neural networks. If there is no basic network, the basic unit can be directly used for construction to obtain m first neural networks. For details, refer to the foregoing step 502, which will not be repeated here.
605、云侧设备从m个第一神经网络中筛选出n个第二神经网络。605. The cloud-side device selects n second neural networks from the m first neural networks.
在得到m个第一神经网络之后,可以对该m个第一神经网络进行训练,如使用大量采集到的数据或者ImageNet数据集来进行训练等,得到训练后的m个第一神经网络,然后评估该训练后的m个第一神经网络的结构或者输出精度等,选择出最优的n个第二神经网络。如从m个第一神经网络中选择n个输出精度最高的神经网络,或者从该m个第一神经网络 中选择结构最优的n个神经网络,或者从该m个第一神经网络中选择出flops和输出精度之间的比值最小的n个神经网络等。After obtaining m first neural networks, the m first neural networks can be trained, for example, using a large amount of collected data or ImageNet data sets for training, etc., to obtain m first neural networks after training, and then Evaluate the structure or output accuracy of the m first neural networks after training, and select the optimal n second neural networks. For example, select n neural networks with the highest output accuracy from m first neural networks, or select n neural networks with optimal structure from the m first neural networks, or select from the m first neural networks N neural networks with the smallest ratio between output flops and output accuracy, etc.
606、云侧设备对n个第二神经网络的参数组合和评估结果之间的关系进行拟合,得到映射关系。606. The cloud-side device fits the relationship between the parameter combinations of the n second neural networks and the evaluation results to obtain a mapping relationship.
在云侧设备从m个第一神经网络中筛选出最优的n个第二神经网络之后,对该n个第二神经网络的参数组合和评估结果之间的关系进行拟合,从而得到映射关系。After the cloud-side device selects the optimal n second neural networks from the m first neural networks, it fits the relationship between the parameter combinations of the n second neural networks and the evaluation results, so as to obtain a mapping relation.
具体地,可以采用线性关系进行拟合,也可以是进行高斯拟合等,具体可以根据实际应用场景进行调整,本申请对此并不作限定。Specifically, a linear relationship may be used to perform fitting, or Gaussian fitting may be performed, etc., which may be adjusted according to actual application scenarios, which is not limited in this application.
为便于理解,本申请实施例可以理解为,从搜索空间所包含的模型参数,采样出若干不同参数的组合,并根据该不同的参数组合改变基础模型的参数,构建若干新的模型。然后在数据集上训练和测试,得到输出结果的正确率或者计算量等指标。根据该正确率或计算量等指标,挑选出符合要求的模型(比如正确率高计算量少),得到模型的参数和计算量的关系。通过公式(比如高斯过程回归)拟合这种关系,得到拟合曲线。For ease of understanding, the embodiment of the present application can be understood as sampling several different parameter combinations from the model parameters included in the search space, and changing the parameters of the basic model according to the different parameter combinations to construct several new models. Then train and test on the data set to obtain indicators such as the correct rate of the output results or the amount of calculation. According to the index such as the accuracy rate or calculation amount, select a model that meets the requirements (for example, the accuracy rate is high and the calculation amount is small), and the relationship between the parameters of the model and the calculation amount is obtained. Fitting this relationship with a formula (such as Gaussian process regression) results in a fitted curve.
通常,可以将一些常用神经网络结构加入基础模型库,比如ResNet、MobileNet等,通过本申请提供的流程拟合得到曲线。当用户输入的基础模型在基础模型库里,即可直接调取拟合好的曲线,无需再进行上述流程,加速服务进程。Generally, some common neural network structures can be added to the basic model library, such as ResNet, MobileNet, etc., and the curve can be obtained by fitting the process provided in this application. When the basic model input by the user is in the basic model library, the fitted curve can be directly retrieved without the need to perform the above process and the service process can be accelerated.
607、云侧设备根据映射关系得到和约束范围对应的参数组合,并构建得到目标神经网络。607. The cloud-side device obtains a parameter combination corresponding to the constraint range according to the mapping relationship, and constructs a target neural network.
在得到参数和计算量的拟合曲线之后,输入构建模型(及目标神经网络)需要的计算量,输出参数值。根据实际应用的硬件资源限制(如模型计算量),输入计算量给上述公式,输出一个或多个参数值,从而根据该一个或多个参数值对基础模型进行缩放,或者重新构建模型,得到目标神经网络。After obtaining the fitting curve of the parameters and the calculation amount, input the calculation amount required to build the model (and the target neural network), and output the parameter value. According to the hardware resource limitation of the actual application (such as the model calculation amount), input the calculation amount to the above formula, and output one or more parameter values, so as to scale the basic model according to the one or more parameter values, or rebuild the model to obtain target neural network.
608、云侧设备向端侧设备发送目标神经网络。608. The cloud-side device sends the target neural network to the terminal-side device.
在云侧设备得到目标神经网络之后,即可将该目标神经网络部署于端侧设备中,例如端侧设备可以将该目标神经网络存储于存储介质中,以使端侧设备可以使用该目标网络来进行相关任务。After the cloud-side device obtains the target neural network, the target neural network can be deployed in the terminal-side device. For example, the terminal-side device can store the target neural network in a storage medium, so that the terminal-side device can use the target network. to perform related tasks.
具体地,云侧设备可以将该目标神经网络的超参和权重参数等发送给端侧设备,以使端侧设备可以根据该超参和权重参数等在本地构建目标神经网络,完成目标神经网络的部署,使端侧设备可以运行与硬件的计算能力适配的目标神经网络。Specifically, the cloud-side device can send the hyperparameters and weight parameters of the target neural network to the terminal-side device, so that the terminal-side device can locally construct the target neural network according to the hyperparameters and weight parameters, etc., to complete the target neural network. deployment, so that the end-side device can run the target neural network that is adapted to the computing power of the hardware.
可选地,该目标神经网络可以由云侧设备来进行训练,并将训练后的目标神经网络部署于端侧设备中,从而使端侧设备可以直接使用该目标神经网络,无需再次进行训练,提高端侧设备的工作效率。Optionally, the target neural network can be trained by the cloud-side device, and the trained target neural network can be deployed in the terminal-side device, so that the terminal-side device can directly use the target neural network without training again. Improve the work efficiency of end-side equipment.
因此,本申请提供了适配用户硬件约束的模型放缩服务,用户只需输入所需模型功能(或提供基础模型)、硬件型号及速度要求,即可输出满足硬件约束的新模型,可以高效地得到与硬件更适配的模型,提高用户体验。例如,用户设备发送模型功能(或指定模型)、硬件型号或时延要求等至云侧设备,如云服务器。云服务器根据输入确定基础模型,如从模型库中选取基础模型。根据硬件型号(选择计算量范围),从数据库中调取查找表,将 时延要求映射到特定flops区间,取区间的上限或均值作为输入flops值。根据基础模型和flops值,确定深度、宽度或输入图像的分辨率等参数值,或者称为超参值。根据基础模型和超参值,训练得到优化模型,将优化模型反馈给用户设备。Therefore, this application provides a model scaling service that adapts to the user's hardware constraints. The user only needs to input the required model function (or provide a basic model), hardware model and speed requirements, and then a new model that satisfies the hardware constraints can be output, which can efficiently To obtain a model more suitable for the hardware, improve the user experience. For example, the user equipment sends model functions (or specified models), hardware models or delay requirements, etc. to the cloud-side device, such as a cloud server. The cloud server determines the basic model according to the input, such as selecting the basic model from the model library. According to the hardware model (select the calculation range), the lookup table is retrieved from the database, the delay requirement is mapped to a specific flops interval, and the upper limit or average value of the interval is taken as the input flops value. Based on the base model and the flops value, determine the values of parameters such as depth, width, or the resolution of the input image, or hyperparameter values. According to the basic model and hyperparameter values, the optimized model is obtained by training, and the optimized model is fed back to the user equipment.
本申请提供的神经网络构建方法可以应用于多种场景,下面示例性地以一些场景为例进行示例性说明。The neural network construction method provided in the present application can be applied to various scenarios, and some scenarios are exemplified below for illustrative description.
参阅图7,本申请提供的神经网络构建方法的一种应用场景示意图。Referring to FIG. 7 , a schematic diagram of an application scenario of the neural network construction method provided by the present application.
其中,可以确定基础网络和硬件约束,该基础网络可以是用户提供的神经网络,也可以是从多种神经网络中选定的神经网络。如基础网络可以是CNN、ResNet、RNN等。Among them, a basic network and hardware constraints can be determined, and the basic network can be a neural network provided by a user or a neural network selected from a variety of neural networks. For example, the basic network can be CNN, ResNet, RNN, etc.
硬件约束可以是根据硬件的计算能力确定的。例如,该硬件约束可以是硬件的flops范围、可承载的神经网络的运行时长、或者可用的内存量等。The hardware constraints may be determined according to the computing capabilities of the hardware. For example, the hardware constraint can be the flops range of the hardware, the runtime of the neural network that can be hosted, or the amount of memory available, etc.
将基础网络和硬件约束作为模型缩放服务器的输入,通过该模型缩放服务器输出与硬件适配的一个或者多个模型。该模型缩放服务器可以用于执行前述图5中的流程,得到最终的一个或者多个与硬件适配的模型。例如,可以通过模型缩放服务器得到的不同大小的神经网络模型,可以根据硬件资源约束部署在手机终端、控制设备、自动驾驶汽车等上面。The underlying network and hardware constraints are used as input to the model scaling server, which outputs one or more models adapted to the hardware. The model scaling server can be used to execute the aforementioned process in FIG. 5 to obtain one or more final models adapted to the hardware. For example, neural network models of different sizes obtained by the model scaling server can be deployed on mobile terminals, control devices, autonomous vehicles, etc. according to hardware resource constraints.
具体地,该模型缩放服务器可以用于从搜索空间中采样的多组参数组合,然后基于该多种参数组合对基础网络进行变形,如调整基础网络的深度、宽度或者池化层的数量或者大小等,得到多个神经网络。然后可以在硬件中运行该多个神经网络,对该多个神经网络在硬件中的运行的过程进行评估,得到该多个神经网络在硬件中的运行时长、占用的内存量或者flops等,然后构建参数组合中的参数,如深度、宽度、输入图像的分辨率或者卷积核的大小等参数,与运行时长、占用的内存量或者flops等之间的映射关系。硬件约束可以包括运行时长、占用的内存量或者flops等中的一种或者多种,将该硬件约束代入映射关系中,即可得到一组或者多组参数组合,可以包括深度、宽度、分辨率或者卷积核的大小等参数的取值。然后基于该一组或者多组参数组合,对基础网络进行缩放,得到一个或者多个神经网络。若对基础网络进行缩放得到了多个神经网络,则可以从该多个神经网络中选择出最优的神经网络作为最优模型,例如,可以选择输出精度最高的神经网络作为最优模型,或者,在相同的输出精度的基础上,选择计算量更少或运行时长更短的神经网络作为最优模型。因此,在本应用场景中,可以输入硬件约束和基础网络,即可输出与硬件适配的模型,快速准确地完成模型缩放,高效地得到与硬件适配的最优模型。Specifically, the model scaling server can be used to sample multiple sets of parameter combinations from the search space, and then deform the base network based on the multiple parameter combinations, such as adjusting the depth and width of the base network or the number or size of pooling layers etc., to get multiple neural networks. Then, the multiple neural networks can be run in hardware, and the process of running the multiple neural networks in the hardware can be evaluated to obtain the running time of the multiple neural networks in the hardware, the amount of memory occupied or flops, etc., and then The parameters in the construction parameter combination, such as the depth, width, resolution of the input image or the size of the convolution kernel, etc., and the mapping relationship between the running time, the amount of memory occupied, or flops, etc. Hardware constraints can include one or more of running time, occupied memory, or flops. Substitute the hardware constraints into the mapping relationship to obtain one or more sets of parameter combinations, which can include depth, width, resolution Or the value of parameters such as the size of the convolution kernel. Then, based on the combination of one or more sets of parameters, the basic network is scaled to obtain one or more neural networks. If multiple neural networks are obtained by scaling the basic network, the optimal neural network can be selected from the multiple neural networks as the optimal model. For example, the neural network with the highest output accuracy can be selected as the optimal model, or , on the basis of the same output accuracy, the neural network with less computation or shorter running time is selected as the optimal model. Therefore, in this application scenario, hardware constraints and a basic network can be input, and a model adapted to the hardware can be output, model scaling can be completed quickly and accurately, and an optimal model adapted to the hardware can be efficiently obtained.
在另一种具体的场景中,得到最优模型可以用于进行图像识别,用于对图像进行分类。其中,得到的最优模型可以应用于终端,如图8所示,用户使用的终端中可以包括多个图像,可以使用部署于终端的与终端的硬件适配的模型,来对终端中的图像来进行分类,并按照分类排列图像,如图8中所示出的将海豚和猫的图像分别进行分类,并按照海豚和猫的分类在显示界面中分别进行展示,以使用户可以快速在相册中找到图像。例如,用户在手机和云盘上存储了大量的图片,对相册进行分类管理能提高用户的体验。如图8所示,利用本申请提供的一系列卷积神经网络模型中跟当前手机计算资源匹配的一个网络模型,可以让手机终端通过图像识别对手机相册中的不同类别的图片进行分类管理,从而方便用户的查找,节省用户的管理时间,提高相册管理的效率。In another specific scenario, the obtained optimal model can be used for image recognition and classification of images. Among them, the obtained optimal model can be applied to the terminal. As shown in FIG. 8 , the terminal used by the user may include multiple images, and the model deployed in the terminal and adapted to the hardware of the terminal can be used to compare the images in the terminal. to classify and arrange the images according to the classification. As shown in Figure 8, the images of dolphins and cats are classified respectively, and they are displayed in the display interface according to the classification of dolphins and cats, so that users can quickly browse the album. image found in . For example, users store a large number of pictures on mobile phones and cloud disks, and categorizing and managing albums can improve user experience. As shown in Figure 8, using a network model matching the current mobile phone computing resources in a series of convolutional neural network models provided by this application, the mobile phone terminal can be used to classify and manage pictures of different categories in the mobile phone album through image recognition, Thus, it is convenient for the user to search, saves the management time of the user, and improves the efficiency of album management.
在另一种具体的应用场景中,如图9所示,本申请得到的最优模型可以包括特征提取网络,Mask RCNN可以看作一个实例分割架构,其中,特征提取网络可以嵌入至Mask RCNN中,用于进行特征提取。还可以包括区域生成网络(Region Proposal Network,Region Proposal,RPN)、池化层(如图9中所示的感兴趣点池化层RoIPool)。设置RoIPool的目的是为了从RPN网络确定的感兴趣区域(region of interest,ROI)中导出较小的特征图中。在将输入图像输入至Mask RCNN之后,即可检测出该输入图像中的目标,如高效地识别出如图9中所示的输入图像中包括的动物。因此,通过本申请提供的方法,可以得到与硬件适配的特征提取网络,使在硬件的承载能力范围内,高效地进行特征提取。In another specific application scenario, as shown in Figure 9, the optimal model obtained in this application may include a feature extraction network, and Mask RCNN can be regarded as an instance segmentation architecture, wherein the feature extraction network can be embedded in Mask RCNN , for feature extraction. It may also include a region generation network (Region Proposal Network, Region Proposal, RPN), a pooling layer (as shown in FIG. 9 , the point of interest pooling layer RoIPool). The purpose of setting RoIPool is to derive smaller feature maps from the region of interest (ROI) determined by the RPN network. After inputting the input image to the Mask RCNN, the target in the input image can be detected, such as the animal included in the input image as shown in Figure 9 can be efficiently recognized. Therefore, through the method provided in the present application, a feature extraction network adapted to the hardware can be obtained, so that the feature extraction can be efficiently performed within the range of the bearing capacity of the hardware.
更具体地,下面以一个具体的场景为例,本申请提供的神经网络构建方法的更详细的流程可以如图10所示。More specifically, taking a specific scenario as an example below, a more detailed flow of the neural network construction method provided by the present application may be shown in FIG. 10 .
首先,搜索空间1001中可以包括深度、宽度、输入图像的分辨率、卷积核的大小或卷积核的group数量等参数的取值范围。First, the search space 1001 may include the value range of parameters such as depth, width, resolution of the input image, size of the convolution kernel, or the number of groups of the convolution kernel.
从搜索空间1001中进行采样,得到m组参数组合。每组参数组合中可以包括深度、宽度、输入图像的分辨率、卷积核的大小或卷积核的group数量等参数的取值。Sampling is performed from the search space 1001 to obtain m sets of parameter combinations. Each set of parameter combinations may include values of parameters such as depth, width, resolution of the input image, size of the convolution kernel, or the number of groups of the convolution kernel.
在得到m组参数组合之后,若存在基础网络,则可以基于该m组参数组合在该基础网络上进行调整,调整基础网络的深度、宽度、池化层的大小、卷积核的大小或者卷积核的group数量,得到m个第一神经网络1003。例如,若基础网络的初始深度,即网络层数为10,而参数组合中包括的深度的值为15,则可以为该基础网络增加5层网络,得到具有15层网络的神经网络。或者,若基础网络的每一层网络的宽度为8,即每一层网络包括8个基础单元,而参数组合中包括的宽度为16,则可以为基础网络中每一层网络增加8个基础单元,得到宽度为16的基础网络。若不存在基础网络,则可以直接构建m个第一神经网络1003。After obtaining m groups of parameter combinations, if there is a basic network, you can adjust the basic network based on the m groups of parameter combinations, and adjust the depth, width, pooling layer size, convolution kernel size or volume of the basic network. The number of groups of accumulated kernels is obtained to obtain m first neural networks 1003 . For example, if the initial depth of the basic network, that is, the number of network layers, is 10, and the value of the depth included in the parameter combination is 15, then 5 layers of networks can be added to the basic network to obtain a neural network with a 15-layer network. Or, if the width of each layer of the basic network is 8, that is, each layer of the network includes 8 basic units, and the width included in the parameter combination is 16, you can add 8 basic units to each layer of the basic network. unit, resulting in a base network with a width of 16. If there is no basic network, m first neural networks 1003 can be directly constructed.
在得到m个第一神经网络之后,可以使用预先设定的数据集1004对该m个神经网络进行训练,得到训练后的m个神经网络1005。After the m first neural networks are obtained, the m neural networks can be trained by using the preset data set 1004 to obtain m neural networks 1005 after training.
然后从训练后的m个神经网络1005中,筛选出计算性能最优的n个第二神经网络。该性能最优可以是输出精度高而计算量少,或者输出精度高而运行时长短等。Then, from the m neural networks 1005 after training, n second neural networks with the best computational performance are selected. The optimal performance can be high output precision and less computation, or high output precision and short running time.
然后对n个神经网络结构的参数与评估结果,如计算量、运行时长或内存占用量之间的关系进行高斯回归拟合,得到映射关系1007。Then, a Gaussian regression fitting is performed on the relationship between the parameters of the n neural network structures and the evaluation results, such as calculation amount, running time or memory occupancy, to obtain a mapping relationship 1007 .
随后在得到约束范围1008之后,基于映射关系1007计算得到一组或者多组参数组合1009。并基于一组或者多组参数组合1009构建得到目标神经网络1010。Then, after obtaining the constraint range 1008 , one or more sets of parameter combinations 1009 are obtained by calculation based on the mapping relationship 1007 . And the target neural network 1010 is obtained by constructing based on one or more sets of parameter combinations 1009 .
可选地,在得到目标神经网络1010之后,还可以对该目标神经网络1010进行训练,得到训练后的目标神经网络,并将训练后的目标神经网络部署于硬件中。Optionally, after the target neural network 1010 is obtained, the target neural network 1010 may also be trained to obtain a trained target neural network, and the trained target neural network is deployed in hardware.
例如,可以将GhostNet作为基础网络,用来做图像分类任务,其计算量为591M flops,搜索空间包括神经网络的宽度w(通道数)、深度d(层数)和输入图像分辨率r。从[0.25,4]这个区间内对w、d、r进行随机采样,采样出m个参数组合,如(w=0.33,d=0.78,r=2.33)。根据参数组合对基础模型的宽度、深度和输入分辨率进行改变,得到m个新模型,同时计算出m个新模型的计算量。然后,在ImageNet数据集或者其子集上进行模型训练,训练完 成后在验证集测试识别正确率,就得到了m个新模型对应的正确率。For example, GhostNet can be used as a basic network for image classification tasks, and its computational load is 591M flops. The search space includes the width w (number of channels), depth d (number of layers) and input image resolution r of the neural network. Randomly sample w, d, and r from the interval [0.25, 4], and sample m parameter combinations, such as (w=0.33, d=0.78, r=2.33). According to the parameter combination, the width, depth and input resolution of the basic model are changed to obtain m new models, and the calculation amount of the m new models is calculated at the same time. Then, model training is performed on the ImageNet data set or its subset, and after the training is completed, the recognition accuracy is tested in the validation set, and the accuracy corresponding to m new models is obtained.
根据计算量和正确率,从这n个新模型中选出在帕累托前沿的m个模型(m<=n),这m个模型是相对较优秀的模型,从而拟合得到其(w,d,r)参数和计算量之间的关系。例如,计算来那个可以通过flops来衡量,可以分别构建w,d,r和flops之间的关系。示例性地,宽度和flops之间的关系可以如图11A所示,深度和flops之间的关系可以如图11B所示,分辨率和flops之间的关系可以如图11C所示。可以使用高斯过程回归来拟合上述w,d,r和flops之间的关系,以分辨率r为例,图中的m个点的横纵坐标分别为flops,
Figure PCTCN2021124360-appb-000011
和分辨率
Figure PCTCN2021124360-appb-000012
这m个点作为训练数据,训练数据和测试点c *的联合分布如下高斯分布
Figure PCTCN2021124360-appb-000013
其中,
Figure PCTCN2021124360-appb-000014
K()为核函数(如内积函数),
Figure PCTCN2021124360-appb-000015
Figure PCTCN2021124360-appb-000016
σ是r的标准差,经过推导,即可得到预测值r *,即拟合的曲线公式:
Figure PCTCN2021124360-appb-000017
其中,
Figure PCTCN2021124360-appb-000018
Figure PCTCN2021124360-appb-000019
宽度和深度与flops之间的关系与分辨率r的计算方式类似,此处不再赘述。
According to the calculation amount and the correct rate, select m models (m<=n) in the Pareto frontier from the n new models. These m models are relatively excellent models, so as to obtain their (w) , d, r) parameters and the relationship between the amount of calculation. For example, to calculate that can be measured in flops, the relationship between w, d, r and flops can be constructed separately. Illustratively, the relationship between width and flops may be as shown in FIG. 11A , the relationship between depth and flops may be as shown in FIG. 11B , and the relationship between resolution and flops may be as shown in FIG. 11C . Gaussian process regression can be used to fit the above relationship between w, d, r and flops. Taking the resolution r as an example, the horizontal and vertical coordinates of m points in the figure are flops, respectively.
Figure PCTCN2021124360-appb-000011
and resolution
Figure PCTCN2021124360-appb-000012
These m points are used as training data, and the joint distribution of training data and test point c * is as follows Gaussian distribution
Figure PCTCN2021124360-appb-000013
in,
Figure PCTCN2021124360-appb-000014
K() is the kernel function (such as the inner product function),
Figure PCTCN2021124360-appb-000015
Figure PCTCN2021124360-appb-000016
σ is the standard deviation of r. After deduction, the predicted value r * can be obtained, that is, the fitted curve formula:
Figure PCTCN2021124360-appb-000017
in,
Figure PCTCN2021124360-appb-000018
Figure PCTCN2021124360-appb-000019
The relationship between width and depth and flops is similar to the calculation method of resolution r, and will not be repeated here.
得到了通过上述方式拟合的曲线公式之后,用户只需要输入的计算量c *,就可以得到预测的r *,d *,w *值。通过(r *,d *,w *)对基础模型GhostNet-A的宽度、深度和输入分辨率等结构进行改变,得到计算量为c *的新模型。新模型通过训练之后,就可以在不同硬件设备进行部署使用,如部署在手机、控制设备等设备中。 After obtaining the curve formula fitted by the above method, the user only needs to input the calculation amount c * to obtain the predicted r * , d * , w * values. By (r * , d * , w * ), the width, depth and input resolution of the basic model GhostNet-A are changed to obtain a new model with a computational cost of c * . After the new model is trained, it can be deployed on different hardware devices, such as mobile phones, control devices and other devices.
为便于理解,将本申请实施例以GhostNet模型为基础网络进行缩放后得到的新模型在ImageNet数据集上的输出结果与一些常用的神经网络(EfficientNet或MobileNetV3等)的输出结果进行对比。For ease of understanding, the output results of the new model obtained by scaling the GhostNet model as the basic network in the embodiment of the present application on the ImageNet dataset are compared with the output results of some commonly used neural networks (EfficientNet or MobileNetV3, etc.).
Figure PCTCN2021124360-appb-000020
Figure PCTCN2021124360-appb-000020
Figure PCTCN2021124360-appb-000021
Figure PCTCN2021124360-appb-000021
表1Table 1
由表1可知,本申请提供的对GhostNet模型进行缩放后得到的新模型的输出结果,在各个神经网络的宽度以及flops等接近或相似的情况下,输出精度更优。因此,由本申请提供的神经网络构建方法得到的神经网络,在高效构建得到神经网络的基础上,可以得到结构和输出精度更优的神经网络,使神经网络的结构和输出精度均衡,可以更好的适配硬件。It can be seen from Table 1 that the output results of the new model obtained by scaling the GhostNet model provided in this application have better output accuracy when the width and flops of each neural network are close or similar. Therefore, the neural network obtained by the neural network construction method provided by the present application can obtain a neural network with better structure and output accuracy on the basis of efficiently constructing the neural network, so that the structure and output accuracy of the neural network are balanced, and can be better matching hardware.
此外,本申请提供的神经网络构建方法还可以用于对任意网络进行缩放,例如,可以以EfficientNet-B0作为基础模型,得到一系列更小的模型。示例性地,以EfficientNet-B0作为基础模型行缩放后得到的新模型在ImageNet数据集上的输出结果与一些常用的神经网络的输出结果进行对比,对比结果可以如表2所示。其中,RA表示随机数据增强(Rand Augment)。In addition, the neural network construction method provided in this application can also be used to scale any network. For example, EfficientNet-B0 can be used as a basic model to obtain a series of smaller models. Exemplarily, the output results of the new model obtained after scaling with EfficientNet-B0 as the basic model on the ImageNet dataset are compared with the output results of some commonly used neural networks, and the comparison results can be shown in Table 2. Among them, RA stands for random data augmentation (Rand Augment).
Figure PCTCN2021124360-appb-000022
Figure PCTCN2021124360-appb-000022
Figure PCTCN2021124360-appb-000023
Figure PCTCN2021124360-appb-000023
表2Table 2
与前述对GhostNet模型进行缩放后得到的新模型的输出结果类似地,由表2可知,在各个神经网络的宽度和flops等接近或相似的情况下,输出精度更优。因此,由本申请提供的神经网络构建方法得到的神经网络,在高效构建得到神经网络的基础上,可以得到结构和输出精度更优的神经网络,使神经网络的结构和输出精度均衡,可以更好的适配硬件。Similar to the output results of the new model obtained by scaling the GhostNet model above, it can be seen from Table 2 that when the width and flops of each neural network are close or similar, the output accuracy is better. Therefore, the neural network obtained by the neural network construction method provided by the present application can obtain a neural network with better structure and output accuracy on the basis of efficiently constructing the neural network, so that the structure and output accuracy of the neural network are balanced, and can be better matching hardware.
前述对本申请提供的方法的流程进行了详细解释,下面对本申请提供的装置进行详细介绍。The flow of the method provided by the present application is explained in detail above, and the device provided by the present application is described in detail below.
参阅图12,本申请提供一种神经网络构建装置,包括:Referring to FIG. 12, the present application provides a neural network construction device, including:
采样模块1201,用于从预设的搜索空间中采样得到至少一组第一参数组合,搜索空间中包括构建神经网络时使用的多种参数的取值范围,至少一组第一参数组合中的每个第一参数组合包括多种参数中的每一种参数的值; Sampling module 1201, configured to sample at least one set of first parameter combinations from a preset search space, where the search space includes the value ranges of various parameters used when constructing a neural network, and at least one set of first parameter combinations is included in the search space. each first parameter combination includes a value for each of the plurality of parameters;
构建模块1202,用于根据至少一组第一参数组合构建多个第一神经网络;a building module 1202, configured to build a plurality of first neural networks according to at least one set of first parameter combinations;
获取模块1204,用于获取约束范围,所述约束范围包括标识计算装置的计算能力的数值范围,约束范围可以是根据计算装置的计算能力的信息确定的数值范围;an obtaining module 1204, configured to obtain a constraint range, where the constraint range includes a numerical range identifying the computing capability of the computing device, and the constraint range may be a numerical range determined according to information on the computing capability of the computing device;
计算模块1205,用于根据映射关系,得到和约束范围对应的第二参数组合,映射关系包括所述至少一个参数组合和多个第一神经网络的评估结果之间的关系,评估结果为对多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;The calculation module 1205 is used to obtain the second parameter combination corresponding to the constraint range according to the mapping relationship, the mapping relationship includes the relationship between the at least one parameter combination and the evaluation results of multiple first neural networks, and the evaluation results are to many a result obtained by evaluating the structure of each of the first neural networks;
构建模块1202,还用于根据第二参数组合得到目标神经网络。The building block 1202 is further configured to obtain the target neural network according to the second parameter combination.
在一种可能的实施方式中,该装置还可以包括:In a possible implementation, the device may further include:
映射模块1203,用于根据至少一个参数组合和多个第一神经网络的评估结果之间的关系,生成所述映射关系。The mapping module 1203 is configured to generate the mapping relationship according to the relationship between the at least one parameter combination and the evaluation results of the multiple first neural networks.
在一种可能的实施方式中,该装置还可以包括:In a possible implementation, the device may further include:
第一训练模块1206,用于使用预设的第一数据集训练多个第一神经网络,得到训练后的多个第一神经网络;The first training module 1206 is used to train a plurality of first neural networks by using a preset first data set to obtain a plurality of first neural networks after training;
筛选模块1207,用于根据训练后的多个第一神经网络中的每个第一神经网络的评估结果或训练后的每个训练后的第一神经网络的输出精度,从多个第一神经网络中筛选出至少一个第二神经网络;The screening module 1207 is configured to select from the plurality of first neural networks according to the evaluation result of each of the trained first neural networks or the output accuracy of each trained first neural network after training. Screening out at least one second neural network from the network;
映射关系是可以根据至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系。The mapping relationship can be based on the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
在一种可能的实施方式中,该映射模块1203,具体用于对至少一个第二神经网络中每个第二神经网络的对应的参数组合和每个第二神经网络的评估结果之间的关系进行拟合,得到映射关系。In a possible implementation manner, the mapping module 1203 is specifically configured to establish the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network Fit to get the mapping relationship.
在一种可能的实施方式中,搜索空间中包括的参数包括以下一种或者多种:宽度、深度、分辨率或卷积核的大小,宽度为神经网络中每一层网络所包括的基础单元的数量,深度为神经网络的网络层的层数,分辨率为输入至神经网络的图像的分辨率。In a possible implementation manner, the parameters included in the search space include one or more of the following: width, depth, resolution or the size of the convolution kernel, where the width is the basic unit included in each layer of the neural network , the depth is the number of layers of the neural network, and the resolution is the resolution of the image input to the neural network.
在一种可能的实施方式中,评估结果可以包括:每个第一神经网络的总浮点运算次数flops、每个第一神经网络的正向推理的运行时长、运行每个第一神经网络占用的内存量、或者每个第一神经网络的参数量中的一项或者多项。In a possible implementation manner, the evaluation result may include: the total number of floating-point operations flops of each first neural network, the running time of the forward inference of each first neural network, the running time of each first neural network One or more of the amount of memory, or the amount of parameters of each first neural network.
在一种可能的实施方式中,获取模块1204,具体用于接收用户输入数据,并根据用户输入数据获取约束范围。In a possible implementation manner, the obtaining module 1204 is specifically configured to receive user input data, and obtain the constraint range according to the user input data.
在一种可能的实施方式中,获取模块1204,具体用于:从用户输入数据中获取计算装置的标识信息;根据计算装置的标识信息获取约束范围。In a possible implementation manner, the obtaining module 1204 is specifically configured to: obtain the identification information of the computing device from the user input data; and obtain the constraint range according to the identification information of the computing device.
在一种可能的实施方式中,上述装置还可以包括:第二训练模块1208,用于使用预设的第二数据集对目标神经网络进行训练,得到训练后的目标神经网络。In a possible implementation manner, the above-mentioned apparatus may further include: a second training module 1208, configured to use a preset second data set to train the target neural network to obtain a trained target neural network.
在一种可能的实施方式中,目标神经网络用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的至少一种。In a possible implementation, the target neural network is used to perform at least one of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection.
请参阅图13,本申请提供的另一种神经网络构建装置的结构示意图,如下所述。Please refer to FIG. 13 , which is a schematic structural diagram of another apparatus for constructing a neural network provided by the present application, as described below.
该神经网络构建装置可以包括处理器1301和存储器1302。该处理器1301和存储器1302通过线路互联。其中,存储器1302中存储有程序指令和数据。The neural network construction apparatus may include a processor 1301 and a memory 1302 . The processor 1301 and the memory 1302 are interconnected by wires. Among them, the memory 1302 stores program instructions and data.
存储器1302中存储了前述图5-图11C中的步骤对应的程序指令以及数据。The memory 1302 stores program instructions and data corresponding to the steps in the aforementioned FIG. 5 to FIG. 11C .
处理器1301用于执行前述图5-图11C中任一实施例所示的神经网络构建装置执行的方法步骤。The processor 1301 is configured to perform the method steps performed by the apparatus for constructing a neural network shown in any of the foregoing embodiments in FIG. 5 to FIG. 11C .
可选地,该神经网络构建装置还可以包括收发器1303,用于接收或者发送数据。Optionally, the apparatus for constructing a neural network may further include a transceiver 1303 for receiving or sending data.
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于生成车辆行驶速度的程序,当其在计算机上行驶时,使得计算机执行如前述图5-图11C所示实施例描述的方法中的步骤。Embodiments of the present application also provide a computer-readable storage medium, where a program for generating a vehicle's running speed is stored in the computer-readable storage medium, and when the computer is running on a computer, the computer is made to execute the above-mentioned Fig. 5-Fig. 11C The illustrated embodiment describes the steps in the method.
可选地,前述的图13中所示的神经网络构建装置为芯片。Optionally, the aforementioned apparatus for constructing a neural network shown in FIG. 13 is a chip.
本申请实施例还提供了一种神经网络构建装置,该神经网络构建装置也可以称为数字处理芯片或者芯片,芯片包括处理单元和通信接口,处理单元通过通信接口获取程序指令,程序指令被处理单元执行,处理单元用于执行前述图5-图11C中任一实施例所示的神经网络构建装置执行的方法步骤。The embodiments of the present application also provide a neural network construction device, which may also be called a digital processing chip or a chip. The chip includes a processing unit and a communication interface. The processing unit obtains program instructions through the communication interface, and the program instructions are processed. The unit is executed, and the processing unit is configured to execute the method steps executed by the apparatus for constructing a neural network shown in any of the foregoing embodiments in FIG. 5 to FIG. 11C .
本申请实施例还提供一种数字处理芯片。该数字处理芯片中集成了用于实现上述处理器1301,或者处理器1301的功能的电路和一个或者多个接口。当该数字处理芯片中集成了存储器时,该数字处理芯片可以完成前述实施例中的任一个或多个实施例的方法步骤。当该数字处理芯片中未集成存储器时,可以通过通信接口与外置的存储器连接。该数字处理芯片根据外置的存储器中存储的程序代码来实现上述实施例中神经网络构建装置执行的 动作。The embodiments of the present application also provide a digital processing chip. The digital processing chip integrates circuits and one or more interfaces for realizing the above-mentioned processor 1301 or the functions of the processor 1301 . When a memory is integrated in the digital processing chip, the digital processing chip can perform the method steps of any one or more of the foregoing embodiments. When the digital processing chip does not integrate the memory, it can be connected with the external memory through the communication interface. The digital processing chip implements the actions performed by the neural network construction device in the above embodiment according to the program codes stored in the external memory.
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上行驶时,使得计算机执行如前述图5-图11C所示实施例描述的方法中神经网络构建装置所执行的步骤。Embodiments of the present application also provide a computer program product that, when driving on a computer, causes the computer to execute the steps performed by the apparatus for constructing a neural network in the method described in the embodiments shown in FIG. 5 to FIG. 11C .
本申请实施例提供的神经网络构建装置可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使服务器内的芯片执行上述图5-图11C所示实施例描述的神经网络构建方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。The neural network construction apparatus provided in the embodiment of the present application may be a chip, and the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, a pin, or a circuit, etc. . The processing unit can execute the computer-executed instructions stored in the storage unit, so that the chip in the server executes the neural network construction method described in the embodiments shown in FIG. 5 to FIG. 11C. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as only Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
具体地,前述的处理单元或者处理器可以是中央处理器(central processing unit,CPU)、网络处理器(neural-network processing unit,NPU)、图形处理器(graphics processing unit,GPU)、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)或现场可编程逻辑门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者也可以是任何常规的处理器等。Specifically, the aforementioned processing unit or processor may be a central processing unit (CPU), a network processor (neural-network processing unit, NPU), a graphics processing unit (graphics processing unit, GPU), a digital signal processing digital signal processor (DSP), application specific integrated circuit (ASIC) or field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or it may be any conventional processor or the like.
示例性地,请参阅图14,图14为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 140,NPU 140作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路140,通过控制器1404控制运算电路1403提取存储器中的矩阵数据并进行乘法运算。Exemplarily, please refer to FIG. 14. FIG. 14 is a schematic structural diagram of a chip provided by an embodiment of the application. The chip may be represented as a neural network processor NPU 140, and the NPU 140 is mounted as a coprocessor to the main CPU ( Host CPU), the task is allocated by the Host CPU. The core part of the NPU is the arithmetic circuit 140, and the controller 1404 controls the arithmetic circuit 1403 to extract the matrix data in the memory and perform multiplication operations.
在一些实现中,运算电路1403内部包括多个处理单元(process engine,PE)。在一些实现中,运算电路1403是二维脉动阵列。运算电路1403还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1403是通用的矩阵处理器。In some implementations, the arithmetic circuit 1403 includes multiple processing units (process engines, PEs). In some implementations, arithmetic circuit 1403 is a two-dimensional systolic array. The arithmetic circuit 1403 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 1403 is a general-purpose matrix processor.
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1402中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1401中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1408中。For example, suppose there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 1402 and buffers it on each PE in the arithmetic circuit. The arithmetic circuit fetches the data of matrix A and matrix B from the input memory 1401 to perform matrix operation, and stores the partial result or final result of the matrix in the accumulator 1408 .
统一存储器1406用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(direct memory access controller,DMAC)1405,DMAC被搬运到权重存储器1402中。输入数据也通过DMAC被搬运到统一存储器1406中。Unified memory 1406 is used to store input data and output data. The weight data is directly passed through the storage unit access controller (direct memory access controller, DMAC) 1405, and the DMAC is transferred to the weight memory 1402. Input data is also moved to unified memory 1406 via the DMAC.
总线接口单元(bus interface unit,BIU)1410,用于AXI总线与DMAC和取指存储器(instruction fetch buffer,IFB)1409的交互。A bus interface unit (BIU) 1410 is used for the interaction between the AXI bus and the DMAC and an instruction fetch buffer (instruction fetch buffer, IFB) 1409.
总线接口单元1410(bus interface unit,BIU),用于取指存储器1409从外部存储器获取指令,还用于存储单元访问控制器1405从外部存储器获取输入矩阵A或者权重矩阵B的原数据。The bus interface unit 1410 (bus interface unit, BIU) is used for the instruction fetch memory 1409 to obtain instructions from the external memory, and is also used for the storage unit access controller 1405 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1406或将权重数据搬运到权重存储器1402中或将输入数据数据搬运到输入存储器1401中。The DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 1406 , the weight data to the weight memory 1402 , or the input data to the input memory 1401 .
向量计算单元1407包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如批归一化(batch normalization),像素级求和,对特征平面进行上采样等。The vector calculation unit 1407 includes a plurality of operation processing units, and further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc., if necessary. It is mainly used for non-convolutional/fully connected layer network computations in neural networks, such as batch normalization, pixel-level summation, and upsampling of feature planes.
在一些实现中,向量计算单元1407能将经处理的输出的向量存储到统一存储器1406。例如,向量计算单元1407可以将线性函数和/或非线性函数应用到运算电路1403的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元1407生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1403的激活输入,例如用于在神经网络中的后续层中的使用。In some implementations, the vector computation unit 1407 can store the processed output vectors to the unified memory 1406 . For example, the vector calculation unit 1407 may apply a linear function and/or a nonlinear function to the output of the operation circuit 1403, such as linear interpolation of the feature plane extracted by the convolutional layer, such as a vector of accumulated values, to generate activation values. In some implementations, the vector computation unit 1407 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as an activation input to the arithmetic circuit 1403, eg, for use in subsequent layers in a neural network.
控制器1404连接的取指存储器(instruction fetch buffer)1409,用于存储控制器1404使用的指令;The instruction fetch buffer (instruction fetch buffer) 1409 connected to the controller 1404 is used to store the instructions used by the controller 1404;
统一存储器1406,输入存储器1401,权重存储器1402以及取指存储器1409均为On-Chip存储器。外部存储器私有于该NPU硬件架构。The unified memory 1406, the input memory 1401, the weight memory 1402 and the instruction fetch memory 1409 are all On-Chip memories. External memory is private to the NPU hardware architecture.
其中,循环神经网络中各层的运算可以由运算电路1403或向量计算单元1407执行。The operation of each layer in the recurrent neural network can be performed by the operation circuit 1403 or the vector calculation unit 1407 .
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述图5-图11C的方法的程序执行的集成电路。Wherein, the processor mentioned in any one of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the programs of the methods of FIG. 5 to FIG. 11C.
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。In addition, it should be noted that the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be A physical unit, which can be located in one place or distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. In addition, in the drawings of the device embodiments provided in the present application, the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus necessary general-purpose hardware. Special components, etc. to achieve. Under normal circumstances, all functions completed by a computer program can be easily implemented by corresponding hardware, and the specific hardware structures used to implement the same function can also be various, such as analog circuits, digital circuits or special circuit, etc. However, a software program implementation is a better implementation in many cases for this application. Based on this understanding, the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that make contributions to the prior art. The computer software products are stored in a readable storage medium, such as a floppy disk of a computer. , U disk, mobile hard disk, read only memory (ROM), random access memory (RAM), disk or CD, etc., including several instructions to make a computer device (which can be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。 当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product.
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a server, data center, etc., which includes one or more available media integrated. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.
最后应说明的是:以上,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。Finally, it should be noted that: the above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. Any person skilled in the art who is familiar with the technical scope disclosed by the present application can easily think of changes. Or replacement should be covered within the protection scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (21)

  1. 一种神经网络构建方法,其特征在于,包括:A method for constructing a neural network, comprising:
    从预设的搜索空间中采样得到至少一组第一参数组合,所述搜索空间中包括构建神经网络时使用的多种参数的取值范围,所述至少一组第一参数组合中的每个第一参数组合包括所述多种参数中的每一种参数的值;At least one set of first parameter combinations is sampled from a preset search space, where the search space includes value ranges of various parameters used in constructing the neural network, and each of the at least one set of first parameter combinations the first parameter combination includes a value for each of the plurality of parameters;
    根据所述至少一组第一参数组合构建多个第一神经网络;Constructing a plurality of first neural networks according to the at least one set of first parameter combinations;
    获取约束范围,所述约束范围包括标识计算装置的计算能力的数值范围;obtaining a constraint range, the constraint range including a numerical range identifying the computing capability of the computing device;
    根据映射关系,得到和所述约束范围对应的第二参数组合,所述映射关系包括所述至少一组参数组合和所述多个第一神经网络的评估结果之间的关系,所述评估结果为对所述多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;Obtain a second parameter combination corresponding to the constraint range according to the mapping relationship, where the mapping relationship includes the relationship between the at least one set of parameter combinations and the evaluation results of the first neural networks, the evaluation results a result obtained by evaluating the structure of each first neural network in the plurality of first neural networks;
    根据所述第二参数组合得到目标神经网络。The target neural network is obtained according to the second parameter combination.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    使用预设的第一数据集训练所述多个第一神经网络,得到训练后的多个第一神经网络;using the preset first data set to train the plurality of first neural networks to obtain a plurality of first neural networks after training;
    根据训练后的所述多个第一神经网络中的每个第一神经网络的评估结果或训练后的所述每个训练后的第一神经网络的输出精度,从所述多个第一神经网络中筛选出至少一个第二神经网络;According to the evaluation result of each first neural network in the plurality of first neural networks after training or the output accuracy of each trained first neural network after training, from the plurality of first neural networks Screening out at least one second neural network from the network;
    所述映射关系包括所述至少一个第二神经网络中每个第二神经网络的对应的参数组合和所述每个第二神经网络的评估结果之间的关系。The mapping relationship includes a relationship between a corresponding parameter combination of each second neural network in the at least one second neural network and an evaluation result of each second neural network.
  3. 根据权利要求2所述的方法,其特征在于,The method of claim 2, wherein:
    所述映射关系通过对所述至少一个第二神经网络中每个第二神经网络的对应的参数组合和所述每个第二神经网络的评估结果之间的关系进行拟合得到。The mapping relationship is obtained by fitting the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述搜索空间中包括的参数包括以下一种或者多种:The method according to any one of claims 1-3, wherein the parameters included in the search space include one or more of the following:
    宽度、深度、分辨率或卷积核的大小,所述宽度为神经网络中每一层网络所包括的基础单元的数量,所述深度为神经网络的网络层的层数,所述分辨率为输入至神经网络的图像的分辨率。Width, depth, resolution or the size of the convolution kernel, the width is the number of basic units included in each layer of the neural network, the depth is the number of layers of the neural network, and the resolution is The resolution of the image input to the neural network.
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述评估结果包括:The method according to any one of claims 1-4, wherein the evaluation result comprises:
    所述每个第一神经网络的总浮点运算次数flops、所述每个第一神经网络的正向推理的运行时长、运行所述每个第一神经网络占用的内存量、或者所述每个第一神经网络的参数量中的一项或者多项。The total number of floating-point operations flops of each first neural network, the running time of forward inference of each first neural network, the amount of memory occupied by running each first neural network, or the amount of memory occupied by each first neural network. One or more parameters of the first neural network.
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述获取约束范围,包括:The method according to any one of claims 1-5, wherein the obtaining the constraint range comprises:
    接收用户输入数据,并根据所述用户输入数据获取所述约束范围。Receive user input data, and obtain the constraint range according to the user input data.
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述用户输入数据获取所述约束范围,包括:The method according to claim 6, wherein the obtaining the constraint range according to the user input data comprises:
    从所述用户输入数据中获取所述计算装置的标识信息;Obtaining identification information of the computing device from the user input data;
    根据所述计算装置的标识信息获取所述约束范围。The constraint range is acquired according to the identification information of the computing device.
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-7, wherein the method further comprises:
    使用预设的第二数据集对所述目标神经网络进行训练,得到训练后的所述目标神经网络。The target neural network is trained using a preset second data set to obtain the trained target neural network.
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,所述目标神经网络用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的至少一种。The method according to any one of claims 1-8, wherein the target neural network is used to perform at least one of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection.
  10. 一种神经网络构建装置,其特征在于,包括:A device for constructing a neural network, comprising:
    采样模块,用于从预设的搜索空间中采样得到至少一组第一参数组合,所述搜索空间中包括构建神经网络时使用的多种参数的取值范围,所述至少一组第一参数组合中的每个第一参数组合包括所述多种参数中的每一种参数的值;A sampling module, configured to sample at least one set of first parameter combinations from a preset search space, where the search space includes the value ranges of various parameters used in constructing the neural network, the at least one set of first parameters each first parameter combination in the combination includes a value for each of the plurality of parameters;
    构建模块,用于根据所述至少一组第一参数组合构建多个第一神经网络;a building module for building a plurality of first neural networks according to the at least one set of first parameter combinations;
    获取模块,用于获取约束范围,所述约束范围包括标识计算装置的计算能力的数值范围;an acquisition module, configured to acquire a constraint range, the constraint range includes a numerical range identifying the computing capability of the computing device;
    计算模块,用于根据所述映射关系,得到和所述约束范围对应的第二参数组合,所述映射关系包括所述至少一组参数组合和所述多个第一神经网络的评估结果之间的关系,所述评估结果为对所述多个第一神经网络中的每个第一神经网络的结构进行评估得到的结果;A calculation module, configured to obtain a second parameter combination corresponding to the constraint range according to the mapping relationship, where the mapping relationship includes the relationship between the at least one set of parameter combinations and the evaluation results of the plurality of first neural networks relationship, the evaluation result is a result obtained by evaluating the structure of each first neural network in the plurality of first neural networks;
    所述构建模块,还用于根据所述第二参数组合得到目标神经网络。The building module is further configured to obtain a target neural network according to the second parameter combination.
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括:The apparatus of claim 10, wherein the apparatus further comprises:
    第一训练模块,用于使用预设的第一数据集训练所述多个第一神经网络,得到训练后的多个第一神经网络;a first training module, configured to train the plurality of first neural networks using a preset first data set to obtain a plurality of trained first neural networks;
    筛选模块,用于根据训练后的所述多个第一神经网络中的每个第一神经网络的评估结果或训练后的所述每个训练后的第一神经网络的输出精度,从所述多个第一神经网络中筛选出至少一个第二神经网络;The screening module is configured to, according to the evaluation result of each first neural network in the plurality of first neural networks after training or the output accuracy of the trained first neural network after training, from the Screening out at least one second neural network from the plurality of first neural networks;
    所述映射关系包括所述至少一个第二神经网络中每个第二神经网络的对应的参数组合和所述每个第二神经网络的评估结果之间的关系。The mapping relationship includes a relationship between a corresponding parameter combination of each second neural network in the at least one second neural network and an evaluation result of each second neural network.
  12. 根据权利要求11所述的装置,其特征在于,The apparatus of claim 11, wherein:
    所述映射关系通过对所述至少一个第二神经网络中每个第二神经网络的对应的参数组合和所述每个第二神经网络的评估结果之间的关系进行拟合得到。The mapping relationship is obtained by fitting the relationship between the corresponding parameter combination of each second neural network in the at least one second neural network and the evaluation result of each second neural network.
  13. 根据权利要求10-12中任一项所述的装置,其特征在于,所述搜索空间中包括的参数包括以下一种或者多种:The apparatus according to any one of claims 10-12, wherein the parameters included in the search space include one or more of the following:
    宽度、深度、分辨率或卷积核的大小,所述宽度为神经网络中每一层网络所包括的基础单元的数量,所述深度为神经网络的网络层的层数,所述分辨率为输入至神经网络的图像的分辨率。Width, depth, resolution or the size of the convolution kernel, the width is the number of basic units included in each layer of the neural network, the depth is the number of layers of the neural network, and the resolution is The resolution of the image input to the neural network.
  14. 根据权利要求10-13中任一项所述的装置,其特征在于,所述评估结果包括:The device according to any one of claims 10-13, wherein the evaluation result comprises:
    所述每个第一神经网络的总浮点运算次数flops、所述每个第一神经网络的正向推理的运行时长、运行所述每个第一神经网络占用的内存量、或者所述每个第一神经网络的参数量中的一项或者多项。The total number of floating-point operations flops of each first neural network, the running time of forward inference of each first neural network, the amount of memory occupied by running each first neural network, or the amount of memory occupied by each first neural network. One or more parameters of the first neural network.
  15. 根据权利要求10-14中任一项所述的装置,其特征在于,The device according to any one of claims 10-14, characterized in that,
    所述获取模块,具体用于接收用户输入数据,并根据所述用户输入数据获取所述约束 范围。The obtaining module is specifically configured to receive user input data, and obtain the constraint range according to the user input data.
  16. 根据权利要求15所述的装置,其特征在于,所述获取模块,具体用于:The device according to claim 15, wherein the acquisition module is specifically configured to:
    从所述用户输入数据中获取所述计算装置的标识信息;Obtaining identification information of the computing device from the user input data;
    根据所述计算装置的标识信息获取所述约束范围。The constraint range is acquired according to the identification information of the computing device.
  17. 根据权利要求10-16中任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 10-16, wherein the device further comprises:
    第二训练模块,用于使用预设的第二数据集对所述目标神经网络进行训练,得到训练后的所述目标神经网络。The second training module is used for training the target neural network using a preset second data set to obtain the trained target neural network.
  18. 根据权利要求10-17中任一项所述的装置,其特征在于,所述目标神经网络用于进行特征提取、语义分割、分类任务、超分辨率或者目标检测中的至少一种。The apparatus according to any one of claims 10-17, wherein the target neural network is used to perform at least one of feature extraction, semantic segmentation, classification tasks, super-resolution or target detection.
  19. 一种神经网络构建装置,其特征在于,包括处理器,所述处理器和存储器耦合,所述存储器存储有程序,当所述存储器存储的程序指令被所述处理器执行时实现权利要求1至9中任一项所述的方法。An apparatus for constructing a neural network, characterized in that it comprises a processor, the processor is coupled to a memory, the memory stores a program, and when the program instructions stored in the memory are executed by the processor, claims 1 to 1 are realized. The method of any one of 9.
  20. 一种计算机可读存储介质,包括程序,当其被处理单元所执行时,执行如权利要求1至9中任一项所述的方法。A computer-readable storage medium comprising a program which, when executed by a processing unit, performs the method of any one of claims 1 to 9.
  21. 一种神经网络构建装置,其特征在于,包括处理单元和通信接口,所述处理单元通过所述通信接口获取程序指令,当所述程序指令被所述处理单元执行时实现权利要求1至9中任一项所述的方法。An apparatus for constructing a neural network, characterized in that it includes a processing unit and a communication interface, the processing unit obtains program instructions through the communication interface, and when the program instructions are executed by the processing unit, the implementation of claims 1 to 9 The method of any one.
PCT/CN2021/124360 2020-10-21 2021-10-18 Neural network construction method and apparatus WO2022083536A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011131423.9 2020-10-21
CN202011131423.9A CN112418392A (en) 2020-10-21 2020-10-21 Neural network construction method and device

Publications (1)

Publication Number Publication Date
WO2022083536A1 true WO2022083536A1 (en) 2022-04-28

Family

ID=74841631

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124360 WO2022083536A1 (en) 2020-10-21 2021-10-18 Neural network construction method and apparatus

Country Status (2)

Country Link
CN (1) CN112418392A (en)
WO (1) WO2022083536A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523045A (en) * 2023-03-13 2023-08-01 之江实验室 Deep learning reasoning simulator oriented to multi-core chip
CN116707851A (en) * 2022-11-21 2023-09-05 荣耀终端有限公司 Data reporting method and terminal equipment
CN117235464A (en) * 2023-11-14 2023-12-15 华东交通大学 Fourier near infrared interference signal virtual generation evaluation method and system
WO2024016739A1 (en) * 2022-07-20 2024-01-25 华为技术有限公司 Method for training neural network model, electronic device, cloud, cluster, and medium
WO2024046460A1 (en) * 2022-09-02 2024-03-07 深圳忆海原识科技有限公司 Port model, construction method, system, and neural network construction platform

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418392A (en) * 2020-10-21 2021-02-26 华为技术有限公司 Neural network construction method and device
CN113128680B (en) * 2021-03-12 2022-06-10 山东英信计算机技术有限公司 Neural network training method, system, device and medium
CN114969636B (en) * 2021-03-23 2023-10-03 华为技术有限公司 Model recommendation method and device and computer equipment
CN113128682B (en) * 2021-04-14 2022-10-21 北京航空航天大学 Automatic neural network model adaptation method and device
CN113240109B (en) * 2021-05-17 2023-06-30 北京达佳互联信息技术有限公司 Data processing method and device for network training, electronic equipment and storage medium
US20230064692A1 (en) * 2021-08-20 2023-03-02 Mediatek Inc. Network Space Search for Pareto-Efficient Spaces
CN113485848B (en) * 2021-09-08 2021-12-17 深圳思谋信息科技有限公司 Deep neural network deployment method and device, computer equipment and storage medium
CN113902099B (en) * 2021-10-08 2023-06-02 电子科技大学 Neural network design and optimization method based on software and hardware joint learning
CN116090512A (en) * 2021-10-29 2023-05-09 华为技术有限公司 Neural network construction method and device
CN116560731A (en) * 2022-01-29 2023-08-08 华为技术有限公司 Data processing method and related device thereof
CN114548384A (en) * 2022-04-28 2022-05-27 之江实验室 Method and device for constructing impulse neural network model with abstract resource constraint
CN117035018A (en) * 2022-04-29 2023-11-10 中兴通讯股份有限公司 Beam measurement parameter feedback method and receiving method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062587A (en) * 2017-12-15 2018-05-22 清华大学 The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning
CN110991658A (en) * 2019-11-28 2020-04-10 重庆紫光华山智安科技有限公司 Model training method and device, electronic equipment and computer readable storage medium
US20200125945A1 (en) * 2018-10-18 2020-04-23 Drvision Technologies Llc Automated hyper-parameterization for image-based deep model learning
CN111768004A (en) * 2020-06-10 2020-10-13 中国人民解放军军事科学院国防科技创新研究院 Model self-adaption method and system based on intelligent computing framework
CN112418392A (en) * 2020-10-21 2021-02-26 华为技术有限公司 Neural network construction method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062587A (en) * 2017-12-15 2018-05-22 清华大学 The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning
US20200125945A1 (en) * 2018-10-18 2020-04-23 Drvision Technologies Llc Automated hyper-parameterization for image-based deep model learning
CN110991658A (en) * 2019-11-28 2020-04-10 重庆紫光华山智安科技有限公司 Model training method and device, electronic equipment and computer readable storage medium
CN111768004A (en) * 2020-06-10 2020-10-13 中国人民解放军军事科学院国防科技创新研究院 Model self-adaption method and system based on intelligent computing framework
CN112418392A (en) * 2020-10-21 2021-02-26 华为技术有限公司 Neural network construction method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024016739A1 (en) * 2022-07-20 2024-01-25 华为技术有限公司 Method for training neural network model, electronic device, cloud, cluster, and medium
WO2024046460A1 (en) * 2022-09-02 2024-03-07 深圳忆海原识科技有限公司 Port model, construction method, system, and neural network construction platform
CN116707851A (en) * 2022-11-21 2023-09-05 荣耀终端有限公司 Data reporting method and terminal equipment
CN116707851B (en) * 2022-11-21 2024-04-23 荣耀终端有限公司 Data reporting method and terminal equipment
CN116523045A (en) * 2023-03-13 2023-08-01 之江实验室 Deep learning reasoning simulator oriented to multi-core chip
CN116523045B (en) * 2023-03-13 2023-11-07 之江实验室 Deep learning reasoning simulator oriented to multi-core chip
CN117235464A (en) * 2023-11-14 2023-12-15 华东交通大学 Fourier near infrared interference signal virtual generation evaluation method and system
CN117235464B (en) * 2023-11-14 2024-02-23 华东交通大学 Fourier near infrared interference signal virtual generation evaluation method and system

Also Published As

Publication number Publication date
CN112418392A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2022083536A1 (en) Neural network construction method and apparatus
WO2021238366A1 (en) Neural network construction method and apparatus
WO2020221200A1 (en) Neural network construction method, image processing method and devices
WO2022042713A1 (en) Deep learning training method and apparatus for use in computing device
WO2021120719A1 (en) Neural network model update method, and image processing method and device
WO2022116933A1 (en) Model training method, data processing method and apparatus
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
WO2022001805A1 (en) Neural network distillation method and device
US20230082597A1 (en) Neural Network Construction Method and System
CN113705769A (en) Neural network training method and device
WO2021218517A1 (en) Method for acquiring neural network model, and image processing method and apparatus
CN110222718B (en) Image processing method and device
WO2022111617A1 (en) Model training method and apparatus
CN112529146B (en) Neural network model training method and device
WO2021218470A1 (en) Neural network optimization method and device
WO2022012668A1 (en) Training set processing method and apparatus
CN111797992A (en) Machine learning optimization method and device
CN113807399A (en) Neural network training method, neural network detection method and neural network detection device
CN115081588A (en) Neural network parameter quantification method and device
CN113536970A (en) Training method of video classification model and related device
CN113627163A (en) Attention model, feature extraction method and related device
CN115018039A (en) Neural network distillation method, target detection method and device
WO2022156475A1 (en) Neural network model training method and apparatus, and data processing method and apparatus
CN113128285A (en) Method and device for processing video
US20220130142A1 (en) Neural architecture search method and image processing method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881953

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21881953

Country of ref document: EP

Kind code of ref document: A1