CN113920097A

CN113920097A - Power equipment state detection method and system based on multi-source image

Info

Publication number: CN113920097A
Application number: CN202111199043.3A
Authority: CN
Inventors: 尚博文; 冯光; 徐铭铭; 孙芊; 王鹏; 徐恒博; 牛荣泽; 王倩; 李宗峰; 张建宾; 陈明; 谢芮芮; 李丰君; 董轩
Original assignee: Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Current assignee: Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Priority date: 2021-10-14
Filing date: 2021-10-14
Publication date: 2022-01-11

Abstract

A method and a system for detecting the state of power equipment based on multi-source images are disclosed. In consideration of limited mobile equipment resources, a D-YOLOv4 model is provided and used for identifying and positioning electric equipment in a visible light image, pixel point mapping of thermal imaging and visible light imaging is completed through an image registration method and affine transformation based on edge detection, an electric equipment area in an infrared image is obtained, and a power grid operation safety standard is used as a detection basis of each electric equipment state. The method comprises the steps of applying image fusion and deep learning, combining an infrared thermal imaging image and a visible light image with each other, establishing a pixel mapping relation, and finding out key attention areas and key attention points, so that the inspection robot can obtain the temperatures of hot points of each target device during power inspection, and the inspection intelligence level is improved based on device type specific analysis.

Description

Power equipment state detection method and system based on multi-source image

Technical Field

The invention belongs to the technical field of power equipment state detection, and particularly relates to a power equipment state detection method and system based on a multi-source image.

Background

Compared with a manual inspection mode, the electric inspection robot system has the advantages of high automation level, high safety and the like, and at present, a visible light camera and a thermal imager are common detection instruments for inspection robots and can detect electric power equipment in a non-contact manner without power failure. Therefore, real-time fault diagnosis based on image processing has important significance, and decision support is provided for equipment detection.

However, massive measurement information generated by uninterrupted measurement presents a great challenge to the traditional fault diagnosis method, and data processing is difficult to synchronously perform. In consideration of the problems that the number of pictures generated by uninterrupted measurement and inspection is huge, manual investigation is time-consuming and labor-consuming, and omission occurs easily, the traditional mode of feature extraction and classifier has the defects of low accuracy and weak generalization capability. With the continuous exploration of scholars, the convolutional neural network achieves a breakthrough result in the aspect of computer vision. In view of the strong data analysis capability of deep learning, how to use a deep convolutional neural network for fault diagnosis, improve the detection efficiency, reduce the potential safety hazard, and become a hot problem which needs to be solved urgently at present.

In the prior art, the overall electrical performance and insulation level change of the equipment, such as the defects of the equipment, the oxidation and corrosion of a contact surface, the loosening of a bolt, the strand scattering of a lead and the like, are detected based on the temperature change of the electrical equipment and can be reflected by thermal imaging. Target detection based on a convolutional neural network is a large branch of computer vision, and aims to find a specific target in an image and label the specific target, but there are many limiting factors in directly applying a thermal imaging graph to target detection: firstly, the infrared image has less texture information, low resolution and large noise. Secondly, color values of thermal imaging pixels are not distributed stably, a pseudo color image is generated according to the temperature, the colors of areas with similar temperatures are also nearly the same, so that the detail information of equipment is faded, the characteristics are not obvious, pictures of the same equipment in different states are greatly different, especially when the equipment is in fault, the fault part of the power equipment has a strong outline, the whole outline information is weak, the power equipment cannot be reliably identified and positioned, and the false detection and even false detection are easily caused. The more serious the equipment failure is, the worse the detection effect is, contrary to the detection requirement. Thirdly, safety and stability are basic requirements of operation of a power system, infrared images under various fault conditions are difficult to acquire, if a mode of directly identifying the infrared images is adopted, the requirement of a convolutional neural network on a large number of fault data sets is difficult to meet, faults which are not contained in a training set are easy to miss detection or false detection, and the training effect cannot be guaranteed. On the contrary, the visible light imaging has obvious advantages in the aspect of object detection because the visible light imaging has high resolution and abundant texture information, is widely applied to various monitoring fields and is not influenced by the running state of equipment. The visible light image has the characteristic of being not influenced by the running state of the equipment, but at the same time, the visible light image also means that the visible light image lacks the thermal fault information of the power equipment. Therefore, the complementarity of the infrared image and the visible light image is considered, the information advantages of the infrared image and the visible light image can be combined, the corresponding relation between the infrared image and the visible light image is established, and the multi-information matching fusion and the comprehensive utilization are realized. Multi-source information can improve the comprehensiveness of the perception of the target compared to single-source information.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention aims to provide a power equipment state detection method and system based on a multi-source image.

The invention adopts the following technical scheme.

A multi-source image-based power equipment state detection method comprises the following steps:

step 1, collecting thermal imaging pictures and visible light imaging pictures of power equipment;

step 2, establishing an improved single-stage detector model based on a convolutional neural network; outputting a pixel area of the power equipment by taking a visible light imaging picture as input data of the improved single-stage detector model;

step 3, respectively extracting the edge outline of the power equipment from the thermal imaging picture and the visible light imaging picture; extracting stable characteristic point pairs between the thermal imaging picture and the visible light imaging picture according to the edge profile;

step 4, registering the thermal imaging picture and the visible light imaging picture by utilizing affine transformation according to the stable characteristic point pairs, namely establishing a mapping relation between pixel points of the thermal imaging picture and pixel points of the visible light imaging picture;

step 5, according to the mapping relation, positioning a pixel region of the power equipment in the visible light imaging picture to a temperature region of the power equipment in the thermal imaging picture, and extracting a target temperature region in the pixel region;

step 6, traversing the equipment temperature in the target temperature area, and taking the maximum value of the equipment temperature in the target temperature area as the hot spot temperature of the power equipment;

and 7, comparing the hot spot temperature with a set hot spot temperature limit value, if the hot spot temperature is greater than or equal to the hot spot temperature limit value, determining that the state of the electric power equipment is abnormal, and giving a temperature alarm.

Preferably, in step 1, when the thermal imaging picture and the visible light picture of the power equipment are collected, the position, the shooting direction and the shooting angle of the picture shooting device are all consistent.

Preferably, in step 2, the model is improved based on the convolution neural network-based single-stage detector model YOLOv4, and the method comprises the following steps:

step 2.1, improving a main structure CSPDarknet53 of the single-stage detector model YOLOv4, and replacing the structure of a residual error network ResNet in a convolutional layer respectively corresponding to feature maps selected in advance in the main structure CSPDarknet53 with a dense module in DenseNuts; the dense module comprises dense blocks and transition layers which are alternately connected, the current layer dense block takes the characteristic information output by each layer of dense blocks in front of the current layer dense block as input, the characteristic information output by the current layer dense block is the input of each layer of dense blocks behind the current layer dense block, and the characteristic information s output by the nth layer dense block_nSatisfy the following relationsIs represented by the following formula:

s_n＝H_n[s₀，s₁，s₂，...，s_n-1]

in the formula, s₀，s₁，s₂，...，s_n-1Respectively representing the characteristic information of the 0 th layer, the 1 st layer, the 2 nd layer, … … and the n-1 th layer,

H_nperforming a combined operation function of batch normalization, activation function and convolution on input data for the nth layer of dense block;

step 2.2, clustering the number and the size of the candidate frames of the pixel area of the power equipment of the single-stage detector model YOLOv4 based on a k-means + + algorithm, and selecting the size of the candidate frames according to the following relational expression:

in the formula (I), the compound is shown in the specification,

n represents the total number of candidate boxes,

X_iindicates the ith candidate box, i is 1,2, …, n,

k represents the number of cluster centers;

cen_jdenotes the size of the jth cluster center, i.e., the jth candidate box, j ═ 1,2, …, k,

avg_IoU_krepresenting the matching degree of the anchor frame and the candidate frame when the number of the clustering centers is k, wherein the value range is 0 to 1;

with the set a ═ { Δ avg _ IoU_k|avg_IoU_k≥70％，k∈[K₁，K₂]Taking a point corresponding to the maximum value of each element in the image as a clustering result, and taking k sizes corresponding to the clustering result as the sizes of the candidate frames of the pixel areas of the power equipment of the single-stage detector model YOLOv 4; wherein, Δ avg _ IoU_k＝avg_IoU_k-avg-IoU_k-1，Δavg_IoU_kRepresents avg _ IoU_kAn amount of increase of (c); the number K of the clustering centers is K₁To K₂All of the integers in (1);

step 2.3, for the single-stage detectorThe neck structure PANET of the model YOLOv4 is improved, and the characteristic information s is connected and downsampled through transverse connection_nFusing into a network detection layer;

step 2.4, performing sparse training on the single-stage detector model YOLOv4 improved in the steps 2.1 to 2.3 according to a Loss function Loss with sparse regularization penalty, wherein the Loss function Loss of the sparse training satisfies the following relational expression:

in the formula (I), the compound is shown in the specification,

loss_YOLOv4a loss function representing the normal training,

g (y) represents the regularization penalty function for the scaling factor y,

λ represents a balance factor;

step 2.5, determining a pruning proportion in the value range of the scale factor gamma according to the model accuracy rate to carry out channel pruning; the value range of the scale factor gamma is 1 to 99 percent.

Preferably, in step 2.1, the sizes of the preselected feature maps in the trunk structure CSPDarknet53 are 19 × 19, 38 × 38, and 76 × 76, respectively.

Preferably, in step 2, the improved single-stage detector model outputs the position coordinate (x) of the upper left corner of the power equipment₀，y₀) The width w and the height h of a pixel region of the power equipment; wherein the electrical equipment position coordinate (x)₀，y₀) Is the center point coordinate of the pixel region of the power equipment, and the pixel region is a point (x)₀-w/2，y₀-h/2) and point (x)₀+w/2，y₀+ h/2) is a rectangular area with diagonal lines.

Preferably, step 3 comprises:

step 3.1, respectively extracting the edge profiles of the power equipment in the thermal imaging picture and the visible light imaging picture by using an edge detection method; wherein, the edge detection method adopts a Sobel operator;

3.2, extracting thermal imaging characteristic points and thermal imaging local characteristic descriptors from the thermal imaging picture edge contour by using an accelerated robustness characteristic algorithm to form a thermal imaging characteristic point set, and extracting visible light imaging characteristic points and visible light imaging local characteristic descriptors from the visible light imaging picture edge contour to form a visible light imaging characteristic point set; wherein, each local feature descriptor is a 64-dimensional feature vector;

step 3.3, matching the thermal imaging local characteristic descriptor and the visible light imaging local characteristic descriptor by using a k-dimensional-tree algorithm and a k-nearest neighbor algorithm to obtain a plurality of characteristic point pairs;

and 3.4, executing a random sampling consistency algorithm to filter the characteristic point pairs and screening out stable characteristic point pairs.

Preferably, in step 3.4, the stable characteristic point pair includes a pixel point position coordinate (x, y) in the thermal imaging picture and a pixel point position coordinate (x ', y') in the visible light imaging picture.

Preferably, step 4 comprises:

step 4.1, calculating the confidence coefficient of each stable characteristic point pair, and selecting 3 stable characteristic point pairs with the highest confidence coefficient;

and 4.2, registering the thermal imaging picture and the visible light imaging picture by using the selected 3 stable characteristic point pairs and affine transformation, wherein the affine transformation of the thermal imaging picture and the visible light imaging picture meets the following relational expression:

in the formula (I), the compound is shown in the specification,

a denotes the object zoom assignment resulting from the difference of the thermal imaging and visible light image acquisition devices,

theta denotes an object rotation angle caused by a difference between the thermal imaging and the visible light image pickup device,

t_x、t_vrespectively representing the amount of translation of the object in the horizontal direction and in the vertical direction as compared to the visible light image due to differences in the acquisition equipmentThe amount of translation of the direction;

x and y respectively represent the position coordinates of pixel points in the thermal imaging picture,

and x 'and y' respectively represent the position coordinates of pixel points in the visible light imaging picture.

Preferably, in step 5,

step 5.1, according to the mapping relation, the pixel area of the power equipment in the visible light imaging picture is positioned to the temperature area of the power equipment in the thermal imaging picture, namely the pixel area of the power equipment in the visible light imaging picture is converted into the temperature area T of the power equipment in the thermal imaging picture_m×nThe following relational expression is satisfied:

in the formula, T_cdRepresenting the temperature corresponding to each pixel in the pixel region, wherein c is more than or equal to 1 and less than or equal to m, and d is more than or equal to 1 and less than or equal to n;

step 5.2, the temperature area T of the power equipment in the thermal imaging picture_m×nExtracting a target temperature area T of the power equipment in the thermal imaging picture_p×qThe following relational expression is satisfied:

in the formula, T_ij∈T_m×nWherein, i is more than or equal to 1 and less than or equal to p and less than or equal to m, and j is more than or equal to 1 and less than or equal to q and less than or equal to n.

Preferably, in step 7, the improved single-stage detector model further outputs a power equipment type, and different hotspot temperature limits are determined according to different power equipment types.

A multi-source image-based power equipment state detection system comprises: the system comprises an image acquisition module, an image area processing module, an image registration module, a hot spot temperature detection module and an equipment state early warning module;

the image acquisition module is used for acquiring a thermal imaging image and a visible light imaging image of the power equipment and respectively inputting the images into the image area processing module and the image registration module;

the picture area processing module comprises a pixel area processing unit and a temperature area processing unit; the pixel area processing unit is used for outputting a pixel area of the power equipment by taking a visible light imaging picture as input data based on an improved single-stage detector model; the temperature region processing unit is used for extracting a target temperature region of the power equipment in the thermal imaging picture from a pixel region of the power equipment in the visible light imaging picture according to a registration result provided by the picture registration module;

the image registration module is used for respectively extracting the edge outline of the power equipment from the thermal imaging image and the visible light imaging image; extracting stable characteristic point pairs between the thermal imaging picture and the visible light imaging picture according to the edge profile; registering the thermal imaging picture and the visible light imaging picture by utilizing affine transformation according to the stable characteristic point pairs; the image registration module outputs a registration result which is input data of the temperature area processing unit;

the hot spot temperature detection module is used for traversing the equipment temperature in the target temperature area, and taking the maximum value of the equipment temperature in the target temperature area as the hot spot temperature of the power equipment; the hot spot temperature output by the hot spot temperature detection module is input data of the equipment state early warning module;

and the equipment state early warning module is used for comparing the hot spot temperature with a set hot spot temperature limit value, and if the hot spot temperature is greater than or equal to the hot spot temperature limit value, determining that the state of the electric equipment is abnormal and giving a temperature alarm.

Compared with the prior art, the method has the advantages that the method applies image fusion and deep learning simultaneously, combines the infrared thermal imaging image and the visible light image, establishes a pixel mapping relation, finds out key attention areas and key attention points, enables the inspection robot to obtain the maximum temperature value of each target device during power inspection, performs specific analysis on specific devices, and improves inspection intelligence level.

The beneficial effects of the invention include:

1. temperature analysis at individual level was achieved: according to different safe operation temperature thresholds of each power device in the power grid operation defect grade standard, the specific temperature of each device is analyzed, the intelligent level of monitoring can be improved, and the situation that only the highest temperature in the whole infrared image is used for identifying the device is effectively avoided, so that the detection sensitivity of the device fault is improved, meanwhile, the problem that the device is displayed due to the fact that the temperature of an individual layer is refined is highlighted, the workload of manual processing is obviously reduced, and the working efficiency is improved;

2. the method has the advantages of interference resistance: the advantage that the abundant texture information of the visible light image can enhance the anti-interference performance of multiple equipment identification and the advantage that the infrared image provides pixel temperature information are fully exerted, and the reliability and the accuracy of the early warning equipment fault are ensured by the combination of the two data sources;

3. the requirements for real-time detection can be met: the YOLOv4 model is improved, the more excellent detection effect is realized by using 25.8% of parameters of the original model, a lightweight model suitable for mobile terminal application and real-time detection is constructed, and the aims of less parameters and high accuracy are fulfilled.

Drawings

FIG. 1 is a block diagram of the steps of a method for detecting the status of an electrical device based on multi-source images according to the present invention;

FIG. 2 is a schematic diagram of a 3-layer dense block in an improved single-stage detector model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the structure of a PANET in an improved single-stage detector model according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of network pruning in an improved single-stage detector model according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating image registration according to an embodiment of the present invention;

FIG. 6 shows a convolution kernel of the Sobel operator according to an embodiment of the present invention.

Detailed Description

The present application is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.

As shown in fig. 1, a method for detecting a state of an electrical device based on a multi-source image includes steps 1 to 7, which are as follows:

step 1, collecting thermal imaging pictures and visible light imaging pictures of the power equipment.

In the preferred embodiment of the invention, a 500kV substation disconnecting switch is taken as a research object, 517 tension thermal imaging pictures and visible light imaging pictures are obtained during the inspection of the inspection robot to construct a training data set which is taken as an input data source for detection.

Specifically, in step 1, when a thermal imaging picture and a visible light picture of the power equipment are collected, the position, the shooting direction and the shooting angle of the picture shooting device are all consistent.

In the preferred embodiment of the invention, in order to avoid the difference between the thermal imaging picture and the visible light imaging picture in the aspects of size, visual angle, visual field and the like, the pictures are shot by using the infrared imager and the visible light camera respectively when the inspection robot inspects the images, and the infrared camera and the visible light camera are positioned at the same position and in the same shooting direction. Moreover, as long as the positions of the infrared camera and the visible light camera are unchanged during routing inspection shooting, the geometric transformation relations of various translations, rotations and scalings between the thermal imaging picture and the visible light picture are always applicable without re-solving and influencing the real-time performance of detection.

Step 2, establishing an improved single-stage detector model based on a convolutional neural network; and outputting a pixel area of the power equipment by taking the visible light imaging picture as input data of the improved single-stage detector model.

The convolutional neural network is a common deep learning framework, and adopts a small amount of calculation to mine deep features of the picture through a unique structure such as convolutional kernel and pooling. Compared with the traditional object detection method, the convolutional neural network has the great advantages of self-learning and strong detection capability, and is divided into a single-stage detector and a two-stage detector in the aspect of target detection. Two-phase detectors, such as Faster R-CNN, have enjoyed great success in high accuracy, but they typically require more inference time. On the contrary, the single-stage detector mainly pursues computational efficiency and is suitable for real-time detection tasks, and therefore, a single-stage detector model YOLOv4 with comprehensive performance in real-time performance and accuracy is selected as a basic network. The application object of the preferred embodiment of the invention is the inspection robot, so the focus of attention is detection precision, mobile terminal application and real-time monitoring. Since the YOLOv4 has more parameters and large model quantity, the problem of difficult application of a mobile terminal exists, and the detection real-time performance is closely related to the size of the model, the model accuracy is ensured, the model parameters are reduced, the YOLOv4 is improved, and the D-YOLOv4 model is provided for the target detection of the power equipment.

YOLOv4 uses CSPDarknet53 as a Backbone network (Backbone), where the CSPNet structure improves the learning capabilities of the network. The Neck network (Neck) adopts a path aggregation network (PANet) and a spatial pyramid pooling network (SPP) to perform multi-scale feature extraction and feature map fusion. The responsibility of the head network (head) part is to predict the class and bounding box of the object. Yolov4 has excellent detection performance and is widely applied in the industry and academia, but has high requirements on computing power and hardware configuration. The model size of YOLOv4 is 256M, requiring 1270 hundred million floating point operations (BFLOPs), which means that its parametric, complex calculations will be the major resistance when it is applied to internet of things devices, embedded devices, or other mobile devices to implement edge calculations or real-time detection.

In the preferred embodiment of the invention, during the inspection of the transformer substation, the data processing is preferentially completed at the distributed monitoring nodes in the inspection robot, so that the time delay caused by data transmission during the remote data processing is avoided. Therefore, in order to reduce model parameters, improve feature expression capability and shorten calculation time, anchor frame parameters and a network architecture of the model are optimized, and finally the finally optimized model is determined according to sparse training and channel pruning.

Specifically, in step 2, on the basis of the convolution neural network-based single-stage detector model YOLOv4, the model is improved, including:

step 2.1, improving a main structure CSPDarknet53 of the single-stage detector model YOLOv4, and replacing the structure of a residual error network ResNet in a convolutional layer respectively corresponding to feature maps selected in advance in the main structure CSPDarknet53 with a dense block in DenseNuts;

dense convolutional networks (densneet) effectively enhance feature reuse and alleviate the problem of gradient vanishing, using denser skip connections, replacing the "add" operation with the "concatenate" operation, achieving better performance with fewer parameters than residual networks (ResNet).

The backbone network of YOLOv4 is CSPDarknet53, which is mainly established on a residual block, and in order to improve the model efficiency, in the preferred embodiment of the invention, the ResNet (residual network) structure in the convolutional layer corresponding to the 19 × 19, 38 × 38, 76 × 76 characteristic diagrams is changed into a dense block; the dense modules comprise dense blocks and transition layers which are alternately connected; as shown in FIG. 2, the feature information output by the current layer dense block is input by the previous layer dense block, and the feature information output by the current layer dense block is input by the next layer dense block, wherein the feature information s output by the n-th layer dense block_nSatisfies the following relation:

s_n＝H_n[s₀，s₁，s₂，...，s_n-1]

in the formula (I), the compound is shown in the specification,

s₀，s₁，s₂，...，s_n-1respectively representing the characteristic information of the 0 th layer, the 1 st layer, the 2 nd layer, … … and the n-1 th layer,

in the preferred embodiment of the present invention, the combined operation function is BN-Mish-Conv (1X 1) -BN-Mish-Conv (3X 3); the Conv (1 × 1) convolution kernel aims to reduce the number of parameters and realize cross-layer information integration, and the Conv (3 × 3) convolution kernel is used for feature extraction. The specific parameter settings for these dense blocks are shown in table 1.

TABLE 1 dense Block setup parameters in the trunk section (input size: 608X 608 pixels)

Detailed data description, a dense block of 76 × 76 resolution is taken as an example. The number of 1 × 1 and 3 × 3 convolution kernels is 64 and 32, respectively, so that the input data size of the 3 × 3 convolution kernels is reduced to 76 × 76 × 64 by 1 × 1 convolution operation, and the corresponding processing result is that the size of the 3 × 3 kernel is 76 × 76 × 32. Therefore, the input data size of 5 layers in this dense block is 76 × 76 × 128, 76 × 76 × 160, 76 × 76 × 192, 76 × 76 × 224, 76 × 76 × 256 in this order, and all the output sizes are 76 × 76 × 32.

Dense connections enhance feature reuse and make parameters more efficient.

in the formula (I), the compound is shown in the specification,

n represents the total number of candidate boxes,

X_iindicates the ith candidate box, i is 1,2, …, n,

k represents the number of cluster centers;

avg_IoU_krepresenting the matching degree of the anchor frame and the candidate frame when the number of the clustering centers is k, wherein the value range is 0 to 1; as can be seen, avg _ IoU_kLarger values indicate better clustering results.

Considering that the present invention needs to consider both the complexity of calculation and the accuracy, the set A is defined as{Δavg_IoU_k|avg_IoU_k≥70％，k∈[K₁，K₂]Taking a point corresponding to the maximum value of each element in the image as a clustering result, and taking k sizes corresponding to the clustering result as the sizes of the candidate frames of the pixel areas of the power equipment of the single-stage detector model YOLOv 4; wherein, Δ avg _ IoU_k＝avg_IoU_k-avg_IoU_k-1，Δavg_IoU_kRepresents avg _ IoU_kAn amount of increase of (c); the number K of the clustering centers is K₁To K₂All integers of (1).

The original target candidate box size of the single-stage detector model YOLOv4 is specific to a COCO common data set, so that the pertinence is lacked, and in view of the fact that most electric power equipment have unique aspect ratio characteristics, the k-means + + algorithm is adopted to perform cluster analysis on the size and the number of anchor boxes, and the adaptability of a network to the size of the electric power equipment is enhanced.

Step 2.3, improving the neck structure PANet of the single-stage detector model YOLOv4, and performing transverse connection and downsampling on the feature information s_nFusing into a network detection layer; the information transfer direction between the scale layers is improved by adding a jump connection mechanism, the calculated amount of the detection layer is reduced, and the information interaction between different scales is enhanced.

For the recognition task, higher resolution features provide more accurate localization signals, which are important for small objects, while deep exploration of features with small resolution has deeper semantic information. The PANET establishes top-down and bottom-up paths based on the thought, semantic feature information with different scales is respectively extracted from feature information obtained by up-sampling and feature information obtained by down-sampling, and the semantic feature information is integrated; to facilitate information flow communication between the multi-scale features.

Due to the integration of semantic feature information, the target detection layer is better predicted. PANet is the neck portion in YOLOv4, and the final output size is 1/8(76 × 76), 1/16(38 × 38), and 1/32(19 × 19) of the original image for the 608 × 608 input size, that is, the minimum receptive field is 8 × 8. Dense grids help locate signals and small object predictions, but add significant computation. For example, a 76 × 76 grid is 16 times and 4 times the amount of computation of a 38 × 38 and 19 × 19 grid, respectively. In the preferred embodiment of the present invention, the infrared information is fully used, but too far away will result in inaccurate infrared information. Further, since detection is possible even when the device far from the robot is close, detection of a small object is not necessary. In summary, the hack structure is improved as shown in fig. 3. Multiple convolutional layers are stacked to form a feature hierarchy, and interlayer connections are made by downsampling and upsampling. To further simplify the model, the hierarchy shown by the dotted line in the figure is eliminated, but in order to use the part of feature information extracted from the hierarchy in the target prediction, the feature information can be fused to the next depth scale through downsampling and horizontal connection. Therefore, the prediction part contains accurate positioning information, and a large amount of calculation burden brought by high resolution is not required to be borne in the prediction link.

Step 2.4, the sparse training is realized through a scale factor of each channel in the BN layer, that is, the single-stage detector model YOLOv4 improved in steps 2.1 to 2.3 is sparsely trained according to a Loss function with sparse regularization penalty, wherein the Loss function Loss of the sparse training satisfies the following relational expression:

in the formula (I), the compound is shown in the specification,

loss_YOLOv4a loss function representing the normal training,

g (y) represents the regularization penalty function for the scaling factor y,

λ represents a balance factor;

in the preferred embodiment of the present invention, L1-regularization is chosen to achieve network sparsity. The penalty function drives the scale factor of the insignificant channel to zero. The value of the scale factor is used to measure the importance of the channel and perform channel pruning accordingly.

In the preferred embodiment of the invention, the purpose of network pruning is to remove the less contributing parts and obtain a more efficient and simplified model. The main steps of network pruning are shown in fig. 4, and mainly include sparse training, channel pruning, and fine tuning. Channel pruning determines the final network channel parameters, which are performed on the basis of sparse training.

After pruning, the accuracy of the model decreases, requiring fine tuning to improve detection performance. On the basis of sparse training, 80%, 60%, 40% and 20% pruning rate were tried, respectively. To balance the size and accuracy of the model, 40% was chosen as the pruning rate. The mAP @0.5 of the model after fine tuning reaches 92.86 percent and is used as the final optimized network model D-YOLOv 4.

In the preferred embodiment of the present invention, a single-stage detector model YOLOv4 is improved to obtain a D-YOLOv4 model of the power equipment, and the comparison between the model and the experimental results of the substation disconnector dataset collected by other typical target detection networks in the present design is detailed in table 2.

TABLE 2 evaluation index comparison of model D-YOLOv4

Experiments show that the D-YOLOv4 model has obvious advantages on a power equipment data set, and achieves more excellent detection effect by using 25.8% of parameter quantity of the original model.

Further, the improved single-stage detector model outputs the position coordinate (x) of the upper left corner of the power equipment in step 2₀，y₀) The width w and the height h of a pixel region of the power equipment; wherein the electrical equipment position coordinate (x)₀，y₀) Is the center point coordinate of the pixel region of the power equipment, and the pixel region is a point (x)₀-w/2，y₀-h/2) and point (x)₀+w/2，y₀+ h/2) is a rectangular area with diagonal lines.

Step 3, respectively extracting the edge outline of the power equipment from the thermal imaging picture and the visible light imaging picture; and extracting stable characteristic point pairs between the thermal imaging picture and the visible light imaging picture according to the edge profile.

Preferably, as shown in fig. 5, step 3 comprises:

step 3.1, respectively extracting the edge profiles of the power equipment in the thermal imaging picture and the visible light imaging picture by using an edge detection method; wherein, the edge detection method adopts Sobel operator. The convolution kernel of the Sobel operator is shown in fig. 6.

It should be noted that in the preferred embodiment of the present invention, the edge detection method uses Sobel operator, which is a non-limiting preferred choice.

Because the imaging principles of visible light and thermal imaging are different, the thermal imaging picture is color-filled according to the temperature, the RGB values of the picture are greatly different, and the pixels of the picture and the pixel of the picture have no correlation, so that the feature point is directly obtained and matched to cause great errors of the picture and the pixel of the thermal imaging picture, but the overall outline of the power equipment is the common point of the two pictures, so the invention extracts the edge outline of an object by an edge detection algorithm, highlights the common point of the information of the two pictures, reduces the calculation amount of subsequent feature detection while solving the problem of feature matching, and accelerates the calculation speed.

In the edge detection step, the edge contour of the visible light image and the infrared image is extracted by utilizing a Sobel operator, and the edge detection result is directly related to the subsequent image analysis.

Step 3.2, extracting thermal imaging feature points and thermal imaging local feature descriptors from the edge contour of the thermal imaging picture by using an accelerated-up robust features (SURF) algorithm to form a thermal imaging feature point set, and extracting visible light imaging feature points and visible light imaging local feature descriptors from the edge contour of the visible light imaging picture to form a visible light imaging feature point set; wherein each local feature descriptor is a 64-dimensional feature vector.

The SURF algorithm is a local feature description operator that remains invariant to image scaling, rotation, and even affine transformations. The method mainly comprises the steps of dimension space extreme value detection, key point positioning, direction distribution, key point descriptors and the like. The SURF algorithm uses a box filter to approximate the computation of a gaussian filter and a Hessian matrix, thereby greatly increasing the image processing speed.

Moreover, after the edge detection is added, the number of key points is obviously increased, which means that the matching process has more choices and is more robust. More importantly, the number of key points in the visible image is significantly greater than in the infrared image, which also demonstrates the advantage of using visible images rather than thermographic images for object detection.

Step 3.3, carrying out preliminary matching on the thermal imaging local feature descriptor and the visible light imaging local feature descriptor by using a k-dimensional-tree algorithm and a k-nearest neighbor algorithm to obtain a plurality of feature point pairs;

the feature matching is to compare feature descriptors and select more similar feature points in the key point sets of different pictures as key point pairs. The SURF descriptor is a 64-dimensional feature vector, and the k-dimensional-tree algorithm (KD-tree) and the k-nearest neighbor algorithm (k-NN) are often combined with SURF algorithms to achieve preliminary feature matching.

Further, in step 3.4, the stable characteristic point pair includes a pixel point position coordinate (x, y) in the thermal imaging picture and a pixel point position coordinate (x ', y') in the visible light imaging picture.

Applying the KD-tree and k-NN algorithms to the detected feature descriptors can obtain preliminary matching pairs, but mismatching pairs are inevitable. Therefore, a random sample consensus (RANSAC) algorithm is executed to filter the error matching pairs and screen out stable characteristic point pairs to solve the affine transformation.

And 4, registering the thermal imaging picture and the visible light imaging picture by utilizing affine transformation according to the stable characteristic point pairs, namely establishing the mapping relation between the pixel points of the thermal imaging picture and the pixel points of the visible light imaging picture.

The phenomena of translation, rotation, scaling and the like exist among pictures, and the pictures belong to affine transformation. In view of this, the registration of two different source images is done by solving an affine transformation matrix. Affine transformation is a basic and widely applied linear transformation model.

Specifically, step 4 includes:

in the formula (I), the compound is shown in the specification,

t_x、t_yrespectively representing the output width target value and the height target value of the improved single-stage detector model YOLOv4,

And solving the affine transformation model to obtain the mapping relation between the thermal imaging picture pixel point temperature value and the pixel point coordinate of the visible light imaging picture.

The final goal of image registration is to establish a pixel mapping relationship. If the positions of the visible light camera and the thermal imager are unchanged, the geometric transformation relation is always applicable. Therefore, the registration operation is not required to be carried out on each frame, and the real-time performance of the equipment state evaluation is not influenced.

And 5, positioning the pixel area of the power equipment in the visible light imaging picture to the temperature area of the power equipment in the thermal imaging picture according to the mapping relation, and extracting a target temperature area in the pixel area.

Because the nature of the thermal imaging graph is a temperature matrix, generally, the temperature value of each pixel point is calculated by the original data acquired by the thermal imaging sensor according to a data calculation method of a technical manual, and the temperature of each pixel point in the visible light can be calculated in such a way.

The pixel area of the power equipment in the visible light imaging picture is represented by a pixel point coordinate matrix, the temperature area of the power equipment in the thermal imaging picture is represented by a pixel point temperature value matrix, and the corresponding relation between the pixel point coordinate matrix and the pixel point temperature value matrix is obtained by establishing the mapping relation between the pixel point temperature value of the thermal imaging picture and the pixel point coordinate of the visible light imaging picture.

Specifically, step 5 comprises:

Step 6, traversing the device temperature in the target temperature area to obtain the maximum of the device temperature in the target temperature areaValue T_maxAs the hot spot temperature of the power equipment.

In the preferred embodiment of the present invention, the hot spot temperatures in 3 scenes are detailed in table 3.

Table 33 hot spot temperature comparisons in scenes

And 7, comparing the hot spot temperature with a set hot spot temperature limit value, if the hot spot temperature is greater than or equal to the hot spot temperature limit value, determining that the power equipment is abnormal in state, giving a temperature alarm, and prompting a worker to check in time.

Specifically, in step 7, the improved single-stage detector model further outputs the type of the electrical equipment, and determines different hotspot temperature limits according to different types of the electrical equipment.

The abnormal state of the power equipment comprises a general fault, a major fault and an emergency fault, and the faults of different levels respectively correspond to different thresholds.

In the preferred embodiment of the invention, the hotspot temperature limit is set according to the national network safety regulation requirement.

In the preferred embodiment of the invention, the defect grade standard of the power equipment related to the hot spot temperature in the power grid enterprise is used as a state evaluation basis, and the operation state is judged by combining the detected hot spot temperature value of the specific equipment. When the maximum temperature between the hot spot temperature of the target equipment area and the whole infrared image is greatly different, namely the maximum temperature is not in the target electric equipment area, the algorithm provided by the method is more targeted and accurate. The intelligent power equipment hot spot temperature monitoring system can measure the temperature of a single hot spot of specific power equipment and evaluate the running state of the single hot spot, and the intelligence of routing inspection is improved.

the picture area processing module comprises a pixel area processing unit and a temperature area processing unit; the pixel area processing unit is used for outputting a pixel area of the power equipment by taking a visible light imaging picture as input data based on the improved single-stage detector model; the temperature area processing unit is used for extracting a target temperature area of the electric power equipment in the thermal imaging picture from a pixel area of the electric power equipment in the visible light imaging picture according to a registration result provided by the picture registration module;

the image registration module is used for respectively extracting the edge outline of the power equipment from the thermal imaging image and the visible light imaging image; extracting stable characteristic point pairs between the thermal imaging picture and the visible light imaging picture according to the edge profile; registering the thermal imaging picture and the visible light imaging picture by using affine transformation according to the stable characteristic point pairs; the image registration module outputs a registration result which is input data of the temperature area processing unit;

Compared with the prior art, the method has the advantages that the image registration and the deep learning are simultaneously applied, the infrared thermal imaging image and the visible light image are combined with each other, the pixel mapping relation is established, the key attention area and the key attention point are found out, so that the inspection robot obtains the maximum temperature value of each target device during power inspection, specific analysis is carried out on specific devices, and the inspection intelligence level is improved.

The beneficial effects of the invention include:

1. temperature analysis at individual level was achieved: according to different safe operation temperature thresholds of each power device in the power grid operation defect grade standard, the specific temperature of each device is analyzed, the intelligent level of monitoring is improved, and the situation that only the highest temperature in the whole infrared image is used for identifying the device is effectively avoided, so that the detection sensitivity of the device fault is improved, meanwhile, the problem that the device is displayed due to the fact that the temperature of an individual layer is refined is highlighted, the workload of manual processing is obviously reduced, and the working efficiency is improved;

2. the method has the advantages of interference resistance: the advantage that the abundant texture information of the visible light image can enhance the anti-interference performance of identification of various devices such as an isolating switch and the like and the advantage that the infrared image provides pixel temperature information are fully exerted, and the reliability and the accuracy of the fault of the early warning device are ensured by the combination of the two data sources;

The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims

1. A power equipment state detection method based on multi-source images is characterized in that,

the method comprises the following steps:

2. The multi-source image-based power equipment state detection method according to claim 1,

in the step 1, when the thermal imaging picture and the visible light picture of the power equipment are collected, the position, the shooting direction and the shooting angle of the picture shooting device are consistent.

3. The multi-source image-based power equipment state detection method according to claim 1,

in step 2, on the basis of a single-stage detector model YOLOv4 based on a convolutional neural network, improving the model, including:

step 2.1, the trunk structure CSPDarknet53 of the single-stage detector model YOLOv4 is improved, and the preselected feature maps in the trunk structure CSPDarknet53 are divided intoReplacing the structure of a residual error network ResNet in the corresponding convolutional layer with a dense module in DenseNuts; the dense module comprises dense blocks and transition layers which are alternately connected, the current layer dense block takes the characteristic information output by each layer of dense blocks in front of the current layer dense block as input, the characteristic information output by the current layer dense block is the input of each layer of dense blocks behind the current layer dense block, and the characteristic information s output by the nth layer dense block_nSatisfies the following relation:

s_n＝H_n[s₀,s₁,s₂,...,s_n-1]

in the formula, s₀,s₁,s₂,...,s_n-1Respectively representing the characteristic information of the 0 th layer, the 1 st layer, the 2 nd layer, … … and the n-1 th layer,

in the formula (I), the compound is shown in the specification,

n represents the total number of candidate boxes,

X_iindicates the ith candidate box, i is 1,2, …, n,

k represents the number of cluster centers;

with the set a ═ { Δ avg _ IoU_k|avg_IoU_k≥70％,k∈[K₁,K₂]The point corresponding to the maximum value of each element in the data is used as a clustering result, and the k sizes corresponding to the clustering result are used as a singlePower device pixel area candidate box size of the phase detector model YOLOv 4; wherein, Δ avg _ IoU_k＝avg_IoU_k-avg_IoU_k-1，Δavg_IoU_kRepresents avg _ IoU_kAn amount of increase of (c); the number K of the clustering centers is K₁To K₂All of the integers in (1);

step 2.3, improving the neck structure PANet of the single-stage detector model YOLOv4, and performing transverse connection and downsampling on the feature information s_nFusing into a network detection layer;

in the formula (I), the compound is shown in the specification,

loss_YOLOv4a loss function representing the normal training,

g (y) represents the regularization penalty function for the scaling factor y,

λ represents a balance factor;

4. The multi-source image-based power equipment state detection method according to claim 3,

in step 2.1, the sizes of the feature maps preselected in the trunk structure CSPDarknet53 are 19 × 19, 38 × 38, and 76 × 76, respectively.

5. The multi-source image-based power equipment state detection method according to claim 1,

in step 2, the improved single-stage detector model outputs powerDevice upper left corner position coordinate (x)₀,y₀) The width w and the height h of a pixel region of the power equipment; wherein the electrical equipment position coordinate (x)₀,y₀) Is the center point coordinate of the pixel region of the power equipment, and the pixel region is a point (x)₀-w/2,y₀-h/2) and point (x)₀+w/2,y₀+ h/2) is a rectangular area with diagonal lines.

6. The multi-source image-based power equipment state detection method according to claim 1,

the step 3 comprises the following steps:

7. The multi-source image-based power equipment state detection method according to claim 6,

in step 3.4, the stable characteristic point pair includes a pixel point position coordinate (x, y) in the thermal imaging picture and a pixel point position coordinate (x ', y') in the visible light imaging picture.

8. The multi-source image-based power equipment state detection method according to claim 6,

step 4 comprises the following steps:

in the formula (I), the compound is shown in the specification,

t_x、t_yrespectively representing the translation amount of an object in the horizontal direction and the translation amount of the object in the vertical direction compared with the visible light image due to the difference of the acquisition devices;

9. The multi-source image-based power equipment state detection method according to claim 8,

in the step 5, the process is carried out,

step 5.1, according to the mapping relation, the pixel area of the power equipment in the visible light imaging picture is positioned to the temperature area of the power equipment in the thermal imaging picture, namely the pixel area of the power equipment in the visible light imaging picture is converted into the temperature area T of the power equipment in the thermal imaging picture_m×nSatisfy the following relational expression：

10. The multi-source image-based power equipment state detection method according to claim 1,

in step 7, the improved single-stage detector model further outputs the types of the electrical equipment, and different hotspot temperature limit values are determined according to different types of the electrical equipment.

11. A multi-source image-based power equipment state detection system realized by using the multi-source image-based power equipment state detection method of any one of claims 1 to 10,

the system comprises: the system comprises an image acquisition module, an image area processing module, an image registration module, a hot spot temperature detection module and an equipment state early warning module;