CN116468894A

CN116468894A - Distance self-adaptive mask generation method for supervised learning of lithium battery pole piece

Info

Publication number: CN116468894A
Application number: CN202310459680.2A
Authority: CN
Inventors: 左嘉铭; 姜南; 庞有伟; 赵骁骐
Original assignee: Wuxi Yuangong Sanqian Technology Co ltd
Current assignee: Wuxi Yuangong Sanqian Technology Co ltd
Priority date: 2023-04-26
Filing date: 2023-04-26
Publication date: 2023-07-21

Abstract

A distance self-adaptive mask generation method for supervised learning of lithium battery pole pieces is disclosed. Firstly, dividing an obtained X-ray image of a lithium battery to be detected into image blocks to obtain a sequence of image blocks, then, respectively passing each image block in the sequence of image blocks through a ViT model to obtain a plurality of image block context feature vectors, then, arranging the plurality of image block context feature vectors into a matrix to obtain a decoding feature matrix, then, carrying out feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix, then, carrying out image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result, finally, determining pole piece spacing based on the image semantic segmentation result, and setting a distance self-adaptive mask value based on the pole piece spacing. Thus, the learning difficulty can be reduced.

Description

Distance self-adaptive mask generation method for supervised learning of lithium battery pole piece

Technical Field

The application relates to the field of intelligent mask generation, and more particularly, to a distance adaptive mask generation method for supervised learning of lithium battery pole pieces.

Background

Lithium batteries are one of the most widely used batteries at present, and are widely applied to the fields of mobile electronic equipment, electric automobiles, energy storage and the like. In the production process of lithium batteries, quality detection is a very important link. The lithium battery pole piece is an important component of a lithium battery, and can be accurately detected, so that the quality and performance of the battery can be ensured. The traditional detection mode of the lithium battery pole piece mainly relies on X-ray equipment to collect X-ray images of the lithium battery, and then the images are manually divided, so that the positioning detection of the pole piece is realized. The method has the problems of more manual intervention, low efficiency, low precision and the like. With the development of the deep learning technology, the lithium battery pole piece detection method based on the deep learning has the advantages of automation, high efficiency, high precision and the like, and becomes a current research hot spot.

However, the detection method of the lithium battery pole piece based on deep learning also has some problems, for example, in the existing scheme, a mask operation of a fixed scale is adopted for the mask of an image, but a fixed threshold strategy cannot be completely and simultaneously adapted to the X-ray images with different amplification ratios, the mask generated by the pole piece with a smaller diameter and a larger distance can be small, and the pole piece with a smaller diameter can be very dense or even adhered. Thus, how to generate a learnable mask tag, and how to balance the relationship of dense and sparse pole pieces, becomes a challenge.

Therefore, an optimized distance adaptive mask generation scheme for lithium battery pole piece supervised learning is desired.

Disclosure of Invention

The present application has been made in order to solve the above technical problems. The embodiment of the application provides a distance self-adaptive mask generation method for supervised learning of a lithium battery pole piece. Firstly, dividing an obtained X-ray image of a lithium battery to be detected into image blocks to obtain a sequence of image blocks, then, respectively passing each image block in the sequence of image blocks through a ViT model to obtain a plurality of image block context feature vectors, then, arranging the plurality of image block context feature vectors into a matrix to obtain a decoding feature matrix, then, carrying out feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix, then, carrying out image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result, finally, determining pole piece spacing based on the image semantic segmentation result, and setting a distance self-adaptive mask value based on the pole piece spacing. Thus, the learning difficulty can be reduced.

According to one aspect of the application, a distance adaptive mask generation method for supervised learning of a lithium battery pole piece is provided, which comprises the following steps:

Acquiring an X-ray image of a lithium battery to be detected;

performing image block division on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks;

each image block in the sequence of image blocks is respectively passed through a ViT model containing an embedded layer to obtain a plurality of image block context feature vectors;

arranging the context feature vectors of the plurality of image blocks in a matrix manner to obtain a decoding feature matrix;

performing feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix;

performing image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result;

determining the distance between pole pieces based on the image semantic segmentation result; and

and setting a distance self-adaptive mask value based on the pole piece spacing.

In the above method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces, performing image block division on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks, including:

dividing the X-ray image of the lithium battery to be detected into a sequence of image blocks along the width direction of the X-ray image of the lithium battery to be detected.

In the above-mentioned distance adaptive mask generation method for supervised learning of lithium battery pole pieces, each image block in the sequence of image blocks is passed through ViT model containing an embedded layer to obtain a plurality of image block context feature vectors, including:

Using the embedding layer of the ViT model to respectively carry out embedded coding on each image block in the sequence of the image blocks so as to obtain a sequence of image block embedded vectors; and

a sequence of the image block embedding vectors is globally based context semantic encoded using a context encoder of the ViT model to obtain the plurality of image block context feature vectors.

In the above-mentioned distance adaptive mask generation method for supervised learning of lithium battery pole pieces, the embedding layer of ViT model is used to respectively perform embedding encoding on each image block in the sequence of image blocks to obtain a sequence of image block embedding vectors, which includes:

respectively expanding each image block in the sequence of image blocks into one-dimensional pixel input vectors to obtain a plurality of one-dimensional pixel input vectors; and

and performing full-connection coding on each one-dimensional pixel input vector in the plurality of one-dimensional pixel input vectors by using an embedding layer of the ViT model to obtain a sequence of the image block embedding vectors.

In the above method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces, performing feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix, including:

Performing feature affinity space affine learning optimization on the decoding feature matrix by using the following optimization formula to obtain the optimized decoding feature matrix;

wherein, the optimization formula is:

wherein M represents the decoding feature matrix, I M I ₂ Representing the two norms of the decoding feature matrix, I M I ^* Representing the nuclear norm of the decoding feature matrix, n being the scale of the decoding feature matrix, log representing the logarithmic function based on 2, exp (·) representing the exponential operation of the matrix, the exponential operation of the matrix representing the calculation of the natural exponential function value raised to the power of the feature value at each position in the matrix, as if it were the multiplication by position points, M' representing the optimized decoding feature matrix.

In the above method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces, setting a distance adaptive mask value based on the pole piece spacing includes:

the distance adaptive mask value is 0.3 times the pole piece spacing.

Compared with the prior art, the distance self-adaptive mask generation method for supervised learning of the lithium battery pole piece comprises the steps of firstly carrying out image block division on an obtained X-ray image of a lithium battery to be detected to obtain a sequence of image blocks, then respectively passing each image block in the sequence of the image blocks through a ViT model to obtain a plurality of image block context feature vectors, then carrying out matrix arrangement on the plurality of image block context feature vectors to obtain a decoding feature matrix, then carrying out feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix, then carrying out image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result, finally, determining pole piece spacing based on the image semantic segmentation result, and setting a distance self-adaptive mask value based on the pole piece spacing. Thus, the learning difficulty can be reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. The following drawings are not intended to be drawn to scale, with emphasis instead being placed upon illustrating the principles of the present application.

Fig. 1 is a graph of mask generation effects under different policies and parameters according to an embodiment of the present application.

Fig. 2 is an application scenario diagram of a distance adaptive mask generation method for supervised learning of lithium battery pole pieces according to an embodiment of the application.

Fig. 3 is a flowchart of a distance adaptive mask generation method for supervised learning of lithium battery pole pieces according to an embodiment of the present application.

Fig. 4 is a schematic architecture diagram of a distance adaptive mask generating method for supervised learning of lithium battery pole pieces according to an embodiment of the present application.

Fig. 5 is a flowchart of substep S130 of a distance adaptive mask generation method for lithium battery pole piece supervised learning according to an embodiment of the present application.

Fig. 6 is a flowchart of substep S131 of the distance adaptive mask generation method for lithium battery pole piece supervised learning according to an embodiment of the present application.

Fig. 7 is a block diagram of a distance adaptive mask generation system for lithium battery pole piece supervised learning according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present application without making any inventive effort, are also within the scope of the present application.

As used in this application and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative, and different aspects of the systems and methods may use different modules.

Flowcharts are used in this application to describe the operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application and not all of the embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.

As described above, the traditional detection mode of the pole piece of the lithium battery mainly relies on X-ray equipment to collect X-ray images of the lithium battery, and then the images are manually divided, so that the positioning detection of the pole piece is realized. The method has the problems of more manual intervention, low efficiency, low precision and the like. With the development of the deep learning technology, the lithium battery pole piece detection method based on the deep learning has the advantages of automation, high efficiency, high precision and the like, and becomes a current research hot spot. However, the lithium battery pole piece detection method based on deep learning has some problems, and how to generate a learnable mask label and how to balance the relationship between the dense pole piece and the sparse pole piece becomes a technical problem.

Accordingly, it is considered that in the actual quality detection process of the lithium battery, the number statistics of the pole pieces and the position statistics of the pole pieces are generally included. The existing detection mode relies on X-ray equipment to collect X-ray images of a lithium battery, so that after image enhancement processing is carried out, human eye observation and manual division are carried out on the enhanced images. With the development of deep learning, an intelligent solution is provided for detecting the pole piece of the lithium battery. Specifically, in the technical scheme of the application, the lithium battery pole piece detection task is modeled as an image segmentation task, and in consideration of completing image segmentation based on deep learning, a mask label capable of being learned is a basis for completing the whole training. Based on the above, a distance self-adaptive mask generation strategy is provided through observing the pole piece attribute, the relation between the dense pole pieces and the sparse pole pieces can be balanced, more mask areas are provided while each pole piece is distinguished as much as possible, and the learning difficulty is reduced.

Specifically, in the technical scheme of the application, the generation of the trainable mask is the basis for completing the image segmentation model based on the deep learning. Therefore, first, a fixed scale mask of a direct idea is proposed, and as shown in fig. 1, const-3, const-5 respectively represent mask generation effect diagrams directly using pixels with diameters of 1, 3, and 5. It can be seen that the fixed threshold strategy cannot be fully adapted to X-ray images with different magnification ratios at the same time, the mask generated by pole pieces with smaller diameters at larger distances will be small, whereas pole pieces with smaller diameters at closer distances will be very dense or even sticky. Based on the above, in the technical scheme of the application, a mask generation strategy for self-adapting to the distance between the pole pieces of the lithium battery is provided. As shown in FIG. 1, ada-0.1, ada-0.3, ada-0.5 represent the generation strategy with pole piece spacing multiplied by 0.1, 0.3, 0.5 as radii, respectively. It can be seen that for dense pole pieces, the mask will automatically decrease, while for sparse pole pieces, the mask will automatically expand in scope. Also, since sparse pole pieces are more difficult to find and detect, providing a larger mask size can indicate the importance of the current pole piece, giving more attention in calculating the loss function value during the training phase. Ada-0.3 was found to work best by experiment.

Specifically, in the technical scheme of the application, first, an X-ray image of a lithium battery to be detected is obtained. Next, considering that the detail hidden feature of the lithium battery pole piece in the X-ray image of the lithium battery to be detected is fine feature information of a small scale, in order to improve the expression capability of the detail hidden feature of the lithium battery pole piece in the X-ray image, the accuracy of detecting the lithium battery pole piece and the accuracy of evaluating the importance degree of the pole piece are improved. Specifically, the X-ray image of the lithium battery to be detected is divided into a sequence of image blocks along the width direction of the X-ray image of the lithium battery to be detected. It should be appreciated that the dimensions of each image block in the sequence of image blocks are reduced compared to the original image, and therefore, the hidden features of the small-sized lithium battery pole piece details in the X-ray image are no longer small-sized objects in each image block, so as to facilitate the subsequent inter-pole piece position detection and the importance level detection of the pole pieces.

Each image block in the sequence of image blocks is then passed through a ViT model containing an embedded layer to obtain a plurality of image block context feature vectors, respectively. In particular, the sequence of image blocks is input to the embedding layer to obtain a sequence of image block embedding vectors, in particular, here, the embedding layer linearly projects each image block in the sequence of image blocks as a one-dimensional embedding vector by means of a learnable embedding matrix. The embedding process is realized by firstly arranging pixel values of all pixel positions in each image block into one-dimensional vectors, and then carrying out full-connection coding on the one-dimensional vectors by using a full-connection layer.

Further, considering that each image block of the sequence of image blocks is image data, and the implicit characteristic information about the lithium battery pole pieces in each image block has an association relationship about the whole image, when feature mining of each image block is performed using a convolutional neural network model having excellent performance in terms of implicit characteristic extraction of the image, it is difficult for a pure CNN method to learn explicit global and remote semantic information interactions in consideration of inherent limitations due to convolution operations. Therefore, in the technical scheme of the application, the sequence of the image block embedded vectors is encoded in the ViT model to extract the implicit context semantic association features of each image block on the lithium battery pole piece, so as to obtain a plurality of image block context feature vectors. It should be understood that ViT, like a transducer, may directly process each image block through a self-attention mechanism, so as to extract the implicit context semantic association feature information about the lithium battery pole piece in each image block, which is beneficial to the detection of the pole position and the pole piece importance degree.

And then, after obtaining the implicit context semantic association characteristic information about the lithium battery pole piece in each image block, integrating the characteristic information in each image block into an image global characteristic representation, namely, arranging the context characteristic vectors of the plurality of image blocks in a matrix to obtain a decoding characteristic matrix. Then, in order to perform positioning detection on the lithium battery pole piece, image semantic segmentation is required to be performed on the decoding feature matrix, so that corresponding masking operation is performed after the position area of the lithium battery pole piece is detected, and an image semantic segmentation result is obtained. In this way, based on the image semantic segmentation result, the pole piece distance is determined, and the distance adaptive mask value is set, so that the mask value is reduced in a self-adaptive manner for dense pole pieces and the mask value is increased in a self-adaptive manner for sparse pole pieces by utilizing a mask generation strategy of the distance adaptation. Also, since the sparse pole pieces are more difficult to find and detect, providing a larger mask size also represents the importance of the current pole piece, giving more attention in training to calculate the loss function value. In particular, here, the distance adaptive mask value is 0.3 times the pole piece pitch.

In particular, in the technical solution of the present application, for the decoding feature matrix obtained by arranging the context feature vectors of the plurality of image blocks, since the context feature vectors of the plurality of image blocks are obtained when the context correlation encoding of the image semantics is performed on each image block by the ViT model, and the image semantics of each image block has an explicit difference, the difference is distributed and diffused in the feature space by the ViT model, so that the feature distribution of the decoding feature matrix itself is more discrete, that is, the correlation between the image semantics of each part of the decoding feature matrix is insufficient, which affects the accuracy of image semantic segmentation of the decoding feature matrix.

Based on this, in the technical solution of the present application, it is preferable to first perform feature affinity space affine learning on the decoding feature matrix to perform optimization, expressed as:

wherein M is a diagonal matrix obtained by performing linear transformation on the decoding feature matrix, and M is equal to M ₂ Representing the two norms of the matrix, i.e. M ^T The maximum eigenvalue of M is set, I M I ^* Represents the core norms of the matrix, i.e. the sum of the eigenvalues of the matrix, and n is the scale of the matrix, i.e. the width times the height, M' being the optimized diagonal matrix.

Here, the feature affinity space affine learning performs affine migration based on spatial transformation with relatively low-resolution information characterization by performing detailed structured information expression in a low-dimensional eigensubspace on high-resolution information characterization in an image semantic distribution space of a diagonal matrix obtained by the decoding feature matrix transformation, so as to implement super-resolution (e.g., feature value-by-feature value or feature segment-by-feature segment) activation of the image semantic distribution at respective local parts of the global matrix feature distribution based on affinity (affine) dense simulation between the image semantic characterizations. And then, the optimized diagonal matrix M' is subjected to inverse linear transformation corresponding to the linear transformation to obtain an optimized decoding feature matrix, so that the relevance among the semantic distributions of each partial image of the decoding feature matrix can be improved, and the accuracy of image semantic segmentation of the decoding feature matrix is improved. Therefore, the image mask can be adaptively generated based on the actual distance and importance degree between the lithium battery pole pieces, so that the relationship between the dense pole pieces and the sparse pole pieces is balanced, the pole piece detection accuracy is improved, and the lithium battery quality detection efficiency and accuracy are optimized.

Fig. 2 is an application scenario diagram of a distance adaptive mask generation method for supervised learning of lithium battery pole pieces according to an embodiment of the application. As shown in fig. 2, in this application scenario, first, an X-ray image of a lithium battery to be detected (for example, D illustrated in fig. 2) is acquired, then, the X-ray image of the lithium battery to be detected is input into a server (for example, S illustrated in fig. 2) in which a distance adaptive mask generation algorithm for supervised learning of a lithium battery pole piece is deployed, where the server can process the X-ray image of the lithium battery to be detected using the distance adaptive mask generation algorithm for supervised learning of a lithium battery pole piece to obtain an image semantic segmentation result, then, a pole piece pitch is determined based on the image semantic segmentation result, and a distance adaptive mask value is set based on the pole piece pitch.

Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.

Fig. 3 is a flowchart of a distance adaptive mask generation method for supervised learning of lithium battery pole pieces according to an embodiment of the present application. As shown in fig. 3, a distance adaptive mask generation method for supervised learning of a lithium battery pole piece according to an embodiment of the present application includes the steps of: s110, acquiring an X-ray image of a lithium battery to be detected; s120, dividing the image blocks of the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks; s130, respectively passing each image block in the sequence of image blocks through a ViT model containing an embedded layer to obtain a plurality of image block context feature vectors; s140, arranging the context feature vectors of the image blocks in a matrix manner to obtain a decoding feature matrix; s150, performing feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix; s160, performing image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result; s170, determining the distance between pole pieces based on the image semantic segmentation result; and S180, setting a distance self-adaptive mask value based on the pole piece spacing.

Fig. 4 is a schematic architecture diagram of a distance adaptive mask generating method for supervised learning of lithium battery pole pieces according to an embodiment of the present application. As shown in fig. 4, in the network architecture, first, an X-ray image of a lithium battery to be detected is acquired; then, carrying out image block division on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks; then, each image block in the sequence of image blocks is respectively passed through a ViT model containing an embedded layer to obtain a plurality of image block context feature vectors; then, the context feature vectors of the image blocks are arranged in a matrix mode to obtain a decoding feature matrix; then, performing feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix; then, performing image semantic segmentation on the optimized decoding feature matrix to obtain an image semantic segmentation result; then, determining the distance between pole pieces based on the image semantic segmentation result; finally, a distance adaptive mask value is set based on the pole piece spacing.

More specifically, in step S110, an X-ray image of the lithium battery to be detected is acquired. For example, an X-ray image of the lithium battery to be detected can be acquired by an X-ray device.

More specifically, in step S120, image block division is performed on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks. Because the detail hidden features of the lithium battery pole piece in the X-ray image of the lithium battery to be detected are small-scale fine feature information, in order to improve the expression capability of the hidden features of the lithium battery pole piece in the X-ray image, the accuracy of detecting the lithium battery pole piece and the accuracy of evaluating the importance degree of the pole piece are improved.

It should be appreciated that the dimensions of each image block in the sequence of image blocks are reduced compared to the original image, and therefore, the hidden features of the small-sized lithium battery pole piece details in the X-ray image are no longer small-sized objects in each image block, so as to facilitate the subsequent inter-pole piece position detection and the importance level detection of the pole pieces.

Accordingly, in a specific example, performing image block division on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks, including: dividing the X-ray image of the lithium battery to be detected into a sequence of image blocks along the width direction of the X-ray image of the lithium battery to be detected.

More specifically, in step S130, each image block in the sequence of image blocks is passed through a ViT model containing an embedding layer to obtain a plurality of image block context feature vectors, respectively. In particular, the sequence of image blocks is input to the embedding layer to obtain a sequence of image block embedding vectors, in particular, here, the embedding layer linearly projects each image block in the sequence of image blocks as a one-dimensional embedding vector by means of a learnable embedding matrix. The embedding process is realized by firstly arranging pixel values of all pixel positions in each image block into one-dimensional vectors, and then carrying out full-connection coding on the one-dimensional vectors by using a full-connection layer. It should be understood that ViT, like a transducer, may directly process each image block through a self-attention mechanism, so as to extract the implicit context semantic association feature information about the lithium battery pole piece in each image block, which is beneficial to the detection of the pole position and the pole piece importance degree.

Accordingly, in one specific example, as shown in fig. 5, passing each image block in the sequence of image blocks through a ViT model containing an embedding layer to obtain a plurality of image block context feature vectors, includes: s131, using an embedding layer of the ViT model to respectively carry out embedding coding on each image block in the sequence of the image blocks so as to obtain a sequence of image block embedding vectors; and S132, performing global-based context semantic coding on the sequence of image block embedded vectors using a context encoder of the ViT model to obtain the plurality of image block context feature vectors.

It should be appreciated that by the context encoder, the relationship between a certain word segment and other word segments in the vector representation sequence may be analyzed to obtain corresponding feature information. The context encoder aims to mine for hidden patterns between contexts in the word sequence, optionally the encoder comprises: CNN (Convolutional Neural Network ), recurrent NN (RecursiveNeural Network, recurrent neural network), language Model (Language Model), and the like. The CNN-based method has a better extraction effect on local features, but has a poor effect on Long-Term Dependency (Long-Term Dependency) problems in sentences, so Bi-LSTM (Long Short-Term Memory) based encoders are widely used. The repetitive NN processes sentences as a tree structure rather than a sequence, has stronger representation capability in theory, but has the weaknesses of high sample marking difficulty, deep gradient disappearance, difficulty in parallel calculation and the like, so that the repetitive NN is less in practical application. The transducer has a network structure with wide application, has the characteristics of CNN and RNN, has a better extraction effect on global characteristics, and has a certain advantage in parallel calculation compared with RNN (RecurrentNeural Network ).

Accordingly, in a specific example, as shown in fig. 6, each image block in the sequence of image blocks is respectively subjected to embedding encoding by using an embedding layer of the ViT model to obtain a sequence of image block embedding vectors, including: s1311, respectively expanding each image block in the sequence of image blocks into one-dimensional pixel input vectors to obtain a plurality of one-dimensional pixel input vectors; and S1312, performing full-connection coding on each one-dimensional pixel input vector in the plurality of one-dimensional pixel input vectors by using an embedding layer of the ViT model to obtain a sequence of the image block embedding vectors.

More specifically, in step S140, the plurality of image block context feature vectors are matrix-arranged to obtain a decoding feature matrix. After the implicit context semantic association characteristic information about the lithium battery pole piece in each image block is obtained, the characteristic information in each image block is integrated into an image global characteristic representation, that is, the context characteristic vectors of the image blocks are arranged in a matrix to obtain a decoding characteristic matrix.

More specifically, in step S150, the feature distribution of the decoded feature matrix is optimized to obtain an optimized decoded feature matrix.

In particular, in the technical solution of the present application, for the decoding feature matrix obtained by arranging the context feature vectors of the plurality of image blocks, since the context feature vectors of the plurality of image blocks are obtained when the context correlation encoding of the image semantics is performed on each image block by the ViT model, and the image semantics of each image block has an explicit difference, the difference is distributed and diffused in the feature space by the ViT model, so that the feature distribution of the decoding feature matrix itself is more discrete, that is, the correlation between the image semantics of each part of the decoding feature matrix is insufficient, which affects the accuracy of image semantic segmentation of the decoding feature matrix. Based on this, in the technical solution of the present application, it is preferable to first perform feature affinity space affine learning on the decoding feature matrix to perform optimization.

Accordingly, in a specific example, performing feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix includes: performing feature affinity space affine learning optimization on the decoding feature matrix by using the following optimization formula to obtain the optimized decoding feature matrix; wherein, the optimization formula is:

Here, the feature affinity space affine learning performs affine migration based on spatial transformation with relatively low-resolution information characterization by performing detailed structured information expression in a low-dimensional eigensubspace on high-resolution information characterization in an image semantic distribution space of a diagonal matrix obtained by the decoding feature matrix transformation, thereby realizing super-resolution activation of image semantic distribution at respective local parts of a global matrix feature distribution based on affinity dense simulation between image semantic characterization. And then, the optimized diagonal matrix is subjected to inverse linear transformation corresponding to the linear transformation to obtain an optimized decoding feature matrix, so that the relevance among the semantic distributions of each partial image of the decoding feature matrix can be improved, and the accuracy of image semantic segmentation of the decoding feature matrix is improved. Therefore, the image mask can be adaptively generated based on the actual distance and importance degree between the lithium battery pole pieces, so that the relationship between the dense pole pieces and the sparse pole pieces is balanced, the pole piece detection accuracy is improved, and the lithium battery quality detection efficiency and accuracy are optimized.

More specifically, in step S160, the optimized decoded feature matrix is subjected to image semantic segmentation to obtain an image semantic segmentation result. In order to perform positioning detection on the lithium battery pole piece, image semantic segmentation is required to be performed on the decoding feature matrix, so that corresponding masking operation is performed after the position area of the lithium battery pole piece is detected, and an image semantic segmentation result is obtained.

More specifically, in step S170, the pole piece pitch is determined based on the image semantic segmentation result. And determining the pole piece distance based on the image semantic segmentation result, and setting a distance self-adaptive mask value to reduce the mask value for the dense pole pieces and increase the mask value for the sparse pole pieces in a self-adaptive manner by utilizing a mask generation strategy of the distance self-adaptive manner.

More specifically, in step S180, a distance adaptive mask value is set based on the pole piece pitch. Providing a larger mask size also represents the importance of the current pole piece, as the sparse pole piece is more difficult to find and detect, thereby giving more attention in training to calculate the loss function value.

Accordingly, in one specific example, setting a distance adaptive mask value based on the pole piece pitch includes: the distance adaptive mask value is 0.3 times the pole piece spacing.

Fig. 7 is a block diagram of a distance adaptive mask generation system 100 for lithium battery pole piece supervised learning according to an embodiment of the present application. As shown in fig. 7, a distance adaptive mask generation system 100 for supervised learning of lithium battery pole pieces according to an embodiment of the present application includes: an image acquisition module 110, configured to acquire an X-ray image of a lithium battery to be detected; the image block dividing module 120 is configured to perform image block division on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks; an encoding module 130, configured to pass each image block in the sequence of image blocks through a ViT model including an embedded layer to obtain a plurality of image block context feature vectors; a matrix arrangement module 140, configured to perform matrix arrangement on the plurality of image block context feature vectors to obtain a decoding feature matrix; the optimizing module 150 is configured to perform feature distribution optimization on the decoding feature matrix to obtain an optimized decoding feature matrix; the image semantic segmentation module 160 is configured to perform image semantic segmentation on the optimized decoded feature matrix to obtain an image semantic segmentation result; a distance determining module 170, configured to determine a pole piece distance based on the image semantic segmentation result; and a distance adaptive mask value setting module 180, configured to set a distance adaptive mask value based on the pole piece pitch.

In one example, in the distance adaptive mask generation system 100 for supervised learning of lithium battery pole pieces, the image block division module 120 is configured to: dividing the X-ray image of the lithium battery to be detected into a sequence of image blocks along the width direction of the X-ray image of the lithium battery to be detected.

In one example, in the distance adaptive mask generation system 100 for lithium battery pole piece supervised learning, the encoding module 130 is configured to: using the embedding layer of the ViT model to respectively carry out embedded coding on each image block in the sequence of the image blocks so as to obtain a sequence of image block embedded vectors; and performing global-based context semantic coding on the sequence of image block embedded vectors using a context encoder of the ViT model to obtain the plurality of image block context feature vectors.

In one example, in the above-mentioned distance adaptive mask generation system 100 for supervised learning of lithium battery pole pieces, the embedding layer of the ViT model is used to perform embedding encoding on each image block in the sequence of image blocks to obtain a sequence of image block embedding vectors, which includes: respectively expanding each image block in the sequence of image blocks into one-dimensional pixel input vectors to obtain a plurality of one-dimensional pixel input vectors; and performing full-connection encoding on each one-dimensional pixel input vector in the plurality of one-dimensional pixel input vectors by using an embedding layer of the ViT model to obtain a sequence of the image block embedding vectors.

In one example, in the above-described distance adaptive mask generation system 100 for supervised learning of lithium battery pole pieces, the optimization module 150 is configured to: performing feature affinity space affine learning optimization on the decoding feature matrix by using the following optimization formula to obtain the optimized decoding feature matrix; wherein, the optimization formula is:

In one example, in the above-described lithium battery pole piece supervised learning oriented distance adaptive mask generation system 100, the distance adaptive mask value is 0.3 times the pole piece pitch.

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective modules in the above-described lithium battery pole-piece-supervised learning-oriented distance adaptive mask generation system 100 have been described in detail in the above description of the lithium battery pole-piece-supervised learning-oriented distance adaptive mask generation method with reference to fig. 2 to 6, and thus, repetitive descriptions thereof will be omitted.

As described above, the distance adaptive mask generation system 100 for lithium battery pole piece supervised learning according to the embodiment of the present application may be implemented in various wireless terminals, for example, a server or the like having a distance adaptive mask generation algorithm for lithium battery pole piece supervised learning. In one example, the lithium battery pole-piece-oriented supervised learning distance adaptive mask generation system 100 according to embodiments of the present application may be integrated into a wireless terminal as one software module and/or hardware module. For example, the lithium battery pole piece supervised learning-oriented distance adaptive mask generation system 100 may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the lithium battery pole piece supervised learning-oriented distance adaptive mask generation system 100 may also be one of a number of hardware modules of the wireless terminal.

Alternatively, in another example, the lithium battery pole piece supervised learning-oriented distance adaptive mask generation system 100 and the wireless terminal may also be separate devices, and the lithium battery pole piece supervised learning-oriented distance adaptive mask generation system 100 may be connected to the wireless terminal through a wired and/or wireless network and transmit interaction information in a agreed data format.

According to another aspect of the present application, there is also provided a non-volatile computer-readable storage medium having stored thereon computer-readable instructions which, when executed by a computer, can perform a method as described above.

Program portions of the technology may be considered to be "products" or "articles of manufacture" in the form of executable code and/or associated data, embodied or carried out by a computer readable medium. A tangible, persistent storage medium may include any memory or storage used by a computer, processor, or similar device or related module. Such as various semiconductor memories, tape drives, disk drives, or the like, capable of providing storage functionality for software.

All or a portion of the software may sometimes communicate over a network, such as the internet or other communication network. Such communication may load software from one computer device or processor to another. For example: a hardware platform loaded from a server or host computer of the video object detection device to a computer environment, or other computer environment implementing the system, or similar functioning system related to providing information needed for object detection. Thus, another medium capable of carrying software elements may also be used as a physical connection between local devices, such as optical, electrical, electromagnetic, etc., propagating through cable, optical cable, air, etc. Physical media used for carrier waves, such as electrical, wireless, or optical, may also be considered to be software-bearing media. Unless limited to a tangible "storage" medium, other terms used herein to refer to a computer or machine "readable medium" mean any medium that participates in the execution of any instructions by a processor.

This application uses specific words to describe embodiments of the application. Reference to "a first/second embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present application. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present application may be combined as suitable.

Furthermore, those skilled in the art will appreciate that the various aspects of the invention are illustrated and described in the context of a number of patentable categories or circumstances, including any novel and useful procedures, machines, products, or materials, or any novel and useful modifications thereof. Accordingly, aspects of the present application may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by a combination of hardware and software. The above hardware or software may be referred to as a "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may take the form of a computer product, comprising computer-readable program code, embodied in one or more computer-readable media.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The foregoing is illustrative of the present invention and is not to be construed as limiting thereof. Although a few exemplary embodiments of this invention have been described, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this invention. Accordingly, all such modifications are intended to be included within the scope of this invention as defined in the following claims. It is to be understood that the foregoing is illustrative of the present invention and is not to be construed as limited to the specific embodiments disclosed, and that modifications to the disclosed embodiments, as well as other embodiments, are intended to be included within the scope of the appended claims. The invention is defined by the claims and their equivalents.

Claims

1. A distance self-adaptive mask generation method for supervised learning of a lithium battery pole piece is characterized by comprising the following steps:

acquiring an X-ray image of a lithium battery to be detected;

2. The method for generating the distance adaptive mask for supervised learning of the lithium battery pole pieces according to claim 1, wherein the image block division is performed on the X-ray image of the lithium battery to be detected to obtain a sequence of image blocks, and the method comprises the following steps:

3. The method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces of claim 2, wherein passing each image block in the sequence of image blocks through a ViT model comprising an embedded layer to obtain a plurality of image block context feature vectors, respectively, comprises:

4. The method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces according to claim 3, wherein the embedding layer of the ViT model is used to respectively perform embedding encoding on each image block in the sequence of image blocks to obtain a sequence of image block embedding vectors, and the method comprises the following steps:

5. The method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces of claim 4, wherein performing feature distribution optimization on the decoded feature matrix to obtain an optimized decoded feature matrix comprises:

wherein, the optimization formula is:

wherein M represents the decoding feature matrix, M ₂ Representing the two norms, M, of the decoding feature matrix ^* Representing the nuclear norm of the decoding feature matrix, n being the scale of the decoding feature matrix, log representing the logarithmic function based on 2, exp (·) representing the exponential operation of the matrix, the exponential operation of the matrix representing the calculation of the natural exponential function value raised to the power of the feature value at each position in the matrix, as if it were the multiplication by position points, M' representing the optimized decoding feature matrix.

6. The method for generating a distance adaptive mask for supervised learning of lithium battery pole pieces of claim 5, wherein setting a distance adaptive mask value based on the pole piece pitch comprises:

The distance adaptive mask value is 0.3 times the pole piece spacing.