PSI - Issue 64
Jie Wang et al. / Procedia Structural Integrity 64 (2024) 1326–1333 Author name / Structural Integrity Procedia 00 (2019) 000–000
1328
3
2.1. Dense matching of feature points First, dense matching of feature points was performed on the region of interest (ROI) in two frames of images to track the displacement of feature points between two moments. The global displacement field was then evaluated based on the displacement of discrete points, visualized as a displacement cloud map. Due to the presence of crack opening and closing, discontinuity appeared in the displacement field around the crack. Detection methods of feature points like SIFT by Lowe et al. (2004) typically yield sparse feature points. In experiments conducted on a pair of images with resolution of 3840×2160 and a total of over 8 million pixels, the number of matched point pairs obtained by SIFT and brute-force matching was less than 1%. In addition, false match was observed. For steel surfaces with sparse textures, both amount and accuracy of the matched feature points dramatically decreases. A certain quantity or density of matched point pairs is necessary to accurately evaluate the displacement field. For the DIC technique, distinct features are introduced by speckles, which effectively increase both the quantity and accuracy of matching. However, it is generally complex for operation. In this study, a dense matching neural network model called LoFTR by Sun et al. (2021) was explored. The model calculated correlation of all pixels between image pairs and selected high-confidence point pairs as the matched point pairs. The LoFTR model employed a convolutional neural network (CNN) to extract feature maps. Two feature maps of 1/8 size and 1/2 size were generated for each input image. All the feature maps were then encoded by a Transformer module with linear attention. Afterward, correlation of all pixels was computed for two feature maps of 1/8 size. Non maximum suppression and threshold filtering were applied to obtain the integer-pixel coordinates of matched point pairs. For each pair of matched points, a small window was extracted from the feature map of 1/2 size and further encoded by the Transformer module to optimize matching precision. Eventually, sub-pixel coordinates for the matched point pairs were acquired. To yield more matched point pairs, the original LoFTR model was fine-tuned on the steel structure dataset. Matching results of a steel image pair (described in detail in section 3) based on the original model and fine-tuned model is shown in Fig. 1.
(a)
(b)
Fig. 1. Matching results on LoFTR model (a) matching results on original LoFTR model; (b) matching results on fine-tuned LoFTR model.
The origin model detected 166 matched point pairs, while the fine-tuned model detected 1840 pairs. However, the quantity was still far from enough to access the displacement filed. Further improvement was made on the model, including upsampling with a higher resolution, dividing images into overlapping patches and displacement smoothness constraint.
Made with FlippingBook Digital Proposal Maker