PSI - Issue 66
Andrii Kompanets et al. / Procedia Structural Integrity 66 (2024) 388–395 Author name / Structural Integrity Procedia 00 (2025) 000–000
392
5
where k = 0.7 is the chosen parameter, N = 4 represents the total number of ConvNext stages, and n is the stage for which the learning rate is being calculated. The value of k was empirically selected using validation patches to ensure good performance on the CSB dataset. The weight regularization parameter for the ADAMW optimization algorithm is set to 10 -5 . Before each training session, the patches from the training subset are randomly divided into training and validation sets with a 90% and 10% split. Additionally, during neural network training, data augmentation is applied each time a patch is selected from the training set. The following sequence of augmentation steps is performed on each patch, with each step being either applied or skipped based on a specified probability: Horizontal flip with 50% probability and vertical flip with 50% probability; Rotation by 0, 90, 180, or 270 degrees, each with 25% probability; Random adjustment of image brightness, contrast, and saturation in the range from 75% to 125%; Random image hue adjustment in the range -10% to +10%; With a 50% probability addition of random image noise, by multiplying each pixel by a value randomly sampled in the range between 0.9 and 1.1; With a probability of 40% one of the following: o random spatial reduction of the image with a scale factor in the range from 75% to 125%, with subsequent padding with mirroring to keep the patch dimension constant; o zoom in on the image with a scale in the range from 75% to 100%, with subsequent interpolation, to keep the patch size the same. 3.2. Evaluation The proposed neural network was tested both on the set of test patches consisting of 200 patches and entire test images of the CSB dataset. To produce segmentation maps for entire images, they are split into patches of size 512×512 pixels and each patch is processed separately. Finally, the output segmentation maps of all patches from the test image are stitched together to produce a final segmentation map for the entire image. Common metrics used to evaluate crack segmentation performance include precision Pr, recall Re , F1-score F1 , and Intersection over Union IoU . To compute these metrics, it is necessary to determine the number of True Positive TP , False Positive FP , and False Negative FN pixels in the segmentation output: � � / � � � (3) � / � � �� (4) 1 � � �� �� �� � �� � � � �� �� ������ (5) 1 � �� �� ������ (6) Precision may be interpreted as a fraction of correctly segmented crack pixels among all pixels segmented as a crack by an algorithm. Recall may be interpreted as correctly segmented crack pixels among all crack pixels according to a ground truth annotation. The F1 -score is the harmonic mean of precision and recall. The IoU compares the predicted crack region with the ground truth crack region by dividing the area of their intersection by the area of their union. The F1 -score and IoU metrics are qualitatively the same indicator therefore, we chose the F1 -score to be our primary metric for comparison of the neural network performances, while the IoU is also provided as it is also common in literature. In this work, all neural network architectures output segmentation maps with values between 0 and 1. To generate the binary segmentation maps needed for metric calculation, a threshold of 0.5 is applied. Pixels with values greater than 0.5 are classified as 1, while those below are classified as 0. Typically, metrics like precision, recall, F1 score, and Intersection over Union are calculated for each image in the test set, and the average of these values is used to represent overall performance. However, this approach can be misleading, especially in datasets like the CSB dataset, where the proportion of crack pixels varies significantly between images, and there is a large imbalance
Made with FlippingBook Ebook Creator