PSI - Issue 62
Fabio Parisi et al. / Procedia Structural Integrity 62 (2024) 701–709 F. Parisi et al. / Structural Integrity Procedia -- (2024) _ – _
706
6
clustering on the Spearman rank-order correlations (Chok, 2010): it groups features measuring a distance based on the Spearman correlation. In particular: (i) different thresholds were chosen, and consequently, different groups of features were identified; (ii) ten RF models were trained for each group by using ten-fold cross-validation; (iii) the variation of the prediction performance for these groups was investigated.
Table 3. Comparison between RF and NN in predicting the EDPs of the pier P3. Metric rf 0 rf 1 rf 2 R 2 0.984 0.984 0.984 mae [m] 0.0074 0.0075 0.0075 sae [m] 0.743 0.752 0.751 ae max [m] 0.052 0.053 0.053
Figure 2. Performance in predicting the displacement 3̂ of the top of the P3 for rf 0 (green), rf 1 (blue) and rf 2 (red)
Figure 3. Performance scores box plot, obtained with the ten-fold cross-validation, in predicting 3̂ for rf 0 (green), rf 1 (blue) and rf 2 (red)
Table 4 reports the feature groups obtained by the hierarchical clustering (i), while Figure 4 shows the distribution of the predictions of the RF models trained on the different groups of features (ii-iii). In Figure 4 it is possible to highlight that the strongest change in score is noticeable between feature groups 4 and 5. RF performance worsens when switching from group 4 to 5, thus, group 4 represents the better choice in terms of the trade-off between features’ numerosity and model performance. This evidence helps determine which portion of the features to use for each model. Table 4. Considered feature groups and relative identification number (ID) Feature group ID Features selected 0
, , , , , , (1.00 ) , ( ) , ( ) , ( ) , ( − 1.5 ) , ( −2 ) , ( −1.5 ) , , , , , , , , , , , , ( ) , , , , , , , , , , , , , , , , , , ,, , , , , , , , , , , , ,
1 2 3 4 5 6
,,
Figure 4. Prediction performance of RF algorithms on varying group of features in Table 4.
5. Results 5.1. Performance of the model
Considering a test set of 100 instances from , Figure 5 and Figure 6 show a measure of the predicting performance of the model trained on the feature group 4 in Table 4. Figure 5 reports a comparison among the predicted values of the EDP of the first pier 3̂ (on the y-axis) and the NLTHA values 3 (on the x-axis), while Figure 6 refers to the forth pier. In both the figures, the red line represents the bisector of the graph and the ideal condition for the EDP predicted to perfectly match the test set values. This comparison highlights graphically the good performance of RF in both the cases: the predictions are squeezed on the red line and generally accurate. The performance highlighted in Figure 5 and Figure 6 is expressed, in terms of multiple summary metrics, in Table 5, where the coefficient of determination, R 2 , the , the sum of the absolute errors ( ) and the maximum of the
Made with FlippingBook Ebook Creator