PSI - Issue 75

Philippe AMUZUGA et al. / Procedia Structural Integrity 75 (2025) 53–64 Author name / Structural Integrity Procedia 00 (2025) 000–000

63

11

• Moderate yet significant perturbation ( P = 10%, A = 30 %) paradoxically improves predictive performance, reducing RMSE to 0.041 and increasing accuracy to 80 % at ± 1 %, outperforming the noise-free model. • Conversely, low-amplitude noise ( A = 10 %) with the same proportion ( P = 10 %) degrades performance (RMSE of 0.085, 45 % accuracy at ± 1%). These results suggest that controlled noise injection may act as implicit regularization, improving model general ization by mitigating overfitting. This raises the fundamental question of whether intentional noise injection can be beneficial in certain modeling contexts. However, this e ff ect must be confirmed for reproducibility, as it could be highly dependent on the noise type and level. Exceeding a critical threshold may instead cause a sharp decline in performance. Several directions are proposed for further investigation: • Extended exploration of the ( P , A ) noise parameter space : Analyze finer and continuous ranges to identify transitions between beneficial and harmful e ff ects and accurately characterize a critical threshold. • Statistical analysis of noise e ff ects : Investigate how noise alters statistical moments and the correlation struc ture of the target, to better understand the observed regularization mechanisms. • Robustness of the feature selection module : Test its resilience to oriented perturbations (e.g., biased or direc tional noise), including within the context of Byzantine attacks [14, 15]. This study aligns with the broader challenge of machine learning robustness—an essential requirement for indus trial deployment, especially in critical sectors such as pressure equipment design. Ensuring prediction stability under data imperfections is crucial for operational integration. These results also open a promising path, namely the possibility of mathematically formalizing the conditions under which noise injection improves the performance of a regularized GLM. Such a formalization, analogous to the Gauss–Markov theorem, would ground these empirical observations in a rigorous theoretical framework. Ultimately, this could lead to native integration of optimized noise injection modules in libraries such as scikit-learn , with P and A as hyperparameters automatically tuned via methods like GridSearchCV , or analytically determined if optimal conditions are proven. This study assessed the robustness of a Generalized Linear Model (GLM) for predicting the fatigue life of T welded joints under simulated perturbations introduced by Gaussian noise injected into the target variable. Building on previous work that established the GLM’s accuracy and interpretability on heterogeneous datasets, we quantitatively analyzed the impact of controlled contamination on the model’s structural stability and predictive performance. The results highlight the GLM’s notable resilience, both in retaining key variables ( F , FAT, t 2 ) and in maintaining high predictive accuracy. A counterintuitive phenomenon was observed: under moderate contamination ( P = 10%) and high noise amplitude ( A = 30 %), performance improved significantly compared to the noise-free case (RMSE decreased from 0.075 to 0.041; accuracy at ± 1 % increased from 56 % to 80 %). Conversely, weaker perturbation ( P = 10%, A = 10 %) degraded performance, suggesting a critical threshold in the ( P , A ) space beyond which noise acts as beneficial implicit regularization. Structurally, higher noise amplitude consistently increased model complexity (number of terms from 6 to 8) without altering the nature of selected variables, confirming the GLM’s functional robustness. This is highly relevant for engineering, as it identifies robust design parameters that support reliable sizing under uncertain industrial conditions. However, this robustness is not unconditional: poorly calibrated noise may impair performance or destabilize the model. Thus, rigorous control of data quality and adherence to the model’s validity domain remain essential. These findings open several research avenues: (i) fine exploration of the critical threshold in the ( P , A ) space, (ii) investigation of the statistical mechanisms underlying the beneficial e ff ect of noise, and (iii) evaluation of robustness against oriented or malicious perturbations (Byzantine attacks). Ultimately, a mathematical formalization of these e ff ects could enable native integration of optimized noise injection mechanisms into machine learning libraries, with P and A as tunable hyperparameters—thereby enhancing the tools available to engineers and data scientists. 5. Conclusion

Made with FlippingBook flipbook maker