PSI - Issue 46

Nithin Konda et al. / Procedia Structural Integrity 46 (2023) 87–93 Nithin Konda et al. / Structural Integrity Procedia 00 (2019) 000–000

90

4

Table 3. Random Forest Hyperparameters. Hyperparameters Value Bootstrap True Max Features Log2 Minimum Samples Split 4 Criterion Number of Estimators

Mean Squared Error 100

2.3.2 Extreme Gradient Boosting Algorithm Extreme Gradient Boosting (XGB), unlike Random Forest, is a boosting ensemble method. It is optimized boosting algorithm which uses a regularization term which makes this algorithm robust to most of the datasets available. The objective function is given by the sum of loss function and regularization term shown in the equation below: ��� �� � , � � � �� � � � � (4) Where l = loss function representing the difference between the actual target value and the predicted target value and � � � is the regularization term. � � � � � � ��� � � � � (5) Where ‘T’ is the number of trees, ω is the leaf weight, is the pruning index, and is the scaling factor of the weights(Mythreyi et al. 2021). This algorithm develops the regression trees in series such that the residuals of the preceding regression tree are trained to the succeeding regression tree. Hence, eventually the patterns are identified when residuals approach to zero. The standout features of this algorithm are same as the Random Forest but the awareness of this algorithm to the sparse data makes it superior. This is achieved using the regularization parameter in the XGB. The Hyper parameters for this algorithm are tuned using Grid Search CV. The best hyper parameters obtained for this XGB algorithm are shown in Table 4 below.

Table 4. XGB Hyperparameters Hyperparameters Value Booster Gb tree Lambda 0.5 Learning Rate 0.2 Maximum Depth Number of Estimators 2 100

3. Model Validation Once the models are developed using the above algorithms, the predicted results are compared with the actual values for the unseen data. For this, we make use of some metrics that tell us how the model is performing on the train and the test data. For the regression problem, the degree of closeness between the experimental data and the predicted data establishes the goodness of fit in a Machine learning (ML) model. Their performances are measured by mean squared error, mean absolute error and R2 scores ���� � � �∑� � � � �� � (6) ���� � � �� � � � � (7) � ��� ∑�� � �� � � � ∑�� � �� � � � (8) where, � is the experimental value, � is the predicted value and � is mean of the experimental values. We have to ensure sure that for these algorithms mean squared error, mean absolute error are minimum and R2 score is nearing

Made with FlippingBook flipbook maker