PSI - Issue 46
Nithin Konda et al. / Procedia Structural Integrity 46 (2023) 87–93 Nithin Konda et al. / Structural Integrity Procedia 00 (2019) 000–000
89
3
independent variables whereas, the fatigue life is the target variable. Each data set has 125 data points which belongs to 5 independent variables. These data set has constraint with its length which promotes the decision of reducing its dimensionality by combining all processing parameters into single metric ‘Energy Density’. �� �⋅ � �⋅� (1) where, E, P, h, v, and t are the energy density (j), laser power (w), hatch spacing (mm), scan velocity (mm/s), and layer thickness (microns), respectively. The target column of fatigue life scaled to logarithmic dimension. Normalization is performed on two independent columns which distributed values over the mean. The final data set contains 2 independent variables which are normalized and scaled target variable (fatigue life) to logarithmic dimension as shown in Table 2. The dataset is split into train (80%) and test (20%), where the model has been developed on the train dataset. The test dataset is used to verify the developed model performing on the unseen data. The performance was analysed using metrics such as mean squared error, mean absolute error, and R2 score.
Table 2. Sample of datasheet. Energy Density Stress Amplitude Fatigue Life (Log Scale) -0.022842 1.459673 5.103804 -0.022842 0.601821 5.363612 -0.022842 0.187686 5.350248 -0.022842 0.601821 5.563481
2.3 Model Development The machine learning (ML) model was developed using python programming language and Random Forest Algorithm adopted from ‘Sci-kit learn library’. The Extreme Gradient Boosting algorithm utilized from Xgboost. The graphs were plotted using Matplotlib and Seaborn libraries. Upon successful development of ML models, the performance metrics were calculated and best performed algorithm was chosen for the predictions of fatigue life of Ti6Al4V alloy. 2.3.1 Random Forest Algorithm Statistical Methods like Random Forests have rarely been used for material property estimations which have gained attention recently. Unlike the other regression techniques, this is a decision tree-based model. The Random Forest Algorithm construct multiple individual regression trees. The predictions of all the regression trees are pooled to make the final predictions. As they were the collection of results to make a final decision which referred to as Ensemble techniques. Each regression tree is split based on minimizing the below function. � � � ⋅ � � � � � � � � � � � ⋅ � � � � � � � (2) � � 1 ⋅ and � 1 ⋅ (3) The split value has been obtained by minimizing the total mean squared loss of the child nodes A and B. The prediction in each child node has been treated as the sample mean of respective node. Such regression trees were built using data sampling and predictions made by these multiple trees which were averaged to make a final prediction. The major properties are important for prediction models with the main features of algorithm such as intrinsically multiclass, robustness to outliers, scalability, prediction accuracy, and parameter tuning. The hyperparameters of Random Forest regression model are tuned using Grid search CV. Hyperparameter tuning is required to understand which results least errors and best performance. The best hyperparameters estimated for this algorithm using the data set are shown below;
Made with FlippingBook flipbook maker