PSI - Issue 78

First page Table of contents Previous page 2121 Next page Last page

Ebrahim Aminifar et al. / Procedia Structural Integrity 78 (2026) 1466–1473

1471

was either architecturally avoided or structurally impractical. These proportions are consistent with both liturgical function and seismic performance expectations, contributing to the typological regularity of the sample Together, these parameters form a cohesive typological matrix capturing the chronological evolution, material transitions, and morphological logic of single-nave churches across Italy. This robust dataset enables meaningful clustering, supports fragility assessments, and guides the derivation of mechanical proxies for future machine learning models . 4. Clustering and Distribution Modeling To refine the classification of single-nave churches by size, an area-based clustering approach was employed. The parameter serves as an effective geometric proxy for multiple correlated features. Specifically, area is inherently a function of plan dimensions (length × width), which, in historical churches, scale proportionally with height and wall thickness due to architectural proportionality rules. As shown in Dabiri et al. (2022), plan area was found to be one of the most influential predictors of fragility parameters (β and μ), with Pearson correlation coefficients of 0.379 and 0.024, respectively. This confirms its value in capturing the building mass and associated seismic demand. Furthermore, the analysis in Section 3.2 demonstrates that floor area also correlates strongly with volumetric indicators like H/W and L/W ratios. Therefore, clustering by area provides a robust and interpretable dimensional basis while avoiding collinearity issues that would arise from including all size parameters simultaneously.

(a)

(b)

Fig. 4. Width-to-Length ratio of analysed sample: (a) dispersion graph and (b) histogram distribution

Given the wide range and non-normal distribution of the floor area data, four unsupervised clustering algorithms were tested: K-Means (applied to log-transformed values), Gaussian Mixture Models (GMM), Agglomerative Clustering, and Quantile Binning. K-means is a centroid-based method that partitions data into a predefined number of clusters by minimizing the within-cluster variance; it performs best when the clusters are spherical and well separated. GMM, by contrast, is a probabilistic model that assumes data are generated from a mixture of Gaussian distributions. Each point is assigned to a cluster based on probability, allowing GMM to effectively capture overlapping or soft cluster boundaries, making it particularly suited to datasets with gradual morphological transitions, such as historical architectural forms. Agglomerative clustering is a hierarchical, bottom-up approach that begins with each data point as a separate cluster and successively merges the closest pairs based on linkage criteria (e.g., distance or variance) until the desired number of clusters is reached. It does not assume any specific data distribution and can capture nested structures but may be sensitive to outliers and imbalanced data density. Quantile binning is a non parametric method that divides the data into intervals containing approximately equal numbers of observations. While simple and interpretable, it imposes fixed thresholds that do not adapt to the underlying data distribution and may result in artificial boundaries, particularly in skewed datasets (Dempster et al., n.d.; Murtagh & Contreras, 2012). Each method was applied using a three-cluster configuration to distinguish between small, medium, and large churches. Clustering performance was evaluated using the silhouette score, computed with the Euclidean distance metric in Python’s scikit -learn library. The score quantifies the cohesion within clusters and their separation from one another, ranging from -1 to 1. Higher values indicate well-defined, compact clusters, while scores near zero or negative

Made with FlippingBook Digital Proposal Maker