Hydrotehnic, Faculty of Civil Engineering and Architecture, University of Nis , Niš , Serbia
This study examines the effectiveness of three regression methods – multiple linear, random forest, and log-linear (gamma) when applied to annual maximum daily precipitation data sets to fill in missing values. Gridded observations data of extreme daily precipitation, sourced from the Digital Climate Atlas of Serbia platform, were utilized for this study in the area of Niš. The dataset, which is complete for the period 1950–2020, was intentionally modified to simulate missing data. These artificial gaps, or 'holes,' were introduced systematically at the beginning, end, and randomly selected locations within the dataset. The data omission was carried out incrementally at rates of 5%, 10%, 15%, and 20%. The performance of the methods for completing incomplete series was evaluated in terms of standard metrics like the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE). The results indicated a commendable performance across all evaluated methods, even when addressing 20% missing data. Notably, multiple linear regression emerged as the most effective technique among those tested.
The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.