In the present study an attempt is made to provide a general Monte Carlo approach for deriving flood frequency curves in ungauged basins in Sicily region (Italy). The proposed procedure consists of (i) a regional frequency analysis of extreme rainfall series, combined with Huff curves-based synthetic hyetographs, for design storms and (ii) a rainfall-runoff model, based on the Time-Area technique, to generate synthetic hydrographs. Validation of the procedure is carried out on four gauged river basins in Sicily region (Italy), where synthetic peak flow frequency curves, obtained by simulating 1000 flood events, are compared with observed values. Results of the application reveal that the proposed Monte Carlo approach is suitable to reproduce with reasonable accuracy the hydrologic response of the investigated basins. Given its relative simplicity, the developed procedure can be easily extended to poorly gauged or ungauged basins.

Derivation of frequency distributions of peak flows is extremely important for many practical applications in engineering, such as: designing hydraulic structures, water resources planning and management, flood risk assessment and floodplain management. Hydrologic modelling in ungauged basins consists in indirect methods based on simplified description of flood formation process. At the beginning of the 90s the Italian National Research Group for the Prevention of Hydro-Geological Disaster (GNDCI), developed uniform procedures to estimate intense rainfall and peak flow at regional level, under the so-called VAPI project. The results of the VAPI project for Sicily region have been published by Cannarozzo et al. (1993, 1995). In spite of some recent efforts to update the VAPI study for Sicily (Ferro and Porto, 2006; Noto and La Loggia, 2009), currently there is no consensus on a single standard procedure.

Streamflow gauges where historical annual maxima peak flow series are available.

To this end, the present study aims to provide a general Monte Carlo approach for deriving peak flow frequency curves in ungauged basins in Sicily. The proposed procedure is a combination of statistical methods for deriving the design storms and simulation techniques, through an event-based rainfall-runoff model, to generate synthetic hydrographs. In particular, a regional analysis of extreme rainfall, based on the index flood-type method (Dalrymple, 1960), is carried out to derive total storm depths of different return periods. Then, such storm depths are distributed in time according to synthetic hyetographs generated by the Huff curves method (Huff, 1967), which provides a probabilistic representation of accumulated storm depths for corresponding accumulated storm durations expressed in dimensionless form. Estimation of effective storm depths is made through the Curve Number (CN) method developed by the US Natural Resources Conservation Service (NRCS), formerly known as the Soil Conservation Service (SCS). Capitalizing on approaches proposed in previous studies (De Michele and Salvadori, 2002; Aronica and Candela, 2007), the Antecedent Moisture Conditions (AMC) in CN method are treated as a random variable with a discrete probability distribution, in order to relax the iso-frequency assumption between the design storms and the resulting hydrographs. Finally, a rainfall-runoff analysis built upon the Time-Area (TA) concept (Clark, 1945) is used as flood routing technique. The proposed methodology is then validated on four gauged river basins in Sicily region, by comparing synthetic versus observed peak flow frequency curves.

Figure 1 illustrates the structure of the proposed Monte Carlo approach for deriving frequency distributions of peak flows in ungauged basins. The procedure consists of a cascade of three different modules, i.e.: a rainfall generator module providing design gross rainfall hyetographs, a module for hydrologic losses computation to transform gross rainfall into excess (net) rainfall hyetographs and a flood routing module whose outputs are synthetic surface runoff hydrographs. Each module is described in details in the following subsections.

Sketch of the Monte Carlo procedure for peak flows simulation in ungauged basins.

It is worth underlining that all the models included in the methodological framework can perform either as lumped or distributed models, by considering respectively uniform or spatially variable excess rainfall and catchment characteristics. In this paper, lumped models are used for the application of the procedure to the illustrative case studies.

The procedure is validated on four gauged river basins in Sicily region, Italy (see Fig. 2), namely: Belice at Ponte Belice, Imera at Drasi, Salso at Ponte Gagliano and Alcantara at Moio.

Location of the investigated river basins and homogeneous regions derived from the regional rainfall frequency analysis carried out by Bonaccorso and Aronica (2016).

As most of the arid or semi-arid Mediterranean river basins, they are characterized by ephemeral stream regimes, with large floods in autumn and winter and low flows in summer, with the only exception of the Alcantara river basin, usually characterized by perennial surface flows, enriched by spring water arising from the big aquifer of the Etna volcano.

The rainfall generator module is calibrated on historical annual maxima
rainfall (AMR) series of short duration, i.e. 1, 3, 6, 12 and 24 h,
retrieved by a dense network of mechanical recording rain gauges operated by
the Water Observatory of Sicily Region (

Implementation of the TA technique in the flood routing module is based on topographic data derived from 100 m grid resolution Digital Elevation Models (DEM) of the basins, through Geographic Information System (GIS) functions.

The performance of the methodology is assessed by comparing the outputs of Monte Carlo simulation with historical annual maxima peak flow series available at the streamflow gauges located at the catchments' outlet cross-sections (see Table 1).

The rainfall generator module encompasses two sub-modules. The first sub-module is based on a regional frequency analysis of AMR series for synthetic generation of long series of total storm rainfall depths. It implies the inclusion of regional pooled data samples having similar frequency distributions in the frequency analysis. This allows the implementation of multiparameter probability distributions, usually more suitable in modelling extreme events than traditional two parameter distributions.

This procedure first requires identification of homogeneous regions, namely
a set of sites whose frequency distributions can be considered to be
approximately the same, apart from a scale factor. In particular, given a
generic site

Several methods of forming regions are available in literature, among which cluster analysis is one of the most widely applied to regional frequency analysis (Muñoz-Diaz and Rodrigo, 2004; Noto and La Loggia, 2009; Yang et al., 2010; Ngongondo et al., 2011). A further step is to test whether the identified regions can be accepted as being homogeneous. To this end the readers may refer to Viglione et al. (2007) and references therein.

Once that homogeneous regions are identified, AMR series of the sites in each
region can be pooled together and standardized with respect to the
corresponding index values. Standardized data are then used to fit the
parameters of different candidate probability distributions by using the

Computation of the total rainfall depth of the

The second sub-module, calibrated on 10 min rainfall data, consists of a
simple method to define the temporal pattern of each simulated rainfall event
at site

Prior to characterizing and developing frequency distributions of storm duration and depth data, storms must be identified and separated within rainfall data to form the underlying database for developing a conceptual and mathematical model of within-storm intensities. Several methods exist to identify storms, however, often a minimum dry period is used. Although such minimum dry period can depend on the time of year, often a constant, arbitrary inter-event time is used to separate storms. In the present study, storm events from the available dataset are selected by assuming an inter-event time equal to 6 h, as suggested by Huff (1967).

It should be emphasized that from a statistical standpoint it would be advisable only to work with very long sequences. However, in semiarid climates, short duration rains of convective origin are rather frequent. Consequently, a balance between data quality and loss of information has to be found. Since rainfall data used in this module are recorded at fixed time interval of 10 min, we decide to disregard all the storm events lasting less than 100 min, which can provide too biased information about the 10 min rainfall.

The hydrologic losses module allows to transform gross rainfall into effective rainfall. In particular, the SCS-CN method, adopted by US Natural Resources Conservation Service, is used for this purpose. This method allows to incorporate information on land use changes, as CN is a function of soil type, land use, soil cover condition and degree of saturation of the soil before the start of the storm, ranging from 0 (for fully permeable soil) to 100 (for completely impervious soil).

It is worth reminding that initial soil moisture conditions represent a major parameter in the surface runoff generation, since this condition controls infiltration capacity (Bronstert and Bárdossy, 1999), especially at the beginning of the storm.

In particular, in the SCS-CN method the soil conditions are described through the definition of three antecedent moisture conditions (AMC) classes, dry (AMC I), normal (AMC II) and saturated (AMC III), depending on the total 5 days rainfall previous to the storm event and the season. Hawkins et al. (1985) have found formulas to calculate the CN for AMC I and AMC III from the values of CN corresponding to AMC II.

Since, a rainfall event variable in time is considered, the effective
rainfall

In this study, the Time-Area (TA) rainfall-runoff model is applied as flood routing technique to simulate the hydrologic responses to design storms. According to this well-known conceptual method, largely applied in hydrology, the river basin is schematized as a network of linear channels linking each point to the outlet, so that storage effects are disregarded. In particular, the basin is divided into an arbitrary number of subareas separated by isochrones, i.e. isolines of equal travel time to the outlet (Saghafian et al., 2002). The histogram of consecutive contributing subareas from the outlet in upstream direction, is the unit hydrograph, namely the model transfer function which transforms the excess rainfall in runoff. To construct the Time-Area histogram, the catchment time to equilibrium must be determined. This time is a function of rainfall intensity and catchment characteristics and represents the time for the most hydraulically remote point in the catchment to contribute to the surface runoff at the outlet for infinitely long rainfall duration. In what follows, time to equilibrium is loosely replaced by the time of concentration. This time is divided into a number of equal time intervals, representing the time difference between adjacent isochrones.

Rainfall generator model parameters and main characteristics of the investigated river basins.

In this study, we use a GIS software for deriving the Time-Area histogram based on a grid Digital Elevation Models (DEM) of the basins under investigation. In particular, following an approach proposed by Kull and Feldman (1998), the travel time for each grid cell within the basin is assumed proportional to the time of concentration scaled by the ratio of travel length of the cell over the maximum travel length. In other words, we assume an uniform and constant average velocity of runoff travelling from any point in the basin to the outlet. This approach allows to derive the time of concentration of each cell by dividing the cell's travel length by the average velocity. Finally, the Time-Area histogram turns out by grouping those cells whose times of concentration is included between the values of each pair of adjacent isochrones.

Once that the Time-Area histogram is known, the runoff hydrograph may be
determined through convolution:

It should be stressed here that, the effective rainfall intensities in Eq. (6)
correspond to the effective rainfalls resulting from the hydrologic
losses module for each of the three AMC classes. In other words, three
hydrographs are derived for each rainfall event. Meanwhile, the frequency
distribution of AMC classes is computed based on the total 5 days rainfall
records previous to historical storm events. Finally, recalling the Bayes'
rule, the runoff hydrograph conditioned by the AMC classes is given by:

Following this approach 1000 synthetic hydrographs are generated in response to the generated hyetographs.

By using a hierarchical clustering approach based on the Ward's minimum variance hierarchical algorithm (Ward, 1963) and the heterogeneity measures proposed by Hosking and Wallis (1997) as homogeneity test, Bonaccorso and Aronica (2016) have identified 5 approximately homogeneous regions for Sicily region. In Fig. 2, the four gauged river basins in Sicily region (Italy) under investigation are superimposed to the identified homogeneous regions.

Testing the goodness of fit of several candidate probability distributions to
the regionally pooled standardized AMR series

In particular, the GEV cumulative distribution function can be expressed as:

Reminding that the return period

With reference to the index value

For a lumped application of the model, a regional index value

Monte Carlo simulations have been carried out to assess the accuracy of the
estimated regional growth curves (Eq. 9). In particular, the parameters of
GEV distributions have been fitted to the regional averaged

Simulated versus observed peak flow frequency curves.

Results suggest that the performance of the regional

Therefore, one thousand design storm depths for every basin are randomly
drawn by solving Eq. (10). Then, each storm depth is distributed in time
according to a synthetic hyetograph randomly selected among those ones
resulting from the Huff curves procedure, by simply multiplying the time of
concentration and the storm depth for the

Mean areal CN values for AMC II are derived for each river basin from a CN
AMC II digital map at 100 m grid resolution available for the entire Sicily
region. Also, mean areal CN values for AMC I and AMC III are obtained by
using the formulas by Hawkins et al. (1985). One thousand effective rainfall
intensities for each AMC conditions are thus derived through Eq. (5).
Finally, the synthetic hydrographs are obtained by using Eq. (7), where the
AMC class frequencies

In Table 2, rainfall generator model parameters are reported for each river
basin together with other information, i.e.: belonging homogeneous region,
area [km

Synthetic and observed peak flow values are plotted versus the corresponding
return periods

A fair agreement can be observed for all the considered river basins when all the AMC class frequencies are taken into consideration (black lines), with Salso at Ponte Gagliano performing a little worse. This fact can be related to the limited sample size of Salso at Ponte Gagliano record, which just counts 18 data. As it can be expected, in general the quality of the agreement decays for high values of return period. To this end, it should be pointed out that the uncertainty of the measurements increases for high values peak discharge (high return periods), due to a lower accuracy of river stage-discharge relationship. In addition, inundation of floodplains in some part of the catchments (more common for higher return periods) could produce a reduction of maximum peak discharge within the river, leading to biased measurements at the end. Finally, it is worth reminding that the adopted flood routing technique does not take into account potential storage effects in the hydrologic response of the catchments, which could be a limitation for large areas. Nevertheless, the results confirm that, overall, the proposed procedure can reproduce observed flood frequency curves with reasonable accuracy.

In the present paper a simple Monte Carlo scheme is proposed to derive peak flow frequency distribution in ungauged or poorly gauged basins. The proposed approach is based on a simplified and parsimonious description of the flood formation process, characterized by a reduced number of parameters in order to ensure a reduced uncertainty in model predictions. Clearly, each module of the procedure shows some advantages and drawbacks.

The regional frequency analysis in the rainfall generator module enables to
reduce the uncertainty related to the limited sample size of AMR series
(more accurate assessment of long

Following previous remarks, future enhancement of this study could include: (i) the use of meteorological information through synthetic indexes combined with rainfall data analysis to improve the selection of homogeneous regions; (ii) a regional model of antecedent soil moisture conditions and (iii) the implementation of a dynamic flood routing technique for a more flexible and reliable simulation of the flood formation process.

For the annual maxima rainfall (AMR) series of 1, 3, 6, 12 and 24 h used in this study,
readers may refer to Annali Idrologici – Parte Prima, Sezione B, Tabella III
(

The authors declare that they have no conflict of interest.

This study is part of the research activities carried out within the research contract n. F68C13000060008 between Sicilia e-Ricerca SpA and the University of Messina (European Regional Development Fund – PO FESR Sicilia 2007–2013, Linea di intervento 2.3.1. C). Edited by: A. Loukas Reviewed by: A. S. Chen and one anonymous referee