Report for Estimate missing discharge data


Report for estimating missing discharge data involves two phases of modeling. The first uses Multiple Linear Regression (MLR) with station discharge values, while the second incorporates rainfall data and lagged values. Output and detailed results, including error metrics and correlation tables, are provided in respective Excel sheets.

First Modeling

The file named "Input Data for First Modeling.xlsx" contains the input data for the initial modeling of each station. This modeling focuses on estimating the discharge values of each station using the discharge values of other stations, employing Multiple Linear Regression (MLR). However, first modeling alone cannot estimate missing data on days when the dependent station lacks data, as other stations may also lack data on these days. Hence, we must utilize the rainfall values of stations to estimate all other missing data, as all four stations have rainfall data for each day.

The output of the first model is stored in an Excel file named after the respective station. Within this file, the sheet labeled 'Output Model1' contains the output of the first modeling phase, where missing data is estimated using discharge values of other stations. Additionally, the sheet labeled 'Model1 Result' houses the results of the first modeling with MLR, including error metrics, R2 values, and a correlation table.

Second Modeling

In the second modeling phase, we utilized the output of the initial modeling as the dependent variable for further analysis. Additionally, we examined the rainfall values of other stations, as well as the rainfall values of these stations with one and two-day lag. Subsequently, we selected variables demonstrating strong correlation as independent variables for inclusion in the model.

Each station file features a sheet labeled 'Input of Model2,' housing the requisite input data for the second modeling phase. The 'Output Model2 Using Rainfall' sheet showcases the outcomes of the second model, signifying the completion of discharge data. Detailed efficiency metrics, correlation table, and the modeling matrix are accessible within the 'Model2 Result' and 'Model2 Matrix' sheets, providing comprehensive insights into the modeling process.