Paper

This report outlines my university project for the Predictive Analytics course at Copenhagen Business School. The study aimed to forecast daily electricity load in Denmark by comparing traditional time series approaches with models that integrate renewable energy generation data. Using a five-year dataset (2016–2020) from the Open Power System Data platform, the project followed a rigorous pipeline that included exploratory data analysis to identify strong weekly seasonality and structural breaks, followed by stationarity testing that necessitated first-order differencing. The core analysis involved training and evaluating three primary modeling techniques: a Seasonal Naive baseline, standard and seasonal ARIMA models, and a Dynamic Regression model incorporating wind and solar generation variables. These models were rigorously validated using Ljung-Box diagnostic tests and compared across different forecast horizons, ultimately revealing that while Auto-ARIMA performed best for short-term predictions, the Dynamic Regression model offered superior accuracy for longer 30-day forecasts by effectively capturing weather-driven demand fluctuations.
Built With
Topics
Denmark is a global energy leader in the transition to renewable energy systems, with aggressive targets for complete decarbonization by 2050 (Jensen 2024). This transition to sustainable power sources revolutionized the character of electricity demand forecasting, posing new challenges to grid operators and policymakers, who now need to take into account an incredibly large number of additional variables. The Danish electricity system shows a unique case study due to its exceptionally high presence of variable, renewable energy sources and advanced grid infrastructure, which offers valuable information for other markets undergoing similar energy transitions (Schütz Roungkvist et al. 2020). However, the implementation of renewable energy sources into the equation brings unprecedented complexity in the forecasting of electricity loads: research shows that integration of renewable energy presents fundamental challenges for traditional load prediction methods, with more than 60% wind prediction errors and 50% solar prediction errors occurring during high utilization scenarios (Wang et al. 2023). This complexity requires sophisticated forecasting approaches that can capture the intricate relationships between renewable generation patterns and electricity demand.
This study aims to develop and evaluate forecasting models for Danish electricity load, by comparing traditional time series approaches with dynamic regression models that incorporate renewable energy generation as predictive variables. Specifically, we examine the effectiveness of ARIMA models, seasonal ARIMA (SARIMA) models, and dynamic regression models with ARIMA errors in predicting daily electricity load over multiple forecasting horizons. The analysis explicitly excludes ETS models based on preliminary testing revealing significant autocorrelation in STL decomposition remainders, indicating their inadequacy for this application. We also seek to determine whether incorporating solar and wind generation data is crucial in greatly improving the accuracy of forecasting over using just temporal models.
This analysis addresses several critical questions that are increasingly relevant for modern electricity systems: How do different forecasting methodologies perform in predicting Danish electricity load? Does the integration of renewable energy generation variables enhance forecast accuracy? How does forecast performance vary across different prediction horizons?
Forecasting electricity load in systems equipped with renewable energy implementations have become increasingly more complex as wind and solar exploitation levels rise. When renewables consist in a large part of the energy supply, traditional models like ARIMA and ETS get stuck in capturing demand trends (Wang et al. 2022): challenges that are particularly evident in markets with high-penetration like Denmark (Jensen 2024).
To address this, Lu and Chen (Lu and Chen 2024) proposed a transformer-based model using data slicing and channel independence, achieving a 47% reduction in forecasting error. Other insights come from Roungkvist et al. (Schütz Roungkvist et al. 2020), who analyzed the Danish market and identified wind production and thermal generation as key variables. Moreover, Karabiber and Xydis (Karabiber and Xydis 2019) also showed that combining forecasting methods outperforms individual models in the Denmark West region.
Finally, probabilistic deep learning methods are proving superior to deterministic approaches; an example comes from Kaur et al. (Kaur et al. 2022), who reviewed recent advancements and found that these models better account for intermittency and uncertainty in renewable energy data.
This study utilizes the Time Series dataset (October 6, 2020 version)
provided on the Open Power System Data (OPSD) platform (Open Power System
Data 2020), which includes hourly data of electricity load and
renewable generation of various European countries; however, while
working on this project, only columns related to Denmark were utilized.
After checking for missing value, dropping of unimportant columns and
converting data types, the cleaned dataset is composed of daily averages
from January 2016 to December 2020, which offers a five-year window that
covers seasonal cycles along with medium-term trends in the Danish power
system. It consists of 1,736 daily observations with the primary focus
on: electricity load (DK_Load), solar generation
(DK_Solar), and wind generation (DK_Wind), all
measured in megawatt hours (MWh).
The initial time plot exhibits clear seasonal patterns with higher consumption during winter months and steep weekly patterns that mirror the seasonal patterns of industrial and commercial activity as exhibited by Figure 1. The seasonal pattern shows consistent yearly oscillations with winter highs at 4,000/4,500 MWh and summer lows at 2,800/3,200 MWh, which indicates additive seasonality that is relatively consistent over the years.
Moving on to the Seasonal and Trend decomposition using Loess (STL), three broad components of the Danish electricity load data are evident as shown in Figure 2: (1) The trend is U-shaped: falling from 2016 until 2017, leveling back in 2018, and slightly increasing thereafter. (2) The season component contains extremely strong annual cycles, and clear weekly cycles with lower weekend consumption. (3) Upon further inspection, the remainder plot reveals clear autocorrelation structure.
Given the distribution of the STL remainders distribution, a formal Ljung–Box test yielded a p-value close to zero: providing clear evidence that the residuals are not white noise. This violates one of the core assumptions of ETS models, making them unsuitable for this analysis and instead supporting the use of ARIMA-based approaches, which are better at capturing temporal dependencies of this kind.
Correlation analysis supports the use of renewable generation variables in forecasting models given the very informative values: solar generation is moderately negatively correlated with electricity load, reflecting low demand during summer days. Wind generation, unexpectedly, is weakly positively correlated with load as shown in Table 1 (presumably due to cold, windy weather requiring heating demand). Solar and wind generation both show seasonal complementarity, solar peaking in summer and wind in winter.
| Variable | Correlation with Load | Interpretation |
|---|---|---|
| Solar | -0.463 | Negative – as expected |
| Wind | 0.190 | Positive – weather effect |
After visually inspecting the load series, it was determined that the electricity load data had stable variance and rather additive seasonality, making logarithmic transformations unnecessary. Trend and seasonal components suggest that differencing may be required in order to achieve stationarity for ARIMA modeling techniques.
Formal stationarity is also tested using the KPSS and the ADF to confirm the time series nature of the data. The ADF test with trend returns a test statistic of -3.831, below the 5% critical value of -3.41 and leading to the rejection of the null hypothesis of a unit root; suggesting the series is stationary in the case of a deterministic trend. However, the KPSS test (test statistic = 0.730) provides a contradictory result with the p value 0.0108, leading to the rejection of the stationarity null hypothesis.
Such test inconsistency is expected in cases of series that exhibit
stationarity around a deterministic trend but may have changing variance
or minimal structural shifts. In order to avoid such uncertainty, the
analysis takes the the ndiffs() function from the forecast
package in R recommended choice of using a single standard difference,
which is effective in inducing stationarity as confirmed by subsequent
testing. Finally, the differenced series show clear stationarity
features, with stable variation around zero. The recommended number of
differences (ndiffs() = 1) provides information regarding
ARIMA model specification, while the successful achievement of
stationarity verifies the appropriateness of the differencing technique
for this dataset.
Autocorrelation function (ACF) and partial autocorrelation function (PACF) should be investigated in understanding the temporal dynamics of the electricity load series. The ACF of the raw series reveals slow decay with high-order lags that cover over 40 periods, confirming non-stationarity and justifying the need for differencing as indicated by the formal tests. The PACF of the original series shows a clear cutoff after lag 1, with some notable lags at each week interval (7, 14, 21), and this points toward the presence of both trend and season components. The pattern also suggests the possibility of AR(1) terms being relevant but also implying weekly seasonal patterns that must be taken care of in model specification. After differencing, the ACF is also very large at lags of a week (lags 7, 14, 21, 28), and this confirms weekly seasonality and that it still continues after detrending. The PACF of the differenced series also follows the same trends and is utilized to inform both non-seasonal and seasonal ARIMA component specification.
Structural break testing employs a number of tests to check whether the data generating process remained the same from 2016 to 2020. Both of the OLS-CUSUM and OLS-MOSUM tests provide very highly significant results (test statistic = 4.4414, p-value <2.2e-16), which are strong evidence against structural stability in the electricity load series. These tests indicate that the statistical properties of the load series changed significantly over the sample period.
The QLR (Chow-type) test, applied with the breakpoints()
function, identifies a number of candidate breaks with declining
Bayesian Information Criterion (BIC) values as more breaks are added.
The lowest BIC number of breaks is four (BIC = 25816), with the breaks
between observations 415, 676, 1143, and 1404 corresponding to changes
in structure in the sample period.
These results indicate that the Danish electricity load series experienced multiple structural breaks, possibly as a result of changing policies. Despite these breaks, the analysis proceeds with homogeneous modeling approaches, as dynamic regression models may be able to account for these changes to some extent via time-varying relationships with the renewable generation variables.
The estimation part of the analysis comprises different forecasting techniques, each of which is used to determine different aspects of electricity load behavior. Following the STL remainder analysis which illustrates that ETS models could not properly capture the advanced temporal dependency, the analysis employs three distinct modeling methods: seasonal naive baseline, ARIMA models, and dynamic regression models with renewable energy variables. In order to perform an adequate validation, the dataset was split by taking a 90–10 train–test split, which proved to be a good balance between training depth and testing accuracy.
The baseline model employs a seasonal naive approach using one year lags, where each daily forecast equals the load observed on the same day in the previous year. This approach captures annual seasonality while providing a simple benchmark against which to evaluate more sophisticated methods.
The model is implemented as SNAIVE(DK_Load ~ lag(365)),
utilizing the strong annual patterns identified in the exploratory
analysis.
Two ARIMA approaches were implemented: automatic model selection and manual specification of a seasonal ARIMA (SARIMA) model designed to capture the weekly seasonality identified in the differenced series.
The auto-ARIMA selection algorithm suggests ARIMA(1,0,2)(2,1,0)[7] as optimal, which is a weekly seasonal pattern specification including auto-regressive and moving-average terms and seasonal auto-regressive terms.
The manual specification employs SARIMA(1,1,1)(0,1,1)[7], with explicit modeling of weekly seasonality and parameter selection based on identification-phase diagnostics. The non-seasonal components (1,1,1) reflect specific diagnostic signals: the auto-regressive order p=1 follows the sharp PACF cutoff after lag 1; the differencing order d=1; and the moving average order q=1 is a conservative option to capture short-run dependencies. The seasonal terms (0,1,1)[7] capture the strong weekly cycles detected in the raw series and reinforced by the ACF/PACF analysis, where significant spikes appear at lags multiple of 7. Seasonal differencing with D=1 removes any residual weekly seasonality, and the seasonal MA term Q=1 captures weekly dependencies without making the model too complex. For the same reason (avoiding making the model too complex by over-parameterizing it), the seasonal AR term (P=0) was left absent.
Dynamic regression models are the most sophisticated implementation in this analysis, combining ARIMA error structures with renewable energy variables. Our model takes the form:
where is the electricity load, and are scaled solar and wind generation respectively, and follows an ARIMA process.
This approach recognizes that while renewable output may be responsible for some load fluctuation, the residual patterns likely possess additional temporal structure that will be best captured with ARIMA error models. The renewable energy variables are standardized based on training set statistics in order to give stable estimation and interpretable coefficients, while the ARIMA error structure is automatically selected based on the residuals of the regression component to determine the optimal orders by minimizing information criteria.
The estimated coefficients are per standard deviation for wind and per standard deviation for solar, indicating that higher renewable output is associated with lower load once seasonality and persistence are handled by the error process. This differs slightly from the raw correlation results because the regression controls for seasonality and the simultaneous effects of both variables, revealing the underlying relationship more clearly.
The primary diagnostic tool used is the Ljung-Box test at lag 10 to determine whether model residuals exhibit significant autocorrelation or not, which would indicate model misspecification. The dynamic regression model and the automated ARIMA model both exhibit satisfactory diagnostic results, with Ljung-Box p-values of 0.1442 and 0.1635 respectively, both well above the 5% significance level, indicating that the selected specifications properly capture the temporal dependencies in the data.
However, the SARIMA(1,1,1)(0,1,1)[7] specification does not pass diagnostic testing with a Ljung-Box p-value of zero, indicating residual autocorrelation and model misspecification. This diagnostic failure excludes the SARIMA model from realistic consideration, even though its apparently satisfactory forecast performance.
Visual inspection of residual plots confirms the Ljung-Box results: both the dynamic regression and automatic ARIMA models show residuals fluctuating randomly around zero with constant variance and no evident autocorrelation, indicating adequate specification. In contrast, the SARIMA residuals present more weekly patterns and remaining autocorrelation.
Table 2 compares the information criteria for the ARIMA models. As we can see, the manual SARIMA actually achieves lower AIC, AICc and BIC values than the auto-ARIMA, supposedly indicating a better fit. However, as seen before, it fails the Ljung-Box test, revealing residual autocorrelation and therefore suggesting overfitting without properly modeling the error structure.
In contrast, the automatic ARIMA shows slightly higher results, but passes the diagnostic checks, making it a more reliable choice.
| Model | Specification | AIC | AICc | BIC |
|---|---|---|---|---|
| Manual SARIMA | ARIMA(1,1,1)(0,1,1)[7] | 19491.32 | 19491.35 | 19512.72 |
| Auto ARIMA | ARIMA(1,0,2)(2,1,0)[7] | 19824.45 | 19824.51 | 19856.55 |
The results indicate that best model choice is extremely responsive to forecast horizon, with different approaches being superior over different timescales.
At 7-day horizon, the automatic ARIMA performs best with a MAPE of 4.48%, slightly better than the manual SARIMA (4.52%) and dynamic regression (4.54%) models. The 14-day horizon follows a similar trend, with the automatic ARIMA (4.02% MAPE) continuing its lead over the dynamic regression model (4.10%). Interestingly, the performance of the manual SARIMA model worsens quicker (4.49%), in line with its diagnostic errors pronouncing it to be fundamentally misspecified. But at the critical 30-day forecast period, the dynamic regression model is the undisputed winner at 3.95% MAPE; a 0.39 percentage point improvement over the auto-ARIMA (4.34%) and a 0.67 percentage point improvement over the SARIMA (4.62%). The seasonal naive benchmark reaches 8.29% MAPE, showing that while year-round seasonality is important and advanced modeling brings considerably more value.
| Model | 7-day MAPE | 14-day MAPE | 30-day MAPE | Ljung-Box p-value |
|---|---|---|---|---|
| Dynamic Regression | 4.54% | 4.10% | 3.95% | 0.1442 |
| Auto ARIMA | 4.48% | 4.02% | 4.34% | 0.1635 |
| Manual SARIMA | 4.52% | 4.49% | 4.62% | 0.0000 |
| Seasonal Naive | 7.65% | 8.11% | 8.29% | N/A |
From these results we can see that the inclusion of renewable variables contributes to the improvements in forecasting error, particularly at longer horizons, with the dynamic regression model offering valuable insights into the relationship between renewable output and electricity demand.
The benefit of adding renewable variables is 0.39 percentage points at the 30-day horizon (4.34% - 3.95%). While small in absolute value, this represents a significant error reduction over the baseline scenario: error reduction that could lead to significant economic improvements for the management grid system if considering cumulated annual results over time.
Figure 4 also illustrates the 30-day forecast performance for the first 30 days of the test period (April-May 2020) for the three models, providing a detailed visual comparison of performance against actual values. The graph clearly shows the weekly cyclical pattern captured by all models, with the dynamic regression model (green line) tracking the actual values (black points) most closely, particularly at the weekly peaks and troughs.
This study aimed to forecast daily electricity load in Denmark using both time series models enhanced with renewable energy data. ARIMA, SARIMA and dynamic regression approaches were compared across multiple forecast horizons to determine which model returns the best metrics. The methodology involved: (1) exploratory data analysis revealing strong weekly seasonality and analyzing trends as well as other crucial factors; (2) stationarity testing leading to first-order differencing; (3) model estimation with automatic and manual specifications; (4) model evaluation through diagnostic testing and forecast accuracy assessment over multiple horizons.
Results indicate superiority of model varies with forecast horizon, with auto-ARIMA(1,0,2)(2,1,0)[7] performing best in the short term (7-day MAPE: 4.48%; 14-day MAPE: 4.02%) and dynamic regression with renewable variables integration performing best at the longer time horizons (30-day MAPE: 3.95%).
Noteably, the SARIMA(1,1,1)(0,1,1)[7] executed manually resulted in delivering lowest information criteria (AIC = 19491.32) but completely failing diagnostic tests (Ljung-Box p-value = 0), which illustrates how perfect in-sample fit can conceal underlying misspecification. Additionally, lack of ETS models is due to severe autocorrelation of STL remainders, meaning that these traditional approaches are not applicable for high renewable integration systems.
In conclusion, incorporating renewable energy variables improves medium-term load forecasting by 0.39 percentage points, with dynamic regression effectively detecting weather-driven correlations. However, the analysis is limited to daily data, whereas operational planning often requires forecasts at an hourly or even sub-hourly resolution. Future research could also extend this work by developing high-frequency forecasting models that directly integrate meteorological inputs such as solar radiation, wind speed, and temperature profiles. In addition, it would be valuable to test model performance during periods of market stress, such as supply shortages or sharp demand spikes, to assess how well forecasting systems adapt to extreme and rapidly changing conditions.
| Test | Test Statistic | Result/Interpretation |
|---|---|---|
| Stationarity Tests | ||
| ADF (with trend) | -3.831 | Stationary (5% critical value: -3.41) |
| KPSS | 0.730 | Non-stationary (p-value: 0.0108) |
| Recommended differencing | 1 | First-order differencing required |
| Box-Cox | 0.890 | Near 1, no transformation needed |
| Structural Break Tests | ||
| OLS-CUSUM | 4.4414 | p-value 2.2e-16*** |
| OLS-MOSUM | 4.4414 | p-value 2.2e-16*** |
| STL Decomposition Remainder | ||
| Ljung-Box (lag 10) | 12 143.31 | p-value 0*** |
** Significant at 0.1% level