linus.folkerts [at] li-f.de

First test of the efficacy of monastery-style working environments

09/09/2024

How much does work environment matter? How deliberate should and can environment design be? There seems to be some consensus on the radically increased availability of distractions in modern life compared to a monk-style existence hundreds of years ago. Rather than to mentally exhaust ourselves with conscious and disciplined decisions against distractive actions, like picking up the phone or spending time on Twitter, we can try to design our environment in such a way that these distractions do not exist in the first place. The interesting question would be to understand if this changes our "baseline" in a meaningful manner, i.e. if we do not have access to these distractions for a long enough time, maybe we do not miss them anymore and hard work feels more stimulating.

Monasteries are places built to void such distractions. However, it can be difficult to stay fully sovereign in how time is spent when visiting a monastery today (religious practices). There is probably some optimal balance between how a day has some forced structure (e.g. meals at specific times) while maintaining freedom in what is being worked on. A counterintuitive but possibly fitting place might be a container ship. Days are highly structured, chores do not exist (food is provided by one central kitchen etc.), high-bandwidth internet does not exist and so on. To test this I embarked on a mid-size cargo vessel for 10 days. I downloaded a number of Maths textbooks and worked on them every day and made far more progress than I could have made not being isolated on a ship. Social life can be an issue, depending on one's personal dependence on such, but overall I would argue that a cargo ship journey provides many aspects of what available working environments often lack. Viability is obviously heavily restricted by the constraints set on types of work possible, i.e. work has to be possible without internet or communication in general. However, for tasks like self-study, this setup seems to be close to optimal.

Mental-state optimality for learning tasks

20/07/2024

Hypothetical scenario: in 30 minutes you will be told a 30-digit number. Immediately after seeing it, you will be put into a 3 months lasting artificial coma. If you wake up and can remember the number you will stay alive, otherwise you will be shot. How would you spend the 30 minutes maximising the probability of successful recall after the coma?

But also in less hypothetical settings, the role of memorization in learning and in the integration of new material is a crucial one. The consolidation of perceived information not only makes a piece of information available as a fact to be recalled but also changes how we think. But what factors influence the successful consolidation of perceived information? Most people are aware of spaced repetition tools like Anki to iteratively increase the halftime of a specific piece of information. However, the deliberate practice of recall is not the only way of controlling the forgetting curve of perceived information and the learned.

memory consolidation: category of processes that stabilize a memory trace after acquisition (modelled through different theories)

systems consolidation: the gradual process of reorganizing memories, typically transferring dependency from the hippocampus to the neocortex over time; second phase of memory consolidation

Chan, Jason C. K. et al. “Retrieval Potentiates New Learning: A Theoretical and Meta-Analytic Review.” Psychological Bulletin 144 (2018): 1111–1146.: https://pubmed.ncbi.nlm.nih.gov/30265011/

Many different factors influence how important the brain perceives a piece of information to be. The brain models this importance and encodes that piece of information with a strength proportional to that modelled importance. This heuristic seems to be a function of the current brain state in terms of different levels of neurotransmitter concentrations (or maybe not a function of, but rather it is itself the current brain state). These neurotransmitter and neuromodulator concentrations correspond to a range of psychological and cognitive states (conscious or subconscious): intensity of attention, arousal, emotion, level of metacognition, perceived novelty, perceived (reward) relevance. All these dimensions form a space of all possible combinations and resulting mental states.

The naturally arising question is then: what is the optimal position in this space, that as a mental state maximises later recall probability of learned information? I.e. we can view this space as the domain of an optimization problem, or through the lens of control theory, we ask how we can stay in a specific area of that space and more importantly, how we can move our mental state in that space to more desirable regions. Before we look at the individual dimensions of this space that seem to cover as much as possible of the variance in recall probabilities at test time, it is important to mention that it is very difficult to disentangle the effect of mental-state aspects (e.g. the current estimate of novelty in our perception) on cognitive performance in general versus on just memory formation and consolidation. At the same time this disentanglement seems rather unimportant given that due to the nature of cognitive performance, an increase of it in general will likely influence memory as well. A second remark on the factors that seem to influence memory consolidation the most: for some dimensions, it is hard to determine if they are partly contained within each other. E.g. the level of perceived novelty might directly influence memory consolidation, at the same, it is possible that it does that only through increasing the level of attention and thus would increase memory performance only indirectly. However, ultimately it is unimportant if memory performance is improved directly or indirectly.

control theory: field in applied mathematics that deals with dynamic systems in which feedback is used to steer the state of that system into desirable subspaces of the state space

Dimensions:

  • attention: attention is the main mechanism of the brain to differentiate between important and unimportant stimuli and has (maybe the most obvious) effect on memory encoding and later recall ability.
  • general arousal: arousal refers to the general state of psychological and physiological activation and is majorly defined through the interplay of the sympathetic (fight or flight response) and parasympathetic (rest and digest) parts of the autonomic nervous system. For cognitively non-demanding tasks a higher level of arousal is associated with higher cognitive performance, for cognitively more demanding tasks there seems to be a sweet spot of moderate arousal. The psychologists Robert M. Yerkes and John Dillingham Dodson formalised this as the Yerkes–Dodson law already in the year 1908. Arousal together with strong emotions also seems to be the main driver for so-called flashbulb memories, i.e. memories that are vividly recallable decades without deliberate practice and are mostly formed during strongly moving moments like childbirths or events of trauma.
  • emotion: different emotional states, especially states of strong emotion, seem to have a strong effect on later recall ability. Likely strongly connected to the impact of general arousal and is highly associated with flashbulb memories as well.
  • metacognition: according to Wikipedia "an awareness of one's thought processes and an understanding of the patterns behind them.". The reflection of one's own thought and reasoning processes might be most prominent in various meditation paradigms like mindfulness. Different studies show quite a significant impact on different measures of cognitive performance through metacognition (or more predominantly meditation). This includes the general practice of meditation and the practice of meditating immediately before a learning task. Again, it seems to be difficult to separate the effects on general cognitive performance and memory. Even more importantly, it seems to be rather challenging to separate the effects of meditation or metacognition on cognitive performance from those on attention and thus only indirectly on cognitive performance.
  • perceived novelty: different cognitive frameworks like predictive coding are based on principles of prediction error and model updating. To generalise well and to not overfit on specific stimuli, it would be useful to model the novelty of stimuli and associate encoding and consolidation strength with such novelty. Several neurotransmitters, most importantly Dopamine and Acetylcholine are associated with this perception of novelty and memory encoding strength at the same time. Studies show that novel environments improve memory performance. The role of acetylcholine in memory and novelty perception is established. Artificially increasing dopamine levels through administering a dopamine precursor (L-DOPA) increases memory performance.
  • perceived (reward) relevance: different neurotransmitters, most importantly Dopamine, are connected to the reward-importance modelling of stimuli in the human brain. Research shows that this perceived reward relevance (also called incentive salience) is correlated with memory performance.
  • context similarity train and test time: so far we have treated the impact of different dimensions like attention as being separate between train and test time. However, there is not only the notion of conditions being optimal for train and test time independently, but also the orthogonal effect of conditions just being similar between train and test time. Research shows that the similarity of specific aspects of the mental state (including the current perception of external stimuli aka the environment) between train and test time has a strong effect on successful recall probability. It very much feels like a form of compression in which a piece of information is stored not in its full size but conditioned on the current context. If my environment is represented through 1 bit and I want to store a piece of information of 10 bits, I only need 9 bits for storage if I condition the information on the environment and (obviously heavily constrained by only then being able to retrieve the information when the environment is fully reinstated). If the environment present during encoding has been memorized before anyway, it can also be mentally reinstated during a recall attempt, removing the condition of being in the same context physically. People have made use of this dynamic for thousands of years through the so-called method of loci in which pieces of information are associated with physical locations in an imagined memory palace. The abilities of mental reinstatement seem to be trainable and reach vast areas from visuals, and sounds but also physical movements and emotions.

Moving in that space?

Coming back to the question from the beginning, how can then recall probability be truly maximised? The different dimensions outlined above already indicate an optimal subspace of the space of mental states. The more interesting question is, how can we move in this space, that is how can we consciously influence all of the different aspects affecting recall probability? This task is a very hard one, firstly due to how many of these aspects are related to subconscious processes making them hard/impossible to control and secondly due to how difficult it is to measure a mental state on these dimensions, which if possible would make the self-training of moving in that space a lot easier. It also presents a difficult trade-off: is time better spent moving in this space (and learning how to move in this space) or doing the actual learning task? While this remains an open question, I will outline some thoughts on possible avenues of approaching mental-state optimality.

  • attention: probably best trained and improved through attention exercises and focused meditation (i.e. meditation pratices that involve focusing on a single point or stimuli for prolonged periods of time)
  • general arousal: specific types of meditation allow to increase and control levels of stimulation (the legendary Tummo meditation would be an example). Potentially more practical interventions might be the use of different breathing techniques and (light) physical activity.
  • emotion: music?
  • metacognition: probably best trained through performing deliberately performing metacognition, often central to meditation practices like mindfulness
  • perceived novelty: could be achieved through performing a learning task in novel environments and settings and through deliberately trying to "defamiliarize" oneself with material and concepts learned
  • perceived reward relevance: potentially achieved through the implementation of short-term rewards (gamification, treats). Some research suggests the visualisation of "mastery" to be helpful. Additionally, it might be helpful to frame material in ways that imply personal connection and evoke a perception of wonder. I.e. making oneself aware of how truly fascinating but also mysterios the nature of a specific problem is, might increase recall probabilities of associated information.
  • context similarity: if possible true physical similarity between train time and test time environment. Otherwise: method of loci to an extreme. I.e. not only imagining a specific scene to be associated to a memory but also a specific emotion, proprioceptive state, sounds etc., again ideally associations present during test time anyway to limit the need of mental reinstatement.

attention: the concentration of awareness on some phenomenon to the exclusion of other stimuli

arousal: multidimensional state of physiological and psychological activation, characterized by varying degrees of autonomic nervous system engagement, hormonal changes, and alterations in cognitive and affective processes, which prepares an organism for action and modulates its responsiveness to internal and external stimuli

sympathetic nervous system: one of the three (physically distinct) divisions of the autonomic nervous system, stimulates a body's "fight or flight" response (or more general arousal)

parasympathetic nervous system: one of the three (physically distinct) divisions of the autonomic nervous system, stimulates "rest-and-digest"/"feed-and-breed" activities that occur when the body is at rest

  • Aliases
    • PSNS

Yerkes-Dodson law: empirical relationship between arousal and performance for both cognitively non-demanding and cognitively more demanding tasks

flashbulb memory: a vivid, long-lasting memory about a surprising or shocking event that has happened in the past

predictive coding: theory of brain function which postulates that the brain is constantly generating and updating a "mental model" of its environment

  • Aliases
    • predictive processing

L-DOPA: precursor to dopamine (and medication as such)

  • Aliases
    • levodopa
    • l-3,4-dihydroxyphenylalanine

incentive salience: a potentially attributed property of a stimulus when modelled to be motivationally relevant by the brain

method of loci: trategy for memory enhancement, which uses visualizations of familiar spatial environments in order to enhance the recall of information

  • Aliases
    • memory palace

mental imagery: the cognitive state of visualisation (in different modalities), heavily used by athletes (and e.g. dancers)

defamiliarization: the technique of presenting or understanding familiar concepts from a novel (potentially strange) perspective

Chun, Marvin M. and Nicholas B. Turk-Browne. “Interactions between attention and memory.” Current Opinion in Neurobiology 17 (2007): 177-184.: https://pubmed.ncbi.nlm.nih.gov/17379501/

Aly, Mariam and Nicholas B. Turk-Browne. “Attention promotes episodic encoding by stabilizing hippocampal representations.” Proceedings of the National Academy of Sciences 113 (2016): E420 - E429.: https://pubmed.ncbi.nlm.nih.gov/26755611/

Mather, Mara and Matthew R. Sutherland. “Arousal-Biased Competition in Perception and Memory.” Perspectives on Psychological Science 6 (2011): 114 - 133.: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3110019/

Kensinger, Elizabeth A.. “Remembering the Details: Effects of Emotion.” Emotion Review 1 (2009): 113 - 99.: https://pubmed.ncbi.nlm.nih.gov/19421427/

Bower, Gordon H.. “Mood and memory.” The American psychologist 36 2 (1981): 129-48 .: https://psycnet.apa.org/record/1981-31724-001

“Beta-adrenergic activation and memory for emotional events.” Nature 371 6499 (1994): 702-4 .: https://www.nature.com/articles/371702a0

Hacker, Douglas J. et al. “Handbook of Metacognition in Education.” (2009).: https://www.taylorfrancis.com/books/edit/10.4324/9780203876428/handbook-metacognition-education-arthur-graesser-douglas-hacker-john-dunlosky

Shapiro, Shauna L. et al. “Toward the Integration of Meditation into Higher Education: A Review of Research Evidence.” Teachers College Record: The Voice of Scholarship in Education 113 (2011): 493 - 528.: https://journals.sagepub.com/doi/10.1177/016146811111300306

Hasselmo, Michael E.. “The role of acetylcholine in learning and memory.” Current Opinion in Neurobiology 16 (2006): 710-715.: https://www.sciencedirect.com/science/article/abs/pii/S095943880600122X?dgcid=api_sd_search-api-endpoint

Knecht, Stefan et al. “Levodopa: Faster and better word learning in normal humans.” Annals of Neurology 56 (2004): n. pag.: https://pubmed.ncbi.nlm.nih.gov/15236398/

Fenker, Daniela B. et al. “Novel Scenes Improve Recollection and Recall of Words.” Journal of Cognitive Neuroscience 20 (2008): 1250-1265.: https://direct.mit.edu/jocn/article-abstract/20/7/1250/4528/Novel-Scenes-Improve-Recollection-and-Recall-of?redirectedFrom=fulltext

Rangel, Antonio et al. “A framework for studying the neurobiology of value-based decision making.” Nature Reviews Neuroscience 9 (2008): 545-556.: https://www.nature.com/articles/nrn2357

Adcock, R. Alison et al. “Reward-Motivated Learning: Mesolimbic Activation Precedes Memory Formation.” Neuron 50 (2006): 507-517.: https://www.sciencedirect.com/science/article/pii/S0896627306002625?dgcid=api_sd_search-api-endpoint

Krokos, Eric et al. “Virtual memory palaces: immersion aids recall.” Virtual Reality 23 (2018): 1 - 15.: https://link.springer.com/article/10.1007/s10055-018-0346-3

Urcelay, Gonzalo P. and Ralph R. Miller. “The functions of contexts in associative learning.” Behavioural Processes 104 (2014): 2-12.: https://www.sciencedirect.com/science/article/abs/pii/S0376635714000448?dgcid=api_sd_search-api-endpoint

Cumming, Jennifer, and Sarah E. Williams, "The Role of Imagery in Performance", in Shane M. Murphy (ed.), The Oxford Handbook of Sport and Performance Psychology, Oxford Library of Psychology (2012; online edn, Oxford Academic, 21 Nov. 2012): https://academic.oup.com/edited-volume/28200/chapter-abstract/213164139?redirectedFrom=fulltext

Overton, Donald A.. “STATE-DEPENDENT OR "DISSOCIATED" LEARNING PRODUCED WITH PENTOBARBITAL.” Journal of comparative and physiological psychology 57 (1964): 3-12 .: https://pubmed.ncbi.nlm.nih.gov/14125086/

Dando, Coral J. et al. “The cognitive interview: the efficacy of a modified mental reinstatement of context procedure for frontline police investigators.” Applied Cognitive Psychology 23 (2009): 138-147.: https://onlinelibrary.wiley.com/doi/10.1002/acp.1451

Eich, Eric. “Mood as a mediator of place dependent memory.” Journal of experimental psychology. General 124 3 (1995): 293-308 .: https://pubmed.ncbi.nlm.nih.gov/7673863/

Kozhevnikov, Maria et al. “Beyond mindfulness: Arousal-driven modulation of attentional control during arousal-based practices.” Current Research in Neurobiology 3 (2022): n. pag.: https://pubmed.ncbi.nlm.nih.gov/36246552/

12/02/2024

What is statistical learning theory? Statistical learning theory asks sort of fundamental questions: what can be learned? what cannot be learned no matter how hard we try? We seek to find answers to these questions by formulating learning as the computation that a learning algorithm performs. A learning algorithm is a function that maps data to a hypothesis in some hypothesis space, which itself is a function space.

At this point it makes sense to take a moment to think: can every output of a learning system be understood as a hypothesis, i.e. a function?

For many processes of learning that might be true, especially the engineered ones. Take for example a neural network. The design, i.e. the architecture, of the neural network defines the hypothesis space, i.e. the different functions that can be learned. The learning algorithm now finds the best function, which is a function easily understood mathematically due to the clear dichotomy between inputs and outputs. But what about something like a brain? There seems to be a gap between the ease with which ML algorithms can be understood as hypotheses and the difficulty of understanding the brain as one mathematical function. Or actually, it is easy to view the brain as a function, but it is harder to understand the learning algorithm at hand. There is no empirical risk minimization trying to "fit" the function mapping perception to actions to some "true" version of how a brain should behave. It is rather a beautiful interplay between the search process of evolution, finding brain configurations that reproduce and persist, i.e. slow-scale adaption, and learning processes, i.e. fast-scale adaption.

learning algorithm: in statistical learning theory the function that maps a training set to a hypothesis

  • Aliases
    • learner

hypothesis space: the space of functions that a learning algorithm can output

  • Aliases
    • hypothesis class
    • hypothesis set

Natural gas consumption in Germany

18/08/2023

last model run hours ago

Germany's dependence on natural gas is undeniable, with consumption of this fossil fuel rising steadily over the past decades. Even amidst the shift towards renewable energy sources, natural gas continues to play a crucial role, primarily as a transition fuel, enabling organisations and policymakers to meet short-term CO2 reduction targets. It's also critical for heat-intensive industrial processes and residential heating, areas that have traditionally been difficult to transition to clean energy sources. Given the unique position of natural gas within our economic system, understanding and forecasting its demand is of paramount importance, particularly in times of supply disruption, such as during the Ukrainian war. Proper planning and procurement of natural gas supplies is essential to prevent severe shortages that could drastically affect a country's gross domestic product.

Attempts to forecast natural gas consumption have a long history, with early models focusing on statistical approaches that took into account aspects such as household income, GDP and ambient temperature. The advent of advanced computing resources led to the application of more sophisticated models, including Artificial Neural Networks (ANNs), Support Vector Regression (SVR) and deep learning models such as Long Short-Term Memory (LSTM) models. While these forecasting techniques have led to unprecedented advances, much of the previous work on natural gas consumption modelling lacks reproducibility due to limited access to the underlying data or exact model parameters. This is often due to licensing restrictions imposed on authors.

This project aims to bridge this gap by providing an open source and publicly available platform for natural gas consumption forecasting in Germany. The focus is on ensuring that critical data on the dependent variable and potential independent variables are publicly available, that forecasts are implemented in a time-efficient manner using state-of-the-art models and frameworks, and that these forecasts are continuously published and productized. The aim is to make energy-related scientific issues, which have a significant impact on everyday life, more accessible to the general public.

natural gas: (naturally occurring) mixture of gases consisting primarily of methane, heavily used for heating, electricity generation and industrial processes

Russian invasion of Ukraine 2022: major escalation of the Russo-Ukrainian war on the 24th of February 2022 with hundreds of thousands of military casualties

artificial neural network: a model in machine learning inspired by the structure and function of biological neural networks

  • Aliases
    • ANN
    • neural network

support vector machine: supervised learning hypothesis space defined through a set of hyperplanes and a solution that is found when examples adjacent to the decision boundary, have a maximum distance to it

  • Aliases
    • SVM

long short-term memory: successor of the (standard) RNN architecture that addresses the vanishing/exploding gradients problem

  • Aliases
    • LSTM

Balitskiy, Sergey et al. “Energy security and economic growth in the European Union.” Journal of Security and Sustainability Issues 4 (2014): 123-130.: https://www.researchgate.net/publication/284456238_Energy_security_and_economic_growth_in_the_European_union

Di Bella, Gabriel et al. “Natural Gas in Europe: The Potential Impact of Disruptions to Supply.” IMF Working Papers (2022): n. pag.: https://www.elibrary.imf.org/view/journals/001/2022/145/001.2022.issue-145-en.xml

Finding the right data for a model can be challenging. Dependent variables for the model have been established in literature. Recent works offer an overview of factors influencing natural gas consumption in today's energy market. A study of literature surveys and discussions was conducted to identify consumption indicators.

Special challenges are faced in this work. All datasets used must be open-source and programmatically accessible for the model.

The potential regressors considered for the model are:

  • Temperature: The relationship between natural gas consumption and temperature is well-established in literature.
  • Natural gas prices: The influence of pricing on consumption is well-studied, described through "price elasticity."
  • Crude oil prices: Prices of other energy commodities can impact natural gas consumption due to substitution effects.
  • Coal prices: Wholesale prices of coal can also affect natural gas consumption.
  • Electricity prices: Electricity is an energy commodity that can substitute natural gas to some extent.
  • Auction prices of European Emission Allowances (EUAs): The European Emission Trading System caps natural gas consumption, but the relationship can be both positive and negative.
  • Natural gas storage levels: Storage levels of natural gas can serve as an indicator of demand and consumption.

To obtain the necessary data, various sources are used. Consumption data for natural gas in Germany is provided by Trading Hub Europe GmbH. Temperature data is sourced from the atmospheric reanalysis model ERA-5 and real-time weather information from Open-meteo. Natural gas prices are obtained from Trading Hub Europe GmbH. Brent crude oil prices are available from the U.S. Energy Information Administration. Electricity prices data is provided by Ember. Auction prices of European Emission Allowances (EUAs) are obtained from the European Energy Exchange AG (EEX). Natural gas storage levels data is sourced from Engie.

The data analysis reveals seasonal patterns in natural gas consumption and the importance of temperature as a factor. The historic data of consumption, temperature, and other variables are examined in the figures provided.

energy market: a type of commodity market on which electricity, heat, and fuel products are traded (Wikipedia)

spot market: type of market on which financial instruments and commodities are traded for immediate delivery

European Energy Exchange: central European energy market located in Leipzig, Germany

  • Aliases
    • EEX

European Power Exchange: another European energy market located in Paris, France

  • Aliases
    • EPEX SPOT

emission allowance: a certificate granting the right to emit a specific equivalent of CO2 emissions, part of an emissions trading scheme

Berrisford, H. G. "The relation between gas demand and temperature: a study in statistical demand forecasting." Journal of the Operational Research Society 16.2 (1965): 229-246.: https://www.tandfonline.com/doi/abs/10.1057/jors.1965.32

Erias, A. F., and Emma M. Iglesias. "The daily price and income elasticity of natural gas demand in Europe." Energy Reports 8 (2022): 14595-14605.: https://www.sciencedirect.com/science/article/pii/S2352484722023411

Hartley, Peter R., Kenneth B. Medlock III, and Jennifer E. Rosthal. "The relationship of natural gas to oil prices." The Energy Journal 29.3 (2008): 47-66.: https://journals.sagepub.com/doi/abs/10.5547/ISSN0195-6574-EJ-Vol29-No3-3?casa_token=AHfqhR3wLskAAAAA:6Qcn9Y4mPLn7--TYTQH56KDbGge3hz8ZiYDXRqvS8qR0FGLnAxTbW9BYQqrFturdHI6PfX6XlNdsBW4

Nick, Sebastian, and Stefan Thoenes. "What drives natural gas prices?—A structural VAR approach." Energy Economics 45 (2014): 517-527.: https://www.sciencedirect.com/science/article/pii/S0140988314001911?casa_token=WxuCCfskNNkAAAAA:2E455k0zcqcnvQ5da1AbGtWUnR-2OQsMYz9AlPt-9Ayz9DTTPTdz-qZKa4ByU0jqN6X7pPDtoQin

Nyga-Łukaszewska, Honorata, and Kentaka Aruga. "Energy prices and COVID-immunity: The case of crude oil and natural gas prices in the US and Japan." Energies 13.23 (2020): 6300.: https://www.mdpi.com/1996-1073/13/23/6300

Gürsan, Cem, and Vincent de Gooyert. "The systemic impact of a transition fuel: Does natural gas help or hinder the energy transition?." Renewable and Sustainable Energy Reviews 138 (2021): 110552.: https://www.sciencedirect.com/science/article/pii/S1364032120308364

last model run hours ago

After understanding the dynamics of natural gas consumption, a model was designed, trained, and evaluated to forecast Germany's natural gas consumption. Previous models, such as a simple seasonal model and a piecewise-linear temperature model, showed promising results, but more sophisticated models are now established in literature. Several models are discussed, including multiple linear regression, Autoregressive Exogenous Model (ARX), Seasonal Autoregressive Integrated Moving Average Model with exogenous regressors (SARIMAX), XGBoost, and NeuralProphet.

The performance of these models is compared using metrics like Mean Absolute Percentage Error (MAPE). The temperature-only versions of ARX and NeuralProphet are also evaluated to create a fully automated forecasting system. The NeuralProphet model performs well in both short-term and long-term forecasts. A hybrid approach is proposed for temperature forecasting, combining short-term forecasts from weather services with a mid-term model trained on historic data. The combined temperature forecasts, along with the temperature-only NeuralProphet model, enable the forecasting of natural gas consumption for the next 365 days.

This approach strikes a balance between predictive performance and deployability, making it suitable for practical applications.

linear regression: statistical approach to relate (multiple) predictors and a target linearly; often estimated using least squares estimation

  • Aliases
    • multiple linear regression

autoregressive model: in classical statistics a linear model for which previous predictions are (additionally) used as predictors

  • Aliases
    • AR

autoregressive integrated moving overage: in classical statistics a model that combines autoregressive, differencing, and moving average components

  • Aliases
    • ARIMA

XGBoost: software library for the efficient implementation of gradient boosting algorithms

NeuralProphet: open-source forecasting library built on top of PyTorch, combining ANNs with classical time series models

mean absolute percentage error: commonly used objective function for time series model optimization, defined as the mean of the absolute percentage errors

  • Aliases
    • MAPE

Tamba, Jean Gaston, et al. "Forecasting natural gas: A literature survey." International Journal of Energy Economics and Policy 8.3 (2018): 216-249.: https://www.zbw.eu/econis-archiv/bitstream/11159/2119/1/1028134630.pdf

Liu, Jinyuan, et al. "Natural gas consumption forecasting: A discussion on forecasting history and future challenges." Journal of Natural Gas Science and Engineering 90 (2021): 103930.: https://www.sciencedirect.com/science/article/pii/S1875510021001372?casa_token=MGNFCQjLDloAAAAA:KxOLmFkMCm4pe7EqShfrmCsunrGzV4DvRtGG1VWh1qSnlq4Ftro_PU4NtTbUa77a-mQRMLApK56e

NeuralProphet: Explainable Forecasting at Scale by Oskar Triebe and Hansika Hewamalage and Polina Pilyugina and Nikolay Laptev and Christoph Bergmeir and Ram Rajagopal. (2021): https://neuralprophet.com/

statsmodels: Econometric and statistical modeling with python by Seabold, Skipper and Perktold, Josef in 9th Python in Science Conference (2010): https://www.statsmodels.org/

XGBoost: A Scalable Tree Boosting System by Chen, Tianqi and Guestrin, Carlos in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016): https://doi.org/10.1145/2939672.2939785

pandas-dev/pandas: Pandas by The pandas development team (2020): https://pandas.pydata.org/

last model run hours ago

The primary aim of this project was to develop and deploy a public-facing natural gas forecasting model, which re-trains and updates predictions daily. The resultant data needs to be stored and served to the public in a straightforward manner. In terms of architecture, the design leans towards simplicity and efficiency. Both temperature and natural gas consumption prediction models are encapsulated in a Docker container as Python code. The results are saved using a storage service that channels forecasts via a backend to be accessed by a frontend. This entire backend operation is hosted on Google Cloud Platform.

A scheduling service is set up to activate daily model re-training and run the model at a certain time. The new forecasts are stored in Google Cloud Storage, from where they can be accessed by users through the frontend. The frontend comprises a web application built with TypeScript and the Next.js framework, with Tailwind CSS aiding in styling. The app's primary function is to visualize data retrieved from the storage endpoint on Google Cloud Storage, avoiding the need for implementing any additional logic.

While the model re-training and runs take place on the Docker container hosted on Google Cloud Platform, the frontend is also hosted on a fully managed platform, Vercel, to reduce costs and administrative tasks. Vercel, which offers free service for non-commercial use, links to a Git repository of a web application built with one of many approved frameworks, managing the deployment and serving of the web application. Though utilizing fully-managed infrastructure solutions may not be a necessity or suit everyone's unique requirements, in this instance, it helps to keep the solution simple and easily reproducible.

docker: popular containerization software first released in 2013

google cloud platform: Google's cloud offering (comparable to AWS and Microsoft Azure)

  • Aliases
    • GCP

Next.js: open-source web development framework for server-side rendering

Tailwind CSS: open-source CSS framework

Vercel: technology company offering serverless web hosting solutions

Copernicus Climate Change Service (2020): Climate and energy indicators for Europe from 1979 to present derived from reanalysis. Copernicus Climate Change Service (C3S) Climate Data Store (CDS): https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-energy-derived-reanalysis

Open-Meteo: Historical Weather API: https://open-meteo.com/

Trading Hub Europe GmbH (2023): Imbalance prices. Format: CSV (Accessed on 03-Mar-2023): https://www.tradinghub.eu/en-gb/Publications/Prices/Imbalance-Prices

U.S. Energy Information Administration, Crude Oil Prices: Brent - Europe [DCOILBRENTEU], retrieved from FRED, Federal Reserve Bank of St. Louis; March 1, 2023.: https://fred.stlouisfed.org/series/DCOILBRENTEU

Ember (2023): European wholesale electricity price data. Wholesale day-ahead electricity price data for European countries, sources from ENTSO-e and cleaned. Format: CSV (Accessed 03-Mar-2023): https://ember-climate.org/data-catalogue/european-wholesale-electricity-price-data/

European Energy Exchange AG (2023): EUA Emission Spot Primary Market Auction Report - History. Format: XLS/XLSX (Accessed 04-Mar-2023): https://www.eex.com/en/market-data/environmental-markets/eua-primary-auction-spot-download

GIE (Gas Infrastructure Europe): GIE AISBL 2022. Aggregated Gas Storage Inventory (AGSI): Germany. Format: CSV (Accessed 03-Mar-2023): https://agsi.gie.eu/data-overview/DE

Trading Hub Europe GmbH (2023): Publication of the aggregate consumption data: Aggregated consumption data. Format: CSV (Accessed on 04-Mar-2023): https://www.tradinghub.eu/en-gb/Publications/Transparency/Aggregated-consumption-data