Machine Learning Forecast

Mon, Apr 11, 2022

Fun with Forecasts

One of the goals when setting up scratchwork.xyz (blog series starts here) was to turn it into a framework that could be used with some fun machine learning models. After about two years of letting it collect data, I decided it was time to put some of that latent knowledge to work for me. How? By predicting the upcoming weather.

I chose to use a neural network, and programmed in Python with the Keras framework. The center of the model is a dual-layer Long Short Term Memory (LSTM) set with dropout. It’s multivariate, using several collected properties (temperature, humidity, wind speed, etc) of the past 7 days to predict the temperature of the next 2 days. Although it’s a relatively simple architecture, and it’s only trained for about 40 epochs (which is not a lot), I wasn’t going for ultra-precision: I was going for something that worked well enough for the time being. I won’t bore you with some of the various optimization strategies I tried, except to note that I found that the batch size used between updating the weights of the model had a much larger effect than I expected on the final weights.

Can I see the results?

Of course! If you go to scratchwork.xyz right now, you’ll see the model’s forecast temperature for the next two days, give-or-take (it only computes the forecast twice a day). You can compare it to the past weather that’s also plotted, to see if you think it looks reasonable.

Why not SARIMA?

SARIMA and related forecasting tools can have two major issues that I wanted to avoid: first, they’re typically univariate, so any information I have about the humidity, recent rainfall, etc., wouldn’t be taken into account. I don’t want to waste that data. Second, they can’t handle the random “it rained for 2 days straight and was 50 degrees both day and night” cases, because the periodicity is baked right in. LSTM learns periodicity, but it still has the capability (in theory) to capture the dynamics of those strange little weather interludes. For this application, neural networks offer big advantages over something like SARIMA.

Pretty good now, but what about when climate change hits?

Well, first of all, I probably won’t have it running by the time that becomes an issue. However, I did implement an automatic system that will continue to train on new data after it’s collected enough new information. Once a month, the weights will be adjusted based on newly-available inputs. This should hopefully continue to refine the model, and correct any climate-change-based drift. In case it goes haywire, the previous weights are also being saved.

Hmm, if I plugged in the history of stock prices…

Oh no. Please don’t do that if you’re someone who’s reading this post. You will lose your money.