Scratchwork.xyz, Part 5: Gathering external weather data using external APIs

Sign up at openweathermap.org

I want to compare our data gathered inside the house with the external weather… but I don’t own a weather station (yet!). We could perhaps laboriously scrape it from a weather website, but that’s harder and considered “rude” at minimum and “not in agreement with the license” in more exacting terms as compared to simply signing up for and using a free-tier API from one of the various weather services. In my case, I signed up with openweathermap.org.

There’s documentation on their website of all the various details of how often you can call their API, how many data points they’ll provide, and what different data is available. The most important part for our purposes is found here, where they explain how to call their API for the current weather at a given location, and what the format of the response back will be. We’ll use python to form a request to send to their API, then parse the response before inserting it into a separate SQL table.

Make a separate SQL table

Quickly – using what you learned previously, make a separate SQL table. As before, I prefer to use pgAdmin’s GUI to make a new table, and set the initial schema. In my case, I decided to make entries for all of:

  • temperature
  • max temperature
  • minimum temperature
  • humidity
  • pressure
  • “feels like” temperature
  • percentage of cloud cover
  • latitude
  • longitude (really I don’t need this every time, but it’s possible the available weather stations will change over time and this could shift)
  • sunrise
  • sunset
  • visibility distance
  • weather description
  • weather_gen (straight up cannot remember what this is)
  • wind direction
  • wind speed
  • time
  • cumulative rainfall within the last hour

Write the python script

Actually, I don’t start by writing a python script, I make a virtual environment, just like we did previously. Since this is kinda-sorta a separate process, you could see it being run on a different system – so let’s keep good habits and define a virtual environment so we know exactly what packages we need, what versions, etc so things don’t break. This script is simple enough that it’s not likely to be relevant, but I’ve heard this is a good habit. Name the environment and the script something logical.

You can view the final product at github under the “weatherGather” folder . It’s quite simple: like before, we need to import requests to help parse an http response, and import the tools for connecting to our SQL database. Then we simply make 1 call to openweathermap’s API, parse the response, and insert it into our database. Easy! Like before, you should not hard-code an API key, nor the SQL database credentials, but… I did, and God hasn’t struck me down yet.

Set the cron job to execute

We do have one additional step here, which is to set this to run on a defined schedule. The way to do that on a server which is continuously running is by using cron, a built-in utility that gets heavy use in linux. Read more about cron here. First, while ssh’s to the server, edit the crontab:

crontab -e

This will open your user’s “crontab,” the list (or table) of cron jobs. Then we need to add a new line to define 2 key points: * how frequently the operation will run * the command, and any arguments required

In our case, we want to run it every 15 minutes. Any more frequently, and we might not actually get new data from openweathermap, and just cause unnecessary load on both servers. A good way to check if you’ve got the right entries for a given repetition is to use this awesome website. We then want the command to be the absolute path to our virtual environment’s python installation, and the argument to be the absolute path to the python script. So we end up adding this line:

*/15 * * * * /home/$USER_NAME/weatherGather/weatherGatherenv/bin/python /home/$USER_NAME/weatherGather/weatherGather.py

Now, every 15 minutes we should be getting exterior weather data and storing it in an SQL table. Fantastic! But now we need a way to visualize our collected data, both local and exterior…