<<< WeeWX Guides >>>
Add Missing Data to WeeWX
Background
Having gotten a caravan, I wanted to make sure that it is not getting cold and damp, especially through the winter months and so I sought out a way to keep track of the internal temperature and humidity of the caravan
After much research I settled on an Ecowitt GW1000 WiFi Weather Station Gateway and Ecowitt WH32 Outdoor Temperature and Humidity Sensor
Initally I set them up to capture the temperature and record it on the Ecowitt Website, but I always planned to merge this data into the WeeWX database, which I have now finished, a desription on how to do this is available here
So now that the integration is finished I need to populate WeeWX with the missing data. The first thing I tried was wee_import, but this only adds new records. It does not update existing records, and so this guide shows you how I achived the necessary update to add the missing data
Warning
This process requires you to make changes to the WeeWX database, outside of the control of the WeeWX program, and if you are not experienced and/or confident working with relation databases then I suggest you obtain additional help.
For background: I have worked with relational database for over 30 years and currently work in a Global Role, advising organisations on the best way to modernise their databases through the Database Freedom Programme
Add Missing Data to WeeWX
As I said I have been sending my temperature readings to the Ecowitt website since I installed my temperature sensor, and luckily for me the website provides an easy way to download your data. This lets me download data on a daily, weekly, monthly or yearly basis. Each with a different level of granularity.
- Daily: returns the data with one record every five minutes
- Weekly: returns the data with one record every thirty minutes
- Monthly: returns the data with one record every four hours
- Yearly: returns the data with one record every day
As I ws downloading about four months of data I decided to download the weekly reports, which still meant I had to consollidate around sixteen Microsoft Excel files, but this gave me sufficient granularity the purposes of storing historical data
Once I had consolidated all of that into a single Excel spreadsheet I extracted the data in csv format and ended up with a file containing data in this format
date_and_time,extraTemp2,extraHumid2,extraTemp3,extraHumid1 2020-11-03 20:00,74.48,60,72.68,51 2020-11-03 20:30,74.3,60,72.68,51 2020-11-03 21:00,74.12,61,72.32,52 2020-11-03 21:30,73.94,62,72.68,51 2020-11-03 22:00,73.94,62,73.22,49 2020-11-03 22:30,73.76,63,72.5,50 2020-11-03 23:00,73.76,64,72.86,49 2020-11-03 23:30,73.76,64,72.32,51 2020-11-04 00:00,73.76,64,71.96,51 2020-11-04 00:30,73.22,61,71.78,51
Create Python Script to Parse the File
The next part of this processes was actually a lot easier than I anticipated, I simply had to create a Python script to read and parse the file and to then update the SQLite database
As this is only ever going to be a one off activity I used constants to store the directory names for the csv files and database location, and in the sample program below you can see that I used two variables, one for testing and one for the actual upgrade
Another point of note is that the SQL statement for the update is hard coded into this program, again as this will only ever be a one off activity this is probably the best approach, and other than the columns names, which should be self explanatory the only other calculation that is required is to convert the UTC date into epochTime to match the records in the database
The code I used is replicated below and downloadable from here
pi@raspberrypi:~ $ vi weeUpdate.py import csv import os import sqlite3 import time #CSV_FILE_PATH = '/tmp' CSV_FILE_PATH = '/Users/revittmk/OneDrive/Mike/WebSites/Weewx/DataLoads/Ecowitt' #SQLITE_DB = '/home/weewx/weewx.sdb' SQLITE_DB = '/Users/revittmk/OneDrive/Mike/WebSites/Weewx/DataLoads/Ecowitt/weewx.sdb' SOURCE_FILE = 'Cougar-Caravan.csv' DEST_FILE = 'Epoc-Caravan.csv' if __name__ == "__main__": firstRow = True sourceFile = os.path.join(CSV_FILE_PATH, SOURCE_FILE) destFile = os.path.join(CSV_FILE_PATH, DEST_FILE) destData = open(destFile, 'w') destWriter = csv.writer(destData, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL) sqliteDB = sqlite3.connect(SQLITE_DB) sqliteCursor = sqliteDB.cursor() with open(sourceFile, 'r') as sourceData: csvReader = csv.reader(sourceData, delimiter=',') for sLineIn in csvReader: if firstRow == True: destWriter.writerow(sLineIn) firstRow = False else: epochTime = int(time.mktime(time.strptime(sLineIn[0],'%Y-%m-%d %H:%M'))) destWriter.writerow([epochTime, sLineIn[1], sLineIn[2], sLineIn[3], sLineIn[4]]) sqlString = 'update archive set extraTemp2 = ' + sLineIn[1] + ','; sqlString = sqlString + 'extraHumid2 = ' + sLineIn[2] + ','; sqlString = sqlString + 'extraTemp3 = ' + sLineIn[3] + ','; sqlString = sqlString + 'extraHumid1 = ' + sLineIn[4] + ' '; sqlString = sqlString + 'where datetime = ' + str(epochTime) + ';\n'; sqliteCursor.execute(sqlString) sourceData.close() destData.close() sqliteDB.commit() sqliteCursor.close()
Test Test Test
I shouldn't need to say this but you are updating data in your production database and if you get it wrong you could end up corrupting everything, not the ideal way to test if your backup strategy works. You do have a backup strategy don't you
Copy your database file somewhere and then connect to the database to examine the records that we will be updating, in my case extraTemp1 contains the temperature of my Rasberry PI CPU but the others are empty
pi@weewx: $ sqlite3 /Users/revittmk/OneDrive/Mike/WebSites/Weewx/DataLoads/Ecowitt/weewx.sdb sqlite> select datetime, extraTemp1, extraTemp2, extraTemp3 from archive where datetime > 1610638200; 1610638500|128.4008|| 1610638800|126.6476|| 1610639100|124.0178|| 1610639400|124.8944|| 1610639700|124.8944|| 1610640000|124.0178|| 1610640300|124.8944|| 1610640600|124.8944|| 1610640900|126.6476|| 1610641200|125.771||
Run the update program and then examine the data again. One of the other things this program does is create a new csv file, Epoc-Caravan.csv in my case, containing the updated information which can be used to perform spot checks against the database. This update file contains the UTC date converted to epochTime
pi@weewx: $ python3 weeUpdate.py pi@weewx: $ sqlite3 /Users/revittmk/OneDrive/Mike/WebSites/Weewx/DataLoads/Ecowitt/weewx.sdb sqlite> select datetime, extraTemp1, extraTemp2, extraTemp3 from archive where datetime > 1610638200; 1610638500|128.4008|34.7|64.6 1610638800|126.6476|34.5|64.9 1610639100|124.0178|34.5|65.3 1610639400|124.8944|34.5|65.5 1610639700|124.8944|34.5|65.5 1610640000|124.0178|34.7|65.66 1610640300|124.8944|34.3|65.3 1610640600|124.8944|34.5|65.5 1610640900|126.6476|34.7|65.7 1610641200|125.771|35.1|65.8
Populate Summary Tables
What we have done here is update the archive rows with the new data, but the summary reports run against a different set of archive tables, for the data I have just updated these are
- archive_day_extraTemp2
- archive_day_extraTemp3
- archive_day_extraHumid1
- archive_day_extraHumid2
First lets have a look at the data in these tables
pi@weewx: $ sqlite3 /Users/revittmk/OneDrive/Mike/WebSites/Weewx/DataLoads/Ecowitt/weewx.sdb sqlite> select * from archive_day_extraTemp2 where datetime > 1604361600; 1604448000|||||||| 1604534400|||||||| 1604620800|||||||| sqlite> select * from archive_day_extraTemp3 where datetime > 1604361600; 1604448000|||||||| 1604534400|||||||| 1604620800|||||||| sqlite> select * from archive_day_extraHumid1 where datetime > 1604361600; 1604448000|||||||| 1604534400|||||||| 1604620800|||||||| sqlite> select * from archive_day_extraHumid2 where datetime > 1604361600; 1604448000|||||||| 1604534400|||||||| 1604620800||||||||
Next we run the WeeWX utility wee_database, this will correctly populate the archive tables so that when the reports run the graphs will be populated with the data we just updated
As before I wanted to test this before running the database utillity against my production database so I created a copy of my main config file and changed the location of the database in that file, this allowed me to tell the database utility to use the test database via a test copy of the config file
pi@weewx: $ wee_database --config=weewx-db.conf --rebuild-daily --from=2020-11-01 sqlite> select * from archive_day_extraTemp2 where datetime > 1604361600; 1604448000|42.98|1604534400|73.22|1604449800|2622.84|48|2622.84|48 1604534400|41.9|1604545200|51.8|1604563200|2304.6|48|2304.6|48 1604620800|44.06|1604707200|54.32|1604649600|2337.0|48|2337.0|48 sqlite> select * from archive_day_extraTemp3 where datetime > 1604361600; 1604448000|66.74|1604509200|76.1|1604453400|3435.36|48|3435.36|48 1604534400|65.48|1604590200|76.28|1604611800|3339.96|48|3339.96|48 1604620800|64.94|1604676600|76.28|1604701800|3362.28|48|3362.28|48 sqlite> select * from archive_day_extraHumid1 where datetime > 1604361600; 1604448000|47.0|1604451600|54.0|1604485800|2430.0|48|2430.0|48 1604534400|50.0|1604611800|57.0|1604559600|2580.0|48|2580.0|48 1604620800|48.0|1604674800|57.0|1604644200|2525.0|48|2525.0|48 sqlite> select * from archive_day_extraHumid2 where datetime > 1604361600; 1604448000|56.0|1604475000|74.0|1604521800|3281.0|48|3281.0|48 1604534400|72.0|1604565000|77.0|1604610000|3567.0|48|3567.0|48 1604620800|73.0|1604651400|78.0|1604696400|3631.0|48|3631.0|48
And that's all there is to it, WeeWX will now create all of your graphs and reports from the data you just uploaded