Home Monitoring Re-Write Number Four!

Since getting the Tesla Powerwall installed, our trusted Wattson has not been able to display correct figures as it can’t tell if we are importing or exporting until the Powerwall is full.  The Wattson displays a relatively static value of +150W indicating that we’re importing, yet the data from the various other devices in the house contradicts that figure.

So it’s time to say goodbye to Wattson and hand it on to a neighbour and hope they get some use out of it.

Wattson’s demise is a great excuse to upgrade to a tablet and display a lot more information than just whether we’re importing or exporting, so I’ve gone out and bought a Samsung Galaxy Tab A from JL to replace Wattson.

In order to display more information on the tablet, I needed to re-write the home monitoring application and start graphing the data at home rather than relying on PVOutput.  PVOutput is a great website, but it’s limited to a 5 minute picture of what’s going on and I’ve run out of fields to upload data, even though I donate to get extra fields! Wattson has gotten us used to being able to see what’s going on instantly rather than waiting for a snapshot 5 minutes later.

The second re-write I did of the home monitoring application in 2015 has been running well for a few years, but despite what I wrote back then about it being maintainable, it was a pain to add in a new datasource and it was written in my least favourite framework – Mule.

Since then I’ve tried re-writing it in Node.js, but that code was less than elegant and not tested at all… It also relied on a heavy weight MySQL database which I wanted to avoid if possible. HSQLDB may be a bit basic, but it’s served me well for many years and allows me to make changes to the files in a text editor if required.

I did learn something valuable from the Node.js re-write – consolidate the five tables I had before into one large table. I’ve changed the following five tables

to a single table for ease of storing the data and to save space.

The previous database file size was 640MB (note that’s more than 200MB per year as I blogged about the database being 400MB only last year) vs. the new single table layout file size of 240MB. Every field in the database except the composite primary keys are nullable. This allows the data to be stored into the table in any order, after all I can’t guarantee which Arduino will send it’s data first.

The next step was to work out how to convert the database from the original layout to the new layout without having my pc running at 100% for over 2 hours (the first time I loaded the data from the old tables to the new table, this is exactly what happened!). The trick was to not insert based on a select union, but to use the HSQLDB merge functionality. The two hour ETL turned into a three minute ETL. This much improved ETL time allows me to take a copy of the old database (the in use one) at any time, transform it and check the new app is compatible with the schema and can write data into the new layout correctly.

As I’ve mentioned above, the new application is no longer based on Mule and instead is a Spring Boot app.   The home monitoring application receives input using Spring MVC controllers and persists the data to the database against the date and time (rounded to the minute).

At the service layer, there’s also three separate scheduled services, one for uploading PVOutput data once a minute, one for requesting the EE addons status page and scraping the data every hour and one for calling the Tesla Powerwall API every five seconds.

EE addons status page scraping I hear you say… “what’s that for?”  We no longer have fixed line internet and rely on EE 4G internet, which is great until we run out of data two days before the end of the month!  The EE addons status page displays how much data you have used, how much is remaining and how long until the next period.  Since I’ve now got the option to display a lot of different data on the tablet, it seemed sensible to display the EE data allowance too!

For anyone interested in doing something similar, here’s a class I’ve written to read the HTML and trim it to extract the right bits of information. The fields aren’t accessible as I don’t store the information – I simply pass it straight to Splunk via toString.

package uk.co.vsf.home.monitoring.service.ee;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.commons.lang3.StringUtils;
import org.apache.commons.lang3.builder.ReflectionToStringBuilder;
import static org.apache.commons.lang3.StringUtils.*;

public class EeDataStatus {

	private static final String ALLOWANCE_LEFT = "allowance__left";
	private static final String ALLOWANCE_TIMESPAN = "allowance__timespan";
	private static final String BOLD_END = "</b>";
	private static final String BOLD_START = "<b>";
	private static final String SPAN_END = "</span>";
	private static final String SPAN_START = "<span>";
	private static final String DOUBLE_SPACE = "  ";

	private final String allowance;
	private final String remaining;
	private final String timeRemaining;

	public EeDataStatus(final String response) {
		String allowance = response.substring(response.indexOf(ALLOWANCE_LEFT) + ALLOWANCE_LEFT.length());
		allowance = allowance.substring(0, allowance.indexOf(SPAN_END));

		Pattern pattern = Pattern.compile("(\\d+.*\\d*GB)");
		Matcher matcher = pattern.matcher(allowance);

		matcher.find();
		this.remaining = matcher.group();
		matcher.find();
		this.allowance = matcher.group();

		String timespan = response.substring(response.indexOf(ALLOWANCE_TIMESPAN) + ALLOWANCE_TIMESPAN.length());
		timespan = timespan.substring(0, timespan.indexOf(SPAN_END));
		timespan = timespan.substring(timespan.indexOf(SPAN_START) + SPAN_START.length());
		timespan = timespan.replaceAll(BOLD_END, EMPTY).replaceAll(BOLD_START, EMPTY);
		timespan = timespan.replaceAll(CR, EMPTY);
		timespan = timespan.replaceAll(LF, EMPTY);
		timespan = timespan.replaceAll(DOUBLE_SPACE, SPACE);
		timespan = StringUtils.trim(timespan);
		this.timeRemaining = timespan;
	}

	@Override
	public String toString() {
		return new ReflectionToStringBuilder(this).toString();
	}
}

When I tried writing the home monitoring application in Node.js I gave Prometheus a go to see whether that would be a good tool for graphing at home.  It worked well when graphing small sets of data, but when I tried to graph over a years worth of data, it either errored because there was too much data coming back from the query, or took a vast amount of time to refresh the graph.  It’s possible I wasn’t using the tool correctly, but I decided it wasn’t for me in this use case because of the inability to graph large amounts of data and because it’s not as intuitive as the graphing tool I’ve chosen to go with.

So what graphing tool have I chosen?  Splunk 🙂

I chose Splunk for a number of reasons:

  1. I’ll be sending less than 500MB to Splunk a day, so it’s free 😀
  2. It’s incredibly intuitive to search through data in Splunk, so I should be able to give my dad a basic lesson and he can create graphs for himself. I had considered the ELK stack, but the searching language isn’t quite as intuitive…
  3. Splunk doesn’t care about the schema of the data you throw at it.  This makes it easy to work with as I can add/remove fields when required and not have to change a schema.

Writing the data to Splunk uses the ToStringBuilder JSON format and a Log4j socket appender.  The ToStringBuilder format is configured at bootup via the following component.

package uk.co.vsf.home.monitoring;

import org.apache.commons.lang3.builder.ToStringBuilder;
import org.apache.commons.lang3.builder.ToStringStyle;
import org.springframework.stereotype.Component;

@Component
public class ToStringBuilderStyleComponent {

	public ToStringBuilderStyleComponent() {
		ToStringBuilder.setDefaultStyle(ToStringStyle.JSON_STYLE);
	}
}

And I chose the Log4j socket appender because it doesn’t require the use of tokens to talk to Splunk.

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="warn">
    <Appenders>
        <Socket name="socket" host="SERVER NAME" port="9500">
            <PatternLayout pattern="%m%n"/>
        </Socket>
        <Console name="STDOUT" target="SYSTEM_OUT">
        </Console>
    </Appenders>
    <Loggers>
        <Logger name="uk.co.vsf.home.monitoring" level="info" additivity="false">
            <AppenderRef ref="socket" />
            <AppenderRef ref="STDOUT" />
        </Logger>

	...

</Configuration>

Bringing it all together, we’ve gone from Wattson which displayed only one figure – house load – as shown in the (albeit not great) picture below:

To this 😀

And this complicated device/application diagram

Hopefully this incarnation of the home monitoring application will last a few years, but I suspect I’ll be re-writing it all again at some point 🙂

References
Tesla Powerwall 2 API https://github.com/vloschiavo/powerwall2/
Log4j2 Socket Appender https://logging.apache.org/log4j/2.x/manual/appenders.html#SocketAppender

Home Monitoring Upgrade (Part 2) – HSQLDB to MySQL

As mentioned in my previous post (see http://blog.v-s-f.co.uk/2017/04/home-monitoring-upgrade/), the first task I have is to migrate from HSQLDB to MySQL.

Because the system logs data every minute while I have power in the house and I want to minimise downtime when I actually have a fully working upgrade, I’ve experimented with a copy of the live HSQL database.

Once I’d copied it from my server to my laptop I then attempted to view the data in notepad++ – yeah, not a particularly smart move! NP++ cannot handle files over about 100M. It also turns out (having more’d the .data file) that the data is not in a readable state. I had a performance issue with a select query a long time back and changed the table to cached.

So, luckily at work a colleague had introduced us all to a great database tool called SQL Workbench. It can work with most databases and unlike SQuirreL, it doesn’t crash when looking at the work DB2 database.

Using SQL Workbench, I’ve then created a script file which creates the new consolidated table and loads data from the old tables in to the new + drops the old tables. The end result is a 122,398KB HSQLDB script file which is human readable.

Next step was to get MySQL running on my server. Instead of installing it directly though, there’s a Docker image available.

My first few attempts at inserting the data from the HSQLDB file in to the MySQL database were less than impressive! One of the attempts had the server running flat out (100% cpu) for over an hour when I finally decided that it was probably not going to complete the import this year and nuked it!

So having learnt a few lessons about not using single row inserts(!), but batching them in to multi-row inserts of 100,000 and a few MySQL deafult parameter increases (although I’m not sure if these are necessary as the batch inserts seemed to make most difference), I was finally able to import the data. It still took a few minutes from running the MySQL container to it being available – but that’s significantly better than running for hours importing the data!!

Now that I have the necessary scripts and knowledge to migrate the data, the next part is re-writing the application that receives the Arduino data, uploads to PVOutput and serves the hot water display Arduino.

Home Monitoring Upgrade

I’ve been monitoring stats from my meter, weather and hotwater tank for over two years now (see http://blog.v-s-f.co.uk/2015/04/home-monitoring-home-made-reborn/) and the application now needs an upgrade.

I now want to log more data from the weather station (temperature and humidity). This should be as simple as adding two new columns to the HSQLDB, changing the application to write in to the two new fields and adding two new fields to the service definition, but it’s not quite that straight forward…

The old app uses an out dated version of Mule on Tomcat in Docker and it’s far too heavy weight for what it needs to be. Therefore it’s time to give it a revamp.

It’s also occured to me recently that instead of storing the data in five separate tables (one for generation, upload info, hotwater, meter and weather data), why not store it in one table. This saves a significant amount of space as there are four less records per minute and it makes adding new columns for additional data sources relatively quick. The HSQLDB that I’ve been using for a while now is over 400M!

So the first task, which is possibly the biggest, is to migrate the data from the five tables in HSQLDB to a single table and then stop using HSQLDB and migrate to MySQL. Why MySQL – it’s actually quite a performant database, it’s free and easy to get running.