Although BLUED gives event-level ground truth (GT) for all appliances, it is not clear how to get power level ground truth based on that. Since the whole goal of NILM is to estimate power for different appliances, it would be useful if there was an estimate of appliance level consumptions in BLUED. The closest we can get is by looking at the GT event labels and going back to the aggregate power to see the corresponding change, and attributing that change to that label.
I ran some basic analysis to create power traces for appliances. Following was how the algorithm worked.
1. Look for different kinds of GT labels.
Mac IDs for Plug level GT: [1,2,3,8,11,12,18,20,23,27,28,29,31,32,34,35,40]
Mac IDs for GT from Environmental sensors : [47,48,49,50,51,52,53,55,56,57,58,59]
Mac IDs for GT from Circuit level monitoring: [4,7,9,10,11]
2. Every time there is a GT event associated with a mac address, extract the power change that happened in the main power level by looking at the difference between a point 30 samples (0.5 seconds) after the event happened and 5 samples before it happened.
3. From these power changes, assume the power change was constant after each positive change, and zero after each negative change.
To test how well this works, I tested it with the GT labels (from Plug meters) we had from BLUED. The energy discrepancy is in part due to calibration errors for the Plug meters (as they were not individually calibrated- and that was because their goal was to provide event-level GT anyway). Part of it is also due to different sampling rates used in the Plugs and the power calculations in BLUED. (The one in the Plugs has one sample every 0.66 seconds, while the power I calculated from aggregate data was 1 Hz). I tried to account for that by resampling the plug data at 2/3 of its frequency. But, that apart. this discrepancy also points to the limitations of estimating power based on event based methods. Even in cases like these where complete event level GT is available, the estimated energy for the week can be off by as much as 270%. Table 1 details the results from comparison of energy level GT for all plug meters.
|Mac ID||Number of Events||Energy calculated from aggregate power (kwhr)||Energy calculated from Plug (kwhr)||Error %|
To test if some of the discrepancy that is observed is due to calibration error for the Plugs, I looked at the mean power consumption of the appliances (mean of all the step changes observed) in both the aggregate case, and the GT case. Following were the results:
|Mac ID||Estimated mean consumption from aggregate(watt)||Estimated mean consumption from plug(watt)||Approx. calibration error from Plug (%)|
So, here I made the assumption that the mean value calculated from the aggregate power was closer to the rated power consumption of the device (which is what was done in the table presented in the original BLUED paper as well). After that I re-calibrated the energy consumption info obtained from Plugs (the error values obtained above in table 2). The following table lists the re-adjusted error in Energy calculations from aggregate power data (with perfect GT) when compared to Plug level energy GT.
|Mac ID||Energy after calibration adjustment (kwhr)||Error after calibration adjustment (%)|
Following are a few plots of what the estimated power trace looked like when compared to plug level power trace. The red line denotes the reconstructed power trace based on Event level GT and the Blue line is the GT based on plug level data.
What is clear from these plots is that most of the error can be attributed to the inability of any event-based algorithm to go back to the aggregate power when an event happens and estimate appropriately the power delta that caused the event. And since such information is only available during events, a state based method is probably more reliable.
Some of the power traces created based on Env Sensor ground truths are shown below.
I am sharing the power traces for all the devices with their mac_ids so that they can used as a reference for anyone who wants to do power estimation using BLUED. I am also sharing the code for extracting such info from a BLUED power based dataset. [Actually sharing is not possible through our website just yet, but should be in a few days]