Literature Review: Simulation of power signatures for appliances

Since data collection (including ground truth) is not a trivial process in NILM, very few standard datasets exist that can be used for validation of algorithms. Oliver Parson has a good list of publicly available datasets for NILM here. One way to get around the necessity of datasets is to simulate  a reaslistic one. In this post, I review the proposed literature aimed at modelling appliance-level and whole home power consumption patterns.

Authors from UMASS: This is probably the most comprehensive simulator of whole home power consumption based on appliance level modeling (and the only one that doesn’t assume appliances consume power in a stepwise fashion). They model three types of loads: resistive (as simple step functions), inductive (as exponential decay), and capacitive (as min-max models with exponentially distributed spikes). They also have complex loads that are combination of one or more of these models. Although they haven’t analyzed in detail how individual power traces can be combined realistically to get an whole home power consumption signature, I assume probabilistic models of when appliance are likely to operate (based on a house’s characteristics like number of occupants, geographic location, etc.) will come into play. A dataset like Pecan street‘s or ICF‘s* would be invaluable in building a realistic usage model from the appliance models in this paper.

*ICF was the implementer (and needs to be contacted for the dataset). The UK Government carried out the study.

Researchers at King’s College do use the patterns from the UK (/ICF) dataset to simulate whole home electricity consumption, but their model only forecasts at an hourly level (and most load signatures aren’t available at that resolution). Something very similar has been done by researchers at University of Waterloo using a similar dataset and similar premise to create hourly level data.

Researchers at Loughborough University add information from occupancy models (based on certain TOS surveys), activity simulation, in addition to building appliance-level simulated data to create whole home consumption profiles at a minute level granularity. They use data from 22 houses at the same resolution to validate their simulation models. Their appliance modeling part is fairly weak, where most appliances are modelled as two state, and revolve around fixed parameters (average annual usage, assigned power factor etc). From a disaggregation stand-point this might not be of much use, although from a perspective of demand forecasting, this might be useful.

Authors from Korea University and Samsung have created a simulator based on SystemC platform (typically used for simulation in design automation) that operates at granularity of a second. They use a user-module to model a user’s behavior for turning appliances on and off, and an appliance-module to model an appliance’s operation behavior. They use hardcoded values to model an appliance’s power consumption, and different states that it can exist in. Again, power is modelled as step changes, and it is not clear how user behavior is modelled. For the test bed they have, they appear to have used the exact time the appliance was switiching states, and the power consumption in each states to model the behavior. Obviously comparing the “model” using these parameters will yield perfect results. This model isn’t as powerful as a model that does incorporates user behavior to simulate the operation of different appliances.

iPower: iPower is a collaborative platform (of companies and universities). The report is an attempt to model common household loads using Matlab’s simulink platform, to simulate whole building consumption. They see this as beneficial in formalizing control problems. HVAC consumption is modelled as a markov model with parameters like indoor temp, outdoor temp, thermal resistance between walls, thermal capacity. (Actually they estimate the tempearture inside the room, and assume HVAC will be a function of that). Similarly, they model a fridge’s consumption based on temperature differentials,and other paratmers like specific heat, cold storage mass, etc. Generic appliances (like dishwasher, microwave etc) are just modelled based on prior assumptions and hardcoded values.

Vienna University of Technology: This is a more interactive simulation. It models appliances as step changes. It asks for values like maximum allowed power, earliest time of use, latest time of use, and creates power consumption signatures. The selection of appliances are few, and the overall building consumption hasn’t been discussed.

Osaka University: In this model, the annual energy consumption of a house is simulated using schedule of living activities, weather data and energy efficiencies of appliances and buildings. They base their simulation off a survey of activities, and model appliances as those linked to occupancy/activity and those that are not. Fridge model is built using outside air temp, water heater using variables like city water temp, outside air temp etc., HVAC consumption model is built using some assumptions about physical properties of buildings, lighting is simulated using floor area. The operating power values and standby power values are hardcoded for ~16 appliances. (They take these micro models to estimate city wide power consumption at hourly level.)

A sequential monte-carlo formulation for contextually supervised load disaggregation

This is a particle filtering approach to estimating HVAC consumption, but can be extended to any load estimation depending on what context information is available. In the case of HVAC, the context information is outside temperature. For appliances like lighting, the context information could be outside light intensity, for active appliances, it could be motion, for periodic appliances (eg. dish washer/stove) it could be time of day.

Data preprocessing:
I worked with Pecan street data (10 houses, 1 week 1 min resolution: 1 week 15 min resolution with appliance level ground truth). Find out the lag between temperature time series and HVAC power by cross-correlating total power and temperature. Smoothen the data (sliding window of 1 sample with a width of 10 samples was used). Add the furnace and air compressor-condensor data to find HVAC ground truth.

Model: 
The problem is modeled as a particle filtering problem, where the hidden state in each time series (variable x) is the HVAC power consumption. The observed variable is the total power consumption (variable z). The action that is taken at each time point that leads to consequent state is represented by  θ. In this case, the temperature acts as the action variable.

Assumptions:
As is clear from figure above, we need to know the transfer functions between and x. I assumed the relationship was linear, and used a primitive model based on assumptions that HVAC consumption ~50% of total power in texas during summer time. So,
z = 2x + N (μ1, σ1). This assumption is allowed to be crude, as when the model compares the estimates of z with observed z to correct for estimates of x.

Now we need to model the relationship between and θ. It is also important to model θ as the action parameter that dispalces x(t-1) to x(t). For this, I used the assumption of x(t)=z(t) / 2. Then I fit a plane for s.t. f ( x(t) )= a.θ + b x(t-1) + c based on observed data. The R square values (after robust regression were >0.98 for all, which gives confidence in the linear model.

Other parameters to consider were how many points to sample, I chose 1000, but it didn’t seem to matter much. I chose the noise distribution for both transfer functions to be gaussian. Their co-variances mattered a lot, and could probably be learned better with cross-validation on training data.

The following are the results on house 1. In this case, I assumed HVAC consumption was 40% of total (just to see how far I could push it, in actuality it is 70% for this data). The error was 2.52%. The RMS error at each time point was 13% (which I dont think is relevant). When I compare it to results from the transfer function directly (dividing each total power time stamp by 2.5 and calculating the sum), the error is 37.6%. This ensures that the results aren’t biased by the choice of transfer function, and that learning is taking place  based on observations.

Next, I tried the model on all houses. The results are pretty good as shown in the following two tables. I chose the parameter values (variance for noise models: x_N and x_R) loosely from how it performed for 1 day of house 1) and kept it constant throughout. The aligning window was based on visual alignment between total power and temperature.The HVAC factor accuracy refers to accuracy of total HVAC consumption for the week if a ballpark figure of 60% of total load was considered to be HVAC (based on Texas numbers). The regression accuracy refers to the accuracy if only the regression model (without particle filtering) was considered. The fact that particle filtering accuracies are better than both these “prior” assumptions on the data, means that the system is learning from the observable and correcting itself.

Parameters Preprocessing Accuracy
Frequency House x_N x_R Aligning Window Smoothing Window Particle Filter HVAC Factor Regression
15m 1 0.55 0.05 14 10 3.90% 6.52% 8.19%
2 0.55 0.05 22 10 2.58% 17.95% 46.65%
3 0.55 0.05 16 10 2.99% 10.33% 57.23%
4 0.55 0.05 5 10 0.29% 0.69% 63.40%
5 0.55 0.05 18 10 0.49% 5.93% 70.82%
6 0.55 0.05 7 10 2.75% 5.36% 10.07%
7 0.55 0.05 11 10 0.59% 3.92% 6.91%
8 0.55 0.05 13 10 0.27% 12.30% 50.10%
mean 1.73% 7.88% 39.17%

Pecan street sample data also has another week (in september)’s data for the same houses at a different frequency (1m). I resampled that at 15 m and redid the analysis (the resampling was done to reduce computation time, simulation methods can take a lot of time)

Parameters Preprocessing Accuracy
Frequency House x_N x_R Aligning Window Smoothing Window Particle Filter HVAC Factor Regression
1m (resampled as 15m) 1 0.55 0.05 17 10 9.11% 13.33% 31.16%
2 0.55 0.05 16 10 13.67% 19.01% 31.44%
3 0.55 0.05 15 10 3.58% 10.67% 36.96%
4 0.55 0.05 18 10 2.87% 9.80% 52.79%
5 0.55 0.05 4 10 7.40% 8.80% 17.20%
6 0.55 0.05 14 10 1.64% 11.13% 43.59%
7 0.55 0.05 11 10 14.74% 16.85% 4.49%
mean 7.57% 12.80% 31.09%

The only thing that was changed was the aligning window as resampling changes that. The aligning window is a variable that encompases the lag between temperature change and HVAC operation. It is also a factor of the thermal integrity of the house.

One of the things that bothers me is that the houses in Pecan Street have very strong HVAC component. On average 60-70% of total consumption is HVAC-more specifically cooling (which is probably understandable given it is Texas). I want to try this on a more generic dataset as well. For that I am working on SMART dataset house 1 right now. The data processing (aligning everything according to time stamps, getting temperature info) has been taking some time now.

The temperature seems to fluctuate from 29 F to 98 F (for the three months). But ground truth is only available for Furnace HRV. The question I have is does furnace HRV include all heating/cooling related consumptions ?

Creating power traces for non-intrusive load monitoring from GT events in BLUED

Although BLUED gives event-level ground truth (GT) for all appliances, it is not clear how to get power level ground truth based on that. Since the whole goal of NILM is to estimate power for different appliances, it would be useful if there was an estimate of appliance level consumptions in BLUED. The closest we can get is by looking at the GT event labels and going back to the aggregate power to see the corresponding change, and attributing that change to that label.

I ran some basic analysis to create power traces for appliances. Following was how the algorithm worked.
1. Look for different kinds of GT labels.

Mac IDs for Plug level GT: [1,2,3,8,11,12,18,20,23,27,28,29,31,32,34,35,40]
Mac IDs for GT from Environmental sensors : [47,48,49,50,51,52,53,55,56,57,58,59]
Mac IDs for GT from Circuit level monitoring: [4,7,9,10,11]

2. Every time there is a GT event associated with a mac address, extract the power change that happened in the main power level  by looking at the difference between a point 30 samples (0.5 seconds) after the event happened and 5 samples before it happened.

3. From these power changes, assume the power change was constant after each positive change, and zero after each negative change.

To test how well this works, I tested it with the GT labels (from Plug meters) we had from BLUED. The energy discrepancy is in part due to calibration errors for the Plug meters (as they were not individually calibrated- and that was because their goal was to provide event-level GT anyway). Part of it is also due to different sampling rates used in the Plugs and the power calculations in BLUED. (The one in the Plugs has one sample every 0.66 seconds, while the power I calculated from aggregate data was 1 Hz). I tried to account for that by resampling the plug data at 2/3 of its frequency. But, that apart. this discrepancy also points to the limitations of estimating power based on event based methods. Even in cases like these where complete event level GT is available, the estimated energy for the week can be off by as much as 270%. Table 1 details the results from comparison of energy level GT for all plug meters.

Mac ID Number of Events Energy calculated from aggregate power (kwhr) Energy calculated from Plug (kwhr) Error %
1 26 0.86 0.23 273.17
2 25 0.93 0.49 88.25
3 24 0.06 0.10 -43.01
8 16 0.13 0.10 32.81
11 616 9.51 6.95 36.87
12 8 3.62 4.88 -25.89
18 45 1.70 1.84 -7.56
20 14 1.39 1.29 8.10
23 34 0.85 2.43 -64.95
27 20 0.07 0.06 7.07
28 77 0.73 0.94 -22.92
29 54 5.36 6.36 -15.71
31 150 0.06 0.16 -63.78
32 8 0.29 0.27 9.09
34 40 0.14 0.14 -4.19
35 2 0.00 0.08 -99.12
40 150 0.44 0.63 -30.55

To test if some of the discrepancy that is observed is due to calibration error for the Plugs, I looked at the mean power consumption of the appliances (mean of all the step changes observed) in both the aggregate case, and the GT case. Following were the results:

Mac ID Estimated mean consumption from aggregate(watt) Estimated mean consumption from plug(watt) Approx. calibration error from Plug (%)
1 29.09 11.75 59.59
2 30.87 23.90 22.57
3 418.33 354.73 15.20
8 778.59 874.25 -12.29
11 157.21 208.64 -32.72
12 44.35 43.11 2.78
18 42.08 57.68 -37.08
20 28.75 37.73 -31.22
23 19.82 80.69 -307.03
27 1240.11 1163.13 6.21
28 30.95 35.08 -13.35
29 170.09 189.19 -11.23
31 1003.14 799.96 20.25
32 1664.38 1271.66 23.60
34 1418.46 1429.12 -0.75
35 64.99 50.55 22.21
40 30.48 36.49 -19.72

So, here I made the assumption that the mean value calculated from the aggregate power was closer to the rated power consumption of the device (which is what was done in the table presented in the original BLUED paper as well). After that I re-calibrated the energy consumption info obtained from Plugs (the error values obtained above in table 2). The following table lists the re-adjusted error in Energy calculations from aggregate power data (with perfect GT) when compared to Plug level energy GT.


Mac ID Energy after calibration adjustment (kwhr) Error after calibration adjustment (%)
1 0.37 -133.82
2 0.60 -53.58
3 0.12 50.53
8 0.09 -51.42
11 4.68 -103.42
12 5.02 27.89
18 1.16 -46.92
20 0.89 -57.16
23 -5.03 116.93
27 0.07 -0.81
28 0.82 11.05
29 5.65 5.04
31 0.19 69.88
32 0.33 11.73
34 0.14 3.46
35 0.10 99.28
40 0.51 13.49

Following are a few plots of what the estimated power trace looked like when compared to plug level power trace. The red line denotes the reconstructed power trace based on Event level GT and the Blue line is the GT based on plug level data.

What is clear from these plots is that most of the error can be attributed to the inability of any event-based algorithm to go back to the aggregate power when an event happens and estimate appropriately the power delta that caused the event. And since such information is only available during events, a state based method is probably more reliable.

Some of the power traces created based on Env Sensor ground truths are shown below.

I am sharing the power traces for all the devices with their mac_ids so that they can used as a reference for anyone who wants to do power estimation using BLUED. I am also sharing the code for extracting such info from a BLUED power based dataset. [Actually sharing is not possible through our website just yet, but should be in a few days]

Building up the motivation -II

In my last post, I provided a summary of all the work that has been done so far in accessing the potential savings from appliance-wise disaggregation. All the studies were done for a period of 2-4 months, which is not enough time to estimate savings arising from behavioral changes (that time frame isn’t even enough to account for weather related changes!). All the studies had 10-20 houses (except for Karbo-Larsen study, which provides very little detail)- which is a very limited sample size. But given the overhead cost and maintenance issues related to monitoring all the appliances in a house, the small sample sizes are no surprise. In summary, the studies done so far in this field claim savings of 10-15% based on limited sample size and trial duration.

Following is the list of things that are of importance in a study like this, that hasn’t been tackled at all in the literature pertaining to NILM advantages. I believe they are of fundamental importance in motivating the solution to the NILM problem, and might even play crucial role in policy making.

1.  The kind of behavioral changes that disaggregated feedback can invoke. For instance, are people more likely to change/ replace appliances or are they more likely to change the way they use their appliances (habits).

2. In terms of behavioral changes, what kind of changes are more likely. This gets to the heart of what appliances really need to identified.

3. What is the minimum number of appliances (and which ones) that need to be given feedback on to achieve savings. i.e after how many appliances does the marginal savings begin to decrease. This information is fundamental for algorithm development.

4.  How does the demographic (income, age, household size etc.) affect what kind of behavioral changes are observed.

5. What is the estimated savings based on what people can change (based on their demographics).

6. Are the changes/ savings persistent ?

7. Is disaggregated feedback required infinitely, or does the savings from it saturate after some time after which it is no longer required ?

8. Is appliance level feedback the only kind of disaggregated feedback that can generate savings ? What about feedback in terms of what room is consuming how much, or what activity (within some broadly defined categories) is consuming how much. [This opens up a whole new avenue of research].

A study that addresses these issues will not only get to the heart of the actual implications of disaggregated feedback, but will also motivate other ways of tackling the same problem.

Building up the motivation for Non-Intrusive Load Monitoring

For the past three decades extensive research has been underway in efforts to disaggregate total power consumption of a house to appliance level detail. The first paper that introduced this concept in 1992 (by George Hart) has been cited 341 times so far, which says on average 16 papers come out a year that deal with this subject directly or indirectly. [245 of these papers came out in the last three years alone!]. The reason behind so much research interest in this field is because it is conveniently located in an academic space where people from Electrical Engineering, Civil  and Environmental Engineering, Computer Science and Machine learning can simultaneously work on different aspects of it. It is an intriguing theoretical problem for the theory minded (as it provides a new use case for the source separation paradigm), promises ample opportunities in terms of hardware design and sensing for the more application/experiment oriented, and comes with the benefit of energy savings for the implication oriented- which is also an easy sell to the general public.

From the implication point of view, disaggregated energy is useful because of the following reasons:

1. Users can take necessary actions to save energy if they know which appliance is consuming how much.
2. Utilities can provide personalized recommendations to users if they know which user is consuming more than the average in their group.
3. Utilities can also perform demand response based on the information they have about what appliance is operating at any given time.
4. Improved load forecasting.
5. Potential for fault detection and diagnosis in larger systems.

Testing the validity of reasons 3 and 4 is difficult because until appliance level data is available in a large scale, the actual implication of demand response based on appliance level data cannot be evaluated. Also, current state of demand response algorithms is not advanced enough where we can efficiently decide what appliances to turn off to achieve the required peak shifting (or whatever the requirement is). Very little work has been done on using appliance signatures for fault diagnosis and pre-emptive action. So, right now the only real motivation behind working towards a NILM solution that can be tested is its utility in helping consumers take control of their consumption habits and, in the process, save energy.

A quick literature review reveals that not many people have looked into the potential of appliance level feedback in saving energy. Of the 38 studies (from 1979 to 2006) looked at by Darby in her seminal report in 2006, not a single one dealt with the impact of disaggregated feedback in terms of energy savings attainable. The studies mostly looked at savings from aggregate feedbacks and recommendations and realistic savings that were achievable ranged from 5-15%.

Ehrhardt-Martinez el al. in 2010 performed a comprehensive literature review and found that there are 5 studies that have looked into the impact of disaggregated feedback. The following is the list of the studies:

1. Dobson and Griffin (1993, Ontario Hydro, Canada) : Experiment done on 25 households found that over a period of two months energy savings of 12.9% were achieved as compared to a control group of 75 other houses. The households were given real disaggregated time data on a computer screen. No details on how the savings resulted (new devices, behavioral changes etc.). No details on persistence (although the behavior was reported to be persistent over the time of the study).

2. Ueno, Inada et al. (2006, Japan) : Two month long experiment on 10 houses where information was logged every 30 minutes. Every morning an email was sent to the users detailing their appliance level energy usage. They found average savings of 18%. [Although there was temperature difference between the period of data collection and baseline, which makes the savings lower. Some have estimated it to be at around 9%].

3.  Ueno,  et al. (2006, Japan): Ueno has done another study in 2006 on 9 houses that showed savings of 9% according to Martinez et al. The study could not be traced from the citation provided.

4. Wood and Newborough(2003, UK): Four month long study on 20 households with 22 more as control. Users showed savings of up to 15% (varied from 11%-39%). The feedback was only on cooking appliances, so the findings aren’t generalizable.

5. Karbo and Larson (2005 Denmark). This study is on 3000 households with 50 households given appliance level feedback. Expected savings of 10%. But the actual study is still being implemented at the time of the paper and results aren’t back yet. No information on what appliances (if all) were monitored.

In summary, savings from 10-15% seems to be a safe bet based on these studies. In my next post, I’ll outline the issues pertaining to NILM that these studies fail to address. Then I’ll build a case for the ideal study that might need to be created to make a definitive claim about savings from NILM methods.

2012 in review

2012 was an OK year in terms of what I achieved in the academic side of things. Besides almost getting done with my undergrad requirements for a Masters in Civil Engineering, I also took a few machine learning and signal processing classes. I also sat for my PhD qualifiers exams.

In terms of publications, here’s what I did this year :

1. Submitted a paper in EG-ICE 2012 for automated training in NILM using EMF sensors. Won Best Paper (mostly because of Mario’s presentation skills)
2. The paper was invited for AEI journal. So, we extended it and submitted it. The reviews are back and we are in process of working out the kinks.
3. Submitted a paper for Buildsys 2012 on using plug meters to first automatically identify appliances and then do demand response on that. Didn’t get through. We are in process of converting that to a journal paper for JCCE.
4. Submitted a paper on using EMF sensors as event detectors to ICCPS 2012.
5. Submitted an abstract and wrote a paper for ISARC 2013 based on the work I did on NILM (power estimation) while I was at Samsung.
6. Submitted an abstract on using blind source separation techniques to tackle cross talk in EMF sensors for EG-ICE 2013.
7. I also wrote a proposal for PG&E dataset to apply collaborative filtering techniques to provide energy recommendations to users. It did not get accepted either.

Here’s a short video that reviews the current state of NILM that I made last night.

Hosting a local file to a remote server and password protecting it

  • ssh into the server where you want your files to be hosted.

ssh username@linux.andrew.cmu.edu [Enter your password when prompted]

  • Go to the folder where you want to host the files.
  • Create a .htaccess file and a .htpasswd file.
  • I used vim. Go to this site and create a htpasswd file that is hashed. Basically enter the username and password you want to protect the folder with, and it will give you a string. Copy that string in your .htpasswd file and save it.
  • Copy the following to your .htaccess file

AuthUserFile /home/content/10/9290610/html/appliance_data/.htpasswd
AuthGroupFile /dev/null
AuthName "EnterPassword"
AuthType Basic
require valid-user

  • Go to the source folder directory and secure copy it to the remote server
    scp -rp Source_Folder user@server.org:destination_folder_path

ruminations on Component Analysis

Most of what I have done in the past year has centered around projections. The crazy part about all of the stuff I have tried doing so far is that I have been doing it without exactly knowing what I was doing. I never had any formal training on PCA, LDA or kernel methods (unless youtube/ wikipedia qualifies as formal). So, now that I am taking MLSP and Pattern Recognition, there are lots of eureka moments that I go through each class. Still, some of the ‘world experts’ that teach our classes (Fernando de la torre, for instance) are at such a high level of understanding that it is hard to keep up with their perspectives on the matter. Fernando, for instance has come up with a master equation that explains PCA, LDA, Kernel PCA, Spectral Clustering and even K-Means as variations of the same theme. I didn’t quite get what the equation means but the general idea seemed to be minimization of reconstruction error, with certain tweaking parameters that define each projection.

The problem with this high level treatment is that we just hear a couple of sentences about each of these topics in class and that is it. The task of picking up the necessary tool and implementing it is left to the student. Spectral clustering and Independent Component Analyses have struck my fancy out of the things that have been mentioned in Fernando’s two classes. And obviously, kernel methods and SVM. Robust PCA was a very interesting idea too.  There is so much to learn, it is crazy!

For the next few days I will try to read up on particular topics each day and blog about it. I read up a bit on ICA today and implemented a few things. I will write about it tomorrow when the hour is more reasonable, but I was surprised at how much better than PCA it was. Actually, the pre-processing step for ICA does something similar (if not exactly) to PCA. With this implementation, I also realized the benefits of having a standard dataset. Since I have the dataset of 15 appliances that I have already done PCA, LDA on, I can immediately know how ICA performs on it- and the results are encouraging. I am thinking of adding it to the paper that  we plan on submitting for the AEI journal.

quick thoughts on active NILM

Mario mentioned that he met someone from Sony, and here I quote, ‘by pure chance’ during a conference (Sust-KDD  2012 in Beijing ?). Apparently he was working on active NILM. The idea, from  what I gathered was to send active pulses (or signals) in the power line and look at the response in order to perform NILM. This is in direct contrast to all the “passive” methods so far where already existing signals are studied and analyzed unmodified.

Something of similar sort is done to find faults in fiber optics cables (look for OTDR).

One thing to try would be to look at a simple circuit (with a bulb and a battery). Maybe send an impulse from one point and measure it at some other point when the bulb is OFF and compare to  how it changes when the bulb is ON.

Something similar has been done by Patel et al (that guy seems to have done everything)  where they send pulses from two (distant) sources down the powerline and use receivers throughout the house for indoor positioning. Although a completely different problem, this can be a good starting point to see what kind of frequency ranges the pulse should be sent at in order to cover the whole house.

My first question is if this can somehow be used to gain distance information (as most devices in the house are mostly stationary and their distance from the main circuit is a characteristic feature in itself). Maybe you could have plug-meters around the house and check for disturbances at nodes. Or even better, maybe you could look at how disturbances in the voltage line due to EMI affect the pulse. That probably would be an easier estimator of distance. Question is, is it worth pursuing ?

Some quick thoughts on projections

Suppose S(t) is a musical signal- piece of an instrument -say violin. Suppose we know that  this signal is composed of 5 different notes of violin-call them n1,n2….n5. So how do you find which notes were being played in the signal at any given time ? [Assume n1 to n5 span S(t) completely]

This, roughly, was the first question in my MLSP class. The way this is solved is by finding the spectrum for all of the notes using STFT. Then STFT is calculated at each time point for the aggregate signal (aka a spectrogram). Then the STFT at each time stamp is projected on to the STFT of the notes. If the magnitude of the projections are above a certain (empirical) threshold, then the note is played at that time stamp, otherwise it is not. The reason for empirical threshold arises from the fact that the notes aren’t orthogonal to one another. If the notes were orthogonal then the notes that don’t contribute to the signal at that time would have a zero magnitude. I am not completely sure, but I think if they are orthonormal, the projections on to notes that contribute would have a value of 1. [Mario ?/ Emre?]

Point is, Energy disaggregation is the same problem. In your ideal case the aggregate signal would be projected on to all of  the appliance signatures, and the ones that are OFF would have a magnitude of zero. Problem is, there is no reason the signatures would be orthonormal to each other. And once you try to create an orthonormal basis (say through PCA or LDA) you lose the simple contributes/doesn’t contribute classification.

Maybe- once we calculate the orthonormal bases for appliance signatures, we could project the appliance signatures into these bases as well. This way you know which of the signatures contribute how much to the basis components [say the first Principal Component]. After that we can project the aggregate signal into the orthonormal basis and know exactly what bases contribute to the signal.

Then you go and look at the projection of the signatures into these bases and see which signatures contribute to these bases  [There has to be some statistical way to do this]. Besides, it sounds reasonable that the new set of orthonormal basis (orthonormal signatures) would still span the same space as the one spanned by the signatures.

Either I am getting a better understanding of projections, or I am ruining my understanding by over thinking things in terms of NILM.