When my empirical mode decomposition code in MATLAB worked (that is to say exhaustively broke a nonlinear signal into symmetric, smooth functions, each less complex than the last - and satisfied the first
requirement of an IMF absolutely and the second somewhat closely), I shot off a message to a friend on Facebook titled "Nonlinearity Conquered." I knew I was getting ahead of myself. But it was something. So there.
The
wealth of literature on EMDs, cries out the lack of a rigorous mathematical base for the definition of EMD. Originally the process was proposed as a very loose algorithm. How loose? There have been as many variations to the EMD algorithm as there have been programmers who have experimented with it. EMD is primarily an iterative algorithm which uses a method called sifting to extract
intrinsic mode functions from any real signal. By definition, IMFs will have well behaved Hilbert transforms. They will also have instantaneous frequencies associated with them, i.e. throughout the signal, there can be isolated such a time-window that within that time-window the signal oscillates with a unique frequency.
Looks very promising and tempting for time-frequency analysis - but because of the algorithmic nature of EMD, the conditions for a signal to be an IMF are rarely, if ever, met. The definitions of the IMF are far too stringent and the process used to obtain them, the EMD, is far too subjective. Some researchers have gone so far as to say that there might be an unseen paradox between the two arguments.
Thus, while a rigorous treatment of the EMD is welcome, a survey of different versions of the EMD algorithm and stoppage criteria would be more pragmatic. We need to investigate the possibility of developing a yardstick to compare different stoppage conditions, subject, of course, to the respective application. Within the purview of a certain application domain - time-series analysis, machine learning, regression, etc - we should be able to pit one stoppage criterion against another and see if we can comment on which is better.
Here's a little demonstration of a very naive EMD algorithm, in the sense that it takes very moderate views on the definitions of the IMF. The input is a part of a OAE dataset - the distortion product amplitude of about 1848 audiometric tests.
EMD extraction produces eight IMFs for the above signal:
As can be clearly seen, the decomposition removes the high-frequency components of the signals first, and then sifts down to the mean trend of the signal. In other words, it can detect waves that are riding on other waves - the kth IMF rides on the (k+1)th IMF.
This particular implementation of the EMD was perhaps the simplest one - it consisted only of determining whether the difference between the number of extrema and zero-crossings differed at most by unity, and then visually inspecting if the signal was roughly symmetric about the horizontal axis. This naive approach worked for this particular signal because it was inherently simple - whatever nonlinearity it had was spatial, not parametric. For roughly bandlimited signals like speech, the algorithm produces even fewer and simpler IMFs, and the residue comes tantalizingly close to zero. This demonstrates the empirical nature of the algorithm - once the algorithm finishes empirical sifting through a given signal, there is little left in the signal which the algorithm has never seen, since it never encounters frequencies outside the band! In case there is a lot of high-frequency noise in the signal (non-bandlimited signals), the algorithm cannot empirically make sense of it (or it can, but only asymptotically), and leaves it in the residue - with the more meaningful data in the early IMFs. In such cases we would need more sophisticated stopping conditions and a very specific adherence to the IMF definition in order to unleash the full might of the Empirical Mode Decomposition.
A treasure of such stopping conditions and IMF definitions are available in the literature. So varied are their motivations and implications that they may be the very scattershot that sends students like me on a wild goose chase.
In the upcoming posts I intend to tackle these variations and see if we establish where one triumphs over another.