Example of EOF-Analysis with Matlab

Step 1) First generate a data set.

As an example we will generate a simple data set. I already have an m-file which will perform this.

lgolf.m -- this will generate a data set in space and time. It is, in fact, nothing but a superposition of several sin(kx - ut) waves. See below for a figure of the data.



Step 2) Perform EOF analysis.

EOF.m -- this file will perform an EOF analysis of the data You will also need zeroavg.m, which computes zero-average data over the columns. I forgot to put this on the website earlier, my apologies!.

Step 3) Analyse Results.

a) Determine the number of relevant EOFs (e.g. explain 90 % of the variance)
b) Plot the EOFs. Do they make sense
c) Plot the EC. Any relevant time scales that may be identified
d) Determine the contribution of each EOF to the reconstruction of the original signal

Please note there are several pitfalls in EOF analysis. Good reviews of the limits of EOF analysis can be found in:

Storch, H. von, A. Navarra, 1993
Analysis of Climate Variability, Applications of Statistical Techniques
proceedings of an Autumn School, on Elba from October 30 to November 6, 1993 Publisher: Berlin : Springer, 1995, ISBN: 3-540-58918-X, 334 pp

See paragraph 13.3.2 "What EOFs are Not Designed for..."

Also see

Richman, 1986
Rotation of Pricipal Components (Review Article)
J. of Climatology, Vol. 6, 293-335 (1986)


Below is a sample MATLAB session. The comments in between the tekst will NOT be generated with MATLAB, but I've added them for brief explanations. Data are generated in variable E

> lgolf;
> whos
  Name      Size         Bytes  Class

  Amp       1x200         1600  double array
  E       100x200       160000  double array
  Phi       1x1              8  double array
  i         1x1              8  double array
  k         1x1              8  double array
  t         1x100          800  double array
  w         1x1              8  double array
  x         1x200         1600  double array
  xc        1x1              8  double array

Grand total is 20505 elements using 164040 bytes
> [V, EOFs, EC, error] = EOF(E,10);
> help EOF

  function [V,EOFs,EC,error]=EOF(D,p)

  This function decomposes a data set D into its EOFs
  and eigenvalues V. The most efficient algorithm proposed by
  von Storch and Hannostock (1984) is used.

  D = each row is assumed to be a sample. Each column a variable.
  Thus a column represents a time series of one variable
  p = is an optional parameter indicating the number of
  EOFs the user wants to take into account (smaller than
  the number of time steps and the number of samples).

  EOFs= matrix with EOFs in the columns
  V = vector with eigenvalues associated with the EOFs
  EC = EOF Coefficients
  error = the reconstruction error (L2-norm)

  This function uses svds from Matlab version 5

  Written by Martijn Hooimeijer (1998)

> % plot a view of the data
> surfl(x,t,E,'light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time')

> plot(V)
% it may be seen that 2 EOFs explain the data perfectly

> plot(EOFs)
% most of the EOFs produce a noise signal, only the first 2 EOFs make sense
> plot(EOFs(:,1:2))

% plot the time evolution of the first two EOFs
> plot(EC(:,1:2))

% plot the contributions of each EOF to the reconstruction to the data
> surfl(x,t,EC(:,1)*EOFs(:,1)','light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time'),
> title('Contribution of first EOF To Reconstruction')
> surfl(x,t,EC(:,2)*EOFs(:,2)','light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time'),
> title('Contribution of second EOF To Reconstruction')

% It is clear that the two EOFs each produce a set of standing waves.
% When added together the result is a propagating wave. This can also
% be derived from the approximately 90 degree phase difference
% between the EOFs 1 and 2 and between their corresponding EOF
% time coefficients.

% Plot the combined contributions of the first two EOFs to the
% reconstruction of the data
> surfl(x,t,EC(:,1:2)*EOFs(:,1:2)','light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time'),
> title('Contribution of second EOF To Reconstruction')

% Plot the reconstruction error
> surfl(x,t,E-EC(:,1:2)*EOFs(:,1:2)','light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time'),
> title('Contribution of second EOF To Reconstruction')

% Note that we get a resulting sine wave. This is because the
% EOF program first makes the data zero-averaged in time and then
% applies the EOF analysis. The difference in the plot is nothing
% but this average which was substracted. We can also determine EOF
% coefficients based on the original data and not on zero averaged
% data by directly transforming the data to EOF space.

> ECnew=E*EOFs(:,1:2);
% Plot the reconstruction error again
> surfl(x,t,E-ECnew(:,1:2)*EOFs(:,1:2)','light'), shading interp, view([-27, 42])
> xlabel('space'), ylabel('time'),
> title('Contribution of second EOF To Reconstruction')

% This yields very small errors in the order of 10-15

% Now compute the L2-norm of the recontruction error
> error2=norm(E-ECnew(:,1:2)*EOFs(:,1:2)')
error2 =
     1.12347020548844e-013

% We can conclude
%  1) the data are well described by only two EOFs,
%  2) The original data represent travelling waves
%  3) Reconstruction is done by adding two standing waves which, added up,
%     yield a travelling wave
%

> exit