Trend estimation

by Irina 25. May 2007 06:17

Trend estimation :

Trend in a time series is a slow, gradual change in some property of the series over the whole interval under investigation. Trend is sometimes loosely defined as a long term change in the mean, but can also refer to change in other statistical properties. For example, tree-ring series of measured ring width frequently have a trend in variance as well as mean.Identification of trend in a time series is subjective because trend in a sample cannot be unequivocally distinguished from low frequency fluctuations.
Curve-fitting.
If a time series changes in level gradually over time, it makes sense to consider as trend some simple function of time itself.
The simplest and most widely used function of time used in detrending is the least-squares-fit straight line, which treats linear trend. Simple linear regression is used to fit the model:

x = a + bt + et

where xt  is the original time series at time t , a is the regression constant, b is the regression  coefficient, and are the regression residuals.The advantage of the straight-line method is simplicity. The straight line may unrealistic, however, in restricting the functional form of the trend.

Trend estimation in Teradata:

SELECT CAST(REGR_SLOPE(deposit , period ) AS DECIMAL(8,4)) as beta,
sqrt(REGR_SXX( deposit , period))  as sxx,
sqrt(REGR_Syy( deposit , period )) as syy,
CAST(REGR_R2(deposit , period)  AS DECIMAL(8,4)) as r,
sqrt(1-r)*syy  as s_e,
cast(s_e/(sqrt(14)*sxx) AS DECIMAL(8,4)) as s_ee,
beta/s_ee as t,
case when abs(t)>1.96 then 1 else 0 end as significant,
case when beta>0  and significant=1 then 1 
     when beta<0  and significant=1 then -1
     else 0 end as trend

Trend estimation in SAS:


data leadprd;
      input date:monyy5. leadprod customer @@;
      format date monyy5.;
      title 'Lead Production Data';
      title2 '(in tons)';
      datalines;
   jan90 38500 1 feb90 37900  1 mar90 36900  1  apr90 38600  1 
   may90 36400 1 jun90 33300  1 jul90 34000  1  aug90 38000  1 
   sep90 37400 1 oct90 42300  1 nov90 36900  1  dec90 34800  1 
   jan91 33900 1 feb91 34000  1 mar91 37200  1  apr91 33300  1 
   may91 29800 1 jun91 24700  1 jul91 30800  1  aug91 31100  1 
   sep91 32400 1 oct91 32900  1 nov91 29100  1  dec91 31800  1 
   jan92 32100 1 feb92 30500  1 mar92 36800  1  apr92 30300  1 
   may92 29500 1 jun92 24700  1 jul92 27600  1  aug92 23800  1 
   sep92 21400 1 feb90 37900  2 mar90 36900  2  apr90 38600  2
   may90 36400 2 jun90 33300  2 jul90 34000  2   aug90 38000 2 
   sep90 37400 2 oct90 42300  2 nov90 36900  2   dec90 34800 2 
   jan91 33900 2 feb91 34000  2 mar91 37200  2   apr91 68800 2 
   may91 75000 2 jun91 85000  2 jul91 10555  2   aug91 11520 2 
   sep91 32400 2 oct91 22500  2 nov91 29100  2   dec91 31800 2 
   jan92 32100 2 feb92 23556  2 mar92 33505  2   apr92 43005 2 
   may92 66500 2 jun92 77550  2 jul92 88800  2   aug92 99990 2 
   ;
   run;

Next produce your forecasts and save their predicted values to SAS data sets. This example uses the forecasting capabilities of the FORECAST, the ARIMA, and the REG procedures. The OUT1STEP option of PROC FORECAST specifies that only the one-step-ahead forecasts are output to the data set LEADOUT1. The LEAD= option produces forecasts for 12 months beyond the sample period.
 proc forecast data=leadprd out=leadout1 out1step
      lead=12 interval=month;
      id date;
      var leadprod;
	  by customer;
   run;

    proc arima data=leadprd;
      i var=leadprod nlag=15;
      e p=1;
      f lead=12 interval=month id=date out=leadout2;
   by customer;
   run;
   quit;

To estimate a time trend for the lead prediction data, it is necessary to create a new variable T that spans both the sample and forecast periods.

data ttrend;
      set leadout2;
      t+1;
   run;
proc reg data=ttrend;
model leadprod = t;
output out=leadout3 p=ptrend;
ods output ParameterEstimates = estim;
ods output  FitStatistics=k;
ods output anova=n;
by customer;
run;
quit;
proc sql;
create table estim_trend as
select a.*,
case when Probt<0.05 then 1 else 0 end as significant,
case when Estimate>0  and calculated significant=1 then 1 
     when Estimate<0  and calculated significant=1 then -1
     else 0 end as trend
 from estim a
 where variable ne 'Intercept';
 quit;

 data final;
      merge leadout1(keep=date leadprod customer
                   rename=(leadprod=pfore)) 
        leadout2(keep=date leadprod forecast customer
                   rename=(leadprod=actual forecast=parima)) 
        leadout3(keep=date ptrend customer);
		by customer date;
   run;
    

About the author

Irina Spivak Irina Spivak
Team Leader at G-Stat. More...


Send mail Email

Blogroll

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2013

    Sign in

    eXTReMe Tracker