Lags with panel data

Using SAS, I have often wished there were a way to use lags easily in PROC SQL (using self-joins on a.year=b.year+1 doesn’t always apply). Turns out one can do it in PostgreSQL using “window” functions. Suppose I want the bonus paid to a manager this year and last:

SELECT executive_id, execid(Executive_ID) as exec_id,  FY_End, bonus, 
  	lag(FY_End) OVER w as FY_End_lag, lag(bonus) OVER w as bonus_lag 
FROM executive.executive
WINDOW w AS (PARTITION BY execid(Executive_ID) ORDER BY FY_End)

Pretty easy (I discuss the execid function below). This discovery should make one of my tasks today a little easier (though already I see a need to deal with “gaps” … the last line of the result set below). There are a lot of lag functions in various packages of R, but I haven’t seen examples of these being used for panel data rather than a single time series.

This example illustrates another neat feature of PostgreSQL: the ability to put R code (the middle two lines below) in the database using the PL/R module provided by Joe Conway. In this case, I had existing regular expression code in R to extract an executive identifier and I just dropped that into PostgreSQL (I can even do this part from R using RPostgreSQL).

CREATE OR REPLACE FUNCTION execid(original_id text)
RETURNS text AS $BODY$
  execid <- gsub(&"^(\\d+)\\..*N\\.(\\d+)$", "\\1_\\2", original_id, perl=TRUE)
  return(execid)
$BODY$ LANGUAGE plr IMMUTABLE;

Putting this into action:

crsp=# SELECT * FROM
crsp-# (SELECT executive_id, execid(Executive_ID) as exec_id,  FY_End, bonus, 
crsp(#   lag(FY_End) OVER w as FY_End_lag, lag(bonus) OVER w as bonus_lag 
crsp(# FROM executive.executive
crsp(# WINDOW w AS (PARTITION BY execid(Executive_ID) ORDER BY FY_End) ) AS a
crsp-# WHERE a.FY_End_lag IS NOT NULL
crsp-# LIMIT 10;
  executive_id   |   exec_id   |   fy_end   | bonus  | fy_end_lag | bonus_lag 
-----------------+-------------+------------+--------+------------+-----------
 10007.2.N.99816 | 10007_99816 | 2004-12-31 |      0 | 2003-12-31 |         0
 10007.1.N.99816 | 10007_99816 | 2005-12-31 |      0 | 2004-12-31 |         0
 10008.2.N.99785 | 10008_99785 | 2005-06-30 |      0 | 2004-06-30 |         0
 10008.1.N.99785 | 10008_99785 | 2006-06-30 |      0 | 2005-06-30 |         0
 10009.1.N.99743 | 10009_99743 | 2006-05-31 |      0 | 2004-12-31 |         0
 10009.2.N.99744 | 10009_99744 | 2004-12-31 |      0 | 2003-12-31 |          
 1000.3.N.0      | 1000_0      | 2001-12-31 |      0 | 2000-12-31 |    375000
 1000.2.N.0      | 1000_0      | 2002-12-31 |  65813 | 2001-12-31 |         0
 1000.1.N.0      | 1000_0      | 2003-12-31 | 140000 | 2002-12-31 |     65813
 1000.1.N.11     | 1000_11     | 2003-12-31 |  36960 | 2001-12-31 |         0
(10 rows)

Advertisements
This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s