WRDS SAS versus PostgreSQL

For one reason or another, last night I was partially replicating a paper by Beaver, McNichols, and Nelson (2007). I pulled together some code in R that draws on my PostgreSQL database, produces some summary statistics like those in the paper, produces a histogram and does a little statistical inference. Total runtime: around 9 seconds. Pretty fast.

Anyway, I discovered this morning that there is already SAS code out there to do much of this using WRDS. Just for comparison, I pulled out just part of the data step and got it running on the WRDS server. Run time: 23 minutes. Big difference.

Then I noticed more straightforward code to replicate Richardson, Sloan, Soliman, and Tuna (2005). I ran this code on WRDS:

proc sql;
    create table comp1 as
        select gvkey, datadate, ACT, CHE, LCT, DLC, AT, IVAO, LT, DLTT, IVST, DLC, PSTK, OIADP
        from comp.funda
        where INDFMT='INDL'and DATAFMT='STD'and POPSRC='D' and CONSOL='C'
            and 1962 <= year(datadate) <= 2001 and missing(AT)=0 and missing(CHE)=0 and
            missing(ACT)=0 and missing(LCT)=0 and missing(SALE)=0 and missing(LT)=0 and 
            missing(OIADP)=0 and prcc_f > 1;
quit;

Runtime data:

NOTE: PROCEDURE SQL used (Total process time):
      real time           33.87 seconds
      cpu time            10.05 seconds
 

Now, a little tweaking of the SQL gives me code I can run in PostgreSQL:

CREATE TABLE comp1 AS 
    SELECT gvkey, datadate, act, che, lct, at, ivao, lt, dltt, ivst, dlc, pstk, oiadp
    FROM comp.funda
    WHERE indfmt='INDL' AND datafmt='STD' AND popsrc='D' and consol='C'
        AND EXTRACT(YEAR FROM datadate) BETWEEN 1962 AND 2001
        AND at IS NOT NULL and che IS NOT NULL AND act IS NOT NULL
        AND LCT IS NOT NULL AND sale IS NOT NULL and lt IS NOT NULL
        AND oiadp IS NOT NULL AND prcc_f > 1;

Runtime data: 2.27 seconds. I also did a version that used the RPostgreSQL package to pull the data into R:

> system.time(source("~/Dropbox/rsst.R"))
   user  system elapsed 
  0.940   0.120   3.242 

Not as big of a difference (a mere ten-or-so times), but still pretty impressive.

Advertisements
This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s