Follow up to a MySQL to SAS comparison

The document here compares MySQL with SAS for some basic data tasks and gives the impression that MySQL is slower than SAS, but can be close enough in terms of performance to be worthy of consideration.

I did a fairly unscientific comparison with PostgreSQL. I don’t have the same hardware (the document above is from 2007), but I used my slow-as-a-wet-week-for-database-stuff two-year-old MacBook Pro (5400 RPM drive with whole-disk encryption).

Conclusion: PostgreSQL surely leaves SAS in its dust (the results posted in the source above are all in minutes). There was one query where PostgreSQL didn’t perform well, but one would actually use “window function” for that particular query (not available in MySQL or SAS, as far as I know) and then PostgreSQL is much, much faster.

CREATE TABLE test AS SELECT * FROM crsp.msf LIMIT 1000000; -- 10.6 seconds 

CREATE TABLE sub1 AS SELECT * FROM test WHERE vol=0; -- 5.5 seconds
CREATE TABLE sub2 AS SELECT * FROM test WHERE vol>0; -- 9.0 seconds

INSERT INTO sub1 SELECT * FROM sub2; -- 23.2 seconds

CREATE TABLE crop AS SELECT permno, date FROM test; -- 2.7 seconds

CREATE TABLE sort AS SELECT * FROM test ORDER BY permno, date; -- 17.6 seconds

CREATE TABLE stat AS SELECT date, avg(ret) FROM test GROUP BY date; -- 1.3 seconds

CREATE TABLE joined AS 
	SELECT a.permno, a.date, a.prc, b.prc AS lprc
	FROM test AS a 
	LEFT JOIN test AS b 
	ON a.permno=b.permno AND
		b.date BETWEEN a.date - interval '31 days' AND a.date - interval '1 day'; 
-- 259.2 seconds (higher than 182 seconds listed for SAS!)

-- Let's create an index to see what that does
CREATE INDEX test_idx ON test (permno, date); -- 3.5 seconds

CREATE TABLE joined2 AS 
	SELECT a.permno, a.date, a.prc, b.prc AS lprc
	FROM test AS a 
	LEFT JOIN test AS b 
	ON a.permno=b.permno AND
		b.date BETWEEN a.date - interval '31 days' AND a.date - interval '1 day'; 
-- 258.9 seconds (like SAS, PostgreSQL didn't use this index).

-- But you would never do that using PostgreSQL. Instead, use a window function:
CREATE TEMP TABLE joined3 AS 
	SELECT permno, date, prc, lag(prc) OVER w AS lprc 
	FROM test 
	WINDOW w AS (PARTITION BY permno ORDER BY date); -- 4.0 seconds!!!	
Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s