Late in 2010, I embarked on a couple of “professional development” projects. One was to learn more about SQL and MySQL, with a view to coming up with a better way of storing data on various projects. The second was to learn more about R, in part because–as many have put it–R has become the lingua franca of statistical computing; so many books on statistics nowadays use R to illustrate what they’re talking about.
To my surprise, these two projects converged. It turns out that R is able to talk to several SQL DBMS systems. In addition packages for connecting to MySQL, there are packages that embed SQLite, allowing connections to SQLite databases. I have not done an benchmarking analysis, but my impression is that SQLite/R is fast, with some tasks taking a fraction of the time they took in my earlier (SAS) setup.
Another thing I like about SQLite is that one can also interact with Perl. So it is quite straightforward to have Perl get and preliminarily process the data before handing over to R for further analysis.