I work with large datasets, messy data handling, models, etc. I often get asked about the computational tools that are useful for dealing with these kinds of problems. There are `menu driven systems' where you click some buttons and get some work done - but these are useless for anything nontrivial. To do serious economics these days, you have to be writing computer programs. And this is true of any field - e.g. empirical macroeconomics - and not just of "computational finance" (which is a hot buzzword these days).
You should pay attention to four elements: price, freedom, elegant and powerful computer science, and network effects.
| System | Applicability | Price | Freedom | CS | Network effects |
| SAS | SAS seeks to be a single monolithic system that will comprehensively do all your economics and statistics. | Huge! | Zero | Terrible | Some economists do write SAS code so I guess some useful SAS codes are floating around. |
| Stata | Stata is a `younger, cleaner statistics package' as compared with SAS. | High | Zero | More elegant than SAS, but really not a place you want to write programs. | Quite a few economists use Stata, and there is a wealth of code coming out through `stata technical bulletins' (STB). |
| Gauss | Gauss is the workhorse of econometricians worldwide. | High | None | Gauss is really poor CS - worse than Stata but better than SAS. | Strong |
| Ox | Ox seeks to be a low price but unfree system to compete with Gauss. | Zero | But it's not GPL, and it is effectively only for Microsoft Windows. | It looks like a clean language, better than gauss and stata. | Small (there are some good stochastic volatility codes out there). |
| Octave (Matlab) | Octave is a freeware matlab. So it's good for writing formulas in matrix notation. I felt it isn't good at dealing with files / string manipulation / regexes, and that it is not strong on probability and statistics. I also felt it's graphics are weak. | Free | Free | Excellent | A few people in macroeconomics are writing matlab codes, and I suppose these should work with Octave, but beyond that, No. (And if these codes use matlab toolboxes like `optimisation', then you're out of luck). |
| R | R is a complete statistics package, integrating great computer science, matrix notation, good graphics, probability and statistics. | Free | Free | Truly superb | Weak - as yet, only a few economists are using R. |
| Grocer | Grocer is an econometrics system written using scilab (a matrix scientific computation system). | Free | Free | Fair | Small |
From 2003 onwards, I have been doing all my work in R. I am also looking at grocer with interest - some of the hard work that has gone into it is yielding invaluable functions which would be hard work to reimplement.
In addition, there is "background work" of massaging data, manipulating files, and so on. I do this using Unix tools: shell, shell prompt utilities, awk and perl.
Ajay Shah, December 2006