Saturday, June 6, 2015

Statistical Calculations & Numerical Accuracy

This post is for those readers who're getting involved with economic statistics for the first time. Basically, it serves as a warning that sometimes the formulae that you learn about have to be treated with care when it comes to the actual numerical implementation.

Sometimes (often) there's more than one way to express the formula for some statistic. While thee formulae may be mathematically identical, they can yield different numerical results when you go to apply them. Yes, this sounds counter-intuitive, but it's true. And it's all to do with the numerical precision that your calculator (computer) is capable of.

The example I'll give is a really simple one. However, the lesson carries over to more interesting situations. For instance, the inversion of matrices that we encounter when applying the OLS estimator is a case in point. When you fit a regression model using several different statistics/econometrics computer packages, you sometimes get slightly different results. This is because the packages can use different numerical methods to implement the algebraic results that you're familiar with.

For me, the difference between a "good" package and a "not so good" package isn't so much the range of fancy techniques that each offers at the press of a few keys. It's more to do with what's going on "under the hood". Do the people writing the code know how make that code (a) numerically accurate; and (b) numerically robust to different data scenarios?