This page describes R programs which are useful for computing the `adaptive trimmed mean' (ATM), which is an estimator that is used when trying to obtain information from dealer markets.
This code is presently in use in two systems for capturing information from dealer markets:
As background on reference rates from dealer markets, you might find the following documents useful:
R is free software, so it is easy for you to install R and then run these programs. These pages might help you get started with R.
The main program is referencerate.R and it internally uses referencerate_lib.R. The `boot' library that comes with R is used for bootstrap estimation.
Bootstrap inference is a probabilistic algorithm, so there is an innate non-replicability of results. By default, each run of R seeds the RNG afresh. In order to reduce confusion and make the program look more reasonable, I seed the RNG with 1001 in referencerate.R. This ensures that upon every run, the numerical values that come out are the same.
To run it for a file 3.data, I say:
$ cat referencerate.R | R --slave --args 3.data > 3.rout
In order to run this for bid and offer, you have to run the program twice, once for the bid and once again for the offer.
The file 3.data is just one value obtained from one dealer per line. If you poll 10 dealers, this file will have 10 lines. The file 3.rout looks like this:
Raw data contains 21 points: [1] 428 430 430 430 430 430 430 431 431 431 431 431 431 431 431 [16] 432 432 432 432 433 4431 Adaptive trimmed mean results -- Optimal trimming 4 Estimate 430.9231 Which has sigma of 19.93157 The 95% CI is 430.3846 - 431.3846
The data in 3.data is a bunch of `normal' values, with one weird value (4431) thrown in as the last observation. We see that the adaptive trimmed mean (ATM) chooses an optimal trimming of 4 (i.e. drop the highest and lowest 4 observations), and gives an estimate of 430.9231, which is essentially unaffected by that one weird observation of 4431.
Bootstrap inference is used to report the standard deviation of the distribution of the ATM and the 95% confidence interval.
Here is another example, which uses raw data from MIBOR polling:
$ cat referencerate.R | R --slave --args 2.data Raw data contains 19 points: [1] 6.20 6.50 6.50 6.00 6.25 6.25 6.25 6.25 6.25 6.25 6.20 6.10 6.30 6.30 6.50 [16] 6.15 6.15 6.20 6.40 Adaptive trimmed mean results -- Optimal trimming 4 Estimate 6.245455 Which has sigma of 0.02935925 The 95% CI is 6.168182 - 6.286364
The file montecarlo.R is an example of setting up a monte carlo simulation to measure the MSE of these estimators. 1000 times, I simulate a dataset from a normal mixture with 7 "normal" observations and 5 "noisy" observations. I use N(0,1) errors for the "normal" observations and draws from t(2) for the "noisy" observations. The mean squared error associated with the mean, median and the ATM are computed. To run it, say:
$ R --slave < montecarlo.R
In my experiment, it reports that the MSE of the mean is 0.3, the median is 0.136 and the ATM is the best at 0.127. You can twiddle the file montecarlo.R and do other experiments based on your curiosity. The file montecarlo.R also makes a pretty picture of the (kernel density estimator of the ) sampling distribution of the three estimators.
Here is the tarfile. Three sample data files, and the results expected for these files, are included so that you can be sure it's running correctly.
This is free software; feel free to do whatever you want with it. The author obviously bears no legal liability for any mistakes or losses that you run into. If you use the software, do cite the paper mentioned above, and if you put it into production for some reference rates, do tell me about it so that I will augment the list of applications above.
Back up to Ajay Shah's home page -- Free software
Ajay Shah
ajayshah at mayin dot org