Monday, March 29, 2010
Meandian, part 2
There was a good thread three years ago on the weighting of data points.
To summarize: if you have 100 data points, then mean treats each one with 1% of the weight. The median treats 98 of the data points at 0% of the weight, and the two middle data points get 50% of the weight each. You can lop off the top and bottom 5% each, and then weight the remaining 90 data points at 1.1% each.
So I was suggesting that there’d be a continuous function, where the closer you are to the median, then then more the weight. So, rather than arbitrarily set the 0% weight at the bottom/top 5% or bottom/top 49% as the two examples above give us, I figured I would do it by the distance of the data point to the median. In a practical example, if someone forecasts Scot Shields for 200 IP while the median is 75, then the distance between the two points (125) would set the weight (say as 1/125, or 1/sqrt(125), etc). This is the opposite of where RMSE gives more weight to the outliers.
We had a good discussion three years ago on the matter, and I’m interested with new blood out there what they think after reading that thread.


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date