Ramblingsthe_s3ntinelFebruary 22 2010, 07:10:50 UTC
I don't think the term 'fit' necessarily connotes 'getting a line to go through the dots exactly'. In fact, 'regression' is basically a synonym for model-fitting. Also, while it is certainly possible to find a 9th-order polynomial that will 'join up the dots' this is something you'd rarely want to do, except perhaps as an exercise in linear algebra.
Typically in statistics, one wants to find a model with as few parameters as possible, but which explains most of the variability in the data.
The way to do a regression (to a cubic polynomial) is not to find polynomial that will go exactly through four points and then just 'see what happens' with the rest. I don't really know what your project wants you to do, so don't assume that the following is necessarily what they're after, but for your edification, here is a description of how to do a 'least squares' regression to a cubic:
Suppose the model you're trying to fit is as follows: y = α3x3 + α2x2 + α1x + α0 + (error term). Let α denote the column vector (α0 α1 α2 α3)T. First you calculate a 9 x 4 matrix X whose ij-th entry is xij-1, where xi is the x-coordinate of your i-th data point. Let y denote the 9-element column vector containing the y-coords of your data points.
Then to fit the model by 'least squares', you perform the following matrix multiplication:
α = (XT X)-1 XT y
Define zi = α3xi3 + α2xi2 + α1xi + α0 (which can be more compactly written as just z = X α). Then one can prove that the values of the α's above are such as to minimize Sum(i from 1 to 9) (yi - zi)2. (That's what 'least squares' means.)
Re: RamblingschreeskoFebruary 22 2010, 08:31:58 UTC
Thank you for the explanation. This is something that I last studied years ago (and even then I didn't have the best grasp of it) so I appreciate the clarification.
For my specific problem, I've been given a set of data points and told to create an equation that fits the graph. I decided to go the tedious linear algebra route because I wasn't sure what order polynomial my equation would have, and I thought it would be the most thorough method to ensure I didn't miss anything (and I also couldn't think of another way to do it). Based on your explanation, though, it seems that another way would be to go through and do a least squares regression to a cubic, quartic, etc. until I find the best fit.
Typically in statistics, one wants to find a model with as few parameters as possible, but which explains most of the variability in the data.
The way to do a regression (to a cubic polynomial) is not to find polynomial that will go exactly through four points and then just 'see what happens' with the rest. I don't really know what your project wants you to do, so don't assume that the following is necessarily what they're after, but for your edification, here is a description of how to do a 'least squares' regression to a cubic:
Suppose the model you're trying to fit is as follows: y = α3x3 + α2x2 + α1x + α0 + (error term). Let α denote the column vector (α0 α1 α2 α3)T. First you calculate a 9 x 4 matrix X whose ij-th entry is xij-1, where xi is the x-coordinate of your i-th data point. Let y denote the 9-element column vector containing the y-coords of your data points.
Then to fit the model by 'least squares', you perform the following matrix multiplication:
α = (XT X)-1 XT y
Define zi = α3xi3 + α2xi2 + α1xi + α0 (which can be more compactly written as just z = X α). Then one can prove that the values of the α's above are such as to minimize Sum(i from 1 to 9) (yi - zi)2. (That's what 'least squares' means.)
Reply
For my specific problem, I've been given a set of data points and told to create an equation that fits the graph. I decided to go the tedious linear algebra route because I wasn't sure what order polynomial my equation would have, and I thought it would be the most thorough method to ensure I didn't miss anything (and I also couldn't think of another way to do it). Based on your explanation, though, it seems that another way would be to go through and do a least squares regression to a cubic, quartic, etc. until I find the best fit.
Reply
Leave a comment