Method of least squares

Some of you will ask, how are these universal constants that physicists put into the formulas? Simple, they are extracted based on experimentation.

When one formulates a formula to explain a natural process and, to be able to calculate, the practical results obtained never exactly match the formula proposed. Everything is due to small errors of calculation due to the devices that are used or, simply, to human errors.

Therefore, what is usually done is an average of values ​​obtained to see what the real value and, thus, to adjust the formula or the constant that is used in the formula. Remember that the constant of the formulas are multipliers of adjustment of the formula to the real values ​​that are obtained … “ñapas” (bad arrangement) as an engineer would say due to the instrumentation.

So let’s imagine that we do an experiment to measure the constant of gravity (either by a pendulum or dropping freely to a known height a known weight). If we do the experiment n times we will get n values ​​that will all be within a range.

Suppose the constant to be measured is g and we repeat n times by having g_{1}, g_{2},...g_{n} values. What would be the value of g mathematically speaking?.

Yesterday, you will not remember it, but I talk about the optimization of values, then we can start from that base. Thus, we can choose the value that minimizes the distance between the value g and each of the obtained. That is, we would have the following function that should be minimized:


If we use this formula, that in principle, minimizing the distance, we would get the correct g value we would see that it has a serious problem: we need an enormous amount of points. That is to say, to more points, to more times than we do the experiment and more values we obtain, with better quality we will obtain g. For example, if we make 2 values, g would be in the interval between both values being a very large interval. Come on, that does not work.

How can we improve it? By making the function F take higher values and therefore, the minimum is more accurate. For example using:


Where to be squared, if we minimize this function, the distances will be smaller and the calculation of g will be more optimal. In addition, the new function is smooth, and as we have seen in soft curves, its critical points are at their maximum or minimum values, although in our case the critical point will be at a minimum.

And how do we know it will be a minimum?. Just look at how the square function is where the only critical point is the minimum, and therefore, in our function which is a sum of squares, exactly the same thing will happen. Very simple.

Returning to the topic, let’s differentiate the new function as a function of g:

0=\frac{\partial F^{2}}{\partial g}_{g=\bar{g}}=\sum_{i=1}^{n}2(\bar{g}-g_{i})=2n\bar{g}-2\sum_{i=1}^{n}g_{i}

Where we have new things there. First we have differentiated in function of g (obvious) and we have added the average value (that would be g) of physical form, that is to say, we call \bar{g} = g.

From the above equation, we take off the mean value of g, remember \bar{g} and we get:


Something that, our head tells us is the average value but what we have done is to apply the method of least squares.

At the end it is no more than algebra and therefore a vector theme. It is simply to imagine each of the values of g calculated as a vector, that is, \vec{g}=(g_{1}, g_{2},..., g_{n}) It would be a vector in a vector space of n dimensions and, in addition, we suppose a vector \vec{v}=(1, 1,...,1), let all its components in this vector space of n dimensions Is 1, is a unit vector.

If we project the vector $ latex \vec{g}$ over \vec{v} we will have the following:

\vec{g_{v}}=\frac{\vec{g}\cdot \vec{v}}{|\vec{v}|^{2}}\vec{v}

That, would give us exactly the same as in the previous way, obviously being \vec{v} a unit vector. Great!.