Claus Johansen, Sønderborg, Denmark, 20160330 > 20200528
Overview 
Contents:
2) Examples of inconvenience by the traditional method
3) Least Squares Fitting  Perpendicular Offsets
5) Comparison of the two formulas
6) Rearrangement of the new formula
7) Comparison of the three methods
8) Implementing in a spreadsheet
9) Are the inconveniences eliminated with this "Perpendicular Offsets" method?
10) The Coefficient of Determination, an additional argument for the "Perpendicular Offsets" method
11) Practical handling of the double solutions for the "Perpendicular Offsets" method
12) Practical formulation of the three solutions for the "Perpendicular Offsets" method
Least square method (linear) is incorporated as a standard tool in all common spreadsheets. The extracted line is often called "Trendline". The method acts by minimising the sum of the squared vertical distances, between the line that best represents the data, and the data points. This traditional method has several inconveniences and excels strictly speaking only by being easy to use via the integration into spreadsheets.
It's therefore suggested to use another variant of least square method, where the sum of the squared offsets (or distances) perpendicular to a line is minimised. It will be shown how to deduce the formula and how it's easily incorporated into a spreadsheet.
Fig. 1.1: Vertical offsets.  Fig. 1.2: Perpendicular offsets. 
Below there will be shown three examples of inconveniences by the traditional method.
A quite random example on some data:


In the equation y = ax + b in the upper left corner, a is the slope. In this example the slope is equal to 0.433333, but it's the same as tangent to the angle of the line with the xaxis (inclination), thereby the angle is:
θ = atan(0.433333) = 23.4287°
Let as an example all points rotate 25°:


In the equation y = ax + b in the upper left corner, a is the slope. In this case the slope is equal to 1.00286, but it's the same as tangent to the angle of the line with the xaxis (inclination), thereby the angle is:
θ = atan(1.00286) = 45.082°
But hello, before the rotation the angle was 23.4287°, thereafter the datapoints were rotated 25°, then the new trendline should lie in an angle of: 23.4287° + 25° = 48.4287°, but the angle is 45.082°  that isn't too convincing!
Below the rotated trendline (red) is rotated back and superimposed the original data and the initial trendline (blue). The trendline from the rotated data has a clearly smaller slope:
Fig. 2.1.2: Comparison with rotated data. 
Invert as an example the data by swapping x and y values:


In the equation y = ax + b in the upper left corner, a is the slope. In this case the slope is equal to 1.426829.
But ahoy, before the inverting the slope was 0.433333, thereafter the data were inverted, then the new trendline ought to have a slope of: 1/0.433333 = 2.307692, but the slope is 1.426829  it doesn't seem trustworthy!
Below the inverted trendline (red) is inverted back and superimposed the original data and the initial trendline (blue). The trendline from the inverted data has a clearly larger slope:
Fig. 2.2.2: Comparison with inverted data. 
In the example the angle of the trendline was found to be:
θ = atan(0.433333) = 23.4287°
Yes, but rotate then all datapoints in reverse sense to this angle, that is 23.4287°:


Now the least squares method ought to give a horizontal line, but that's not the case. in the equation y = ax + b in the upper left corner, a is the slope  if the line was horizontal would it be 0. The slope of the line is 0.035, and it's equal to tangent to the angle of the line with the xaxis (inclination), thereby the angle is:
θ = atan(0.0350657) = 2.008°
A small angle yes, but not zero.
Maybe not so important, but another indication for the line to have a slope, is that the coefficient of determination R^{2} isn't zero.
A condition for finding the line with the least sum of squared perpendicular distances to a pointcloud of data, as shown below, is that you know / assume / demand that the line passes through the centre of gravity of the pointcloud. This assumption will not be proved here but makes good sense if you think of the pointcloud as a physical body and the deduced lines as principal axis of inertia.
The centre of gravity of the pointcloud is found, in a spreadsheet it will typically be easiest to use the "average" function. There can be created a new coordinate system (s, t) and data can be transformed from the (x, y) coordinate system to this new (s, t) coordinate system.
As the mean values are symbolised by and the new coordinates are generated by:
(3.1)  
(3.2) 
Fig. 3.2.1: The centre of gravity with the new coord.sys.  Fig. 3.2.2: Data in (s, t) coord.sys. 
Now let a new coordinate system (u, v) be positioned in an arbitrary angle θ relatively to the (s, t) coordinate system (it is thereby rotated around the centre of gravity):
Fig. 3.3.1: A new rotated coordinate system.  Fig. 3.3.2: Data in (u, v) coord.sys. 
Given that the uaxis symbolizes the searched line (with a minimum sum of squared perpendicular distances) then the perpendicular distances will simply be the value of the vcoordinate.
The general formulas for rotating the coordinates, from (s, t) to (u, v) by the angle θ, reads:
(3.3)  
(3.4) 
Equation 3.3 will strictly not be used but is included for completeness sake. "v", i eq. 3.4, is as mentioned the perpendicular distance to the line or the uaxis. it's the square of this distance that will be used:
(3.5)  
⇔  
(3.6) 
Equation 3.6 is then the squared distance for an arbitrary point, but it's the sum of the squares of the distances, that shall be minimised. To avoid writing the somewhat awkward summations several times, they will be replaced by these compact expressions:
(3.7)  
(3.8)  
(3.9)  
(3.10) 
When eq. 3.7, 3.8, 3.9 and 3.10 are inserted in eq. 3.6, then the sum of squared perpendicular distances can be written as shown below  it this equation where a minimum must be found:
(3.11) 
When the minimum (and/or maximum) of a function is sought, then differential coefficient is set to zero. The function above is easy to differentiate  it consist of sine and cosine plus some constants:
(3.12)  
⇔  
(3.13) 
From here you can proceed with two formulas for trigonometric functions with double argument:
(3.14)  
(3.15) 
When eq. 3.14 and 3.15 are inserted in eq. 3.13, it can be written as:
(3.16) 
If this differential coefficient is set equal to zero, it leads to the following equation, with starting point in eq. 3.16:
(3.17)  
⇔  
(3.18)  
⇔  
(3.19) 
From here you can proceed in different directions, that will lead to various solutions. You can square both sides in the equation and thereafter by means of the Pythagorean identity replace cos^{2} by sin^{2} or vice versa. The drawback is that the solutions in reality will hold two ± signs and thereby give 4 solutions, that needs to be checked. You can also just divide by cos on both sides of the equation and thereby get the result directly with the tangent function. This is done here:
(3.20)  
⇔  
(3.21) 
Another source, that I will refer to in the next section, uses a property "B", to express the right side of eq. 3.21. In that way the equation will look like:
(3.22) 
Where:
(3.23) 
The angle of the line (relative to the Xaxis) becomes:
(3.24) 
As the slope is equal to tangent to the angle of the line to the Xxis, it becomes:
(3.25) 
As it's mentioned above between eq.3.19 and 3.20, and as it's below in § 4 "A second source", there will generally be at least one ± sign in the solution. It's a consequence when looking for a minimum involves a differentiation and this differential is set to zero. But thereby there can't be distinguished between a minimum and a maximum  when there are two solutions they will represent a minimum and a maximum, respectively.
Then what about the solution represented here  in eq. 3.22 there is no ± sign in the solution? No, but
tan(2θ) can have values between ∞ and +∞ and the argument, 2θ will therefore in
the principal interval, with the function arcus tangent / arctan / atan take values between 90° and +90°,
the desired angle, θ will thereby take values between 45° and +45°. But it can't be true, that the
desired line can't be steeper than ±45°? No, the other solution that should be checked is the complementary
angle, that is the angle, that together with the angle found sums up to 90°.
θ + θ_{c} = π/2 (or 90°).
The angle (or the other solution) that should be checked is then:
(3.26) 
Ergo, whether you use the formula deduce here in § 3 or the formula in § 4 "A second source", there'll be two solutions that must be checked. However, the check can relatively easy be done, with the expressions used here in § 3.
In § 11 it's shown, how to decide which of the double solutions to use  without checking both solutions!
In § 6 the formula is rearranged to an expression, that don't need an evaluation of two solutions with a following check.
Mathworld gives the formula below, quote from:
Weisstein, Eric W. "Least Squares FittingPerpendicular Offsets."
From MathWorldA Wolfram Web Resource.
http://mathworld.wolfram.com/LeastSquaresFittingPerpendicularOffsets.html
When:
(4.1) 
Then the slope of the line can be found as:
(4.2) 
By a closer look, it can be seen, that the complex expression for "B", i eq. 4.1, is due to there isn't transformed to a (s, t) coordinate system and that the summations aren't written with the compact expressions of the type "SS", that were used in § 3. The equations 3.23 and 4.1 are actually the same, so eq. 4.1 might as well be written as:
(4.3) 
The quantity "b" in eq. 4.2 can be a bit confusing, as b is often used for the intercept with the Yaxis in equation for a line:
(4.4) 
But the slope is equal to tangent to the inclination (the angle to the Xaxis), so eq. 4.2 could as well be written as:
(4.5) 
The inclination (the angle to the Xaxis) is then:
(4.6) 
Can it really be true, that the two formulas actually tell the same?
(3.22)  
(4.5) 
Here again you can use a formula for trigonometric functions with double argument  this time tangent:
(5.1) 
Combine eq. 3.22 with 5.1 to get:
(5.2)  
⇔  
(5.3)  
⇔  
(5.4) 
You will end with a quadric equation, with tan(θ) as unknown. The Standard solution can be written as:
(5.5)  
⇔  
(5.6) 
It is thereby easy to transform the first solution, with 2θ to same solution as the second source, med θ alone.
Whatever you have used the first or second formula, it will be natural to evaluate the coefficient of determination (= Coefficient of correlation, squared), that can be composed with some of the previous used properties:
(5.7) 
It might seem as a trivial variant to use the atan2 function in the new formula, but it will ease the practical use:
If you take the new formula, as it was formulated in eq. 3.22, you will from here come directly to the following expression:
(3.22)  
⇔  
(6.1) 
From here you can insert the expression for "B", but it's more straightforward to utilise the formula as it was formulated in the previous eq. 3.21 and continue from there:
(3.21)  
⇔  
(6.2)  
⇔  
(6.3) 
Where:
(6.4)  
(6.5) 
This last formulation seems to be the most streamlined for practical use, that is eq. 6.3, 6.4 and 6.5.
As the slope is equal to tangent to the angle with the xaxis, it can now be written as:
(6.6) 
The method is discussed in § 4 with source reference with an explanation of the property "B". For the line with the least sum of squared perpendicular offsets, the following is true:
The slope is:
(4.5) 
Or the inclination of the line (relative to the xaxis Xaxis) is:
(4.6) 
In addition to this the line goes through the centre of gravity of the points ( , ).
As the following identity applies:
(7.1) 
Then the following relation must be true:
(7.2) 
The plus in eq. 4.5 must thereby include all the positive solutions, that is positive slopes/inclinations:
(7.3) 
And the minus in eq. 4.5 must include all the negative solutions, that is negative slopes/inclinations:
(7.4) 
If the coordinate system is turned a full 360°, the de positive solutions from eq. 7.3 will show the following course:
Fig. 7.a.1: Positive solutions. 
If the coordinate system is turned a full 360°, the de negative solutions from eq. 7.4 will show the following course:
Fig. 7.a.2: Negative solutions. 
If the coordinate system is turned a full 360°, both the positive and negative solutions, as shown in eq. 4.6 and again individually in eq. 7.3 and eq. 7.4 will show the following courses:
Fig. 7.a.3: Positive and negative solutions. 
The line that has the least sum of squared perpendicular offsets, has an inclination of approx. 25.610° in the initial position, so when the coordinate system is turned by this angle, the solution must show an inclination of 0°  by further turning the solution must be negative, and the solution must in this wise inevitably move back and forth between the positive and negative solutions. By evaluating both solutions and thereafter selecting the one with the least sum of squared perpendicular offsets, the right solution is chosen. Below are the positive and negative solutions from the two above figures, Fig. 7.a.1 & 7.a.2, superimposed the correct solution, marked by a fat yellow line:
Fig. 7.a.4: All solutions. 
The method was deduced in § 3 with an explanation of the property "B". For the line with the least sum of squared perpendicular offsets, the following is true:
The slope is:
(3.25) 
Or the inclination is:
(3.24) 
In addition to this the line goes through the centre of gravity of the points ( , )
As mentioned before in § 3, the function arcus tangent will give values between 90° and +90°,
thereby will the half angle, as used in eq. 3.24 & 3.25 have values between 45° and +45°. The other solution
that must be checked is the complementary angle, that is the angle, that together with the evaluated, adds up to
90°.
θ + θ_{c} = π/2 (or 90°).
The angle (or the other solution) that must be checked is:
(3.26) 
If the coordinate system is turned a full 360°, the directly found solution, as shown in eq. 3.24 & 3.25, shows the following course:
Fig. 7.b.1: Direct solutions. 
If the coordinate system is turned a full 360°, the complementary solutions, as shown in eq. 3.26, shows the following course:
Fig. 7.b.2: Complementary solutions. 
If the coordinate system is turned a full 360°, both the directly found and the complementary solutions, as shown in eq. 3.24 and 3.26, will show the following course:
Fig. 7.b.3: Direct and Complementary solutions. 
By evaluating both solutions and thereafter selecting the one with the least sum of squared perpendicular offsets, the right solution is chosen. Below are the direct and complementary solutions from the two above figures, Fig. 7.b.1 & 7.b.2, superimposed the correct solution, marked by a fat yellow line:
Fig. 7.b.4: All solutions. 
All things considered it may not be a new method, as it's just a rewording of the initial formula for the "new" method, that was deduced in § 3, but as this solution behaves significantly differently from the two previous solution, I will nevertheless count it as an independent method.
The method was deduced in § 6 with an explanation of the properties "X" and "Y". For the line with the least sum of squared perpendicular offsets, the following is true:
The slope is:
(6.6) 
Or the inclination is:
(6.3) 
In addition to this the line goes through the centre of gravity of the points ( , )
If the coordinate system is turned a full 360°, the solutions, as shown in eq. 6.3 & 6.6 above, will show the following course:
Fig. 7.c.1: Positive solutions. 
There are in the solution no immediate other solution that must be checked, but in principle you can't know if the equations 6.3 & 6.6 gives a maximum or minimum. An alternative solution could be:
(7.5)  
(7.6) 
It is obvious, that this solution is 90° offset compared to the previous solution, with (X, Y). The direction to (X, Y) is diametrically opposite to (X, Y), that is 180° offset and as there is multiplied by ½ to find θ this solution is offset 90°.
If the coordinate system is turned a full 360°, both solutions, as shown eq. in 7.5 & 7.6 above, will show the following course:
Fig. 7.c.2: Negative solutions, offset 90°. 
By evaluating both solutions and thereafter selecting the one with the least sum of squared perpendicular offsets, the right solution is chosen. Below are the direct "positive" and 90° offset "negative" solutions from the two above figures, Fig. 7.c.1 & 7.c.2, superimposed the correct solution, marked by a fat yellow line:
Fig. 7.c.3: Alle solutions. 
Contrary to the two previous methods, there is no need to check two solutions. The solution called "positive", with (X, Y), gives directly the solution with the least sum of squared perpendicular offsets. The solution called "negative", med with (X, Y), gives directly the solution with the largest sum of squared perpendicular offsets. The fact that it isn't necessary to find two solutions and thereafter check the offsets to select the correct one, represents after all a considerable simplification.
Below is shown an example on how the formulas can be implemented in a spreadsheet:
There is used some energy on avoiding nonsense calculations, especially divisions by zero. This leads to some formulas becomes somewhat awkward, with one or several "if" expressions and "NaN" as result in these cases:
Nr.  Equation.  Eq. no.  Formula i spreadsheet.  Action. 

1        Data. 
2 

8.1 8.2 
I2=average(A:A) J2=average(B:B) 
Average values for x and y data. 
3 

3.1 3.2 
C2=A2I$2 D2=B2J$2 
(s, t) coordinates. 
4      E2=C2^2 F2=D2^2 G2=C2*D2 
The 3 (s, t) squares. 
5 

3.8 3.9 3.10 
M2=sum(E:E) N2=sum(F:F) O2=sum(G:G) 
Sum of the 3 (s, t) squares. 
6 

6.4 6.5 
I5=M2N2 J5=2*O2 
Denominator and numerator in arcus tangent function are computed. 
7      I7=if('Ark1'!I5=0,if('Ark1'!J5=0,1,0),0)  It's tested if data has a "direction". 
8  6.3  I9=if(I7=0,(1/2)*atan2(I5,J5),"NaN")  The inclination (Xaxis) "θ" is computed.  
9      I10=if(I7=0,degrees(I9),"NaN")  Optionally convertion to degrees "Θ". 
10  6.6  I11=if(I7=0,if(I9=pi()/2,"NaN",tan(I9)),"NaN")  Slope of the line.  
11  8.3  I12=if(I7=0,if(I9=pi()/2,"NaN",$J2$I2*I11),"NaN")  Intercept at Yaxis.  
12  8.4  I13=if(I7=0,if(I9=0,"NaN",if(I9=pi()/2,I2,$I2$J2/I11)),"NaN")  Intercept at Xaxis.  
13  5.7  I14=if(M2*N2=0,"NaN",O2^2/(M2*N2))  The Coefficient of Determination. 
Line 7: If both X and Y (evaluated in line 6) is equal to zero the data has no "direction" and all lines (passing through the CoG) will represent the correlation between x and y data just as well. In this case the value of the cell is set to 1, otherwise it's set to 0.
Line 8: If the check of a "direction" of the data is o.k., then the inclination "θ" is calculated.
Line 9: If the check of a "direction" of the data is o.k., then the inclination is converted to degrees (from radians, calculated in line 8).
Line 10: If the check of a "direction" of the data is o.k. and if the angle differ from π/2 (90°) then the slope is evaluated.
Line 11: If the check of a "direction" of the data is o.k. and if the angle differ from π/2 (90°) then the intercept with the Yaxis is calculated.
Line 12: If the check of a "direction" of the data is o.k. and if the angle differ from 0 then the intercept with the Xaxis is calculated. If the angle is equal to π/2 (90°) then the Xintercept is equal to the average value of x coordinates. If the angle is different from π/2 then the Xintercept is calculated with the formula.
Line 13: If the denominator, the result of SS_{ss}·SS_{tt}, isn't zero, the coefficient of determination is calculated.
The table above can be written somewhat more intuitive, if it's implied, that the relevant cells are linked to a parameter name. When in the table above, at the top says "I2=average(A:A)" and it thereby is understood that the parametername "xm" is linked to this cell, then it will in the corresponding cell below, at the top be written as "xm=average(A:A)":
Nr.  Equation.  Eq. no.  Formula i spreadsheet.  Action. 

1        Data. 
2 

8.1 8.2 
xm=average(A:A) ym=average(B:B) 
Average values for x and y data. 
3 

3.1 3.2 
C2=A2xm D2=B2ym 
(s, t) coordinates. 
4      E2=C2^2 F2=D2^2 G2=C2*D2 
The 3 (s, t) squares. 
5 

3.8 3.9 3.10 
SSss=sum(E:E) SStt=sum(F:F) SSst=sum(G:G) 
Sum of the 3 (s, t) squares. 
6 

6.4 6.5 
X=SSssSStt Y=2*SSst 
Denominator and numerator in arcus tangent function are computed. 
7      Err=IF(X=0,IF(Y=0,1,0),0)  It's tested if data has a "direction". 
8  6.3  θ=if(Err=0,(1/2)*atan2(I5,J5),"NaN")  The inclination (Xaxis) "θ" is computed.  
9      Θ=if(Err=0,degrees(θ),"NaN")  Optionally convertion to degrees "Θ". 
10  6.6  m=if(Err=0,if(I9=pi()/2,"NaN",tan(I9)),"NaN")  Slope of the line.  
11  8.3  b=if(Err=0,if(θ=pi()/2,"NaN",ymxm*m),"NaN")  Intercept at Yaxis.  
12  8.4  a=if(Err=0,if(θ=0,"NaN",if(θ=pi()/2,xm,xmym/m)),"NaN")  Intercept at Xaxis.  
13  5.7  R²=if(SSss*SStt=0,"NaN",SSst^2/(SSss*SStt))  The Coefficient of Determination. 
In § 2 some inconveniences by the "traditional" method were demonstrated. Here it will be shown that this "Perpendicular Offsets" method don't have these problems.
The quite random example on some data, that were used earlier:


In the equation y = ax + b , in the upper left corner, a is the slope. In this example the slope is equal to 0.479345, but it's the same as tangent to the angle of the line with the xaxis (inclination), thereby the angle is:
θ = atan(0.479345) = 25.6105°
It's the angle, that's found in the spreadsheet above in § 8.
Let as an example all points rotate 25°:


In the equation y = ax + b , in the upper left corner, a is the slope. In this case the slope is equal to 1.217874, but it's the same as tangent to the angle of the line with the xaxis (inclination), thereby the angle is:
θ = atan(1.217874) = 50.6105°
Before the rotation the angle was 25.6105°, thereafter the datapoints were rotated 25°, then the new trendline should lie in an angle of: 25.6105° + 25° = 50.6105°, which is the case  this inconvenience by the "traditional" method is eliminated by the "Perpendicular Offsets" method.
Invert as an example the data by swapping x and y values:


In the equation y = ax + b in the upper left corner, a is the slope. In this case the slope is equal to 2.086182.
before the inverting the slope was 0.479345, thereafter the data were inverted, then the new trendline ought to have a slope of: 1/0.479345 = 2.086180, which is the case  this inconvenience by the "traditional" method is eliminated by the "Perpendicular Offsets" method.
In the example the angle of the trendline was found to be:
θ = atan(0.479345) = 25.6105°
Yes, but rotate then all datapoints in reverse sense to this angle, that is 25.6105°:


Least squares method  Perpendicular Offsets predicts exactly the amount the pointcloud must be rotated, to give a horizontal line. in the equation y = ax + b in the upper left corner, a is the slope  when the line is horizontal it's just 0.
An additional consequence is, that the coefficient of determination R^{2} becomes zero.
The value of Coefficient of Determination, as a measure of the quality of the fit, is disputed, but it's commonly used. The formula for the Coefficient of Determination was first shown at the bottom of § 5, as eq. 5.7. It's of course also shown in § 8, where the use of the formulas for the Least Squares Fitting  Perpendicular Offsets is illustrated.
(5.7) 
In § 2 it was shown, that with the dataset, that's generally used for illustrating the method here, you will, with the traditional method, that minimises the sum of the squares of the vertical distances to the line that "best" describes data and the individual datapoints, obtain a line that has the angle:
θ = atan(0.433333) = 23.4287°
Thereafter it was in § 2.2 shown, that when the dataset was inverted, which actually correspond to minimising the sum of the squares of the horizontal distances to the line that "best" describes data and the individual datapoints, obtain a line that has the angle:
θ = atan(1/1.426829) = atan(0.700855) = 35.0249°
In § 8 and again in § 9 it was shown, that when you minimise the sum of the squares of the perpendicular distances to the line that "best" describes data and the individual datapoints, obtain a line that has the angle:
θ = atan(0.479345) = 25.6105°
If the coordinate system is turned a full 360°, the coefficient of determination will vary as shown below:
Fig. 10.1: The coefficient of determination, the coordinate system is rotated. 
The coefficient of determination is equal 4 times during this one rotation, that is when the "trendline" becomes horizontal or vertical. If we concentrate on the first zeropoint, is it then at 23.4287°, 25.6105° or 35.0249°? Well, it's at least not 35.0249°, it's easily seen in the figure above  but then the two other solutions?
If you zoom in on the range 24.6°  26.6°, then it can clearly be seen that the zeropoint is close to the 25.6° as the perpendicular method predicted:
Fig. 10.2: The coefficient of determination, the coordinate system is rotated, detail. 
Is the coordinate system then rotated the 25.6105° both methods (min. vertical and min. perpendicular offsets) agree on, that the coefficient of determination is equal to zero. Vertical distance below to the left and perpendicular offsets below to the right:
Fig. 10.3: Coord.sys. roteret 25.6105°, vertical method.  Fig. 10.4: Coord.sys. roteret 25.6105°, perpendicular method. 
Again another property that speaks for the "perpendicular method".
The new formula, as it was first deduced in § 3 as eq. 3.24, has a double solution as eq. 3.26. Both solutions must in principle be investigated and thereafter shall the one with the least sum of squared distances be selected.
(3.24) 
(3.26) 
How the "correct" solution switches from the one case to the other was illustrated in fig. 7.b.4 in § 7b:
Fig. 7.b.4: All solutions. 
It's a bit clearer to see, if the solution that's not used is set to zero, as shown below. At the same time is the parameter X shown, as it was defined as the numerator in B, in § 6 as eq. 6.4:
(6.4) 
Fig. 11.1: All solutions & the parameter X. 
The parameter X follows, as it can be seen, precisely the swap between the directly found angle from eq. 3.24, in the graph named θd, and the complementary angle, in the graph named θc. But that means, that you don't have to automatically compute both solutions, you will do with a simple "if, else" structure, depending of which system you work in. When X is positive the directly found angle from eq. 3.24 is used, when X is negative the complementary angle from eq. 3.26 is used.
The formula from Mathworld was quoted in § 4, as eq. 4.2 or if you prefer in eq. 4.5 and 4.6. In all cases we are dealing with double solutions. In principle both solutions must be investigated and the one with the least sum of squared distances should be selected.
(4.6) 
How the "correct" solution switches from the one case to the other was illustrated in fig. 7.a.4 i § 7a:
Fig. 7.a.4: All solutions. 
It's a bit clearer to see, if the solution that's not used is set to zero, as shown below. At the same time is the parameter Y, shown, as it was defined as the denominator in B, i § 6 as eq. 6.5:
(6.5) 
Fig. 11.2: All solutions & parameter Y. 
The parameter Y follows, as it can be seen, precisely the swap between the positive and negative solutions from eq. 4.6, in the graph named θ+, and θ. But that means, that you don't have to automatically compute both solutions, you will do with a simple "if, else" structure, depending of which system you work in. when Y is positive the plus "+" in eq. 4.6, when Y is negative the minus "" in eq. 4.6 is used.
The new formula was rearranged in § 6, to eq. 6.3 and there's no double solutions, that must be checked  it can't get easier.
(6.3) 
How this solution gives the "correct" solution all the time (and there's no need for complementary solutions) was illustrated in fig. 7.c.3 in § 7c:
Fig. 7.c.3: Alle solutions. 
As the numerator in B:
(6.4) 
Then for X > 0:
(3.24) 
And for X < 0:
(3.26) 
The case X = 0 can be placed anywhere in any of the two above shown expressions.
As the denominator in B:
(6.5) 
Then for Y > 0:
(12.1) 
And for Y < 0:
(12.2) 
In the case Y = 0 then:
θ = 90°
or
m = ∞
The new formula was rearranged in § 6, to eq. 6.3 and there's no double solutions, that must be checked  it can't get easier.
(6.3) 
When the solution quoted from Mathworld i § 4, is known, you could just run the calculations in § 5 through in opposite direction and thereby arrive at the new formula.
Well, I didn't run across in relation to "statistics", but in relation to moment of inertia within mechanics, there are a similar formula.
The line, that's found here, is the axis for the least principal moment of inertia for the point cloud. Perpendicular to this line, and through the centre of gravity, passes The line for the middle principal moment of inertia for the point cloud. Perpendicular to these two lines, through the centre of gravity, and thereby out of the plane passes The line for the largest principal moment of inertia for the point cloud.
Principal axis of moment of inertia: Around these axes the deviation moments are zero.
Overview 