Dictionary
search :       browse : 
A B C D E F G H I J K L M
  N O P Q R S T U V W X Y Z
         
Description
Related Terms
Everyday Examples
Interactive Checkpoint
More Information
Challenge
How to use this dictionary
Contact
 Description  
 Line of Best Fit:  The line that best represents the trend that the points in a scatter plot follow.

When we look at a scatter plot of two variables, if we could draw a straight line with a ruler close to most of the data, we would be able to predict the value of one variable based on a value of the other. What does line of "best fit" mean? Most statistical packages use the method of least squares. This method uses the data to find the line of best fit by minimizing the sum of the squares of the distances of the data points to the line. The squares of the distances are used since the distance might be a negative number, but the square of the distance of the point to the line will be positive. Examples of the distance of the point to the line is indicated by the red lines below. The equation of this line of best fit can be found from the data.

Through this procedure an estimate of the equation of the line is sought. An estimate is not about being exactly correct. It provides what some people refer to as a "ballpark figure" - something that is near to and could be, but is not expected to be, exact.

Compare the above graph to the one below. Although the line below goes through two of the data points, all the rest of the data is above the line, where in the line above the number of points above and below the line are nearly equal.

If most of the data points are close to the line of best fit, the two variables are said to be highly correlated, otherwise they are weakly correlated. If the slope of the line of best fit is positive (going up from left to right), the two variables plotted are said to be positively correlated. If the slope of the line of best fit is negative (going down from left to right), the two variables being plotted are said to be negatively correlated. If a line of best fit cannot be estimated, then the two variables are said to have no correlation. In the example above, the quality and price are positively correlated.