Source: Tables and graphs created by Jonathan Osters
This tutorial is going to cover interpretation of what the slope and y-intercept of a least-squares regression line actually mean. So let's take a look. A slope is a rate of change. Now, you've talked about rates of change a lot in everyday life. The examples of common rates that you might use in everyday life are miles per hour. That's how many miles you'll travel if you increase your time in the car by one hour.
Dollars per gallon at the gas station, that's how many dollars you or your pump bill will increase by if you increase by one gallon. Bushels per acre in farming, in cents per pound at the grocery store. And so you can use that application to answer questions like this. If a car's driving at 50 miles per hour, how much farther can you go by driving one additional hour?
Well, 50 miles per hour means 50 miles every hour. So one additional hour means 50 additional miles. Let's take a look at this one. If a fisherman is paid $4.04 per pound for sea scallops, how much more money do the fishermen stand to make by catching one additional pound of scallops? Well, $4 per pound, so every additional pound means $4.04 additional money.
In linear regression, slope is the average rate of change. We know that there's actual data points on there, so we can calculate the rates of change between two data points. But we're interested in the average over all the points. So that is the average increase or decrease in the response variable that corresponds to an increase of one in the explanatory variable.
So suppose we had something like distance hat equals 15 times time, where time is in hours and distance is in miles. That 15 would be 15 miles per hour. But because this is a regression equation, that 15 is an average speed. There's no guarantee that you go 15 miles every single hour. But what this indicates is that for each additional hour in time, the distance increases by 15 miles on average.
Look at this example here where this is the miles from Minneapolis Saint Paul airport and the airfare to these destinations from Minneapolis Saint Paul. The equation of the regression line was airfare hat-- predicted airfare-- equals 113.11 plus 0.137 times miles. OK. So what do those mean? Let's look at that. The 0.137 times miles, that's the slope.
So that's our rate of change. It's how quickly the airfare changes if you increase the miles by one. So for each additional mile, the airfare is predicted to increase by 0.137 dollars, or 13.7 cents. Now, a couple of important ideas as you interpret these values-- first, it's for every additional mile. You can't leave this word out. You can't say for every mile. Because it has to do with the fact that it doesn't start at zero miles costing zero dollars. That's the reason we have to say every additional mile.
Secondly, we have to say it's a predicted increase. Because this airfare hat, we're not figuring out actual airfares. Remember this is an average. And so we're using it to predict the additional airfare for each mile. So remember, this is an average. So it's not a hard and fast rule, and that's why we can use it to predict airfare. But we can't use it to actually assign airfares.
And lastly, we're using units-- miles and dollars. So we can't say airfare increases by 0.137. 0.137 whats? 0.137 dollars. And so for each additional mile, we increase it by this many dollars. So make sure you include units.
Now let's look at the y-intercept. The y-intercept here was 113.11. The y-intercept is the value of the y, which is the response-- which is, in this case, airfare-- when the value of x, which is the explanatory variable-- in this case, miles-- is zero. So let's look at that. When a flight is zero miles, that's when the explanatory is zero, the airfare is predicted to be $113.11.
Now, that doesn't make a whole lot of meaningful sense. But we'll talk about that in a minute. A couple things to remember-- you're talking about an ordered pair. So it's zero miles and $113.11. So you need to have both those numbers in there, because this number really corresponds to an ordered pair on the graph. Secondly, just like the slope was a prediction, this is also a prediction.
Now, it's not a meaningful prediction. And again, we'll get to that in a minute. But it's a prediction. It's because this line is a prediction line. It's a best fit line. It's not actually finding airfares for us. And just like last time, make sure you include units.
So here, the y-intercept didn't make a whole lot of contextual sense. You wouldn't buy a ticket for $113.11 just to go nowhere. The reason has to do with the range of miles values for which this is an appropriate airfare guess, for which this line is an appropriate airfare guess. What we have here is we have miles values from 407 up to almost 1,300 miles.
So what that means is we can use this line within that range of 407 to 1,294 to make reasonable predictions on airfare. So if we wanted to go to, say, San Antonio, we could certainly use this line to do that, because the distance from Minneapolis to San Antonio is within this range. So it's reasonable to use this line to make predictions within this 400 to 1,300 range.
Anything outside of that range might not be reasonable. So we'll sometimes just acknowledge the fact that maybe the y-intercept isn't part of that reasonable prediction range. And so maybe it doesn't have a whole lot of contextual sense. But it's a good idea to know how to interpret it anyway.
So do this one on your own. Interpret the slope and y-intercept involving sodium content and calories for certain hot dogs. So pause the video and interpret what those values mean.
What you should have come up with is you should have identified the 160 as being if a hot dog had zero calories, then the predicted sodium would be 160 milligrams. And the 2.5 means for each additional calorie a hot dog has, the sodium content is predicted to increase by 2.5 milligrams. And remember that 2.5 milligram per calorie increase is an average. This is not a hard and fast rule.
So to recap, the slope is a rate of change and it explains how an increase in the explanatory variable affects the response. The y-intercept shows you what's predicted for the response when the explanatory value is zero, and sometimes it doesn't have a meaningful interpretation. Doesn't mean we shouldn't know how to interpret it, just sometimes it doesn't make a whole lot of sense in context. Sometimes it falls outside that reasonable predictions window.
So we talked about interpreting the slope of a regression line and the y-intercept. Good luck, and we'll see you next time.