Hi. This tutorial covers correlation and causation. So let's start with the data set-- a pretty interesting data set. What we have here are a bunch of different countries. And then the life expectancy in that country is listed here. And the number of people per TV are listed here. So we have a bivariate data set. Notice that the table's just continued here. So these aren't separate data sets. It's just split up that way.
So if we just look at a couple-- so if we look at the country of Angola, the life expectancy is 44 years. And there are 200 people for every TV. So that just means that there are not a lot of TVs in the country. If we look at Australia, the life expectancy of 76.5. And there are two people per TV. So for every one television, there's two people. So in Australia, there's a lot of televisions in the country. Cambodia, they live to be about 49 and 1/2 years old-- 177 people per TV.
So if you scan the data set, you can see that generally, people with-- or countries with a higher life expectancy have a small number of people per TV. We look at Japan. Japan is 79 for life expectancy, 1.8 people per TV. United States, 75.5, 1.3 people per TV. Go to Uganda, though. 51 is the life expectancy. 191 people per TV.
So let's take a look at this graphed. I also have a best-fit line drawn in here. So we can see that there is a negative association so that as the number of people per TV increases, the life expectancy decreases. So the r value is negative 0.8. So that's a strong negative association between people per TV and life expectancy. So that's what I have here-- strong negative correlation between people per TV and life expectancy.
So now this is the important question that we're getting at in this tutorial, is, does that mean that increasing the people per TV ratio, which would mean decreasing the number of TVs in a country, will lower life expectancy? So does that mean if the United States were to get rid of TVs-- so if they were to increase their people-per-TV ratio-- would that make the life expectancy go down? So if we remove TVs from a country, does that mean that people will not live as long?
And of course, the answer to that is no. So a strong correlation does not imply causation. So just because two variables are correlated, that does not mean that one causes a change in the other. So just to make sure we have a good working definition of correlation and causation, correlation is the strength and direction of a linear association between two variables. So in this example, we did have a strong correlation, strong negative correlation.
And then causation, sometimes known as cause and effect, is when variation in an explanatory variable causes variation in a response variable. So in this case, no, a change in people per TV does not cause a change in life expectancy.
So what was happening here is that we had a lurking variable. So a lurking variable will sometimes be responsible for variation in both an explanatory and a response variable. So in the example, the lurking variable is the wealth of a country. So the wealth of the country affects both life expectancy and people per TV.
So if a country is very wealthy, generally, their population will live longer because they'll have better access to health care, better nutrition, better access to clean water, and so forth. And because they're wealthier, they're also probably going to have more TVs because TVs are a luxury item that wealthier-- people in wealthier countries will be able to afford. So the wealth of the country, the lurking variable, will affect both life expectancy and people per TV.
So because you have this lurking variable, it's important that you don't say that there's a cause and effect here. The best evidence for causation comes from a randomized controlled experiment, rarely from an observational study. So the data that was collected here came from an observational study because you're just observing life expectancy and then the people per TV ratio. You're not imposing any sort of treatment to look for a change. It was just an observational study.
So if we did-- if we wanted to do-- make a causation implication, it'd be important that we end up doing an experiment rather than an observational study. So that has been the tutorial on correlation and causation. Thanks for watching.