Use Sophia to knock out your gen-ed requirements quickly and affordably. Learn more
×

Expected Value and Variance

Author: Sophia

what's covered
In this lesson, you will learn how to compute and evaluate the expected value and variance of a discrete random variable. Specifically, this lesson will cover:

Table of Contents

1. Introduction to Expected Value and Variance

Expected value is essentially the average outcome you would expect from a set of probabilities. Imagine you have some dice, and you want to know the average number you would roll over many trials. The expected value helps you calculate this average.

In a business context, expected value can help you make informed decisions. For example, if you’re evaluating different investment opportunities, you can use the expected value to estimate the potential returns of each option. This way, you can choose the investment with the highest expected return. Imagine you are running an email marketing campaign and want to determine the expected revenue from this campaign. Expected value assists you in determining potential earnings from different marketing strategies.

Variance measures how much the outcomes of a discrete random variable differ from the expected value (the mean). In simpler terms, it tells you how spread out the numbers are. If the variance is high, the numbers are more spread out; if it's low, they are closer to the expected value. For example, you have collected data from several past email marketing campaigns and want to understand the variability in the revenue generated. A high variance would indicate that the revenue from the various email campaigns varies significantly from the mean. This suggests the outcomes are quite spread out, meaning the campaign performance is inconsistent. High variance in revenue might prompt you to investigate further. Are there specific factors causing some campaigns to perform much better or worse than others? Understanding these factors can help you reduce variability and achieve more consistent revenue generation from the email marketing campaigns.

term to know
Expected Value
Represents the average outcome of a discrete random variable if the process of generating the random variable was repeated many times.

1a. Discrete Probability Distributions

The foundation of expected value and variance is based on a discrete probability distribution. A discrete probability distribution lists all the possible values of a discrete random variable and its associated probabilities. Suppose you are analyzing the number of units sold per day for a new product. Based on historical data, you construct the following discrete probability distribution.

Units Sold (x  ) Probability (P(x  ))
0 0.10
1 0.20
2 0.30
3 0.25
4 0.15

In this scenario, the discrete random variable is the number of units sold. The Units Sold (x) column lists the possible outcomes for the discrete random variable. The Probability (P(x)) column shows the probability of each outcome occurring. For example, there is a 10% chance of selling 0 units, a 20% chance of selling 1 unit, and so on.

hint
The lowercase x value in a discrete probability distribution denotes the specific outcomes or values the discrete random variable can take on.

Notice that the sum of probabilities in the P(x) column sums to 1.0.

0.10 plus 0.20 plus 0.30 plus 0.25 plus 0.15 equals 1.0

This ensures that one of these outcomes will occur, covering all possible scenarios for the number of units sold. In order for the probability distribution to be valid, the sum of the probabilities of a discrete probability distribution should always sum to 1.0.

Now, in a real-world business analytics scenario, you must use historical data to find and develop the discrete probability distribution. Let’s look at an example of how you would use Excel to develop a discrete probability distribution from historical data.

EXAMPLE

Suppose you have the number of complaints received each day from a company that services student loans.

The discrete probability distribution computed in Excel is provided below.



The student loan servicing company can use the discrete probability distribution to observe that the most likely number of complaints is 1 with a probability of 33.3%. The least likely number of complaints is 2 or 3 with a probability of 20% each.

try it
Using the student loan servicing complaint data, construct the discrete probability distribution in Excel. The name of the Excel file is student_loan_servicer_complaints.xlsx.

Using the student_loan_servicer_complaints.xlsx file, perform the following.

1. Select the entire range of your data (cells A1 to B31). On the Insert menu, select Pivot Table --> From Table/Range and select + New Worksheet. Place Complaints in both the Rows and Σ Values box.

2. You need the values in the cells of the contingency table to represent the counts of each distinct number of complaints in the data (0, 1, 2, and 3). In the drop-down arrow for the Σ Values box, select Value Field Settings, select Count, and OK.





The PivotTable should update with the counts.



3. Add a new column to the PivotTable by entering the word Probability in cell C2.

4. In cell C3, enter the formula:

=B3/$B$7
5. Once you have the formula entered in cell C3, hover over the small black plus sign (Auto-Fill Handle) at the bottom-right corner of the cell. Click and hold the left mouse button, then drag the handle down to cell C6.

6. The PivotTable should update to look like the one displayed in the previous example.



7. An alternate way to obtain the Probability column is to create the Relative Frequency column using the PivotTable functionality. Delete the Probability column and put your cursor in the body of the PivotTable so you will have the PivotTable Field options appear on the right-hand side of Excel. Drag Complaints to the Σ Values box. There will now be two instances of Complaints in the Σ Values box.

8. Using the drop-down arrow for the Sum of Complaints in the Σ Values box, select Value Field Settings.



9. In the Value Field Settings dialog box, name the new column Relative Frequency and select Count as the SUMMARIZE VALUE BY.



10. Select SHOW VALUE AS, select % of Grand Total, and select OK.



You should now obtain a Relative Frequency column in the PivotTable, which is the same as the Probability column created previously.



11. Check your work by creating the discrete probability distribution in Excel.

watch
Check out this video about constructing the discrete probability distribution.

term to know
Discrete Probability Distribution
Displays the probability of occurrence of each possible outcome of a discrete random variable.

1b. Expected Value

The expected value, denoted by E open parentheses X close parentheses, of a discrete random variable is a measure of the central tendency, representing the average outcome if the business process or phenomenon were repeated many times. You calculate the expected value using:

E open parentheses X close parentheses equals sum x times P open parentheses x close parentheses

where x is the possible outcomes (numerical values) of the discrete random variable and P open parentheses x close parentheses is the probability of observing a specific value of the discrete random variable.

Let’s take a look at an example calculation of expected value before you apply the concept to a real-world scenario.

EXAMPLE

Suppose you have the following probability discrete distribution.



To find the expected value, use the formula:

sum x times P open parentheses x close parentheses equals open parentheses 1 times 0.2 close parentheses plus open parentheses 2 times 0.3 close parentheses plus open parentheses 3 times 0.1 close parentheses plus open parentheses 4 times 0.4 close parentheses
space space space space space space space space space space space space space space space space space space space equals 0.2 plus 0.6 plus 0.3 plus 1.6
space space space space space space space space space space space space space space space space space space space equals 2.7

This means that if the business process that generated the values of the discrete random variables and their associated probabilities were repeated many times, the average outcome would be 2.7.

Now, let’s apply the expected value to a real-world scenario.

EXAMPLE

You are assisting the director of marketing at UrbanHive, a company specializing in selling natural honey in urban areas. The director is keen to understand the effectiveness of their email marketing campaigns to optimize future strategies. Based on historical data, you have developed a discrete probability distribution that shows the revenue generated from the last five email marketing campaigns along with their associated probabilities.



To provide insights to the director, you calculate the expected value as follows:

E open parentheses X close parentheses equals sum x times P open parentheses x close parentheses equals open parentheses 6 comma 000 times 0.20 close parentheses plus open parentheses 5 comma 500 times 0.25 close parentheses plus open parentheses 7 comma 000 times 0.15 close parentheses plus open parentheses 4 comma 500 times 0.10 close parentheses plus open parentheses 6 comma 500 times 0.30 close parentheses
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals 1 comma 200 plus 1 comma 375 plus 1 comma 050 plus 450 plus 1 comma 950
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals $ 6 comma 025

The expected revenue from the email marketing campaign is $6,025.

By calculating the expected value, you provide the director with a clear understanding of the average revenue that can be anticipated from future email marketing campaigns.

The director of marketing at UrbanHive can use the expected value of $6,025 in several ways to enhance decision-making:

  • Budgeting and Forecasting: The expected value provides a benchmark for future revenue from email marketing campaigns. This helps in setting realistic budgets and financial forecasts.
  • Performance Evaluation: The expected value serves as a performance metric. The director can compare actual campaign results against this benchmark to evaluate the effectiveness of different strategies and identify areas for improvement.
  • Communication with Management: The expected value provides a concrete figure that can be communicated to stakeholders, such as senior management or investors, to justify marketing expenditures and demonstrate potential returns.

try it
Use Excel to compute the expected value of the revenue for the email marketing campaign for UrbanHive in the previous example.

Using the data in the email_marketing_campaign.xlsx file, perform the following:

1. Create a new column by entering Expected_Revenue in cell D1.

2. Enter the following formula in cell D2:

=SUMPRODUCT(B2:B6,C2:C6)
You should see the value 6025 populate cell D2.

The SUMPRODUCT() function in Excel simplifies the process of computing expected value by combining the multiplication and summation steps into a single function, making it both efficient and easy to use.

The function takes two arrays (ranges of cells) as inputs: B2:B6 (the values of the random variable, x) and C2:C6 (the corresponding probabilities P open parentheses x close parentheses). It multiplies each element in the first array by the corresponding element in the second array. After multiplying the corresponding elements, the function sums the products together. So, it is performing the expected value calculation.

The screenshot below shows the Excel formula.

1c. Variance

The variance of a discrete random variable, denoted as Var(x), is a measure of how much the values of the random variable deviate from the expected value (mean). It indicates the spread or dispersion of the values. You calculate the variance using:

V a r open parentheses X close parentheses equals sum open parentheses x minus E open parentheses x close parentheses close parentheses squared times P open parentheses x close parentheses

where x is the numerical outcome of the discrete random variable, E open parentheses x close parentheses is the expected value (mean) of the discrete random variable, and P open parentheses x close parentheses is the probability of observing a specific value of the discrete random variable.

The squared difference between each value of the discrete random variable and its mean is multiplied by each corresponding probability.

The relationship between variance and standard deviation also applies to discrete random variables. To find the standard deviation of a discrete random variable (X), you simply take the square root of its variance, square root of V a r open parentheses X close parentheses end root.

Let’s look at an example calculation of expected value before you apply the concept to a real-world scenario.

EXAMPLE

Suppose you have the following probability discrete distribution from the previous section. Remember that the expected value was 2.7.



To find the variance, use the formula:

sum open parentheses x minus E open parentheses x close parentheses close parentheses squared times P open parentheses x close parentheses equals open square brackets open parentheses 1 minus 2.7 close parentheses squared times 0.2 close square brackets plus open square brackets open parentheses 2 minus 2.7 close parentheses squared times 0.3 close square brackets plus open square brackets open parentheses 3 minus 2.7 close parentheses squared times 0.1 close square brackets plus open square brackets open parentheses 4 minus 2.7 close parentheses squared times 0.4 close square brackets
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals open parentheses 2.89 times 0.2 close parentheses plus open parentheses 0.49 times 0.3 close parentheses plus open parentheses 0.09 times 0.1 close parentheses plus open parentheses 1.69 times 0.4 close parentheses
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals 0.58 plus 0.15 plus 0.009 plus 0.68
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals 1.42

Let’s return to assisting the director of marketing and provide some insight into the email marketing campaign by applying variance.

EXAMPLE

Let's return to the scenario where you are assisting the director of marketing at UrbanHive, a company specializing in selling natural honey in urban areas. You have learned that variance is another measure that can be computed from a discrete probability distribution. You decide to compute the variance for the last five email marketing campaigns and provide the director with some insights using variance.



Recall the expected value of the revenue was $6,025 from the previous example.

To calculate variance, perform the following:

V a r open parentheses X close parentheses equals sum open parentheses x minus E open parentheses x close parentheses close parentheses squared times P open parentheses x close parentheses equals open square brackets open parentheses 6000 minus 6025 close parentheses squared times 0.20 close square brackets plus open square brackets open parentheses 5500 minus 6025 close parentheses squared times 0.25 close square brackets plus open square brackets open parentheses 7000 minus 6025 close parentheses squared times 0.15 close square brackets plus open square brackets open parentheses 4500 minus 6025 close parentheses squared times 0.10 close square brackets plus open square brackets open parentheses 6500 minus 6025 close parentheses squared times 0.30 close square brackets
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals open parentheses 625 times 0.20 close parentheses plus open parentheses 275 comma 625 times 0.25 close parentheses plus open parentheses 950 comma 625 times 0.15 close parentheses plus open parentheses 2 comma 325 comma 625 times 0.10 close parentheses plus open parentheses 225 comma 625 times 0.30 close parentheses
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals 125 plus 68 comma 906.25 plus 142 comma 593.75 plus 232 comma 562.5 plus 67 comma 687.5
space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space space equals 511 comma 875

The variance of $511,875 indicates the degree to which the revenues from the email marketing campaigns deviate from the mean revenue of $6,025. This information can help the director of marketing use the variance information in several strategic ways to make informed decisions, such as:

  • Assessing Risk and Stability:
    • High Variance: High variance indicates that the revenues from the campaigns are widely spread out from the mean. This means there is a significant difference between the best and worst-performing campaigns.

      High variance indicates that the revenues from the campaigns are highly variable. This suggests that some campaigns perform very well while others do not, which could be risky. The director might want to investigate the factors contributing to this variability and work on strategies to stabilize the performance.

    • Low Variance: Low variance indicates that the revenues from the campaigns are closely clustered around the mean. This suggests that the revenues are more consistent across campaigns. This stability can be reassuring and might indicate that the marketing strategies are generally effective.
In the context of evaluating the performance of marketing campaigns, what is considered “high” or “low” variance can depend on several factors, including industry standards, company goals, and historical performance data.

  • Understanding Customer Preferences: High variance might indicate that different campaigns resonate differently with various customer segments. Analyzing these differences can provide insights into customer preferences and help tailor future campaigns more effectively.

try it
Use Excel to compute the variance of the revenue for UrbanHive's email marketing campaign.

Using the data in the email_marketing_campaign.xlsx file, perform the following.

1. Create a new column by entering Deviation from Mean in cell E2.

2. Enter the following formula in cell E3:

=B3-$D$3
Drag this formula down to cell E7.

3. Create a new column named Squared Deviations in cell F2.

4. Enter the following formula in cell F3:

=E3^2
Drag this formula down to cell F7.

The symbol ^, pronounced "caret," is the exponent symbol in Excel.

5. Create a new column named Multiply by Probability in cell G2.

6. Enter the following formula in cell G3:

=F3*C3
Drag this formula down to cell G7.

7. Sum up all the values in column G to calculate the variance. In cell H2, create a new column named Variance and enter the following formula in cell H3:

=SUM(G2:G6)
8. You should see the value 511875 populate cell H3.

The formulas used in Excel are shown below, along with the actual values that the formulas produced.

watch
Follow along with this video to compute the variance for the revenue for the email marketing campaign.

summary
In this lesson, you focused on the ideas of expected value and variance for a discrete random variable. The expected value (or mean) of a discrete random variable is a measure of the central tendency of a discrete probability distribution. In a business data analytics context, expected value helps in predicting the average outcome of a random event based on historical data. For example, the expected value calculation of several past marketing campaigns for UrbanHive assisted the director of marketing in understanding the average revenue that can be anticipated from future email marketing campaigns. The revenue stability of the email marketing campaigns was assessed through variance. The variance calculation for the email marketing campaigns provides the director of marketing with insights into the consistency and risk associated with the revenue generated from these campaigns.

Source: THIS TUTORIAL WAS AUTHORED BY SOPHIA LEARNING. PLEASE SEE OUR TERMS OF USE.

Terms to Know
Discrete Probability Distribution

Displays the probability of occurrence of each possible outcome of a discrete random variable.

Expected Value

Represents the average outcome of a discrete random variable if the process of generating the random variable was repeated many times.