Home Chapter 6 MINITAB Project

# MINITAB Project

STATISTICS EXPLORATION # 6
THE NORMAL AND OTHER CONTINUOUS PROBABILITY DISTRIBUTIONS

PURPOSE - to use MINITAB to

• enhance the understanding of a continuous random variable and its corresponding density function
• produce probabilities associated with a normal random variable
• find values of normal random variables when the probabilities are given
• help assess, through normal probability plots, whether a sample can be assumed to come from a normal population
• investigate other distributions such as the exponential random variable

BACKGROUND INFORMATION

• A continuous random variable X is a rule which assigns a real value x to each outcome of a probability experiment such that the values of X range over a given interval.
• Note: In practice, X represents some form of measurement such as time, length or area.
• Note: The x-values are values of the random variable X.
• Note: The convention is to use upper case letters to represent the random variable and the corresponding lower case letters to represent the values of the random variable.
• Example: Suppose we are interested in the systolic blood pressure of a college student on a particular campus. The target population will be the set of systolic blood pressure for all students on that campus. If we would like to determine whether the systolic blood pressure for the selected student is greater than 125, we would only be able to find this probability if we know the theoretical probability distribution for the systolic blood pressures for the students at that campus. If the following hypothetical distribution represents the population, then the corresponding probability may be represented by the shaded area.
• Note: Such a distribution will be a continuous distribution.

• The probability density function for a continuous random variable is the function , say f(x), defined over the real numbers having the following properties:
• f(x) ³ 0 for all real values of x;
• the total area under the graph of f(x) and above the x-axis equals 1; the probability P(a £ X £ b) is the value of the area of the region in the plane which lies between the values of "a" and "b", below the graph, and above the x-axis.
• The mean or the expected value for a continuous random variable X, denoted by m = E(X), is a measurement of central tendency for the x-values.
• In the blood pressure example above, the mean for the distribution is 120.
• The variance for a continuous random variable X, denoted by s2 = V(X), is a measurement of the dispersion of the distribution of x-values about the mean of the distribution.
• Note: The standard deviation, denoted by s, for the random variable X is obtained by taking the square root of the variance.
• Note: The larger the variance or the standard deviation, the more dispersed the data values will be about the mean. This is depicted in the following diagram for two different distributions with the same mean.
• Note: Formal definition for the mean, variance, and the standard deviation for a continuous variable will not be presented since they involve the probability density function f(x) and integral calculus.
• Many real life situations can be modeled by a particular continuous probability distribution. This distribution is called the normal probability distribution.
• The normal random variable X has a bell shaped distribution, which is symmetric about its mean.
• The normal probability density function for the random variable X, say g(x), is defined by
• Note: A normal random variable with a mean of 0 and a standard deviation of 1 is called a standard normal random variable.
• The standard normal random variable, denoted by Z, has a mean m = 0 and a standard deviation s = 1.
• The transformation from a normal random variable X to the standard normal random variable Z is obtained by computing the Z-score given by the following relationship.

• A normal probability plot for a set of data is a scatter plot with the ranked data values along the vertical axis and the corresponding z-values from a standard normal distribution along the horizontal axis.
• When the data are normally distributed, a linear (straight line) trend will result in the normal probability plot. A nonlinear trend in the normal probability plot will suggest that the data are non-normal.

PROCEDURES

First, load the MINITAB (windows version) software as described in Exploration #0.

NOTE: Several continuous probability distributions can be investigated through the procedures included MINITAB, however, this exploration will concentrate only on the normal probability distribution.

NOTE: The procedures presented in these explorations may not be the only way to achieve the end results. Also, whenever graphs are presented, only the MINITAB graphics features will be used.

1. RANDOM SAMPLING FROM A NORMAL PROBABILITY DISTRIBUTION

This section will illustrate how MINITAB can be used to generate random samples from a normal distribution and use the generated values to graph the corresponding normal curve.

Example 1: Generate a random sample of size 1000 from a standard normal distribution. Use the generated values to compute the probability density values for the generated values. Plot along the density values along the y-axis and the generated values from the normal distribution along the x-axis. Use the connect option to display the graph.

Recall: The standard normal distribution has a mean m =0 and s = 1.

First to generate the normal values, select Calc® Random Data. Figure 6.1 shows a partial display of the action.

Figure 6.1: Display of the result of selecting Calc® Random Data

To generate random data from the normal distribution, click on the Normal distribution in the Sample From Columns option. In the resulting dialog box, enter the values as shown in Figure 6.2. Click on the OK button and the data will be generated from the standard normal distribution and saved in column C1 (NORMAL).

Figure 6.2: Normal Distribution dialog box with Entries to simulate 1000 values from a Standard Normal Distribution

Next, we need to compute the Probability density for these simulated values.

To achieve this, select Calc® Probability Distributions® Normal. Fill in the entries as shown in Figure 6.3.

Note: Column C1 was renamed as NORMAL and column C2 was renamed as DENSITY.

Observe that we have selected the Probability density option. Also, observe that we retained the mean of 1 and standard deviation of 1 for the standard normal distribution. Since we needed the density values for the data in column C1, we use this as the Input column entry and use C2 as the Optional storage column for the density values.

Figure 6.3: Normal Distribution dialog box with Entries to compute the Probability Densities for the 1000 simulated values from a Standard Normal Distribution

Click on the OK button and the densities will be generated.

Next, we will plot the density values along the y-axis and the normal values in C1 along the x-axis. To achieve this, select Graph® Plot and in the Plot dialog box, fill in as shown in Figure 6.4. Note that we selected Connect as the Display option for the graph.

Figure 6.4: The Plot dialog box with entries to graph the DENSITY values< along the y-axis and the NORMAL values along the x-axis

Click on the OK button and the graph will be generated. Figure 6.5 shows the resulting graph.

Figure 6.5: The Standard Normal Probability distribution graph

Observations from the graph in Figure 6.5:

• The curve is bell shaped.
• The mean, median, and the mode are all equal (to zero in this case).
• The mean, median, and the mode are all located at the center of the distribution.
• The distribution has only one mode (unimodal)
• The distribution is symmetric about the mean (of 0 in this case).
• The curve is continuous and never cuts the x-axis.
• The total area under the curve and the x-axis is equal to one. (This requires calculus to establish).
• Because of the symmetry of the distribution about the mean (of 0 in this case), the area to the left of the mean (mode, median) is equal to 0.5, which is the same as the area to the right of the mean (mode, median).

Figure 6.6 shows some of these properties.

Figure 6.6: The Standard Normal Probability distribution graph Displaying some of its Properties

1. FINDING PROBABILITIES FOR A NORMAL RANDOM VARIABLE

In this section we will use MINITAB to help in computing probabilities for a normal random variable.

Example 2: Use MINITAB to determine P(Z > 1.0).

Note: Z is a standard normal random variable. That is, Z is normally distributed with a mean m = 0 and a standard deviation s = 1.

Figure 6.7 shows the area that represents P(Z > 1.0).

Figure 6.7: Shaded area represents P(Z > 1.0)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Select Cumulative probability.

Also, recall that Z is a standard normal random variable so the mean will be 0 and the standard deviation will be 1. Next select the Input constant option and type 1 in the text box. All of these are shown in Figure 6.8.

Figure 6.8: Normal Distribution dialog box with the Cumulative Probability selections

Click on the OK button and the cumulative probability will be computed and displayed in the Session window. This value is shown in Figure 6.9.

Figure 6.9: Display of the results for the Cumulative Probability

Now, from Figure 6.9, P(X < = x) = P(X <= 1.0) = 0.8413 represents the value of P(Z £ 1.0). That is, P(Z £ 1) = 0.8413. This shown in Figure 6.10.

NOTE: Since Z is a continuous random variable, P(Z £ 1.0) = P(Z < 1.0).

Figure 6.10: Area representing P(Z £ 1.0)

Recall, the original problem was to find P(Z > 1). Now, since the total probability under the curve and the x-axis is 1, then we can see that P(Z > 1) = 1 ? P(Z £ 1) = 1 ? 0.8413 = 0.1587.

Example 3: Use MINITAB to determine P(Z > -2.2).

Figure 6.11 shows the area that represents P(Z > -2.2).

Figure 6.11: Shaded area represents P(Z > -2.2)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Again select Cumulative probability with the Input constant being ?2.2. Refer to Figure 6.8.

Click on the OK button and the cumulative probability will be computed and displayed in the Session window. This value is shown in Figure 6.12 and it is equal to P(Z £ -2.2).

Figure 6.12: Display of the results for P(Z £ -2.2)

Now, from Figure 6.12, P(X < = x) = P(X <= -2.2) = 0.0139 represents the value of P(Z £ -2.2). That is, P(Z £ -2.2) = 0.0139.

Recall, the original problem was to find P(Z > -2.2). Now, since the total probability under the curve and the x-axis is 1, then we can see that P(Z > -2.2) = 1 ? P(Z £ -2.2) = 1 ? 0.0139 = 0.9861.

Example 4: Use MINITAB to determine P(Z £ 1.8).

Figure 6.13 shows the area that represents P(Z £ 1.8).

Figure 6.13: Shaded area represents P(Z £ 1.8)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Again select Cumulative probability with the Input constant being 1.8. Refer to Figure 6.8.

Click on the OK button and the cumulative probability will be computed and displayed in the Session window. This value is shown in Figure 6.14 and it is equal to P(Z £ 1.8).

Figure 6.14: Display of the results for P(Z £ 1.8)

Now, from Figure 6.14, P(X < = x) = P(X <= 1.8) = 0.9641 represents the value of P(Z £ 1.8). That is, P(Z £ 1.8) = 0.9641.

Example 5: Use MINITAB to determine P(-2.7 £ Z £ 1.3).

Observe that we can write P(-2.7 £ Z £ 1.3) = P(Z £ 1.3) - P(Z £ -2.7), by using cumulative probabilities. This depicted below.

Figure 6.15 shows the area that represents P(-2.7 £ Z £ 1.3).

Figure 6.15: Shaded area represents P(-2.7 £ Z £ 1.3)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Again select Cumulative probability with the Input constant being 1.3. Repeat for an Input constant of ?2.7. Refer to Figure 6.16.

Click on the OK button and the cumulative probabilities will be computed and displayed in the Session window. These values are shown in Figure 6.16.

Figure 6.16: Display of the results for P(Z £ 1.3) and P(Z £ -2.7)

So, from Figure 6.16, P(-2.7 £ Z £ 1.3) = P(Z £ 1.3) - P(Z £ -2.7) = 0.9032 ? 0.0035 = 0.8997.

Example 6: A normal random variable X has a mean m = 100 and a standard deviation s = 2. Use MINITAB to determine the probability that X is greater than 95.

NOTE: MINITAB will compute probabilities for any normal distribution. It is not restricted to only the standard normal distribution.

Figure 6.17 shows the area that represents P(X > 95).

Figure 6.17: Shaded area represents P(X > 95)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Select Cumulative probability with the Input constant being 95. Make sure that you enter a mean of 100 and a standard deviation of 2 in the Mean and Standard deviation text boxes. You will need to do this since X is normal with a mean of 100 and a standard deviation of 2. Refer to Figure 6.18.

Figure 6.18: Normal Distribution dialog box with the Cumulative Probability selections

Click on the OK button and the cumulative probabilities will be computed and displayed in the Session window. These values are shown in Figure 6.19.

Figure 6.19: Display of the results for P(X £ 95)

Now, since P(X > 95) = 1 ? P(X £ 95). Then, from Figure 6.19,

P(X > 95) = 1 ? P(X £ 95) = 1 ? 0.0062 = 0.9938.

Example 7: A normal random variable X has a mean m = 100 and a standard deviation s = 2. Use MINITAB to determine the probability that X is less than 102.

NOTE: MINITAB will compute probabilities for any normal distribution. It is not restricted to only the standard normal distribution.

Figure 6.20 shows the area that represents P(X < 102).

Figure 6.20: Shaded area represents P(X < 102)

Again, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Select Cumulative probability with the Input constant being 102. Make sure that you enter a mean of 100 and a standard deviation of 2 in the Mean and Standard deviation text boxes. Again, you will need to do this since X is normal with a mean of 100 and a standard deviation of 2. Refer to Figure 6.18 except now the input constant will be 102.

Click on the OK button and the cumulative probabilities will be computed and displayed in the Session window. These values are shown in Figure 6.21.

Figure 6.21: Display of the results for P(X < 102)

From Figure 6.21, P(X < 102) = 0.8413.

Example 8: A normal random variable X has a mean m = 100 and a standard deviation s = 2. Use MINITAB to determine the probability that X lies between 94 and 104.

Observe that we can write P(94 £ X £ 104) = PX £ 104) - P(X£ 94), by using cumulative probabilities. This depicted below.

Figure 6.22 shows the area that represents P(94 £ X £ 104) .

Figure 6.22: Shaded area represents P(94 £ X £ 104)

Now, for us to use MINITAB to help find this area, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Again select Cumulative probability with the Input constant being 104. Repeat for an Input constant of 94. Make sure that you enter a mean of 100 and a standard deviation of 2 in the Mean and Standard deviation text boxes. Again, you will need to do this since X is normal with a mean of 100 and a standard deviation of 2.

Click on the OK button and the cumulative probabilities will be computed and displayed in the Session window. These values are shown in Figure 6.23.

Figure 6.23: Display of the results for P(X £ 104) and P(X £ 94)

So, from Figure 6.23, P(94 £ X £ 104) = P(X £ 104) - P(X £ 94) = 0.9772 ? 0.0013 = 0.9759.

1. FINDING A VALUE OF THE NORMAL RANDOM VARIABLE

In the previous section we found probabilities associated with values of normal random variables. In this section, we will use MINITAB to help find values of any normal random variable when we are given the associated probabilities.

NOTE: The feature in MINITAB that allows us to find corresponding values for a random variable in a probability distribution is the Inverse cumulative probability option in the distribution.

Example 9: Assume that IQ scores have a normal distribution with a mean m = 100 and a standard deviation s = 15. Use MINITAB to determine the 75th percentile of the IQ distribution.

Let the random variable X represent the IQ score. Let x0 represent the IQ score we are required to determine.

Now, since x0 represents the 75th percentile for the IQ scores, it means that this value of x0 will be larger than 75% of all the IQ scores (when ordered from smallest to largest). Symbolically, we can write P(X £ x0) = 0.75.

Figure 6.24 shows this relationship.

Figure 6.24: Shaded area represents P(X £ x0) = 0.75

For us to use MINITAB to help find the value for x0, select Calc® Probability Distributions® Normal. The Normal Distribution dialog box will appear.

Select Inverse cumulative probability with the Input constant being 0.75. Make sure that you enter a mean of 100 and a standard deviation of 15 in the Mean and Standard deviation text boxes respectively. You will need to do this since X, the IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. Figure 6.25 shows the dialog box with the appropriate entries.

Figure 6.25: Dialog box with entries to help determine x0 in the relationship P(X £ x0) = 0.75

Click on the OK button and the inverse cumulative probability value will be computed and displayed in the Session window. This values is shown in Figure 6.26.

Figure 6.26: Session window output with value for x0 in the relationship P(X £ x0) = 0.75

From Figure 6.26, P(X £ x0) = 0.75 implies that x0 = 110.1173.
That is, the 75th percentile for the IQ scores is approximately equal to 110.

Example 10: Assume that IQ scores have a normal distribution with a mean m = 100 and a standard deviation s = 15. Use MINITAB to determine the range for the middle 80% of the IQ scores.

Let the random variable X represent the IQ score. Let x1 represent the lower end point for the interval of IQ scores and let x2 represent the upper end point for the interval of IQ scores.

Now, since we are dealing with the middle 80% of the IQ scores, symbolically, we can write P(x1 £ X £ x2) = 0.80.

Figure 6.27 shows this relationship.

Figure 6.27: Shaded area represents P(x1 £ X £ x2) = 0.80

Now, since the area between x1 and x2 is 0.8, then the total area of the un-shaded portions in Figure 6.27 is 0.2. From the symmetry of the normal distribution, the area to the left of x1 will be equal to the area to the right of x2. These areas in the tails of the distribution will be each equal to 0.1. This is depicted in Figure 6.28.

Figure 6.28: Display with all areas marked

From Figure 6.28 observe that the total area to the left of x2 will be equal to 0.9. Thus we can use the Inverse cumulative probability feature to obtain x2. Select Calc® Probability Distributions ® Normal. The Normal Distribution dialog box will appear. Select Inverse cumulative probability with the Input constant being 0.90. Make sure that you enter a mean of 100 and a standard deviation of 15 in the Mean and Standard deviation text boxes respectively. You will need to do this since X, the IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. Refer to Figure 6.25 for an example of the dialog box.

Click on the OK button and the inverse cumulative probability value will be computed and displayed in the Session window. The value of 119.2233 which corresponds to the value of x2 is shown in Figure 6.29.

Figure 6.29: Session window output with value for x2 in the relationship P(X £ x2) = 0.9

To obtain the value for x1 repeat with the input constant being equal to 0.1. The value of 80.7767 which corresponds to the value of x1 is shown in Figure 6.30.

Figure 6.30: Session window output with value for x1 in the relationship P(X £ x1) = 0.1

Thus the range for the middle 80% of the IQ scores is obtained by subtracting the value of x1 from the value of x2. That is the range = 119.2233 ? 80.7767 = 38.4466 or approximately a range of 39.

1. DETERMINING WHETHER A SET OF DATA WAS SAMPLED FROM AN APPROXIMATE NORMAL DISTRIBUTION

There are several techniques that can be used to establish whether a sample set of data values came from a normal distribution. These include the use of

• Histograms
• Stem-and-leaf plots
• Box plots
• Empirical Rule
• Normal probability plots
• Interquartile Range (IRQ)/standard deviation » 1.3

The first four graphical techniques have been discussed in previous explorations. Here we will discuss normal probability plots.

Recall, a normal probability plot for a set of data is a scatter plot with the ranked data values along the vertical axis and the corresponding z-values from a standard normal distribution along the horizontal axis.

Also, when the data are normally distributed, a linear (straight line) trend will result in the normal probability plot. A nonlinear trend in the normal probability plot will suggest that the data are non-normal.

Example 11: Open the worksheet called Trees. This data file is one of the many data files that are packaged with the MINITAB software.

When the worksheet is opened, there will be data values listed for the three variables: Diameter, Height, and Volume.

Use MINITB to construct a normal probability for the variable Height.

To produce a normal probability plot for the variable Height, select Graph® Probability Plot. In the resulting dialog box, select Height for the Variables text box and select Normal as the Distribution. These selections are shown in Figure 31.

NOTE: Under the Options button in the dialog box, deselect the following two features: Display table of percentile estimates and Include confidence intervals in plot.

Figure 6.31: Probability Plot dialog box with Entries to Construct a Normal Probability Plot for the variable Height

Click on the OK button and the normal probability plot will be generated as shown in Figure 6.32.

Figure 6.32: Normal Probability Plot for the variable Height

Observe that the plot exhibit a linear trend. This would suggest that the sample values of the tree heights came from a normal distribution.

Figure 6.33 shows the normal probability plot for the diameter. Observe that the plot does not follow a straight-line pattern. This would indicate that the sample values for the tree volumes did not come from a normal distribution.

Figure 6.33: Normal Probability Plot for the variable Volume

1. NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

In this section, we will investigate the normal approximation for the binomial distribution.

Recall, if X is a binomial random variable with n trials and probability of success p, then the

• mean for X ,
• standard deviation for X,

Suppose that Y is a normal random variable with mean and standard deviation .

• Under minimal conditions such as np > 5 and n(1 ? p) > 5, the theory shows that the random variables X and Y have very similar shapes.
• In addition, the probability P(X £ x0) for the binomial random variable X can be shown to be close to the P(Y £ x0 +0.5) for the normal random variable Y.
• The add-on of the 0.5 is commonly called the continuity correction factor that is applied when converting from the discrete binomial random variable X to the continuous normal random variable Y.

We will use MINITAB to demonstrate this approximation of the binomial random variable with the normal random variable through the following example.

Example 12: A 1995 report from a Harvard University study claims that 34.9% of people in the age group of 25 to 29 years of age were homeowners. Suppose that a random sample of 100 from this age group is polled

• Let X represent the number in the sample who turn out to be homeowners.

Here X is a binomial random variable since:

• Each person in the sample will either be a homeowner or not. Thus we have two possible outcomes for each trial (person in sample).
• There is a fixed number of trials. In this case n = 100.
• The probability that a person in this age group is a homeowner is p = 0.349.
• We will assume that each person is a homeowner independently of the next person in the sample.
• The mean for the random variable X is = (100)(0.349) = 34.9.
• The standard deviation for the random variable X is =  » 4.7665

Let the normal random variable Y have the same mean and standard deviation.

Note:

• m = 34.9 > 5
• s = 4.7665 » 5.

Although s is not greater than 5, the approximation should hold well.

We will use MINITAB to simulate 500 values from a binomial distribution with the mean and standard deviation computed above. Place headings of x-value, P(x-value), P(y-value) in columns C1, C2, and C3 respectively.

Refer to Exploration 5 to see how to simulate values and probabilities from the binomial distributions.

To compute the P(y-value) in column C3, use the 500 binomial values in C1 as the input column and use MINITAB to compute the densities for these values as normal values.

NOTE: In both simulations we are using the same mean and standard deviation.

Next, plot the P(x-value) and P(y-value) along the vertical axis and the x-value along the horizontal axis. You will need to superimpose both graphs on the set of axes.

For one of the graphs you may use a symbol to plot the points and for the other graph you may want to connect the plotted points. This will enable to see the two different plots.

To superimpose both graphs on the same axes, in the Plot dialog box, select the option of Frame and choose Multiple Graphs. In the Multiple Graphs dialog box select Overlay graphs on the same page. An example of the Multiple Graphs dialog box is shown in Figure 6.34.

Figure 6.34: Multiple Graphs dialog box

A sample plot of the superimposed graphs is shown in Figure 6.35.

Figure 6.35: Superimposed Binomial and Normal approximation

Observe from Figure 6.35 that the curves almost coincide. Thus one can conclude that the normal distribution is approximating the binomial probabilities very well.

NOTES

EXPLORATION #6: HOMEWORK ASSIGNMENT

Name: _____________________ Date: ______________________

Course #: ___________________ Instructor: _________________

1. A bag of a certain brand of popcorn is placed in a microwave oven and the time it takes a kernel to pop is observed. Suppose that the time to pop, X, has a normal distribution with mean m = 115 seconds and a standard deviation s = 23 seconds.
1. If a kernel of this brand of popcorn is selected at random, what is the probability that it will take more than 121 seconds to pop? Shade in an appropriate area under the given curve to indicate the required probability of P(X > 121).
2. Use the normal probability distribution features of MINITAB to help find the probability.
3. P(X > 121) = ______________________________

4. If a kernel of this brand of popcorn is selected at random, what is the probability that it will take less than 110 seconds to pop? Shade in an appropriate area under the given curve to indicate the required probability of P(X < 110).
5. Use the normal probability distribution features of MINITAB to help find the probability.
6. P(X < 110) = ______________________________

7. If a kernel of this brand of popcorn is selected at random, what is the probability that it will take between 100 seconds and 145 seconds to pop? Shade in an appropriate area under the given curve to indicate the required probability of P(100 < X < 145).

8. Use the normal probability distribution features of MINITAB to help find the probability.
9. P(100 < X < 145) = ______________________________

1. The following data involve the forearm measurements in inches of 140 adult males Reference: Pearson, K. and Lee, A., 1903, On the Laws of Inheritance of Physical Characteristics, Biometrika, 2, pp. 357 ? 462.

 17.3 18.4 20.9 16.8 18.7 20.5 17.9 20.4 18.3 20.5 19 17.5 18.1 17.1 18.8 20 19.1 19.1 17.9 18.3 18.2 18.9 19.4 18.9 19.4 20.8 17.3 18.5 18.3 19.4 19 19 20.5 19.7 18.5 17.7 19.4 18.3 19.6 21.4 19 20.5 20.4 19.7 18.6 19.9 18.3 19.8 19.6 19 20.4 17.3 16.1 19.2 19.6 18.8 19.3 19.1 21 18.6 18.3 18.3 18.7 20.6 18.5 16.4 17.2 17.5 18 19.5 19.9 18.4 18.8 20.1 20 18.5 17.5 18.5 17.9 17.4 18.7 18.6 17.3 18.8 17.8 19 19.6 19.3 18.1 18.5 20.9 19.3 18.1 17.1 19.8 20.6 17.6 19.1 19.5 18.4 17.7 20.2 19.9 18.6 16.6 19.2 20 17.4 17.1 18.3 19.1 18.5 19.6 18 19.4 17.1 19.9 16.3 18.9 20.7 19.7 18.5 18.4 18.7 19.3 16.3 16.9 18.2 18.5 19.3 18.1 18 19.5 20.3 20.1 17.2 19.5 18.8 19.2 17.7

1. Use MINITAB to compute descriptive statistics for the data.
2. Mean : ___________________________.

Median : ___________________________.

Mode : ___________________________.

Standard Deviation : ___________________________.

3. Compare the values of the mean and the median. Discuss.
4. Construct a normal probability plot for the data set. Provide a hard copy of the output.
5. From the plot, can you conclude whether the sample came from a normal distribution? Discuss.

1. Use MINITAB to help find the required probabilities and shade in an appropriate area on the given normal curve. Also, indicate where the z-value is located on the horizontal axis.
1. What is the value of P(Z > 1.72)? ___________________________.
2. What is the value of P(Z £ 1.09)? ___________________________
3. What is the probability of a standard normal score being less than ?1.33?
4. P(Z < -1.33): _______________________

5. Find P(-2.03 < Z £ -0.5). ___________________________
6. Find P(-1 < Z < 1). ____________________________.
7. How would you interpret this probability? Discuss.

8. Find P(-2 < Z < 2). ____________________________.
9. How would you interpret this probability? Discuss.

10. Find P(-3 < Z < 3). ____________________________.
11. How would you interpret this probability? Discuss.

1. Based on your discussions in parts (e), (f), and (g) in problem number 3, postulate a general rule for normal or approximate normal distributions. Relate the rule to the proportion of values that will be between 1, 2, 3 standard deviations from the mean for any normal distribution.
2. Many natural phenomena that humans observe are approximately normally distributed. Based on observations, it can be assumed that human intelligence is approximately distributed. The graph below illustrates the comparisons of the standard normal scores (z-scores) and the IQ scores.
1. What is the value of the mean IQ score from the graph?

2. Mean: ___________________________.

3. What is the value of the standard deviation for the IQ scores?

4. Hint: A value of z = 1 corresponds to an IQ score, X, of 115 and . You need to solve for s.
Standard deviation s: ____________________________.

5. If a person is chosen at random and tested, what is the probability of that person having an IQ score greater than 117. Use MINITAB to help find the probability.
6. P(X > 117): _________________________(where X is the IQ score).

7. If a person is chosen at random and tested, what is the probability of that person having an IQ score less than 72. Use MINITAB to help find the probability.
8. P(X < 72): _________________________ (where X is the IQ score).

9. If a person is chosen at random and tested, what is the probability of that person having an IQ score between 85 and 115. Use MINITAB to help find the probability.
10. P(85 < X < 115): _________________________ (where X is the IQ score).

11. What IQ score will correspond to the 50th percentile. Use MINITAB to help find the percentile.
12. 50th Percentile IQ score: _________________________ (where X is the IQ score).

13. Use What IQ score will correspond to the 85th percentile. Use MINITAB to help find the percentile.
14. 85th Percentile IQ score: _________________________ (where X is the IQ score).

1. The following table (National Report) gives the 1998 Profile of College Bound Seniors. It shows the means and standard deviations for the SAT I Math scores for different ethnic groups. This data can be observed at http://www.collegeboard.org/sat/cbsenior/yr1998/nat/natbk298.html

SAT I Mean Scores and Standard Deviations for Males, Females, and Total by Ethnic Group

1. If a scholarship is available to students with SAT I math test scores above the 82nd percentile, what is the score needed for female students for the different groups to be eligible for the scholarship?
2.  Ethnic Group 82nd Percentile Scores American Indian or Alaskan Native Asian, Asian American, or Pacific Islander African American or Black Mexican or Mexican American Puerto Rican Latin American, South American, Central American, or Other Hispanic or Latino White Other

1. Would using the 82nd percentile scores for the different ethnic groups be appropriate to award scholarships for the female students in these groups? Discuss.

1. The following data set summarizes the chest sizes of Scottish militiamen in the early 19th century. Chest sizes are measured in inches, and each observation reports the number of soldiers with that chest size.
2.  Frequency, f Chest Size (inches) 3 33 18 34 81 35 185 36 420 37 749 38 1073 39 1079 40 934 41 658 42 370 43 92 44 50 45 21 46 4 47 1 48

1. Use MINITAB to help plot the information given on chest size. Let the vertical (y) axis represent the frequencies and the horizontal (x) axis represent the chest size.
2. Provide a hard copy of the graph.

3. Describe the shape of the distribution for the chest sizes.
4. What is the mean chest size for the data?
5. Mean: _________________________.

Note: To enter the chest size values into MINITAB in a single column, we need to generate 3 values of 33, 18 values of 34, 81 values of 35, etc in separate columns and then stack them in one column. Use the sequence Calc® Make Patterned Data® Arbitrary Set of Numbers to generate the different sets of chest sizes. For instance, in order to generate 3 values of 33 in column C1, the following dialog box shows the appropriate entries.

Repeat for all other chest sizes and save in columns C2 to C16.

Next stack the generated values in column C17 by selecting Manip® Stack ® Stack Columns and fill in the appropriate boxes in the dialog box as shown below.

This stack all the observed chest size values in column C17. Now descriptive statistics can be computed for all the chest sizes in column C17.

6. Present a normal probability plot for the values in C17. What can you infer about the data set from the plot? Discuss.
7. If this data is considered as the population, what is the standard deviation for the chest size?
8. Note: To compute the population standard deviation from the sample standard deviation given by the descriptive statistics in MINITAB, you need to multiply it by . For this data set n = 5738.

Standard deviation (s X): _____________________________.

9. If this data is considered as the population, what is the probability that a randomly selected militiaman will have a chest size greater than 38 inches?
10. Probability: _____________________________.

What assumption(s) are you making in computing the probability? Refer to parts (b) and (d). Discuss.

11. If this data is considered as the population, what is the probability that a randomly selected militiaman will have a chest size less than 44 inches?
12. Probability: _____________________________.

Display the mean chest size and the value of the given chest size and shade the appropriate probability (area).

13. If this data is considered as the population, what is the probability that a randomly selected militiaman will have a chest size between 36 inches and 43 inches?
14. Probability: _____________________________.

Display the mean chest size and the values of the given chest size and shade the appropriate probability (area).

1. Consider the following normal random variables (X) with the given means and standard deviations.
1. Compute the probabilities that each variable will be between one, two, and three standard deviations from the respective means.
2.  Mean, m Standard deviation, s P(m - s < X < m + s ) P(m - 2s < X < m + 2s ) P(m - 3s < X < m + 3s ) 100 16 34 3.5 8.3 0.125 55 5
3. What are your observations from these computed probabilities? Discuss.
4. Try to generalize your observations in part (b) for any normal random variable.
5. The Empirical Rule for any Normal Distribution with mean m and standard deviation s.
6. Approximately 68% of the observations will fall in the interval (m - 1s, m + 1s)

Approximately 95% of the observations will fall in the interval (m - 2s, m + 2s)

Approximately 99.7% of the observations will fall in the interval (m - 3s, m + 3s)

7. Verify the Empirical Rule for the militiamen data in Activity 7. Let X represent the chest size variable. Note: Here we are assuming that the variable for the chest sizes is normally distributed. Use the mean and standard deviation for the population as computed in Activity 7.

P(m - 1s < X < m + 1s) = ____________________________

P(m - 2s < X < m + 2s) = ____________________________

P(m - 3s < X < m + 3s) = ____________________________

1. Suppose that the lifetime X of a randomly tested battery describes an exponential distribution having a mean life of 1200 hours.
2. Note: An exponential distribution is positively skewed for a random variable X ³ 0. The density function is given by

1. Use MINITAB to help display a graph of the density function for X over the interval [0, 5000]. Use Calc® Make Patterned Data® Simple Set of Numbers to store 0, 50, 100, 150, ?, 5000 in column C1 with the label Lifetime. The dialog box is shown below.
2. Next we need to generate exponential densities values for the generated values in column C1. To achieve this, select Calc® Probability Distributions® Exponential. Fill in the dialog box as shown below such that the densities are saved in column C2. Observe that we are using a mean of 1200.

Next plot the values in C2 versus the values in C1. Let the values in C2 be along the vertical (Y) axis and the values in C1 be along the horizontal (X) axis.

Provide a hard copy of the graph with this assignment.

3. Next compute the cumulative probabilities for these exponential values and save in column C3. This can be achieved by selecting Calc® Probability Distributions® Exponential and filling in the dialog box as follows. Note that that column C3 has been renamed CumProb.
4. Use the cumulative probabilities in column C3 to determine the following percentages of batteries:

 Requested Percentages Percentages % that are expected to fail before 200 hours % that are expected to fail before 2000 hours % that are expected to fail after 500 hours % that are expected to fail after 1500 hours % that are expected to fail between 500 and 1500 hours % that are expected to fail between 1000 and 2000 hours

1. This activity will investigate the normal approximation to the binomial distribution.

Recall a binomial random variable X with parameters n and p has a

• a mean and
• a standard deviation

where n is the number of trials in the binomial experiment and p is the constant probability of success for each trial.

Suppose that a normal random variable X* has the same mean and standard deviation as the binomial random variable.

MINITAB can be used to demonstrate that under certain minimal conditions, such as np > 5 and n(1 ? p) > 5, the normal and the binominal distributions have very similar shapes and hence would indicate that the corresponding probabilities are approximately equal. The approximation gets better with increase n.

Also, a probability of the form for the binomial distribution can be approximated by from the corresponding normal distribution.

Note: The 0.5 value that was added is called the continuity correction factor that is applied when converting from a discrete distribution to a continuous distribution.

A recent UCLA Center for Communication Policy found that 76% of all Internet users check their e-mails daily. Suppose that 200 Internet users are selected at random and let X represent the number of Internet users who check their e-mails daily.

Note: the mean m = 200´ 0.76 = 152 and the standard deviation

Let X* be a normal random variable with the same mean and standard deviation as above.

Note: Recall for the normal distribution, virtually all (99.7%) of the data values will be within 3 standard deviations of the mean. Thus, virtually all the values will be between 152 - 3´ 6.0399 = 133.8803 and 152 + 3´ 6.0399 = 170.1197. For all practical purposes, almost all the normal values will lie between 133 and 171.

• Use Calc® Make Patterned Data® Simple Set of Numbers to generate the integer values from 133 to 171 and save in column C1. Label C1 as x-values.
• Use Calc® Probability Distributions® Binomial to compute the binomial probability values for C1 and save in column C2. Label as BinProb.
• Use Calc® Probability Distributions® Normal to compute the normal probability densities for C1 and save in column C3. Label as NormProb.
• Graph C1 versus C2 and C3 and use the Frame option in the Plot dialog box to superimpose the graphs on the same set of axes. Note: Plot with C1 along the horizontal (x) axis and C2 and C3 along the vertical (y) axis.
• Provide a hard copy of the graph and discuss your observations.