Home | Chapter16: |

Download all Excel spreadsheets for this chapter.

We can use a spreadsheet and the `RANDBETWEEN` function to simulate the tossing of a fair coin 10 times. We will use a '1' to denote a head and a '2' to denote a tail. Figure 16.1 illustrates possible output of tossing a coin 10 times. To generate this spreadsheet, we place the formula

`RANDBETWEEN(1,2)`

in cell `A2` and copy it down column `A` to cells `A3` through `A11`. Next, in cell `C2`, we use the formula

`=COUNTIF(A2:A11,1)`

which counts the number of times a '1' occurs in column `A`.
In cell `D2`, we use the similar formula `=COUNTIF(A2:A11,2)`. The beauty of the spreadsheet is that if we want to run the simulation again in *Excel*, we simple need to hit `CTRL+ALT+F9`.

Suppose now we would like to repeat this experiment 10 times and a bar graph is to be drawn showing how many heads are tossed on each trial. Figure 16.2 shows one possible design.

There are several key formulas in the makeup of this spreadsheet. In cell `B3`, the formula `RANDBETWEEN(1,2)`
is used to generate a heads or a tails (represented by the numbers '1' and '2' respectively). This formula is then copied to cells located in the block `B3` to `K12` in order to generate 100 coin toss outcomes.

Cell `B14` contains the formula

`=COUNTIF(B3:B12,1)`

which counts the number of heads that occured in the first trial (10 flips). Similarly, in cell `B15` we type `=COUNTIF(B3:B12,2)`
which counts how many tails occured in the first trial. Copying these formulas across to cells `B14` through `K15` finishes our counting of heads and tails in each trial.

Finally, a frequency table is made. After row `17` containing the integers between 0 and 10 is generated (how?), we place in cell `B18` the formula

`=COUNTIF(\$B14:\$K14,B17)`

which is promptly copied to cells `B18` through `L18`. The use of the $ in front of the column indices `B` and `K`, but not in front of the column index `B`, allows us to count the frequency with which each number of heads showed up in the 10 trials.

In order to visually see the results of our process, we use the spreadsheet to generate a bar graph using the information found in rows `17` and `18`. The exciting part of our design is that `CTRL+ALT+F9` will *automatically* regenerate both new data and a new bar graph. With only 10 trials, our bar graph is not very nearly **normal**,
but there is little more work involved in extending this process to 100 trials in which case the bar graph does appear to be more normal.

Suppose *N* students take a college entrance exam. Further, suppose the scores on the exam have an approximately normal distribution with mean
points and standard deviation
points. We wish to design a spreadsheet which allows the user to input *N*, , and , and will output to a fourth cell an *estimate* of the first-quartile score *S* for the exam. Also, suppose we wish to output to a fifth cell an estimate of the number of students scoring below the first quartile
on the exam.

We start our spreadsheet by placing the value of *N* in cell `A3`, the value of in `B3`, and the value of in `C3`. Since the first quartile
should be approximately points below the mean ,
we place the formula

`=B3-0.675*C3`

Given a normal distribution
with mean
and standard deviation ,
*Microsoft Excel* has a built in function which computes (approximately) the percent of data having a **standardized value**
(sometimes called a ** z-score**)
less than

`=NORMDIST(0.675,0,1,TRUE)`

As an example of the usefulness of the `NORMDIST`
function, consider the verbal part of the 1996 SAT exam. The scores on the exam fit an approximately normal distibution
with mean =505
and standard deviation =110.
Suppose we would like to estimate the percentage of scores which fell between 500 and 650. The process we would use in *Excel* would be to first compute the standardized values of 500 and 650. For this, we could use the `STANDARDIZE` command. To compute the standardized value
of 500 we could use the formula

`=STANDARDIZE(500,505,110)`

in cell `A1` which would output -0.045455. Similarly, in cell `A2` we could compute the standardized value of 650 to be 1.318182. To compute the percent of standardized values falling between -0.045455 and 1.318182, we could use the formula

`=NORMDIST(A2,0,1,TRUE)-NORMDIST(A1,0,1,TRUE)`

which would output 0.424406. This would mean about 42.4\% of the scores fell between 500 and 650.

**Walking**

**Exercise 16.1** An honest coin is tossed *N* times. Let the random variable *X* represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs *N* in one cell and the spreadsheet outputs to two other cells the mean
and standard deviation of the random variable *X*.

**Exercise 16.2** A dishonest coin with probability of heads being tossed *p* is tossed *N* times. Let the random variable *X* represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs the number *N* in one cell and the number *p* in another cell and outputs to two other cells the mean
and standard deviation of the random variable *X*.

**Exercise 16.3** Suppose *N* students take a college entrance exam. Further, suppose the scores on the exam have an approximately normal distribution with mean points and standard deviation points.

(a) Design a spreadsheet which allows the user to inputN, , and , and will output to a fourth cell anestimateof the third-quartile scoreSfor the exam. (b) Design a spreadsheet which allows the user to inputN, , and , and will output to a fourth cell anestimateof the inter-quartile range for the exam. (c) Design a spreadsheet which allows the user to inputN, , and , and will output to two other cellsestimatesof the maximum and minimum for the exam.

**Exercise 16.4** The scores on the verbal part of the 1996 SAT exam fit an approximately normal distribution
and standard deviation =110.
Use a spreadsheet to compute the percent of scores falling between 290 and 720.

**Jogging**

**Exercise 16.5 [Excursions, Chapter 16, Exercises 13, 14]**
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input , , and a number *x* and will output to a fourth cell the standardized value
of *x*. *Note:* In *Excel*, the command

`STANDARDIZE(x,mean,standard deviation)`

can be used to simplify the process. However, this exercise is for you to design a spreadsheet that acts like the `STANDARDIZE` function.

**Exercise 16.6 [Excursions, Chapter 16, Exercise 17]**
Suppose that a normal distribution has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input , , and a standardized value *z* and will output to a fourth cell the data value having standardized value *z*.
Test your design on input of =183.5, =31.2, and *z*=-1.

**Exercise 16.7 [Excursions, Chapter 16, Exercise 19]**
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input and a standardized value *z* for another input data value *x* and will output to a fourth cell the standard deviation .
Test your design on input of =50, *x*=84, and *z*=2.

**Exercise 16.8 [Excursions, Chapter 16, Exercise 21]**
Suppose that a normal distribution
has a mean of
and a standard deviation
of . Design a spreadsheet that allows the user to input and a standardized value *z* for another input data value *x* and will output to a fourth cell the mean .
Test your design on input of =15, *x*=50, and *z*=3.

**Exercise 16.9** Suppose that a normal distribution
has a standard deviation
of . Design a spreadsheet that allows the user to input and two numbers *x* and *y* (*y*>*x*) and will output to a fourth cell how far apart *x* and *y* are (as measured in standard deviations).

**Exercise 16.10** A dishonest coin with probability of heads being tossed *p* is tossed *N* times. Let the random variable *X* represent the number of times the coin comes up heads. Design a spreadsheet in which the user inputs the number *N* in one cell and the number *p* in another cell and outputs to two other cells the first
and third quartile
of the random variable *X*.

**Running**

**Exercise 16.11** An assembly line produces 100,000 light bulbs a day, 20% of which generally turn out to be defective. Suppose we draw a random sample of *n*=100 light bulbs. Let *X* represent the number of defective light bulbs in the sample.

(a) Use a spreadsheet to compute the percentage chance that the number of defective light bulbs in the sample will be less than 25.

(b) Use a spreadsheet to compute the percentage chance that the number of defective light bulbs in the sample will be greater than 17.

**Exercise 16.12 [Excursions, Chapter 16, Exercise 67]**
An urn contains 10,000 beads of which 20% are red and the rest white. Suppose that we draw a sample of size 400 (drawing a bead and each time replacing it in the urn before drawing again). What are the approximate chances that the number of white beads in the sample will be 310?