|
Problem 5.
Below is a table
of data providing the percentage of weekly visitors visiting one
of the summary web pages of a web-page-rich website. This table
was used for the material appearing on the earlier web page
Displaying
Quantitative Data.
|
Percentage of Weekly
Visitors Visiting one of the Summary Web Pages of a
Web-Page-Rich Website between the Weeks of 17-23 April 2005
and 19-25 February 2006 |
Week
4/17/2005-4/23/2005
4/24/2005-4/30/2005
5/01/2005-5/07/2005
5/08/2005-5/14/2005
5/15/2005-5/21/2005
5/22/2005-5/28/2005
5/29/2005-6/04/2005
6/05/2005-6/11/2005
6/12/2005-6/18/2005
6/19/2005-6/25/2005
6/26/2005-7/02/2005
7/03/2005-7/09/2005
7/10/2005-7/16/2005
7/17/2005-7/23/2005
7/24/2005-7/30/2005
7/31/2005-8/06/2005
8/07/2005-8/13/2005
8/14/2005-8/20/2005
8/21/2005-8/27/2005
8/28/2005-9/03/2005
9/04/2005-9/10/2005
9/11/2005-9/17/2005
9/18/2005-9/24/2005
9/25/205-10/01/2005
10/02/2005-10/08/2005
10/09/2005-10/15/2005
10/16/2005-10/22/2005
10/23/2005-10/29/2005
10/30/2005-11/05/2005
11/06/2005-11/12/2005
11/13/2005-11/19/2005
11/20/2005-11/26/2005
11/27/2005-12/03/2005
12/04/2005-12/10/2005
12/11/2005-12/17/2005
12/18/2005-12/24/2005
12/25/2005-12/31/2005
1/01/2006-1/07/2006
1/08/2006-1/14/2006
1/15/2006-1/21/2006
1/22/2006-1/28/2006
1/29/2006-2/04/2006
2/05/2006-2/11/2006
2/12/2006-2/18/2006
2/19/2006-2/25/2006 |
Percentage
1.26
1.05
1.02
0.92
1.03
0.89
0.99
0.93
1.03
0.75
0.75
0.86
*
0.82
0.81
0.82
0.84
0.62
0.83
0.82
0.65
0.52
0.58
0.81
0.72
0.64
0.70
0.75
0.77
0.71
0.76
1.09
0.94
0.66
0.85
1.01
0.98
1.05
0.86
0.80
0.76
0.61
0.99
0.77
0.91 |
For the above
population data, use Minitab to:
(a) Compute
the mean, (b) Determine the median, (c) Compute the
variance, (d) Compute the standard deviation, (e) Compute
the corresponding z-score for each weekly percentage,
(f) Verify that for the z-score intervals (-1.5,1.5), (-2,2),
and (-3,3) the data set conforms to Chebyshev's Rule, (g)
Construct a Stem-and-Leaf Display for the percentage data, and
(h) Construct a frequency histogram with:
(i) intervals
.12-percentage-points wide, over the domain .44-to-1.28, (ii) first value
in Minitab's lower interval-definition box to be
left endpoint of first interval, (iii) mean and median marked by
vertical lines, and (iv) z-scores below each x-axis value
and the mean.
Solution to Problem
5:
Centrality, Spreads, Normality, and Chebyshev's Rule
Note that finding
the mean, median, variance, standard deviation for a data set
this large would be, without using Minitab (or other statistics
software), quite tedious.
We first transfer the two columns
of data to Minitab by following instructions found in
Solution to Problem 1:
Displaying Quantitative Data, with one additional new
technique. After dragging over the table's first column, we
observe that the number of entries is unknown and not easily
counted. Consequently, in the first column in Minitab's
worksheet, we drag over enough cells to ensure that all of the
entries will be pasted. As table entries will be repeated in the
excess cells, we simply look to see where repetition begins and
delete all cells containing repeated data. Then when we paste
the table's second column in Minitab's second worksheet column,
we drag only down to the bottom row of the first column.
(a) - (d) With the data pasted into
Minitab's worksheet, we may easily summon the basic statistics.
Step 1. Click
Stat for a drop down menu, place
the cursor over
Basic Statistics for a second drop down menu, and move
the
cursor across to the second menu to the line Display Descriptive
Statistics. We have:
or blown up for easier viewing:
Step 2. Click
Display Descriptive
Statistics to obtain the Display Descriptive Statistics
dialog box. We have:
We now follow standard Minitab
procedure.
Step 3. Move the cursor to the left window of
the dialog box and click C2 Percentage. We have:
Note that C2 Percentage
has been highlighted and the Select button activated.
Step 4. Click the
Select button to place Percentage into
the Variables: window and then click the OK button
to obtain:
We frame the results.
From this, we take
our answers. One word of caution: Minitab uses the formula for a
sample variance in its computations; however, when the data set is
large, the difference between the sample standard deviation and population standard
deviation is small.
We have:
|
(a) |
 |
 |
.84 |
|
(b) |
median |
 |
.82 |
|
(d) |
 |
 |
.156 |
|
(c) |
 |
 |
.024 |
(e) Next, we
compute the z-scores for the percentage data.
Step 1. Move the
cursor to Calc, click it to obtain a drop down
menu, and move the cursor to Standardize.... We have:
Step2. Click
Standardize... to
obtain the Standardize dialog box.
Step 3. Click
C2 Percentage in the left
window of the dialog box and then click the Select button to
place Percentage into the Input column(s): window.
Step 4. Click the
Store results in:
window and type in C3 (as it is an unused column).
We observe that the radio button
next to Subtract mean and divide by std. dev. is already
activated, so we are set to do the z-score computation.
Step 5. Click
the OK button, and the z-scores appear in column C3.
Step 6. To round the z-scores to two
places to the right of the decimal point, either (a) drag over the cells
in column C3 and right click the shaded area or (b) click any
cell in column C3 and then right-click. A menu appears.
Step 7. Move the cursor to
Format
Column to obtain a second menu and then move the cursor across to
Numeric.... We have:
Step 8. Click
Numeric... to
obtain the Numeric Column Format dialog box. Select
the middle radio button to fix the number of decimal places and
then type 2 into the small window to the right.
Step 9. Click the
OK button to obtain:
Step 10. If the cells of
the column are not already selected, drag over the active cells
of the column. Then right-click the darkened cells
to obtain a menu and click Copy Cells in the menu. Paste the cells into a table containing the percentage data and
do a little reformatting (such as font selection, line removal,
and a couple insertions) to obtain the table of answers for section (e).
Percentage
xi
1.26
1.05
1.02
0.92
1.03
0.89
0.99
0.93
1.03
0.75
0.75
0.86
*
0.82
0.81
0.82
0.84
0.62
0.83
0.82
0.65
0.52
0.58
0.81
0.72
0.64
0.70
0.75
0.77
0.71
0.76
1.09
0.94
0.66
0.85
1.01
0.98
1.05
0.86
0.80
0.76
0.61
0.99
0.77
0.91 |
z-score
zi
2.70
1.35
1.16
0.52
1.23
0.33
0.97
0.58
1.23
-0.57
-0.57
0.13
*
-0.12
-0.19
-0.12
0.00
-1.41
-0.06
-0.12
-1.22
-2.05
-1.67
-0.19
-0.77
-1.28
-0.90
-0.57
-0.45
-0.83
-0.51
1.61
0.65
-1.15
0.07
1.10
0.90
1.35
0.13
-0.25
-0.51
-1.47
0.97
-0.45
0.45 |
(f) We next verify that the percentages of the population
with z-scores within the intervals (-1.5,1.5), (-2,2), and (-3,3) conform
to Chebyshev's Rule. We have:
and from this table, we have:
The percentage of the population with z-scores within
the interval (-1.5,1.5)
90.9%
55.6%
(1-1/1.52)(100%); hence, the percentage
conforms to Chebyshev's Rule for the interval (-1.5,1.5).
The percentage of the population with z-scores within
the interval (-2,2)
95.5%
75%
(1-1/22)(100%); hence, the percentage
conforms to Chebyshev's Rule for the interval (-2,2).
The percentage of the population with z-scores within
the interval (-3,3)
100%
88.9%
(1-1/32)(100%); hence, the percentage
conforms to Chebyshev's Rule for the interval (-3,3).
(g) As the requested Stem-and-Leaf Display is what we
composed in
Stem-and-Leaf Construction Using Minitab Release 13,
we simply duplicate it here.
(h) Lastly, we construct, using Minitab, a frequency histogram with:
(i) intervals
.12-percentage-points wide, over the domain .44-to-1.28,
(ii) first value
in Minitab's lower interval-definition box to be
left endpoint of first interval, (iii) mean and median marked by
vertical lines, and (iv) z-scores below each x-axis value
and the mean.
We follow the basic instructions of
Histogram Construction Using Minitab Release 13.
Requirements (i) and (ii) impose the following selections in
the Type of Intervals and Definition of Intervals sections
of the Histogram Options dialog box.
After clicking the OK button in the
Histogram Options dialog box and and again in the Histogram dialog box, we use the
Tool bar to add
the mean and median lines, the z-scores, and labels. We
have:
|