STA 3187W – Homework 1, Page 1

Homework 3: (Chapters 5 & 7, Statistical Ethics) – Due October 24th – 107 Points

Non-Writing-Intensive Component

Chapter 7

(1) (6 Points) Consider a case where we wish to conduct a systematic random sample of 35 elements

from a population of size 179. You obtain a random number of 1 as your starting point, and you are to

conduct a systematic random sample from there.

(a) Determine the value of k to use in the “1-in-k Systematic Random Sample”. Show all work.

(b) Provide the entire list of 35 observations in your sample.

(c) Note that element 176 would not be in your sample. Explain in 1-2 sentences why it would not be

practical to expect element 176 in your sample.

Statistical Ethics

(2) (10 Points)

• Hover over the IRB Submissions Tab, and click on “Types of Review”.

(a) What are the three types of review categorized by IRB submissions?

• Click on the Resources tab.

(b) GWU uses the CITI training program. How often must investigators and research staff undergo

continuing training from CITI Program in order to complete human subjects research?

• While hovering over the Resources tab, click on “Education, Outreach, & Training”

(c) If you would like to discuss your project in more detail, or if you are new to research at GW, you may

request a 30-minute appointment with OHR via phone or in person. (1) What is the e-mail to schedule an

appointment, and (2) what office hours are available for appointments?

• Hover over the About Tab, and click “FAQ’s”:

(d) Find 2 questions that appear either interesting to you in regards to your future profession or to you

personally (aka for the heck of it). Provide the questions, provide the answers for these questions, and

explain how they might help you in the future.

Chapter 5

(3) (10 Points) Suppose we want to select a stratified random sample from a population and we have

$5000 for data collection. The population has been divided into four strata, and we have decided to

allocation 20% of the total sample size into stratum 1, 15% of the sample size into stratum 2, 35% of the

sample size into stratum 3, and 30% of the sample size into stratum 4. The cost of sampling per unit is $6

for stratum 1, $4 for stratum 2, $5 for stratum 3, and $8 for stratum 4.

(a) Determine the sample sizes for the four strata as well as the total sample size.

(b) How much money is left in our budget?

(4) (27 Points) (Similar to problem 5.5)

A corporation wishes to obtain information on the effectiveness of a business machine. A number of

division heads will be interviewed by Skype and asked to rate the equipment on a numerical scale. The

divisions are located in North America, Europe, and East Asia. Hence, stratified sampling is desired. The

costs are slightly larger for interviewing division heads located outside North America (since interview

times in Europe is between 5-8 hours later and in East Asia is between 10-14 hours later, we have to pay

interviewers more to conduct interview outside regular work hours). The accompanying table gives the

costs per interview, as well as approximate variances of the ratings (based on previous interviews).

Stratum 1 2 3

Description N. America Europe E. Asia

𝑁𝑖 115 65 50𝜎2

(approx.) 2.56 3.24 2.89

𝑐𝑖 $8 $12 $16

(a) If the researchers have $500 to spend on sampling, choose the sample size 𝑛 and the allocation

that minimizes V(𝑦̅𝑠𝑡).

(b) If the researchers wish to stay within an error bound of 0.50, choose the sample size 𝑛 and the

allocation that minimizes the cost.

(c) Based on the results you obtain from parts (a) and (b), how many total division heads should you

sample (including how many from North America, Europe, and East Asia)?

Justify your answer in 1-2 sentences.

(5) (24 Points) (Similar to problem 5.31) Wage earners in a large firm are stratified into management and

clerical classes, the first having 250 employees and the second having 400 employees. To assess attitudes

on sick-leave policy, independent simple random samples of 50 workers of each class were selected.

Individuals were provided three options: “favor”, “do not favor”, “no opinion”. The responses are

provided in the table below:

Stratum 1 2

Description Management Clerical

𝑵𝒊 250 400

𝒏𝒊 50 50

Favor 33 15

Do Not Favor 10 32

No Opinion 7 3

(a) Estimate the proportion of employees who favor the sick-leave policy and place a bound on the error

estimation.

(b) Construct a 95% confidence interval for the proportion of employees who favor the sick-leave policy,

and interpret the confidence interval given the context of the problem.

(c) Is there evidence that a majority of employees favor the sick-leave policy? How about a minority? In

both cases, justify your reasoning.

(d) It can be argued that the opinions of management are more than likely different than the opinions of

the clerical workers and should not be representative of the clerical workers. Is there evidence that there is

a difference between the proportion of management workers who favor the sick-leave policy from that of

clerical workers? Justify your answer statistically at the 5% significance level.

Writing-Intensive Component

Chapter 5

(1) (30 points) (Similar to problem 5.10)

A forester wants to estimate the total number of farm acres planted with trees for a state. Because the

number of acres of trees varies considerably with the size of the farm, he decided to stratify on farm sizes.

The 260 farms in the state are placed in one of four categories, according to size. A stratified random

sample of 50 farms, selected roughly by using proportional allocation yields the results shown in the table

below:

Stratum I II III IV

Description 0-200 Acres 200-400 Acres 400-600 Acres Over 600 Acres

Using the data provided, first construct box plots for each of the four stratum. Discuss whether it appears

useful to stratify our data (as opposed to using an SRS). Then, estimate the total number of acres of trees

on farms in the state. Next, estimate the variance and place a bound on the error of estimation. Use that

information to provide a 95% confidence interval for the total number of acres of trees on farms in the

state with interpretation.

Now, suppose that United State Department of Agriculture (USDA) using geographic information

systems (GIS) software concluded that the there are roughly 60,000 acres of trees within the state. Based

on your results, is this claim plausible (at the 5% significance level)? Provide a full explanation in

determining the plausibility of said claim.

When answering this question, it should be formed as a mini-essay report, providing a full narrative of

your findings that results in 1-2 typed pages, double-spaced. Fully describe your process at each step, and

avoid using “I” (instead, use the academic “we” or “this report”/ “this paper”).

The report should have roughly 4 paragraphs:

• One-paragraph introduction

• One paragraph discussing the necessity of stratifying your population into strata, along with

providing summary statistics for each of your four stratum samples and confirming via boxplots

that it makes sense to stratify based on your data.

o Also provide the summary statistics for each of the four strata in a table (between paragraphs

1 and 2).

o Provide a set of box plots underneath paragraph 2.

• One paragraph on:

o The computation of the estimator (include either the formula or a verbal description of the

estimator).

o Estimated variance of the estimator.

o Computation of the error bound (also discussing where the 2 comes from)

o An interpreted 95% confidence interval.

• One paragraphs on the evaluation of the claim from the USDA, discussing the full thought process

behind your conclusion.

You may include formulas in this essay instead of a completely verbal description of computations.

However, there should be full paragraphs when discussing the entire procedure – no bullets!

Consider this problem as a presentation to an audience who understands statistics, but would like to know

where you get the values (think like a job interview, where they are skeptics!).

Optional Component (Supplemental problems; Not graded/no points awarded)

Chapter 7 Homework: 9-12, 21

Chapter 5 Homework: 1, 2, 3, 4, 5, 11, 14, 15, 18, 19, 20, 22, 27, 31, 32, 35, 40, 41