Statistics assignment-76734

  • Statistics assignment
  • ··  5 pages, 1250 words Revised Australia
  • APA Revised 10 Refers
  • (+61) 43253****
  • May 13, 2015 | Wednesday 01:09 pm | 2 months ago

Question 1 (9 marks in total)

Cans of Fizzy Cola are believed to have contents that are Normally distributed with a mean of

305 ml and a standard deviation of 10 ml.

(a) If a random sample of 25 cans is selected from a day’s production, what is the probability

that the 12th can contains more than 300 ml? (2 marks)

(b) If the average of the contents of the 25 cans is calculated, what is

i. the standard error of this average? (1 mark)

ii. the probability that this average is greater than 300 ml? (3 marks)

(c) What is the probability that at least 20 of the 25 cans have contents greater than 300 ml?

(3 marks)

Marking Criteria

Marks 0 1 2 3

1(a) Incorrect answer and

no working

Correct answer but

working has errors

Correct answer with

correct working

1(b)i. Answer is incorrect Answer is correct

1(b)ii.

and

1(c)

Solution is missing or

incorrect

One part of precess

is correct, but rest is

missing or wrong

Process attempted but

impaired by a minor er-

ror, or correct
nal an-

swer with inadequate

working

Correct
nal answer

with working (interme-

diate calculations or

output from software)

3

STA201/401 201530 Assignment 2

Question 2 (15 marks in total)

Suppose that, in 10 years of use throughout Australia, one particular animal vaccine has been

successful in treating an animal disease in 66% of cases (i.e., on average, 66% of animals have

recovered), and this percentage has become accepted as the industry `standard’.

(a) JaneyWerakso has 64 animals su
ering from this disease. If she administers the vaccine

in the approved manner, what is the standard error of the proportion of animals that will

recover? (2 marks)

(b) After treatment, Janey
nds that 39 of her animals recover from the disease. What

proportion of Janey’s animals recover from this disease? (1 mark)

(c) What is the probability of getting this proportion (or less) of recoveries from 64 animals

if the vaccine really does have a 66% recovery rate? (3 marks)

(d) If p represents the true probability that an animal recovers from the disease when the

vaccine is administered, use the data from Janey’s experiment to test H0 : p = 0:66 vs

H1 : p 6= 0:66 at the 5% level of signi
cance. (5 marks)

(e) Janey is pretty annoyed that less than 66% of her animals recovered, and wants to know

why we didn’t test H0 : p = 0:66 vs H1 : p < 0:66 because, after all, less than 66% recovered.

Is this a valid thing to do? Answer `yes’ or `no’, and justify your answer. (2 marks)

(f) In the light of Janey’s concerns, would it be valid to take another large sample of animals

and test H0 : p = 0:66 vs H1 : p < 0:66? Justify your answer. (2 marks)

Marking Criteria

Marks 0 1 2 3

2(a) Incorrect answer and

no working

Correct answer but

working has errors

Correct answer with

correct working

2(b) Answer is incorrect Answer is correct

2(c) Incorrect answer and

no working

Incorrect answer, but

some sensible working

Correct answer, but

working incorrect or in-

complete

Correct answer with

correct working

2(d) (See below)

2(e)

and

2(f)

Answer (Y/N) incor-

rect, and no justi
ca-

tion

Answer correct, but

justi
cationinade-

quate

Correct answer, and

justi
cation that is cor-

rect and clear

Marks 0 1

2(d) Test statistic and

samplingdistribu-

tion

Statistic and distribution either

missing or incorrect

Statistic and distribution both

correct

Rejection criterion Criterion missing or incorrect Criterion correctly stated

Calculations (soft-

ware or manually)

Calculations wrong or only par-

tially correct

Calculations correct

Decision Decision missing or not justi
ed Decision correctly stated and

justi
ed

Conclusion Missing, or not in context of prob-

lem

Statement in English in context of

problem

4

STA201/401 201530 Assignment 2

Question 3 (12 marks in total)

Suppose that a reputable survey agency is engaged to estimate the proportion of Australian

electors who believe that, to reduce carbon emissions, a nuclear power plant should be built

somewhere in Australia.

(a) Under the assumption that everyone who is contacted gives a reply of `In favour’ or `Op-

posed/Don’t Know’, how large a sample must be taken if the estimate obtained from the

survey is to have a margin of error of at most 2%? Use a 95% con
dence interval in

calculating the margin of error, and assume that absolutely nothing is known about the

proportion of Australians in favour. (3 marks)

(b) Repeat (a) if it is believed that the proportion of Australian electors in favour is not more

than 30%. (2 marks)

(c) Repeat (a) if it is thought that 20% of people contacted will refuse to answer, thereby

providing no information. (Make the assumption that these people think like the other

80%, but just don’t want to answer.) (2 marks)

(d) Suppose that a random sample of 1000 Australian electors is asked whether they believe

that, to reduce carbon emissions, a nuclear power plant should be built somewhere in

Australia. After eliminating the 221 people who refuse to answer, 275 of the remainder say

that they support building a nuclear power plant. Calculate a 95% con
dence interval for

the true proportion, p, of Australian electors who believe that, to reduce carbon emissions,

a nuclear power plant should be built somewhere in Australia. (3 marks)

(e) Use the con
dence interval obtained in (d) to test H0 : p = 0:30 vs H1 : p 6= 0:30 at the 5%

level of signi
cance. You do not have to carry out the formal steps of hypothesis testing,

but your answer must say whether or not you reject H0, and give reasons for your answer.

(2 marks)

Marking Criteria

Marks 0 1 2 3

3(a) Incorrect answer and

no working

Incorrect answer but

partly correct working

Correct answer with

partly correct working

Correct answer, with

working that is clear

and correct

3(b)

and (c)

Incorrect answer and

no working

Incorrect answer but

partly correct working

Correct answer, with

working that is clear

and correct

3(d) Incorrect answer and

no working

Correct working, but

no recognition of non-

responses

Correct answer with-

out complete explana-

tion

Correct answer, with

working that is clear

and correct

3(e) Incorrect answer and

no working

Correct answer, but in-

complete explanation

Correct answer, with

working that is clear

and correct

5

STA201/401 201530 Assignment 2

Question 4 (12 marks in total)

A researcher is investigating the e
ectiveness of a new diet that is being considered for raising

chickens. He wants to measure the weight gained by chicks on this diet in the 14 days between

the 7th and 21st days of life. On a given date, a random sample of 25 7-days-old chicks is

selected, and their weights are measured then and again on the 21st day. The weight gains (in

kg) are recorded in the
le chicken weights.csv in the folder in Interact2 where this Assignment

is located.

The researcher wants to estimate the mean weight gained (in kg), , by chicks on this diet.

He asks you to calculate a 95% con
dence interval for the value of , and tells you to use the

t-distribution.

(a) What assumptions are required to be met for the use of the t-distribution? Do they seem

valid here? Justify your answer. (3 marks)

(b) Irrespective of your answer to (a), calculate a 95% con
dence interval for . (3 marks)

(c) The mean weight gain over 14 days under a standard diet is 0.85 kg. (This
gure has been

established over several years, and can be regarded as a known constant.) Test at the 1%

signi
cance level whether the new diet has a mean weight gain that is greater than 0.85

kg. Give all the steps of the hypothesis testing process. However, you should use the R

Commander to perform the calculations. (6 marks)

Marking Criteria

Marks 0 1 2 3

4(a) All assumptions incor-

rect

All assumptions cor-

rect, but their validity

not assessed

All assumptions cor-

rect, but the validity of

only one assessed

All assumptions cor-

rect and their validity

assessed

4(b) Answer and working

incorrect

Answer wrong but

working partly correct

Answer correct but

working incompletely

described

Answer correct and

working correct and

clear

Marks 0 1

4(c) H0, H1 and
Hypotheses missing or incorrectly

stated

Hypotheses correctly stated and

symbols explained,
speci
ed

Test statistic and

samplingdistribu-

tion

Statistic and distribution either

missing or incorrect

Statistic and distribution both

correct

Rejection criterion Criterion missing or incorrect Criterion correctly stated

Calculations (soft-

ware or manually)

Calculations wrong or only par-

tially correct

Calculations correct

Decision Decision missing or not justi
ed Decision correctly stated and

justi
ed

Conclusion Missing, or not in context of prob-

lem

Statement in English in context of

problem

6

STA201/401 201530 Assignment 2

Question 5 (12 marks in total)

(This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A

Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.)

Researchers investigated the e
ect of broadband ultraviolet B (UVB) therapy and a topical

cream used together on areas of psoriasis (a skin complaint). One of the outcome variables was

the Psoriasis Area and Severity Index (PASI). The Table that appears below contains the PASI

scores for 20 patients who were assessed at baseline and after eight treatments.

Subject 1 2 3 4 5 6 7 8 9 10

Baseline 5.9 7.6 12.8 16.5 6.1 14.4 6.6 5.4 9.6 11.6

After 8 Treatments 5.2 12.2 4.6 4.0 0.4 3.8 1.2 3.1 3.5 4.9

Subject 11 12 13 14 15 16 17 18 19 20

Baseline 11.1 15.6 6.9 15.2 21.0 5.9 10.0 12.2 20.2 6.2

After 8 Treatments 11.1 8.4 5.8 5.0 6.4 0.0 2.7 5.1 4.8 4.2

The original question asked `Do these data provide evidence, at the .01 level of signi
cance, to

indicate that the combination therapy reduces PASI scores?’

(a) State the null and alternative hypotheses that you would test in order to answer this

question. If you use symbols other than H0 and H1 (and `=’, etc) in your statement, you

must explain what those symbols represent. (2 marks)

(b) If the data satisfy the appropriate statistical assumptions, then these hypotheses can be

tested by using a t-test. Would you use a `Welch Two Sample t-test’ or a `paired t-test’?

State which of these you would use, and give reasons for your answer. (2 marks)

(c) For the test that you chose in (b), state the statistical assumptions underlying the analysis.

Do these assumptions seem valid in this case? Answer `Yes’ or `No’, and give reasons for

your answer. You must show any computer output that you use. (3 marks)

(d) Irrespective of your answer to (c), carry out the test of H0 vs H1, showing all steps in the

decision process (except that you do not need to restate H0 and H1). (5 marks)

Marking Criteria

Marks 0 1 2 3

5(a) Neither hypothesis cor-

rect, or symbols used

but not explained

One hypothesis correct

and symbols explained

Both hypotheses cor-

rect and symbols

explained

5(b) Test to be used incor-

rect, or not speci
ed

Correct test speci
ed,

butjusti
cationinade-

quate

Correct test speci
ed,

andjusti
cationcor-

rect and clear

5(c) Assumptions not

stated, or incorrect

One assumption cor-

rect, but not tested for

validity

All assumptions cor-

rect, but not all tested

for validity

All assumptions cor-

rect, and all correctly

tested for validity

Marks 0 1

5(d) Test statistic and

samplingdistribu-

tion

Statistic and distribution either

missing or incorrect

Statistic and distribution both

correct

Rejection criterion Criterion missing or incorrect Criterion correctly stated

Calculations (soft-

ware or manually)

Calculations wrong or only par-

tially correct

Calculations correct

Decision Decision missing or not justi
ed Decision correctly stated and

justi
ed

Conclusion Missing, or not in context of prob-

lem

Statement in English in context of

problem

7

STA201/401 201530 Assignment 2

Question 6 | STA401 only (18 marks in total)

This question is ONLY for STA401 students. STA201 students do NOT need to submit this

question.

(This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A

Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.)

Two independent samples of Year 7 students were selected. Sample A was drawn from children

who were caries-free, while Sample B was drawn from children who had lots of caries. The

saliva pH levels were measured for all the children. The data are stored in several forms in the

le saliva.csv in the folder in Interact2 where this Assignment is located. It is suggested that

you look at the
le so that you know the format of the data before you import it to the R

Commander.

(a) You are to use the independent-samples pooled t-test to test whether the mean saliva pH

level of caries-free children di
ers from the corresponding mean for children with lots of

caries. Use a 5% level of signi
cance in all your testing.

i. State the statistical assumptions about the data that underly the independent-samples

pooled t-test. (2 marks)

ii. Are the assumptions valid for the supplied data? (You may use the R Commander

to help you decide this.) Justify your answer. (8 marks)

iii. Irrespective of your answer to ii., perform the independent-samples pooled t-test to

answer the research question at the beginning of section (a). You should follow

the standard steps of hypothesis testing (although you are encouraged to use the R

Commander to do the calculations). Your answer must include a sentence in English

that is useful to the researchers (who do not know any Statistics). (6 marks)

(b) Suppose that you need to
nd a 95% con
dence interval for the variance of a population,

and the sample of 20 observations that you have available to you clearly indicates that

the data are not Normally distributed. Brie

y suggest a technique that you might use to

calculate the 95% con
dence interval. [Note that you need only to suggest the technique

and include a couple of lines of justi
cation. You do not need to perform any calculations.]

(2 marks)

Marking Criteria

Marks 0 1 2 3

6(a)i. All assumptions incor-

rectly stated

One assumption cor-

rectly stated

All assumptions cor-

rectly stated

6(a)ii. (See below)

6(a)iii. (See below)

6(b) No appropriate tech-

nique suggested

An appropriate technique suggested, but

justification incorrect

or missing

An appropriate technique suggested, and

correctlyjustified

8

STA201/401 201530 Assignment 2

Marks 0 1

6(a)ii. First assumption No or incorrect justification for answer

Answer and justification correct

Second assumption No or incorrect justification

Question 1

 Cans of Fizzy Cola are believed to have contents that are Normally distributed with a mean of 305 ml and a standard deviation of 10 ml.

(a)    If a random sample of 25 cans is selected from a day’s production, what is the probability that the 12th can contains more than 300 ml?

Here, we have to calculate p(X>300)

Here, we have to use the z-score formula given as below;

Z = ( 300 – 305) / 10 = -5/10 = -0.5

P(z>-0.5) = 0.691462

Required probability = 0.691462

(b)   If the average of the contents of the 25 cans is calculated, what is

  1. The standard error of this average?

Standard error = stdev. / sqrt(n) = 10 / sqrt(25) = 2

  1. The probability that this average is greater than 300 ml?

Solution:

Z = (X – mean ) / [ stdev./sqrt(n)

Z = (300 – 305) / [ 10 / sqrt(25)]

Z = -2.5

P(x>300) = P (z>-2.5) = 0.99379

 Required probability = 0.99379

(c) What is the probability that at least 20 of the 25 cans have contents greater than 300 ml?

Z = (X – mean ) / [ stdev./sqrt(n)

Z = (300 – 305) / [ 10 / sqrt(20)]

Z = -2.236

P(x>300) = P (z>-2.236) = 0.9873

Required probability = 0.9873

Question 2

Suppose that, in 10 years of use throughout Australia, one particular animal vaccine has been successful in treating an animal disease in 66% of cases (i.e., on average, 66% of animals have recovered), and this percentage has become accepted as the industry ‘standard’.

(a)    JaneyWerakso has 64 animals suffering from this disease. If she administers the vaccine in the approved manner, what is the standard error of the proportion of animals that will recover?

Standard error of proportion = sqrt(pq/n) = sqrt( 0.66*0.34/64) = 0.003506

(b)   After treatment, Janey finds that 39 of her animals recover from the disease. What proportion of Janey’s animals recover from this disease?

Required proportion = 39/64 = 0.6094

(c)    What is the probability of getting this proportion (or less) of recoveries from 64 animals if the vaccine really does have a 66% recovery rate?

Solution:

Z = ( 64*0.6094 – 64*0.66) / sqrt(64*0.66*0.34) = -0.8545

P(z< -0.8545) = 0.1964

Required probability = 0.1964

(d)   If p represents the true probability that an animal recovers from the disease when the vaccine is administered, use the data from Janey’s experiment to test H0: p = 0.66 vs H1: p 6= 0.66 at the 5% level of significance. (5 marks) (e) Janey is pretty annoyed that less than 66% of her animals recovered, and wants to know why we didn’t test H0: p = 0.66 vs H1: p < 0.66 because, after all, less than 66% recovered. Is this a valid thing to do? Answer ‘yes’ or ‘no’, and justify your answer.

Number of items of interest = 64*0.6094 = 39

Z Test of Hypothesis for the Proportion

Data

Null Hypothesis            p =

0.66

Level of Significance

0.05

Number of Items of Interest

39

Sample Size

64

Intermediate Calculations

Sample Proportion

0.609375

Standard Error

0.0592

Z Test Statistic

-0.8550

Two-Tail Test

 
Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.3926

Do not reject the null hypothesis

 

(e)   In the light of Janey’s concerns, would it be valid to take another large sample of animals and test H0: p = 0.66 vs H1: p < 0.66? Justify your answer.

Sample Size Determination

Data

 
Estimate of True Proportion

0.66

Sampling Error

0.0592

Confidence Level

95%

Intermediate Calculations

Z Value

-1.9600

Calculated Sample Size

245.9663

Result

Sample Size Needed

246

Yes, it would be valid to take another large sample size of 246 animals.

 Question 3

Suppose that a reputable survey agency is engaged to estimate the proportion of Australian electors who believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia.

(a)    Under the assumption that everyone who is contacted gives a reply of ‘In favour’ or ‘Opposed/Don’t Know’, how large a sample must be taken if the estimate obtained from the survey is to have a margin of error of at most 2%? Use a 95% confidence interval in calculating the margin of error, and assume that absolutely nothing is known about the proportion of Australians in favour.

We absolutely nothing is known about the proportion of Australians in Favour, in this case, we take p = 0.5 or 50%

Sample Size Determination

Data

 
Estimate of True Proportion

0.5

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations

Z Value

-1.9600

Calculated Sample Size

2400.9118

Result

Sample Size Needed

2401

(b)   Repeat (a) if it is believed that the proportion of Australian electors in favour is not more than 30%.

Sample Size Determination

Data

 
Estimate of True Proportion

0.3

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations

Z Value

-1.9600

Calculated Sample Size

2016.7659

Result

Sample Size Needed

2017

 (c)    Repeat (a) if it is thought that 20% of people contacted will refuse to answer, thereby providing no information. (Make the assumption that these people think like the other 80%, but just don’t want to answer.)

Sample Size Determination

Data

 
Estimate of True Proportion

0.2

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations

Z Value

-1.9600

Calculated Sample Size

1536.5835

Result

Sample Size Needed

1537

(d)   Suppose that a random sample of 1000 Australian electors is asked whether they believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia. After eliminating the 221 people who refuse to answer, 275 of the remainder say that they support building a nuclear power plant. Calculate a 95% confidence interval for the true proportion, p, of Australian electors who believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia.

Confidence Interval Estimate for the Proportion

Data

 
Sample Size

1000

Number of Successes

275

Confidence Level

95%

Intermediate Calculations

Sample Proportion

0.275

Z Value

-1.9600

Standard Error of the Proportion

0.0141

Interval Half Width

0.0277

   

Confidence Interval

Interval Lower Limit

0.2473

Interval Upper Limit

0.3027

 (e)   Use the confidence interval obtained in (d) to test H0: p = 0.30 vs H1: p 6= 0.30 at the 5% level of significance. You do not have to carry out the formal steps of hypothesis testing, but your answer must say whether or not you reject H0, and give reasons for your answer.

Solution:

Here, we do not reject the null hypothesis, because p = 0.30 is lies between the confidence interval (0.2473, 0.3027).

 Question 4

 A researcher is investigating the effectiveness of a new diet that is being considered for raising chickens. He wants to measure the weight gained by chicks on this diet in the 14 days between the 7th and 21st days of life. On a given date, a random sample of 25 7-days-old chicks is selected, and their weights are measured then and again on the 21st day. The weight gains (in kg) are recorded in the file chicken weights.csv in the folder in Interact2 where this Assignment is located. The researcher wants to estimate the mean weight gained (in kg), µ, by chicks on this diet. He asks you to calculate a 95% confidence interval for the value of µ, and tells you to use the t-distribution.

(a)    What assumptions are required to be met for the use of the t-distribution? Do they seem valid here? Justify your answer.

If we do not given the population standard deviation, then we use t distribution and if we are given the population standard deviation, then we use the z distribution. Here, we do not given the information about the population standard deviation, so we need to use the t distribution.

 (b)   Irrespective of your answer to (a), calculate a 95% confidence interval for µ.

Confidence Interval Estimate for the Mean

Data

Sample Standard Deviation

0.202607996

Sample Mean

0.766

Sample Size

25

Confidence Level

95%

Intermediate Calculations

Standard Error of the Mean

0.040521599

Degrees of Freedom

24

t Value

2.0639

Interval Half Width

0.0836

   

Confidence Interval

Interval Lower Limit

0.68

Interval Upper Limit

0.85

 (c)    The mean weight gain over 14 days under a standard diet is 0.85 kg. (This figure has been established over several years, and can be regarded as a known constant.) Test at the 1% significance level whether the new diet has a mean weight gain that is greater than 0.85 kg. Give all the steps of the hypothesis testing process.

t Test for Hypothesis of the Mean

Data

Null Hypothesis                m=

0.85

Level of Significance

0.01

Sample Size

25

Sample Mean

0.766

Sample Standard Deviation

0.202607996

Intermediate Calculations

Standard Error of the Mean

0.0405

Degrees of Freedom

24

t Test Statistic

-2.0730

Upper-Tail Test

 
Upper Critical Value

2.4922

p-Value

0.9755

Do not reject the null hypothesis

 

 Question 5

Biostatistics: A Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.) Researchers investigated the effect of broadband ultraviolet B (UVB) therapy and a topical cream used together on areas of psoriasis (a skin complaint). One of the outcome variables was the Psoriasis Area and Severity Index (PASI). The Table that appears below contains the PASI scores for 20 patients who were assessed at baseline and after eight treatments.

Subject 1 2 3 4 5 6 7 8 9 10 Baseline 5.9 7.6 12.8 16.5 6.1 14.4 6.6 5.4 9.6 11.6 After 8 Treatments 5.2 12.2 4.6 4.0 0.4 3.8 1.2 3.1 3.5 4.9 Subject 11 12 13 14 15 16 17 18 19 20 Baseline 11.1 15.6 6.9 15.2 21.0 5.9 10.0 12.2 20.2 6.2 After 8 Treatments 11.1 8.4 5.8 5.0 6.4 0.0 2.7 5.1 4.8 4.2

The original question asked ‘Do these data provide evidence, at the .01 level of significance, to indicate that the combination therapy reduces PASI scores?’

(a)    State the null and alternative hypotheses that you would test in order to answer this question. If you use symbols other than H0 and H1 (and ‘=’, etc) in your statement, you must explain what those symbols represent.

Null hypothesis: H0: mean for baseline = mean for after 8 treatments

Alternative hypothesis: Ha: mean for baseline ≠ mean for after 8 treatments

(b)   If the data satisfy the appropriate statistical assumptions, then these hypotheses can be tested by using a t-test. Would you use a ‘Welch Two Sample t-test’ or a ‘paired t-test’? State which of these you would use, and give reasons for your answer.

 We will use the paired t test because we are given a pairs for baseline and after 8 treatments.

(c)    For the test that you chose in (b), state the statistical assumptions underlying the analysis. Do these assumptions seem valid in this case? Answer ‘Yes’ or ‘No’, and give reasons for your answer. You must show any computer output that you use.

Here, we have to use the paired sample t test. For this test, we need the data for after and before type. Here, we already have data for after and before. The differences between pairs would be normally distributed.

 (d)   Irrespective of your answer to (c), carry out the test of H0 vs H1, showing all steps in the decision process (except that you do not need to restate H0 and H1).

Solution:

Paired t Test

Data

Hypothesized Mean Difference

0

Level of significance

0.05

Intermediate Calculations

Sample Size

20

DBar

6.2200

Degrees of Freedom

19

SD

5.0403

Standard Error

1.1271

t Test Statistic

5.5188

Two-Tail Test

Lower Critical Value

-2.0930

Upper Critical Value

2.0930

p-Value

0.0000

Reject the null hypothesis

 Question 6

This question is ONLY for STA401 students. STA201 students do NOT need to submit this question. (This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.) Two independent samples of Year 7 students were selected. Sample A was drawn from children who were caries-free, while Sample B was drawn from children who had lots of caries. The saliva pH levels were measured for all the children. The data are stored in several forms in the file saliva.csv in the folder in Interact2 where this Assignment is located.

(a)    You are to use the independent-samples pooled t-test to test whether the mean saliva pH level of caries-free children differs from the corresponding mean for children with lots of caries. Use a 5% level of significance in all your testing.

Solution:

Pooled-Variance t Test for the Difference Between Two Means
(assumes equal population variances)

Data

Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample

 
Sample Size

15

Sample Mean

7.519333333

Sample Standard Deviation

0.29884221

Population 2 Sample

 
Sample Size

12

Sample Mean

7.271666667

Sample Standard Deviation

0.20112562

Intermediate Calculations

Population 1 Sample Degrees of Freedom

14

Population 2 Sample Degrees of Freedom

11

Total Degrees of Freedom

25

Pooled Variance

0.0678

Standard Error

0.1009

Difference in Sample Means

0.2477

t Test Statistic

2.4557

Two-Tail Test

 
Lower Critical Value

-2.0595

Upper Critical Value

2.0595

p-Value

0.0214

Reject the null hypothesis