 Statistics assignment
 ·· 5 pages, 1250 words Revised Australia
 APA Revised 10 Refers
 (+61) 43253****
 May 13, 2015  Wednesday 01:09 pm  2 months ago
Question 1 (9 marks in total)
Cans of Fizzy Cola are believed to have contents that are Normally distributed with a mean of
305 ml and a standard deviation of 10 ml.
(a) If a random sample of 25 cans is selected from a day’s production, what is the probability
that the 12th can contains more than 300 ml? (2 marks)
(b) If the average of the contents of the 25 cans is calculated, what is
i. the standard error of this average? (1 mark)
ii. the probability that this average is greater than 300 ml? (3 marks)
(c) What is the probability that at least 20 of the 25 cans have contents greater than 300 ml?
(3 marks)
Marking Criteria
Marks 0 1 2 3
1(a) Incorrect answer and
no working
Correct answer but
working has errors
Correct answer with
correct working
1(b)i. Answer is incorrect Answer is correct
1(b)ii.
and
1(c)
Solution is missing or
incorrect
One part of precess
is correct, but rest is
missing or wrong
Process attempted but
impaired by a minor er
ror, or correct
nal an
swer with inadequate
working
Correct
nal answer
with working (interme
diate calculations or
output from software)
3
STA201/401 201530 Assignment 2
Question 2 (15 marks in total)
Suppose that, in 10 years of use throughout Australia, one particular animal vaccine has been
successful in treating an animal disease in 66% of cases (i.e., on average, 66% of animals have
recovered), and this percentage has become accepted as the industry `standard’.
(a) JaneyWerakso has 64 animals su
ering from this disease. If she administers the vaccine
in the approved manner, what is the standard error of the proportion of animals that will
recover? (2 marks)
(b) After treatment, Janey
nds that 39 of her animals recover from the disease. What
proportion of Janey’s animals recover from this disease? (1 mark)
(c) What is the probability of getting this proportion (or less) of recoveries from 64 animals
if the vaccine really does have a 66% recovery rate? (3 marks)
(d) If p represents the true probability that an animal recovers from the disease when the
vaccine is administered, use the data from Janey’s experiment to test H0 : p = 0:66 vs
H1 : p 6= 0:66 at the 5% level of signi
cance. (5 marks)
(e) Janey is pretty annoyed that less than 66% of her animals recovered, and wants to know
why we didn’t test H0 : p = 0:66 vs H1 : p < 0:66 because, after all, less than 66% recovered.
Is this a valid thing to do? Answer `yes’ or `no’, and justify your answer. (2 marks)
(f) In the light of Janey’s concerns, would it be valid to take another large sample of animals
and test H0 : p = 0:66 vs H1 : p < 0:66? Justify your answer. (2 marks)
Marking Criteria
Marks 0 1 2 3
2(a) Incorrect answer and
no working
Correct answer but
working has errors
Correct answer with
correct working
2(b) Answer is incorrect Answer is correct
2(c) Incorrect answer and
no working
Incorrect answer, but
some sensible working
Correct answer, but
working incorrect or in
complete
Correct answer with
correct working
2(d) (See below)
2(e)
and
2(f)
Answer (Y/N) incor
rect, and no justi
ca
tion
Answer correct, but
justi
cationinade
quate
Correct answer, and
justi
cation that is cor
rect and clear
Marks 0 1
2(d) Test statistic and
samplingdistribu
tion
Statistic and distribution either
missing or incorrect
Statistic and distribution both
correct
Rejection criterion Criterion missing or incorrect Criterion correctly stated
Calculations (soft
ware or manually)
Calculations wrong or only par
tially correct
Calculations correct
Decision Decision missing or not justi
ed Decision correctly stated and
justi
ed
Conclusion Missing, or not in context of prob
lem
Statement in English in context of
problem
4
STA201/401 201530 Assignment 2
Question 3 (12 marks in total)
Suppose that a reputable survey agency is engaged to estimate the proportion of Australian
electors who believe that, to reduce carbon emissions, a nuclear power plant should be built
somewhere in Australia.
(a) Under the assumption that everyone who is contacted gives a reply of `In favour’ or `Op
posed/Don’t Know’, how large a sample must be taken if the estimate obtained from the
survey is to have a margin of error of at most 2%? Use a 95% con
dence interval in
calculating the margin of error, and assume that absolutely nothing is known about the
proportion of Australians in favour. (3 marks)
(b) Repeat (a) if it is believed that the proportion of Australian electors in favour is not more
than 30%. (2 marks)
(c) Repeat (a) if it is thought that 20% of people contacted will refuse to answer, thereby
providing no information. (Make the assumption that these people think like the other
80%, but just don’t want to answer.) (2 marks)
(d) Suppose that a random sample of 1000 Australian electors is asked whether they believe
that, to reduce carbon emissions, a nuclear power plant should be built somewhere in
Australia. After eliminating the 221 people who refuse to answer, 275 of the remainder say
that they support building a nuclear power plant. Calculate a 95% con
dence interval for
the true proportion, p, of Australian electors who believe that, to reduce carbon emissions,
a nuclear power plant should be built somewhere in Australia. (3 marks)
(e) Use the con
dence interval obtained in (d) to test H0 : p = 0:30 vs H1 : p 6= 0:30 at the 5%
level of signi
cance. You do not have to carry out the formal steps of hypothesis testing,
but your answer must say whether or not you reject H0, and give reasons for your answer.
(2 marks)
Marking Criteria
Marks 0 1 2 3
3(a) Incorrect answer and
no working
Incorrect answer but
partly correct working
Correct answer with
partly correct working
Correct answer, with
working that is clear
and correct
3(b)
and (c)
Incorrect answer and
no working
Incorrect answer but
partly correct working
Correct answer, with
working that is clear
and correct
3(d) Incorrect answer and
no working
Correct working, but
no recognition of non
responses
Correct answer with
out complete explana
tion
Correct answer, with
working that is clear
and correct
3(e) Incorrect answer and
no working
Correct answer, but in
complete explanation
Correct answer, with
working that is clear
and correct
5
STA201/401 201530 Assignment 2
Question 4 (12 marks in total)
A researcher is investigating the e
ectiveness of a new diet that is being considered for raising
chickens. He wants to measure the weight gained by chicks on this diet in the 14 days between
the 7th and 21st days of life. On a given date, a random sample of 25 7daysold chicks is
selected, and their weights are measured then and again on the 21st day. The weight gains (in
kg) are recorded in the
le chicken weights.csv in the folder in Interact2 where this Assignment
is located.
The researcher wants to estimate the mean weight gained (in kg), , by chicks on this diet.
He asks you to calculate a 95% con
dence interval for the value of , and tells you to use the
tdistribution.
(a) What assumptions are required to be met for the use of the tdistribution? Do they seem
valid here? Justify your answer. (3 marks)
(b) Irrespective of your answer to (a), calculate a 95% con
dence interval for . (3 marks)
(c) The mean weight gain over 14 days under a standard diet is 0.85 kg. (This
gure has been
established over several years, and can be regarded as a known constant.) Test at the 1%
signi
cance level whether the new diet has a mean weight gain that is greater than 0.85
kg. Give all the steps of the hypothesis testing process. However, you should use the R
Commander to perform the calculations. (6 marks)
Marking Criteria
Marks 0 1 2 3
4(a) All assumptions incor
rect
All assumptions cor
rect, but their validity
not assessed
All assumptions cor
rect, but the validity of
only one assessed
All assumptions cor
rect and their validity
assessed
4(b) Answer and working
incorrect
Answer wrong but
working partly correct
Answer correct but
working incompletely
described
Answer correct and
working correct and
clear
Marks 0 1
4(c) H0, H1 and
Hypotheses missing or incorrectly
stated
Hypotheses correctly stated and
symbols explained,
speci
ed
Test statistic and
samplingdistribu
tion
Statistic and distribution either
missing or incorrect
Statistic and distribution both
correct
Rejection criterion Criterion missing or incorrect Criterion correctly stated
Calculations (soft
ware or manually)
Calculations wrong or only par
tially correct
Calculations correct
Decision Decision missing or not justi
ed Decision correctly stated and
justi
ed
Conclusion Missing, or not in context of prob
lem
Statement in English in context of
problem
6
STA201/401 201530 Assignment 2
Question 5 (12 marks in total)
(This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A
Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.)
Researchers investigated the e
ect of broadband ultraviolet B (UVB) therapy and a topical
cream used together on areas of psoriasis (a skin complaint). One of the outcome variables was
the Psoriasis Area and Severity Index (PASI). The Table that appears below contains the PASI
scores for 20 patients who were assessed at baseline and after eight treatments.
Subject 1 2 3 4 5 6 7 8 9 10
Baseline 5.9 7.6 12.8 16.5 6.1 14.4 6.6 5.4 9.6 11.6
After 8 Treatments 5.2 12.2 4.6 4.0 0.4 3.8 1.2 3.1 3.5 4.9
Subject 11 12 13 14 15 16 17 18 19 20
Baseline 11.1 15.6 6.9 15.2 21.0 5.9 10.0 12.2 20.2 6.2
After 8 Treatments 11.1 8.4 5.8 5.0 6.4 0.0 2.7 5.1 4.8 4.2
The original question asked `Do these data provide evidence, at the .01 level of signi
cance, to
indicate that the combination therapy reduces PASI scores?’
(a) State the null and alternative hypotheses that you would test in order to answer this
question. If you use symbols other than H0 and H1 (and `=’, etc) in your statement, you
must explain what those symbols represent. (2 marks)
(b) If the data satisfy the appropriate statistical assumptions, then these hypotheses can be
tested by using a ttest. Would you use a `Welch Two Sample ttest’ or a `paired ttest’?
State which of these you would use, and give reasons for your answer. (2 marks)
(c) For the test that you chose in (b), state the statistical assumptions underlying the analysis.
Do these assumptions seem valid in this case? Answer `Yes’ or `No’, and give reasons for
your answer. You must show any computer output that you use. (3 marks)
(d) Irrespective of your answer to (c), carry out the test of H0 vs H1, showing all steps in the
decision process (except that you do not need to restate H0 and H1). (5 marks)
Marking Criteria
Marks 0 1 2 3
5(a) Neither hypothesis cor
rect, or symbols used
but not explained
One hypothesis correct
and symbols explained
Both hypotheses cor
rect and symbols
explained
5(b) Test to be used incor
rect, or not speci
ed
Correct test speci
ed,
butjusti
cationinade
quate
Correct test speci
ed,
andjusti
cationcor
rect and clear
5(c) Assumptions not
stated, or incorrect
One assumption cor
rect, but not tested for
validity
All assumptions cor
rect, but not all tested
for validity
All assumptions cor
rect, and all correctly
tested for validity
Marks 0 1
5(d) Test statistic and
samplingdistribu
tion
Statistic and distribution either
missing or incorrect
Statistic and distribution both
correct
Rejection criterion Criterion missing or incorrect Criterion correctly stated
Calculations (soft
ware or manually)
Calculations wrong or only par
tially correct
Calculations correct
Decision Decision missing or not justi
ed Decision correctly stated and
justi
ed
Conclusion Missing, or not in context of prob
lem
Statement in English in context of
problem
7
STA201/401 201530 Assignment 2
Question 6  STA401 only (18 marks in total)
This question is ONLY for STA401 students. STA201 students do NOT need to submit this
question.
(This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A
Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.)
Two independent samples of Year 7 students were selected. Sample A was drawn from children
who were cariesfree, while Sample B was drawn from children who had lots of caries. The
saliva pH levels were measured for all the children. The data are stored in several forms in the
le saliva.csv in the folder in Interact2 where this Assignment is located. It is suggested that
you look at the
le so that you know the format of the data before you import it to the R
Commander.
(a) You are to use the independentsamples pooled ttest to test whether the mean saliva pH
level of cariesfree children di
ers from the corresponding mean for children with lots of
caries. Use a 5% level of signi
cance in all your testing.
i. State the statistical assumptions about the data that underly the independentsamples
pooled ttest. (2 marks)
ii. Are the assumptions valid for the supplied data? (You may use the R Commander
to help you decide this.) Justify your answer. (8 marks)
iii. Irrespective of your answer to ii., perform the independentsamples pooled ttest to
answer the research question at the beginning of section (a). You should follow
the standard steps of hypothesis testing (although you are encouraged to use the R
Commander to do the calculations). Your answer must include a sentence in English
that is useful to the researchers (who do not know any Statistics). (6 marks)
(b) Suppose that you need to
nd a 95% con
dence interval for the variance of a population,
and the sample of 20 observations that you have available to you clearly indicates that
the data are not Normally distributed. Brie
y suggest a technique that you might use to
calculate the 95% con
dence interval. [Note that you need only to suggest the technique
and include a couple of lines of justi
cation. You do not need to perform any calculations.]
(2 marks)
Marking Criteria
Marks 0 1 2 3
6(a)i. All assumptions incor
rectly stated
One assumption cor
rectly stated
All assumptions cor
rectly stated
6(a)ii. (See below)
6(a)iii. (See below)
6(b) No appropriate tech
nique suggested
An appropriate technique suggested, but
justification incorrect
or missing
An appropriate technique suggested, and
correctlyjustified
8
STA201/401 201530 Assignment 2
Marks 0 1
6(a)ii. First assumption No or incorrect justification for answer
Answer and justification correct
Second assumption No or incorrect justification
Question 1
Cans of Fizzy Cola are believed to have contents that are Normally distributed with a mean of 305 ml and a standard deviation of 10 ml.
(a) If a random sample of 25 cans is selected from a day’s production, what is the probability that the 12th can contains more than 300 ml?
Here, we have to calculate p(X>300)
Here, we have to use the zscore formula given as below;
Z = ( 300 – 305) / 10 = 5/10 = 0.5
P(z>0.5) = 0.691462
Required probability = 0.691462
(b) If the average of the contents of the 25 cans is calculated, what is
 The standard error of this average?
Standard error = stdev. / sqrt(n) = 10 / sqrt(25) = 2
 The probability that this average is greater than 300 ml?
Solution:
Z = (X – mean ) / [ stdev./sqrt(n)
Z = (300 – 305) / [ 10 / sqrt(25)]
Z = 2.5
P(x>300) = P (z>2.5) = 0.99379
Required probability = 0.99379
(c) What is the probability that at least 20 of the 25 cans have contents greater than 300 ml?
Z = (X – mean ) / [ stdev./sqrt(n)
Z = (300 – 305) / [ 10 / sqrt(20)]
Z = 2.236
P(x>300) = P (z>2.236) = 0.9873
Required probability = 0.9873
Question 2
Suppose that, in 10 years of use throughout Australia, one particular animal vaccine has been successful in treating an animal disease in 66% of cases (i.e., on average, 66% of animals have recovered), and this percentage has become accepted as the industry ‘standard’.
(a) JaneyWerakso has 64 animals suﬀering from this disease. If she administers the vaccine in the approved manner, what is the standard error of the proportion of animals that will recover?
Standard error of proportion = sqrt(pq/n) = sqrt( 0.66*0.34/64) = 0.003506
(b) After treatment, Janey ﬁnds that 39 of her animals recover from the disease. What proportion of Janey’s animals recover from this disease?
Required proportion = 39/64 = 0.6094
(c) What is the probability of getting this proportion (or less) of recoveries from 64 animals if the vaccine really does have a 66% recovery rate?
Solution:
Z = ( 64*0.6094 – 64*0.66) / sqrt(64*0.66*0.34) = 0.8545
P(z< 0.8545) = 0.1964
Required probability = 0.1964
(d) If p represents the true probability that an animal recovers from the disease when the vaccine is administered, use the data from Janey’s experiment to test H0: p = 0.66 vs H1: p 6= 0.66 at the 5% level of signiﬁcance. (5 marks) (e) Janey is pretty annoyed that less than 66% of her animals recovered, and wants to know why we didn’t test H0: p = 0.66 vs H1: p < 0.66 because, after all, less than 66% recovered. Is this a valid thing to do? Answer ‘yes’ or ‘no’, and justify your answer.
Number of items of interest = 64*0.6094 = 39
Z Test of Hypothesis for the Proportion  
Data 

Null Hypothesis p = 
0.66 
Level of Significance 
0.05 
Number of Items of Interest 
39 
Sample Size 
64 
Intermediate Calculations 

Sample Proportion 
0.609375 
Standard Error 
0.0592 
Z Test Statistic 
0.8550 
TwoTail Test 

Lower Critical Value 
1.9600 
Upper Critical Value 
1.9600 
pValue 
0.3926 
Do not reject the null hypothesis 
(e) In the light of Janey’s concerns, would it be valid to take another large sample of animals and test H0: p = 0.66 vs H1: p < 0.66? Justify your answer.
Sample Size Determination  
Data 

Estimate of True Proportion 
0.66 
Sampling Error 
0.0592 
Confidence Level 
95% 
Intermediate Calculations 

Z Value 
1.9600 
Calculated Sample Size 
245.9663 
Result 

Sample Size Needed 
246 
Yes, it would be valid to take another large sample size of 246 animals.
Question 3
Suppose that a reputable survey agency is engaged to estimate the proportion of Australian electors who believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia.
(a) Under the assumption that everyone who is contacted gives a reply of ‘In favour’ or ‘Opposed/Don’t Know’, how large a sample must be taken if the estimate obtained from the survey is to have a margin of error of at most 2%? Use a 95% conﬁdence interval in calculating the margin of error, and assume that absolutely nothing is known about the proportion of Australians in favour.
We absolutely nothing is known about the proportion of Australians in Favour, in this case, we take p = 0.5 or 50%
Sample Size Determination  
Data 

Estimate of True Proportion 
0.5 
Sampling Error 
0.02 
Confidence Level 
95% 
Intermediate Calculations 

Z Value 
1.9600 
Calculated Sample Size 
2400.9118 
Result 

Sample Size Needed 
2401 
(b) Repeat (a) if it is believed that the proportion of Australian electors in favour is not more than 30%.
Sample Size Determination  
Data 

Estimate of True Proportion 
0.3 
Sampling Error 
0.02 
Confidence Level 
95% 
Intermediate Calculations 

Z Value 
1.9600 
Calculated Sample Size 
2016.7659 
Result 

Sample Size Needed 
2017 
(c) Repeat (a) if it is thought that 20% of people contacted will refuse to answer, thereby providing no information. (Make the assumption that these people think like the other 80%, but just don’t want to answer.)
Sample Size Determination  
Data 

Estimate of True Proportion 
0.2 
Sampling Error 
0.02 
Confidence Level 
95% 
Intermediate Calculations 

Z Value 
1.9600 
Calculated Sample Size 
1536.5835 
Result 

Sample Size Needed 
1537 
(d) Suppose that a random sample of 1000 Australian electors is asked whether they believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia. After eliminating the 221 people who refuse to answer, 275 of the remainder say that they support building a nuclear power plant. Calculate a 95% conﬁdence interval for the true proportion, p, of Australian electors who believe that, to reduce carbon emissions, a nuclear power plant should be built somewhere in Australia.
Confidence Interval Estimate for the Proportion  
Data 

Sample Size 
1000 
Number of Successes 
275 
Confidence Level 
95% 
Intermediate Calculations 

Sample Proportion 
0.275 
Z Value 
1.9600 
Standard Error of the Proportion 
0.0141 
Interval Half Width 
0.0277 
Confidence Interval 

Interval Lower Limit 
0.2473 
Interval Upper Limit 
0.3027 
(e) Use the conﬁdence interval obtained in (d) to test H0: p = 0.30 vs H1: p 6= 0.30 at the 5% level of signiﬁcance. You do not have to carry out the formal steps of hypothesis testing, but your answer must say whether or not you reject H0, and give reasons for your answer.
Solution:
Here, we do not reject the null hypothesis, because p = 0.30 is lies between the confidence interval (0.2473, 0.3027).
Question 4
A researcher is investigating the eﬀectiveness of a new diet that is being considered for raising chickens. He wants to measure the weight gained by chicks on this diet in the 14 days between the 7th and 21st days of life. On a given date, a random sample of 25 7daysold chicks is selected, and their weights are measured then and again on the 21st day. The weight gains (in kg) are recorded in the ﬁle chicken weights.csv in the folder in Interact2 where this Assignment is located. The researcher wants to estimate the mean weight gained (in kg), µ, by chicks on this diet. He asks you to calculate a 95% conﬁdence interval for the value of µ, and tells you to use the tdistribution.
(a) What assumptions are required to be met for the use of the tdistribution? Do they seem valid here? Justify your answer.
If we do not given the population standard deviation, then we use t distribution and if we are given the population standard deviation, then we use the z distribution. Here, we do not given the information about the population standard deviation, so we need to use the t distribution.
(b) Irrespective of your answer to (a), calculate a 95% conﬁdence interval for µ.
Confidence Interval Estimate for the Mean  
Data 

Sample Standard Deviation 
0.202607996 
Sample Mean 
0.766 
Sample Size 
25 
Confidence Level 
95% 
Intermediate Calculations 

Standard Error of the Mean 
0.040521599 
Degrees of Freedom 
24 
t Value 
2.0639 
Interval Half Width 
0.0836 
Confidence Interval 

Interval Lower Limit 
0.68 
Interval Upper Limit 
0.85 
(c) The mean weight gain over 14 days under a standard diet is 0.85 kg. (This ﬁgure has been established over several years, and can be regarded as a known constant.) Test at the 1% signiﬁcance level whether the new diet has a mean weight gain that is greater than 0.85 kg. Give all the steps of the hypothesis testing process.
t Test for Hypothesis of the Mean  
Data 

Null Hypothesis m= 
0.85 
Level of Significance 
0.01 
Sample Size 
25 
Sample Mean 
0.766 
Sample Standard Deviation 
0.202607996 
Intermediate Calculations 

Standard Error of the Mean 
0.0405 
Degrees of Freedom 
24 
t Test Statistic 
2.0730 
UpperTail Test 

Upper Critical Value 
2.4922 
pValue 
0.9755 
Do not reject the null hypothesis 
Question 5
Biostatistics: A Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.) Researchers investigated the eﬀect of broadband ultraviolet B (UVB) therapy and a topical cream used together on areas of psoriasis (a skin complaint). One of the outcome variables was the Psoriasis Area and Severity Index (PASI). The Table that appears below contains the PASI scores for 20 patients who were assessed at baseline and after eight treatments.
Subject 1 2 3 4 5 6 7 8 9 10 Baseline 5.9 7.6 12.8 16.5 6.1 14.4 6.6 5.4 9.6 11.6 After 8 Treatments 5.2 12.2 4.6 4.0 0.4 3.8 1.2 3.1 3.5 4.9 Subject 11 12 13 14 15 16 17 18 19 20 Baseline 11.1 15.6 6.9 15.2 21.0 5.9 10.0 12.2 20.2 6.2 After 8 Treatments 11.1 8.4 5.8 5.0 6.4 0.0 2.7 5.1 4.8 4.2
The original question asked ‘Do these data provide evidence, at the .01 level of signiﬁcance, to indicate that the combination therapy reduces PASI scores?’
(a) State the null and alternative hypotheses that you would test in order to answer this question. If you use symbols other than H0 and H1 (and ‘=’, etc) in your statement, you must explain what those symbols represent.
Null hypothesis: H_{0}: mean for baseline = mean for after 8 treatments
Alternative hypothesis: H_{a}: mean for baseline ≠ mean for after 8 treatments
(b) If the data satisfy the appropriate statistical assumptions, then these hypotheses can be tested by using a ttest. Would you use a ‘Welch Two Sample ttest’ or a ‘paired ttest’? State which of these you would use, and give reasons for your answer.
We will use the paired t test because we are given a pairs for baseline and after 8 treatments.
(c) For the test that you chose in (b), state the statistical assumptions underlying the analysis. Do these assumptions seem valid in this case? Answer ‘Yes’ or ‘No’, and give reasons for your answer. You must show any computer output that you use.
Here, we have to use the paired sample t test. For this test, we need the data for after and before type. Here, we already have data for after and before. The differences between pairs would be normally distributed.
(d) Irrespective of your answer to (c), carry out the test of H0 vs H1, showing all steps in the decision process (except that you do not need to restate H0 and H1).
Solution:
Paired t Test  
Data 

Hypothesized Mean Difference 
0 
Level of significance 
0.05 
Intermediate Calculations 

Sample Size 
20 
DBar 
6.2200 
Degrees of Freedom 
19 
S_{D} 
5.0403 
Standard Error 
1.1271 
t Test Statistic 
5.5188 
TwoTail Test 

Lower Critical Value 
2.0930 
Upper Critical Value 
2.0930 
pValue 
0.0000 
Reject the null hypothesis 
Question 6
This question is ONLY for STA401 students. STA201 students do NOT need to submit this question. (This is based on a question from Daniel, W.W. and Cross, C.L. (2013). Biostatistics: A Foundation for Analysis in the Health Sciences. (10th edn) Hoboken, NJ: Wiley.) Two independent samples of Year 7 students were selected. Sample A was drawn from children who were cariesfree, while Sample B was drawn from children who had lots of caries. The saliva pH levels were measured for all the children. The data are stored in several forms in the ﬁle saliva.csv in the folder in Interact2 where this Assignment is located.
(a) You are to use the independentsamples pooled ttest to test whether the mean saliva pH level of cariesfree children diﬀers from the corresponding mean for children with lots of caries. Use a 5% level of signiﬁcance in all your testing.
Solution:
PooledVariance t Test for the Difference Between Two Means  
(assumes equal population variances)  
Data 

Hypothesized Difference 
0 
Level of Significance 
0.05 
Population 1 Sample 

Sample Size 
15 
Sample Mean 
7.519333333 
Sample Standard Deviation 
0.29884221 
Population 2 Sample 

Sample Size 
12 
Sample Mean 
7.271666667 
Sample Standard Deviation 
0.20112562 
Intermediate Calculations 

Population 1 Sample Degrees of Freedom 
14 
Population 2 Sample Degrees of Freedom 
11 
Total Degrees of Freedom 
25 
Pooled Variance 
0.0678 
Standard Error 
0.1009 
Difference in Sample Means 
0.2477 
t Test Statistic 
2.4557 
TwoTail Test 

Lower Critical Value 
2.0595 
Upper Critical Value 
2.0595 
pValue 
0.0214 
Reject the null hypothesis 