CALCULATION

MAS284: Applied Statistics and Process Management
External Assignment 4
Internal assignment 3
Due: Wednesday, May 23.
Semester 1, 2012
Where appropriate, you may use MINITAB/EXCEL or any suitable statistical package; not
all questions will need the use of a statistical package. Should you use a statistical package,
you should cut and paste the relevant part of the output next to your answer. If you hand
in an unedited output, it should be annotated and be placed immediately before or after your
comments/answers to the question; outputs placed as ‘appendices’ or not properly identified
may not be marked.
The data for question 1, 2 and 4 can be found in the MINITAB/EXCEL file
a4 2012.MTW/a4 2012.xlxs in LMS.
1.
(14 marks)
Monthly loan applications for a local bank over a three years period were as follows: (read
across from left to right):
20 22 26 30 36 35 40 41 36 32 34 22
19 19 22 30 31 30 36 39 37 36 36 32
29 30 28 32 38 37 39 44 48 47 33 28
(a) Graph the time series and identify any distinctive features of the plot.
(b) (i) Fit a linear trend line to the data.
(ii) Is the trend line significant? Justify.
(iii) Explain briefly the reasons for the significance or non-significance of the trend
line and the small R
2
value?
(c) An AR(3) model is fitted to the loan data with the results given below. Based on
these results, would it be reasonable to fit a smaller order autoregressive model?
Justify.
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 1.3677 0.1738 7.87 0.000
AR 2 -0.4707 0.2996 -1.57 0.126
AR 3 0.1007 0.2094 0.48 0.634
Number of observations: 36
Residuals: SS = 697.749 (backforecasts excluded)
1
MS = 21.144 DF = 33
2.
(12 marks)
In a random sample of 33 junior executives, each is classified by potential for promotion
as good, poor, or uncertain. Each is given a test to measure his or her level of anxiety.
The coded results (the higher the value the higher the level of anxiety) are as follows:
Good 4 5 4 3 2 5 3 2 5 4 4 4 3
Poor 4 8 7 5 5 4 9 9 6 7
Uncertain 5 4 7 4 7 5 4 5 3 5
What are your conclusions about this study? Write a brief report which should include
the appropriate hypotheses and an evaluation of assumptions.
3. (12 marks)
Suppose that a golf association wants to compare the mean distances travelled by four
different brands (A, B, C and D) of golf balls when struck with a driver. A randomised
block design is employed utilising a random sample of 8 golfers, with each golfer using a
driver to hit four balls one from each brand in a random order. The distance travelled is
to be recorded for each hit.
(a)
Why is a randomised block design employed and why is randomisation necessary?
(b) The experiment was carried out as designed but the results were erroneously analysed
as if the experiment had been completely randomized (i.e. no blocking) with the oneway
ANOVA output given below. Based on this output, is there evidence that the
mean distances associated with the brands differ?
(c)
Suppose the sum of squares due to the golfers is 4264. Set up the ANOVA table that
will incorporate this information. Is the conclusion the same as that in (b)?
Analysis of Variance for dist
Source DF SS MS F P
brand 3 2652 884 2.51 0.079
Error 28 9832 351
Total 31 12484
2
4. (12 marks)
The temperature and pressure used in molding a certain plastic affect its tensile strength.
The following table show the tensile strength (coded) of specimens of plastic molded at 3
different temperatures (coded 1,2,3) and under 3 different pressures (coded A,B,C).
pressure
temperature A B C
10 10 8
1 11 10 8
12 9 8
12 8 9
8 12 9
2 9 12 8
9 11 9
8 11 9
8 9 10
3 9 9 10
9 10 11
9 9 11
(a) Calculate the means for the four observations in each temperature-pressure group.
Plot the means of the nine groups on a graph with tensile strength on the y axis and
temperature on the x axis. For each pressure, connect the three points corresponding
to the different temperatures. What does the plot tell you regarding interaction and
main effects? [ Note that you can alternatively plot the nine points on a strength vs
pressure graph and for each temperature, connect the three points corresponding to
the different pressures.]
(b)
Run a two-way ANOVA on these data. Summarize the results of the significance
tests.
3

SOLUTION

1.

a) The time series for all the three years together is as follows:

 

Fig. 1 – No. of monthly loan applications for three consecutive years

Shown below is the graph with three plots – one for each year.

 

Fig. 2 – No. of monthly loan applications for three consecutive years arranged year wise

As can be seen from the graphs, the number of monthly loan applications seems to be fluctuating.

They seem to be generally increasing in the period from January to August-September, and then, in the first and second years, show a steady decline until December.

The number of monthly loan installments also seems to be increasing every year.

 

 

 

 

 

 

 

 

b)

 

 

The equation of the linear trend line is y = 0.301x + 27.02. The R-squared value is 0.185. This means that this is an extremely poor fit for the data. Hence, the trend line for this particular set of data is insignificant (a poor model) and does not give a fairly accurate idea of the relationship between the variables.

The reason for this insignificance is that a linear relationship between the variables is assumed while in fact the relationship between them is very far from it. The relationship between the two variables is not accurately explained by a linear fit for the data.

 

 

 

 

 

2.

A graph showing the levels of anxiety for each type of employee (classified by potential of promotion) is shown below.

 

As an assumption for this analysis, the order of the employees is kept the same as that given in the data as it will not make any difference to the analysis, since it is the levels of anxiety (Y-axis) that we are concerned about. Hence, the order of points on the X-axis (employees) will not matter in the final analysis[1].

 

 

Fig. 3 – Level of Anxiety for employees with different potential for promotion

 

From the graph, it can be seen that employees with a Good promotion potential are the least anxious. The level of anxiety, on average, increases as the potential for promotion decreases.

A single factor ANOVA table is shown below (see “2.xlsx”):

Source of Variation

SS

df

MS

F

P-value

F crit

Between Groups

42.20047

2

21.10023

10.08467

0.000447

3.31583

Within Groups

62.76923

30

2.092308

Total

104.9697

32

       

Table 1 – A single factor ANOVA for the given data

 

The p-value is negligible. Hence, the null hypothesis can be rejected. We can conclude that there is a correlation between the level of anxiety and the promotion prospects of an employee. The relationship between these two factors for an employee, on average, is not coincidental.

 

 

 

 

 

 

 

 

 

 

 

3.

a) A randomized block design is necessary to eliminate the effects of the nuisance factor in the calculations. By randomizing the order in which the balls are shot, the brand order is randomized, which is not of primary concern. The only variation that we are concerned with is the distance travelled by each brand, and not the order in which the balls are hit. The factor of interest is the brand (i.e. the distance travelled by the balls belonging to each one)

b) The p-value is 0.079, which is higher (though only slightly) than the standard cut-off threshold value of 0.05. Hence, the evidence is (barely) insufficient to conclude that the mean distance of at least one brand differs from the others.  Although, it must be noted that the p-value in itself is not a definite measure of the accuracy of the hypothesis, it can be used as an approximate measure for comparison.

c) See the file “3.xlsx”, which contains the required formulas in the cells as well. The table is as shown below:

Source

DF

SS

MS

F

P

brand

3

4264

1421.33333

4.04773529

0.01648339[2]

error

28

9832

351.142857

Total

31

14096

Table 2 – The ANOVA table for the data

 For the given F-value, the p-value is 0.0165. This is much below the threshold of 0.05. Hence, there is sufficient evidence to conclude that the mean distance of at least one brand differs from the others. This result is different from the one obtained in b).

 

 

4.

a) The Pivot table in “4.xlsx” gives the averages/means of the four observations for the different pressures for each temperature. The table is reproduced below:

Temperature Average of A Average of B Average of C
1 11.25 9.25 8.25
2 8.5 11.5 8.75
3 8.75 9.25 10.5
Grand Total 9.5 10 9.166666667

 

The plot of tensile strength vs. temperature for different pressures is shown below:

 

Fig. 4 – Tensile strength vs. Temperature for different pressures

 

From the plot, it is clear that

  • For Pressure A, the tensile strength is highest for temperature 1, and then decreases for temperature 2 and 3. The tensile strength of the plastic for temperatures 2 and 3 is approximately the same.
  • For pressure B, the tensile strength for temperatures 1 and 3 is the same (at 9.25), but is higher (11.5) for temperature 2.
  • For pressure C, the tensile strength goes on increasing as the temperature changes from 1 to 3.

Hence, we can conclude that the interaction is very low. The effect of temperature and pressure on the tensile strength and their relationship varies widely for different plastics.

b) The two way ANOVA table for the three temperatures is shown below:

 

 

 

ANOVA
Source of Variation SS df MS F P-value F crit
Sample 0.055556 2 0.027778 0.065217 0.937011 3.354131
Columns 4.222222 2 2.111111 4.956522 0.014672 3.354131
Interaction 43.11111 4 10.77778 25.30435 8.56E-09 2.727765
Within 11.5 27 0.425926
Total 58.88889 35        

 

The p-value of the interaction is very small, almost approaching zero. Hence, the null hypothesis is rejected. The relationship between temperature and pressure and their effects on tensile strength of the material is not the same for different materials.

The conclusion for the different pressures is also the same. It can be concluded from the analysis that the pressure has different effects on the plastic materials.

On the other hand, the p-value of the sample is greater than 0.05, hence the null hypothesis can be accepted.

 

 

References:

Freedman David, Robert Pisani and Roger Purves. Statistics. 4th ed. New York: W. W. Norton & Company, 2007.

Freedman, David. Statistical models: Theory and practice. Cambridge: Cambridge University Press, 2005.

Pace, Larry. Statistical Analysis using Excel 2007. New York: Prentice Hall, 2010.

Speigel. Murray and Larry Stephens. Schaums Outline of Statistics. 4th ed. New York: McGraw-Hill, 2011.

 

 



[1] An alternate way of doing this analysis is to draw a Pivot Table in Excel. However, this method is shorter and simpler and can provide more or less the same conclusion.

[2] Is calculated using the FDIST function in Excel. See “3.xlsx”. In this case, P = FDIST(F,3,28)

LE50

“The presented piece of writing is a good example how the academic paper should be written. However, the text can’t be used as a part of your own and submitted to your professor – it will be considered as plagiarism.

But you can order it from our service and receive complete high-quality custom paper.  Our service offers Statistics  essay sample that was written by professional writer. If you like one, you have an opportunity to buy a similar paper. Any of the academic papers will be written from scratch, according to all customers’ specifications, expectations and highest standards.”

Please  Click on the  below links to Chat Now  or fill the Order Form !
order-now-new                      chat-new (1)