Sugar Variable: 1355508

Question 1

The variable that was selected from the data was the sugar variable. The mean for the sugar level for the whole data was 8.45. The average was obtained by highlighting the variables in Excel, and the mean is displayed at the lower part of the sheet as shown below

The best visualization that can be used to describe the sugar level is the use of histogram, as shown below.

Question 2

  1. The hypothesis

Mathematical:

·  H0: μ = 8.45 

·  H1: μ ≠ 8.45

Written terms

H0: The mean sugar level was not significantly different from 8.45

H1: The mean sugar level was significantly different from 8.45

  1. The selected level of significance was 95 % CI. It was selected because it is the default level of significance used in hypothesis testing.
  2. The sample size that was selected was 75. The standard deviation was obtained to be 5.85, and the error was obtained to be 0.68. The Z score was selected to be 1.96 because 95 % confidence level was used. The standard deviation and the error were calculated from the sample size.
  3. The type of hypothesis that will be used for the hypothesis is one tail. It is one tail because it involves testing the means of one column.
  4. The critical value for the 75 observations is 1.969.
  5. The sample size was obtained by creating a new column showing a random sample. The data was then sorted using the random sample, and the top 75 observation was selected as the random sample. The data is shown in the excel file.
  6. The t-statistics was obtained to be 0.66, and the p-value was obtained to be 0.51. Since the p-value was obtained to be higher than 0.05, the null hypothesis was rejected, and we conclude that the mean sugar level was not significantly different from 8.45

3. Construct a confidence interval

a) The mean sample size was obtained to be 8.9

b) The estimate for the sample mean was slightly higher than the population mean (8.45).

c) The confidence interval for the mean was obtained to be [7.55, 10.24. This means that the 95 % Confidence interval for the mean of the sugar level was between 7.55 to 10.24

4. Compute the population mean and standard deviation

a) The population mean and standard deviation of the level of sugar was obtained to be 8.45 and 6.58. The sample mean (8.9) was slightly higher than the population mean. The sample deviation (5.85) was slightly lower than the population deviation. This means that the mean and standard of the sample size and the population was almost the same.

b) The population was 8.45, and it was within the 95 % confidence interval of the mean of the sample sugar level, i.e. 7.67 – 9.22.

5. Use simple linear regression on the random variables of fat and calories

a) The fat is the independent variable, and the calories are the dependent variable

b) The relationship between the fat and calories is positive and strong.

c) The scatter plot

d) The results show that a unit increase in the fat level increases the level of the calories by 19.507 per unit. The calories level of 113.29 is not affected by the level of fat. The value of r-squared (0.579) shows that the fat level explains 57.9 % variation of the calories level. The relationship between fat and calories is the same as my assumption. Their relationship is positive and strong.