Ann Arbor Rent: 1194188

If I have a suspicion of plagiarism, you will have to come to my office and retake the exam. Students participating in plagiarism activities could also receive zero for this exam and will be subject to disciplinary sanctions (including but not limited to a failing grade in this course).

Each question is worth 20 points. Please type your answers when possible. Please place all of your answers right below each question. DO NOT delete the questions. Please make sure to provide explanations to all the answers you are giving. If you are using excel to perform testing, please make sure to copy and paste formulas you have used. This will help me to identify the reasons for which you could’ve made a mistake and it will ensure me that you knew what you were doing when answering questions and not just guessing. You must provide all the explanations and show all calculations to receive full credit.

For problems 1-5, you will need to access the file “Ann Arbor Rent” that contains data on rental prices and housing characteristics of apartments in Ann Arbor area. The following data is presented in file:

Rent – monthly rent in $ per apartment unit
Bed – number of bedrooms
Bath – number of bathrooms
Sqft – size of the apartment in square feet
Safety – neighborhood safety ratings (1 – least safe, 5 – most safe)
Garage – garage availability and cost : ‘paid’ – garage parking available for extra cost, ‘included’ – garage parking available for no extra cost, ‘no’ – garage parking is not available

Please answer ALL questions as if you were creating a report that you would present at your office meeting. This means your report must be clear, concise, visually attractive, and written in the manner that it is easily understood by your co-workers (who are not as proficient in econometrics and statistics as you are).

Question 1. Descriptive statistics.

  1. Create a frequency distribution table for “Bed” and the “Rent” variables. In a few sentences discuss what you can conclude based on the frequency distribution tables. Create a pie chart using the frequency distribution table for “Bed” variable and create a histogram for “Rent” variable.  

Answer:

Table 1 Frequency distribution table for Bed

From the table 1 frequency distribution table for bed it has been seen that there are 19 observations in the single bed and 21 observations in the double bed.

Table 2 Frequency distribution table for Rent

From the table 2 frequency distribution table for rent it has been seen that the maximum number of observations lies on the 1000-1500 class and the minimum number of frequency has been lies on the 2000-2500 class.

Figure 1 Pie chart on Bed

Figure 2 Histogram for Rent

  • Provide summary statistics for “Sqft” variable. Make sure to discuss these statistics (Mean, median, mode, st. dev., range).

Answer:

Table 2 Summary output Sqft

The mean, median and mode of the sqft is 1190.98, 1175 and 1000. Similarly the standard deviation and the range of this variable is 350.48 and 1450.

  • Calculate correlation coefficient between “Rent” and “Safety” variables. Provide interpretation. Create a scatterplot between these two variables. Discuss if your scatterplot supports the correlation coefficient.

Answer:

Table 3 Correlation table

The correlation between rent and safety is 0.7. It is positive and strong.

Figure 3 Scatter plot on safety versus Rent

The figure 3 represent the scatter plot on safety versus rent. In this plot represent the relationship between two variable. In the X-axis represent the safety and the Y-axis represent the rent. The correlation between these two variables in strong and positive.

  • Complete the following table showing statistics for Rent by Safety. Which Safety rating has the highest average rent? Which safety rating has the lowest average rent? Is that expected? Explain.
 Safety 1 Safety 2 Safety 3Safety 4
Number of apartments     
Average Rent      
St. Deviation     

 (Hint: You may have to rearrange the data to answer this question. You can create a table in excel where you would group your data according to the garage availability, similar to the table listed above. Then calculate the averages and standard deviation for each of the groups.)

Answer:

Question 2. Hypothesis testing

  1. Your friend, who is moving to Ann Arbor claims that apartments with safety ratings of 3 or 4 are on average more expensive than apartments with safety ratings of 1 and 2.   Use hypothesis testing to prove your friend right or wrong. Please make sure to set up the null and alternative hypothesis and show how you calculated t-statistic and p-value for your hypothesis. State the conclusion based on your results.  

Answer:

Null hypothesis (H0): The safety ratings on 3 and 4 is not more expensive as compared to safety ratings on 1 and 2.

Alternative hypothesis (H1): The safety ratings on 3 and 4 is more expensive as compared to safety ratings on 1 and 2.

Table 4 T-test output

Test statistic (t) =

= -6.55

P-value = 0.00

Alpha= 0.05

It has been seen that P-value < alpha (at 5%). Therefore the null hypothesis of this test is rejected and at the same time the alternative hypothesis is accepted. Hence it may be summarized that the safety ratings on 3 and 4 is more expensive as compared to safety ratings on 1 and 2.

  • The same friend claims that apartments with two bedrooms, on average, are larger than 1200 square feet. Use hypothesis testing to prove him right or wrong. Please make sure to set up the null and the alternative hypothesis and show how you calculated t-statistic and p-value for your hypothesis. State the conclusion based on your results.

Answer:

Null hypothesis (H0): The square meter feet on two bedrooms are not larger than 1200 square feet.

Alternative hypothesis (H1): The square meter feet on two bedrooms are larger than 1200 square feet.

Table 5 T-test output

Test statistic (t) =

            = 21.96

P-value = 0.00

Alpha= 0.05

It has been seen that P-value < alpha (at 5%). Therefore the null hypothesis of this test is rejected and at the same time the alternative hypothesis is accepted. Hence it may be summarized that the square meter feet on two bedrooms are larger than 1200 square feet.

  • Your friend (whom you dislike by this point) claims that safety and garage availability are completely unrelated. Meaning that either ‘no-garage’, or ‘paid garage’, or ‘included garage’ apartments would have, on average, same average safety rating. Set up the hypothesis and use ANOVA to test your friend’s claim.  Interpret the results. (Hint: you will have to split your data according to the garage availability).

Answer:

Null hypothesis (H0): There is no relation between the safety and garage availability.

Alternative hypothesis (H1): There is a relationship between the safety and garage availability.

Table 6 ANOVA Output

Test statistic (F) = 7.96

P-value = 0.00

Alpha= 0.05

It has been seen that P-value < alpha (at 5%). Therefore the null hypothesis of this test is rejected and at the same time the alternative hypothesis is accepted. Hence it may be summarized that there is a relationship between the safety and garage availability.

Question 3. Regression analysis

You want to test if the square footage determines the price of the apartment.  

  1. Create a scatterplot between the price of the apartments and the square footage variables.

Answer:

Figure 4 Scatter plot on Sqft versus Rent

  • Based on the scatterplot, are the assumptions of linear regression model (linearity, homoscedasticity, and conditional mean independence) satisfied? (make sure to explain your reasoning).

Answer: Yes. The figure 4 scatter plot Sqft versus Rent satisfies all the principle like linearity, homoscedasticity, and conditional mean independence. The data points falls linearly. Moreover the data points falls close to the trend line. This plot satisfies common treatment effect or common treatment specific-outcome.

  • Write the equation that you plan on estimating using regression analysis. Discuss the expected sign (positive or negative) of an estimated slope coefficient

Answer:

Table 7 Regression output

The regression model is as below

Rent= intercept+ slope * sqft

The expected sign on rent versus sqft is positive, because in general when the sqft of a house increases at the same time the sqft is also increases.

  • Use excel to estimate the regression coefficients (intercept and slope, your betas).

Answer:

Rent = -39.85+1.11* Sqft

  • Interpret both estimated coefficients and discuss p-value for each estimated coefficient.

Answer:

Estimated coefficients are -39.85 and 1.11

And the P- value are 0.73 and 0.00.

  • What is the expected rent rate for the apartment that is 1500 square feet large?  

Answer:

For

X=1500

Rent= -39.85+1.11*1500

         = 1620.92

Question 4. Regression analysis and hypothesis testing.

  1. Based on the results you obtained for Question 3, interpret 95% confidence intervals for estimated slope and intercept coefficients.

Answer:

The 95% confidence interval for estimated slope = mean 1.96 *(sd/sqrt (n))

= (0.92, 1.29)

The 95% confidence interval for intercept coefficient = mean 1.96 *(sd/sqrt (n))

= (-271.235, 191.5306)

  • Remember your friend from Question 2? Well, he is very certain that increase in the apartment size positively affects rent price. Use hypothesis testing to check his claim. (Make sure to specify null hypothesis, find t-statistics and p-value, interpret the results).

Answer:

Null hypothesis (H0): There is no positive affect on apartment size and rent price.

Alternative hypothesis (H1): There is a positive affect on apartment size and rent price.

Test statistic = 12.01

P-value = 0.00

Alpha= 0.05

It has been seen that the P-value < Alpha (at 5%). Therefore the null hypothesis of this test is rejected and at the same time the alternative hypothesis is accepted. Hence it may be summarized that there is a positive effect on apartment size and rent price.

  • That same friend, let’s call him Lenny, is now convinced that he has as much real estate knowledge as any other realtor, so he says that on average, apartment rent will increase by more than $1 for each square foot increase in the size of the apartment. Use hypothesis testing to check his claim.

Answer:

Null hypothesis (H0): There is no positive affect on apartment rent increases by $1 on each square foot increase.

Alternative hypothesis (H1): There is a positive effect on apartment rent increases by $1 on each square foot increase.

Test statistic = 12.01

P-value = 0.00

Alpha= 0.05

It has been seen that the P-value < Alpha (at 5%). Therefore the null hypothesis of this test is rejected and at the same time the alternative hypothesis is accepted. Hence it may be summarized that there is a positive effect on apartment rent increases by $1 on each square foot increase.

  • Does the confidence interval for slope coefficient supports your answer in part c) Explain.

Answer: Yes the confidence interval for slope of the coefficient supports that the rent of apartment increase $1 for each size of the apartment increase. The hypothesis test for slope of coefficient satisfies this price related to size. Moreover the confidence interval shows the positive side of this hypothesis test.

Question 5. Choosing functional form.  

  1. In the data file, generate a column = ln(Rent) and a column = ln(Sqft). You do not have to print this data! Just use it to answer the questions below.
  2. Run a regression for each of the specified functional forms and interpret the estimated slope coefficients. Don’t worry about interpreting intercept. Make sure to examine p-value for each coefficient in each model and use it when you interpret your coefficients.
  3. Linear
  4. Log-linear
  5. Linear – log
  6. Log – log.  

Answer:

The slope of the coefficient are 0.89 and the coefficient of intercept is 0.85.

P-value for intercept = 0.11

P-value for slope= 0.00

The regression model is as below

For linear

Rent = -39.85+1.11* Sqft, P- value for slope= 0.00, P-value for intercept = 0.73, estimated slope of the coefficient = 1.11 and coefficient for intercept= -38.85

Table 8 Regression output (linear)

Figure 5 Scatter plot on Sqft versus Rent (linear)

For Log-linear

Ln (Rent) = 0.0008+6.13 * sqft, P- value for slope= 0.000, P-value for intercept= 0.000, estimated coefficient for slope = 0.0008 and coefficient for intercept= 6.13

Table 9 Regression output (log- linear)

Figure 6 Scatter plot on Sqft versus Rent (log- linear)

For Linear log

Rent = -6928.98 +1166.30* ln (sqft), P- value for slope= 0.000, P-value for intercept= 0.000, estimated coefficient for slope = 1166.30 and coefficient for intercept= -6928.98

Table 10 Regression output (linear –log)

Figure 7 Scatter plot on Sqft versus Rent (linear –log)

For log-log

Rent = -39.85+1.11* Sqft, P- value for slope= 0.00, P-value for intercept= 0.12, estimated coefficient for slope = 0.89 and coefficient for intercept= 0.85.

Table 11 Regression output (log-log model)

Figure 8 Scatter plot on Sqft versus Rent (log-log)

  • Based on your scatterplot, which of the functions best fits the data? Based on the R-squared, which of the functions best fits the data? Based on the interpretation of the coefficients, which model is the most understandable one?

Answer:

On the basis of scatter plot linear function best fit the data. Similarly on the basis of R-square value log-linear model is best fit the data. Similarly on the basis of interpretation of coefficient log-log model is the best.