Factors Affecting Tourist Attraction to Certain Zones: 1205742

Introduction
Tourism is a major factor in human socio-economic engagements. Apart from the revenue they generate, they improve learning of other people’s way of life and also the creation of employment. Studying the aspect of tourism is important for the betterment of the society.
Number of tourist attracted statistics
Multiple regression model
Dependent was categorically chosen to analyze the number of tourist trips attracted to different zone. The independent variables were chosen to ascertain some of the reasons why tourist are attracted to some zones compared to others. Some of the variables chosen, were the availability of jobs, households, number of adults etc.
The researcher confident interval in this study is 95% giving us the alpha of 0.05
The prediction is that tourists are attracted to areas where there are plenty of jobs, large number of adults who are working to offer services and to areas where there are low crime rates.

Independent quantitative variables
Number of trips generated: these are the number of trips generated from a certain zone due to people moving out. It is important to compare those who move out as some are moving in
Number of households: number of households can be used to predict the population of an areas, this is important as this population can offer the services required by the tourists
Workers in the zone: these are different service providers
Adult in the zone: this is important to our study because it enable us to predict the labour force when need be if there is influx of tourists
Jobs in the zone: apart from just visiting a place for pleasure, some tourist visit to work, availability of jobs hence serve well.
Shopping: these are the different category of the goods available for shopping. People who like shopping can be attracted to varieties of thing to shop.
Crime rate: some tourist consider the crime rate before they can visit a place
Binary categorical variables
Nature of people generated: these are people leaving a certain zone to be either tourist somewhere or people are attracted and are leaving.
Season: this is whether the season is summer or winter

Categorical variable with at least 3 variables
Colour: these are zones with majority as Latino, Blacks, White or Others

Original data
Number of trips attracted Number of trips generated Number of households Workers in zone Adults in zone Jobs in zone Shopping m2 Crime Rate Nature of people generated Season Colour
1063 1144 946 946 1892 3032 125716 20 0 0 1
678 421 310 310 620 3807 67182 21 0 0 2
553 823 337 674 1348 3498 188545 15 0 0 2
1768 1216 904 904 1808 4703 133366 32 1 1 3
1045 1168 838 838 2514 5926 187937 23 0 0 1
1406 1104 538 1076 2152 5935 198630 14 1 1 3
1528 1521 690 1380 2070 4514 127228 17 1 0 2
744 1946 921 1842 3684 5995 86105 15 0 0 4
577 938 875 875 1750 3064 16370 23 1 1 3
718 1023 805 805 1610 3558 179430 39 0 0 2
1084 268 230 230 460 4851 142533 38 0 0 1
892 565 474 474 1422 3142 191692 12 1 1 1
514 1066 710 710 1420 3017 48221 32 1 1 3
106 483 289 289 578 5228 39930 27 1 0 2
22 998 610 610 1220 3221 10322 26 0 0 4
449 257 157 157 471 3308 116098 12 0 1 4
233 800 565 1130 1695 3837 31795 18 1 0 1
35 1295 927 927 2781 5912 11855 21 0 0 2
1463 948 704 1408 2112 5175 95402 22 1 0 1
685 865 695 695 2085 4739 89118 33 1 1 3

Multicollinearity
Multicollinearity exist where two variables are correlated. From our data above, the number of worker and Jobs in the zone are correlated because people who are working will be obviously engaged. We will therefore remove the variable “Worker” as it may interfere with our analysis.

Once we remove, the multicollinearity, we remain with the following data
Number of trips attracted Number of trips generated Number of households Adults in zone Jobs in zone Shopping m2 Crime Rate Nature of people generated Engaged Colour
1063 1144 946 1892 3032 125716 20 0 0 1
678 421 310 620 3807 67182 21 0 0 2
553 823 337 1348 3498 188545 15 0 0 2
1768 1216 904 1808 4703 133366 32 1 1 3
1045 1168 838 2514 5926 187937 23 0 0 1
1406 1104 538 2152 5935 198630 14 1 1 3
1528 1521 690 2070 4514 127228 17 1 0 2
744 1946 921 3684 5995 86105 15 0 0 4
577 938 875 1750 3064 16370 23 1 1 3
718 1023 805 1610 3558 179430 39 0 0 2
1084 268 230 460 4851 142533 38 0 0 1
892 565 474 1422 3142 191692 12 1 1 1
514 1066 710 1420 3017 48221 32 1 1 3
106 483 289 578 5228 39930 27 1 0 2
22 998 610 1220 3221 10322 26 0 0 4
449 257 157 471 3308 116098 12 0 1 4
233 800 565 1695 3837 31795 18 1 0 1
35 1295 927 2781 5912 11855 21 0 0 2
1463 948 704 2112 5175 95402 22 1 0 1
685 865 695 2085 4739 89118 33 1 1 3

Outliers
Outlier is data that differ significantly from the rest. In the above table, we can see that there are two number of trips attracted which differ significantly from the rest. Value 22 and 35. The following table has no outliers.

Number of trips attracted Number of trips generated Number of households Adults in zone Jobs in zone Shopping m2 Crime Rate Nature of people generated Season Colour
1063 1144 946 1892 3032 125716 20 0 0 1
678 421 310 620 3807 67182 21 0 0 2
553 823 337 1348 3498 188545 15 0 0 2
1768 1216 904 1808 4703 133366 32 1 1 3
1045 1168 838 2514 5926 187937 23 0 0 1
1406 1104 538 2152 5935 198630 14 1 1 3
1528 1521 690 2070 4514 127228 17 1 0 2
744 1946 921 3684 5995 86105 15 0 0 4
577 938 875 1750 3064 16370 23 1 1 3
718 1023 805 1610 3558 179430 39 0 0 2
1084 268 230 460 4851 142533 38 0 0 1
892 565 474 1422 3142 191692 12 1 1 1
514 1066 710 1420 3017 48221 32 1 1 3
106 483 289 578 5228 39930 27 1 0 2
449 257 157 471 3308 116098 12 0 1 4
233 800 565 1695 3837 31795 18 1 0 1
1463 948 704 2112 5175 95402 22 1 0 1
685 865 695 2085 4739 89118 33 1 1 3

Excel data regression output
R Square 0.641413
Adjusted R Square 0.238002
Standard Error 400.6401
Observations 18

ANOVA
  df SS MS F Significance F
Regression 9 2296898 255210.9 1.589975 0.262415
Residual 8 1284100 160512.5
Total 17 3580998      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -530.156 728.9124 -0.72733 0.487763 -2211.03 1150.719 -2211.03 1150.719
X Variable 1 0.959304 1.017722 0.942599 0.043476 -1.38757 3.306174 -1.38757 3.306174
X Variable 2 1.079052 1.337963 0.806488 0.043274 -2.0063 4.1644 -2.0063 4.1644
X Variable 3 -0.69156 0.432391 -1.59938 0.148403 -1.68865 0.305537 -1.68865 0.305537
X Variable 4 0.264314 0.153816 1.718373 0.024051 -0.09039 0.619015 -0.09039 0.619015
X Variable 5 0.002896 0.003058 0.946962 0.37138 -0.00416 0.009948 -0.00416 0.009948
X Variable 6 -10.6024 16.75558 -0.63277 0.044545 -49.2408 28.03608 -49.2408 28.03608
X Variable 7 -18.7354 418.8831 -0.04473 0.965421 -984.682 947.2107 -984.682 947.2107
X Variable 8 407.7466 629.155 0.648086 0.035083 -1043.09 1858.581 -1043.09 1858.581
X Variable 9 -176.027 269.5488 -0.65304 0.532044 -797.607 445.5539 -797.607 445.5539

Significant variables
Here we can see that;
Number of trips generated, Number of households, Jobs in the Zone, Crime rate and Season have a p-value less than 0.05 which was our level of significant hence they are statistically significant.
The table below shows data with significant variables only
Number of trips attracted Number of trips generated Number of households Jobs in zone Crime Rate Season
1063 1144 946 3032 20 0
678 421 310 3807 21 0
553 823 337 3498 15 0
1768 1216 904 4703 32 1
1045 1168 838 5926 23 0
1406 1104 538 5935 14 1
1528 1521 690 4514 17 0
744 1946 921 5995 15 0
577 938 875 3064 23 1
718 1023 805 3558 39 0
1084 268 230 4851 38 0
892 565 474 3142 12 1
514 1066 710 3017 32 1
106 483 289 5228 27 0
449 257 157 3308 12 1
233 800 565 3837 18 0
1463 948 704 5175 22 0
685 865 695 4739 33 1

Excel regression output
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.519108
R Square 0.69473
Adjusted R Square 0.67491
Standard Error 466.9061
Observations 18

ANOVA
  df SS MS F Significance F
Regression 5 964982.9 192996.6 0.8853 0.010134
Residual 12 2616016 218001.3
Total 17 3580998      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -192.193 612.465 -0.3138 0.759059 -1526.64 1142.254 -1526.64 1142.254
X Variable 1 0.031349 0.601882 0.052085 0.059318 -1.28004 1.342737 -1.28004 1.342737
X Variable 2 0.564186 0.948173 0.595024 0.00288 -1.5017 2.630076 -1.5017 2.630076
X Variable 3 0.152291 0.126457 1.204296 0.025169 -0.12323 0.427817 -0.12323 0.427817
X Variable 4 -1.0636 15.38948 -0.06911 0.046039 -34.5944 32.46721 -34.5944 32.46721
X Variable 5 129.9874 238.2694 0.545548 0.595372 -389.157 649.1318 -389.157 649.1318

Interpretation of the result
The adjusted R square from the output is 0.67 meaning that 67% of the number of trips attracted to a certain zone can be well explained with the model. Since the overall p-value of our model is also less than 0.05, we can say that our model is statistically significant.

We can also see that the coefficient of the Crime rate is negative. This means that every time there is increase in number of crime rate, then there will be decline of the total number of people attracted to a certain zone by 1.0636 times the total crime rate. The rest of the coefficients are positive meaning they are directly proportional to total number of tourist received.

The model

Where;
x1 is number of trips generated
x2 is number of households
x3 is jobs in zone
x4 is crime rate
x5 season
For example, for a particular period, the number of trips generated in a zone was 1123, household were 986, jobs in the zone is 820, crime rate 12 and season is winter denoted by 0, we then have;

Therefore, the total number of people attracted to that area will be 511
When we have summer which is denoted by 1, then the number of tourist attracted to that zone at that period will increase by 130, hence total number becomes 641 tourists.

Conclusion
Based on the last Excel output, I can say that in the end, the model is good. The model we obtained can explain 67% of the total number of trips made by tourist attracted to a certain zone. Generally, the model is statistically significant because it has an overall p-value of 0.01 which is less than 0.05, the confidence level.
The model obtained is strong to make future prediction. Apart from its ability to explain the changes in number of tourist attracted to certain area, most of the independent variables are statistically significant making it stronger.
The model can be used to make business decision because considering the factors within a certain zone like crime rate or season, you can use these factors to determine the possible number of tourists who are likely to be attracted in that zone at a particular hence helping in proper prior business planning.
Some of the variable were not statistically significant because of their minimal effect they have on the number of tourist visiting a region. For example, number of adults in the zone is not directly linked to the number of tourist visiting a certain zone.
Outliers can bring a wrong impression on the central tendency of a given set of data, but since outliers were eliminated in the course of the work, they can have very little effect on the final model since the model is statistically significant.
The topic about the number of tourist attracted to a certain region is important academically since it helps the learner to understand what they have learned in classes better. Carrying out a research and analyzing data give the learner a real world experience.
Statistics is a very important discipline in our day to day lives. The ability of applying statistics to obtain the significant factors influencing some movements among many other factors is vital as it make work and learning easier.
What I have learned is applicable in my career in that, in future I will be a businessman and in business, the decisions made are very important. These decision affects all aspects of business, therefore when statistics is applied in the decision making process, one can make bold decision based on scientific evidence.

References
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/
https://towardsdatascience.com/statistical-significance-hypothesis-testing-the-normal-curve-and-p-values-93274fa32687
https://www.tutorialspoint.com/tourism_management/tourism_management_factors_affecting.htm
https://www.brainscape.com/flashcards/tourism-factors-affecting-tourism-6149350/packs/8462399
https://www.guru99.com/what-is-data-analysis.html