{"id":37910,"date":"2025-05-15T15:09:51","date_gmt":"2025-05-15T05:09:51","guid":{"rendered":"https:\/\/myassignmenthelp.info\/assignments\/?p=37910"},"modified":"2025-05-15T15:09:52","modified_gmt":"2025-05-15T05:09:52","slug":"clusters-and-distributions-2375409","status":"publish","type":"post","link":"https:\/\/myassignmenthelp.info\/assignments\/clusters-and-distributions-2375409\/","title":{"rendered":"Clusters and Distributions-2375409"},"content":{"rendered":"<p><div class=\"ppw-restricted-content\"><\/p>\n\n\n\n<p>This assignment builds on our previous work through Milligan, Chapter 9, and takes a deeper dive into two key data visualization techniques: clustering and distribution analysis. <em>Clustering<\/em>&nbsp;can be a key technique for visualizing the relationships between two or more variables, moving from single dimensions into multidimensional analysis. <em>Distribution analysis<\/em>&nbsp;can highlight central tendencies in a dataset (such as mean or median) and allow visualization to show outliers clearly.<\/p>\n\n\n\n<p>In this assignment, you will apply clustering and distribution analysis to analyze the relationships between several factors in a healthcare dataset by following these steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go through this document and use Tableau to answer all the questions listed below. Where applicable, paste screenshots into the template below.<\/li>\n\n\n\n<li>When you are ready, complete the <strong>online quiz,<\/strong>\u00a0which verifies your homework. Use the answers you found in this document to answer the questions.<\/li>\n\n\n\n<li>When you have completed the online quiz, submit the Word document.<\/li>\n\n\n\n<li>Remember, you can always ask your instructor for help if needed.<\/li>\n\n\n\n<li>If you need to adjust the size of your visualizations to match the options in the questions, use the \u201cFormat\u201d-> \u201cCell size\u201d options. For example, \u201cCtrl+Shift+B\u201d on a Windows computer will make the visualization bigger, and \u201cCtrl+Up\u201d will make it taller.<\/li>\n<\/ul>\n\n\n\n<p>Attachments:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Diabetes2.csv (dataset downloaded from Kaggle):<\/li>\n\n\n\n<li><a href=\"https:\/\/www.kaggle.com\/datasets\/uciml\/pima-indians-diabetes-database\"><u>Pima Indians Diabetes Database<\/u><\/a>\u00a0<a>\u00a0<\/a><\/li>\n<\/ul>\n\n\n\n<p>For this assignment, follow these steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Download the Pima Indians Diabetes dataset<\/li>\n\n\n\n<li>Perform exploratory data analysis (EDA) of the data in Tableau<\/li>\n\n\n\n<li>Perform clustering analysis of the data in Tableau<\/li>\n\n\n\n<li>Perform distribution analysis of the data in Tableau<\/li>\n<\/ol>\n\n\n\n<p><strong>Download the diabetes dataset<\/strong><strong><\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click to access the <a href=\"https:\/\/www.kaggle.com\/datasets\/uciml\/pima-indians-diabetes-database\"><u>Pima Indians Diabetes Database<\/u><\/a>. You may need to sign in or register if you don\u2019t already have an account on Kaggle. Click the \u201cDownload\u201d button on the linked page above to download the dataset.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"314\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image.png\" alt=\"\" class=\"wp-image-37911\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-300x151.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>You can scroll down on this web page to learn more about this dataset, including its original publication date and what the fields mean.<\/li>\n\n\n\n<li>When you download the data, it will save it as a .zip file, likely named something like &#8220;archive.zip\u201d (a standard download protocol for Kaggle).<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"239\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-1.png\" alt=\"\" class=\"wp-image-37912\" style=\"width:735px;height:auto\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-1.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-1-300x115.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: standard download<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Double-click on the \u201carchive\u201d file. It should open to a screen like the one below (this is from a Windows machine; if you are on a Mac, it may vary slightly). You will see that there is one file there (named diabetes.csv, highlighted with a green arrow below). If you click on the \u201cExtract all\u201d option, it will extract all the files (in this case, just the one diabetes.csv file). Save the file in a location of your choice, where it will be easy to find in order to connect with Tableau.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-2.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"248\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-2.png\" alt=\"\" class=\"wp-image-37913\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-2.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-2-300x119.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Open the dataset on your computer. It\u2019s a .csv file, so it should open in Microsoft Excel, and it should look like the table below. Verify that you have these exact numbers showing up in your downloaded file:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-3.png\"><img loading=\"lazy\" decoding=\"async\" width=\"609\" height=\"242\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-3.png\" alt=\"\" class=\"wp-image-37914\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-3.png 609w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-3-300x119.png 300w\" sizes=\"auto, (max-width: 609px) 100vw, 609px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Perform Exploratory Data Analysis (EDA) with Tableau<\/strong><strong><\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Open the dataset in Tableau. Since this is a .csv file, and not an Excel file, you need to connect to a \u201cmore\u201d kind of data (see image below):<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-4.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"348\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-4.png\" alt=\"\" class=\"wp-image-37915\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-4.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-4-300x167.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-5.png\"><img loading=\"lazy\" decoding=\"async\" width=\"601\" height=\"346\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-5.png\" alt=\"\" class=\"wp-image-37916\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-5.png 601w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-5-300x173.png 300w\" sizes=\"auto, (max-width: 601px) 100vw, 601px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Data file<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>When you open the file in Tableau, you should see something like this:<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-6.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"286\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-6.png\" alt=\"\" class=\"wp-image-37917\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-6.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-6-300x138.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Tableau file<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Before we do any multidimensional clustering, it\u2019s always a good idea to get an overview of each of the data fields:<ol><li>There are 768 data rows here, each corresponding to a different member of the Pima Indian Tribe. The table contains each member\u2019s number of pregnancies, blood glucose, blood pressure, and other health measurements. All the way to the right, the table also contains an outcome variable:<ol><li>0 if the tribe member was not diagnosed with diabetes<\/li><\/ol><ol><li>1 if the tribe member was diagnosed with diabetes<\/li><\/ol><\/li><\/ol>\n<ol class=\"wp-block-list\">\n<li>For example, let\u2019s look at the Outcome for the first member. This person has had six pregnancies with a glucose reading of 148. If we scroll all the way to the right, we see the age listed as 50 and the Outcome listed as 1. This means this tribe member was diagnosed with diabetes.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-7.png\"><img loading=\"lazy\" decoding=\"async\" width=\"560\" height=\"367\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-7.png\" alt=\"\" class=\"wp-image-37918\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-7.png 560w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-7-300x197.png 300w\" sizes=\"auto, (max-width: 560px) 100vw, 560px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 1: Understanding the Data<\/strong><\/p>\n\n\n\n<p>In the diabetes dataset, Row 8 contains a tribe member who reported ten pregnancies. Which other data fields correspond with this person?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Glucose of 115, BMI 35.3, Outcome: No Diabetes<\/li>\n\n\n\n<li>Glucose of 115, BMI 35.3, Outcome: Diabetes<\/li>\n\n\n\n<li>Glucose of 168, BMI 38.0, Outcome: Diabetes<\/li>\n\n\n\n<li>Glucose of 168, BMI 38.0, Outcome: No Diabetes<\/li>\n\n\n\n<li>Glucose of 139, BMI 27.1, Outcome: No Diabetes<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 1 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;A. Glucose of 115, BMI 35.3, Outcome: No Diabetes<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-8.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"198\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-8.png\" alt=\"\" class=\"wp-image-37919\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-8.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-8-300x95.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Let\u2019s explore the Pima Indians Diabetes Database a little bit more. Anytime you get a new dataset, it\u2019s a good idea to run some general descriptives to see what the data look like. One of the major tools we use is a <em>histogram<\/em>.<\/li>\n<\/ol>\n\n\n\n<p>Make a histogram of the age. On a new worksheet (below, Sheet 1), we first pull the Age to the Rows, and then, in the \u201cShow Me\u201d tab, we select the histog<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-9.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"342\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-9.png\" alt=\"\" class=\"wp-image-37920\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-9.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-9-300x164.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-10.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"313\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-10.png\" alt=\"\" class=\"wp-image-37921\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-10.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-10-300x150.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-11.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"420\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-11.png\" alt=\"\" class=\"wp-image-37922\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-11.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-11-300x202.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: histogram<\/p>\n\n\n\n<p>Note that when it makes the histogram, under the Data tab, you can see a new variable under Tables called \u201cAge (bin).\u201d This is the histogram bin for your Age variable.<\/p>\n\n\n\n<p>If we look at Sheet 1, we can see that most of our data points report an Age of between 20 and 30. We also have a few people in their 40s and 50s, and many fewer older people.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Tableau does the best it can to guess at \u201cgood\u201d bin sizes for your dataset, but it doesn\u2019t always guess exactly the way a human would like. Go to the right of the green oval under the Data -> Tables -> Age (bin), pull that menu down, and select Edit to see the exact bin sizes Tableau is suggesting here.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-12.png\"><img loading=\"lazy\" decoding=\"async\" width=\"473\" height=\"532\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-12.png\" alt=\"\" class=\"wp-image-37923\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-12.png 473w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-12-267x300.png 267w\" sizes=\"auto, (max-width: 473px) 100vw, 473px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Data tables<\/p>\n\n\n\n<p>We can see here that Tableau is suggesting we group people in age groups of 4.69 years, and that our first bin is suggested to start at age 18.76. So the first bin will contain ages 21, 22, and 23. The second bin, due to rounding issues, contains ages 24, 25, 26, 27, and 28! Let\u2019s update the bins so that they are of bin size 5, so that our yearly increments go in nice, round age buckets.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-13.png\"><img loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"247\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-13.png\" alt=\"\" class=\"wp-image-37924\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-13.png 510w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-13-300x145.png 300w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Edit bins<\/p>\n\n\n\n<p>If we update the bin size to 5, we can now see that the histogram has adjusted slightly. The first bin contains ages 20, 21, 22, 23 and 24. The second bin contains ages 25, 26, 27, 28, and 29. &nbsp;The histogram is \u201csmoother\u201d in the jump between the 35\u201340 and 40\u201345 age bins.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-14.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"302\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-14.png\" alt=\"\" class=\"wp-image-37925\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-14.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-14-300x145.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Sometimes, the axis labels aren\u2019t obvious. To be sure of which bin you are viewing, you can click on a bin, and it will display the information.<\/p>\n\n\n\n<p>Below, we have clicked on one of the bars and learned it is the bin for ages 35 (and contains ages 35, 36, 37, 38, and 39 but not age 40). If you are curious, you can right-click on the bin and choose \u201cView Data\u201d and then \u201cFull Data\u201d to see the exact data points which make up that bar.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-15.png\"><img loading=\"lazy\" decoding=\"async\" width=\"657\" height=\"337\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-15.png\" alt=\"\" class=\"wp-image-37926\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-15.png 657w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-15-300x154.png 300w\" sizes=\"auto, (max-width: 657px) 100vw, 657px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-16.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"384\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-16.png\" alt=\"\" class=\"wp-image-37927\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-16.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-16-300x185.png 300w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-16-348x215.png 348w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: histogram<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>On a new sheet in Tableau, make a histogram of the BMI variable. Adjust the bin sizes so that they are size 5.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-17.png\"><img loading=\"lazy\" decoding=\"async\" width=\"531\" height=\"242\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-17.png\" alt=\"\" class=\"wp-image-37928\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-17.png 531w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-17-300x137.png 300w\" sizes=\"auto, (max-width: 531px) 100vw, 531px\" \/><\/a><\/figure>\n\n\n\n<p>In your BMI histogram, what values are in the most frequent bin?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>30\u201334.9<\/li>\n\n\n\n<li>30\u201335<\/li>\n\n\n\n<li>30.40, 135<\/li>\n\n\n\n<li>30.40\u201333.30<\/li>\n\n\n\n<li>none of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 2 Answer:<\/strong>&nbsp;<\/td><td>&nbsp;<strong>B. 30\u201335<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-18.png\"><img loading=\"lazy\" decoding=\"async\" width=\"635\" height=\"707\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-18.png\" alt=\"\" class=\"wp-image-37929\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-18.png 635w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-18-269x300.png 269w\" sizes=\"auto, (max-width: 635px) 100vw, 635px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 3: Understanding Questionable Values<\/strong><\/p>\n\n\n\n<p>In your BMI histogram, are there any values that make you question the data and wonder if those should be filtered out?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Yes; the data looks normally distributed, like a bell curve, and this is not expected for these sorts of measurements<\/li>\n\n\n\n<li>Yes; there are too many counts of a BMI of 40 and higher, which is much larger than expected<\/li>\n\n\n\n<li>Yes; there are 11 counts of a BMI of 0, which is probably not an accurate measure<\/li>\n\n\n\n<li>No; the data looks normally distributed, like a bell curve, and this is expected for these sorts of measurements<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 3 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;C.Yes; there are 11 counts of a BMI of 0, which is probably not an accurate measure<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-19.png\"><img loading=\"lazy\" decoding=\"async\" width=\"629\" height=\"500\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-19.png\" alt=\"\" class=\"wp-image-37930\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-19.png 629w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-19-300x238.png 300w\" sizes=\"auto, (max-width: 629px) 100vw, 629px\" \/><\/a><\/figure>\n\n\n\n<p>On a new sheet in Tableau, make a histogram of the Glucose variable. Adjust the bin sizes so they are size 10. (Don\u2019t filter out any Glucose measurements at this point.)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-20.png\"><img loading=\"lazy\" decoding=\"async\" width=\"500\" height=\"222\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-20.png\" alt=\"\" class=\"wp-image-37931\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-20.png 500w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-20-300x133.png 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 4: Understanding the most frequent bins<\/strong><\/p>\n\n\n\n<p>In your Glucose histogram, what values are in the most frequent bin?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>100 through 107, including 100 and 107<\/li>\n\n\n\n<li>100 through 109, including 100 and 109<\/li>\n\n\n\n<li>100 through 110, including 100 and 110<\/li>\n\n\n\n<li>110 through 120, including 110 and 120<\/li>\n\n\n\n<li>120 through 129, including 120 and 129<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 4 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;<\/strong><\/td><td><strong>&nbsp;<\/strong><\/td><td><strong>B.<\/strong><strong><\/strong><\/td><td><strong>100 through 109, including 100 and 109<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-21.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"615\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-21.png\" alt=\"\" class=\"wp-image-37932\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-21.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-21-300x296.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 5 \u2013 Understanding the bin values and counts<\/strong><\/p>\n\n\n\n<p>The most frequent Glucose measurement is about 100. What is the <strong>second<\/strong>&nbsp;most frequent Glucose measurement?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>bin 100, count 117<\/li>\n\n\n\n<li>bin 110, count 94<\/li>\n\n\n\n<li>bin 110, count 105<\/li>\n\n\n\n<li>bin 120, count 102<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 5 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;D.<\/strong><strong> <\/strong><strong>bin 120, count 102<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Question 6: Understanding the shape of the data<\/strong><\/p>\n\n\n\n<p>In your Glucose histogram, how would you describe the overall shape of this data?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>The average\/median is about 100, and symmetric. It\u2019s not skewed at all.<\/li>\n\n\n\n<li>The average\/median is about 100, and it\u2019s skewed to the right.<\/li>\n\n\n\n<li>The average\/median is about 100, and it\u2019s skewed to the left.<\/li>\n\n\n\n<li>It is bimodal: there are two distinct centers of glucose measurements, probably one for diabetes diagnoses and one for non-diabetes diagnoses.<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 6 Answer:<\/strong>&nbsp;<\/td><td>&nbsp;B.The average\/median is about 100, and it\u2019s skewed to the right.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-22.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"396\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-22.png\" alt=\"\" class=\"wp-image-37933\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-22.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-22-300x190.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-23.png\"><img loading=\"lazy\" decoding=\"async\" width=\"488\" height=\"224\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-23.png\" alt=\"\" class=\"wp-image-37934\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-23.png 488w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-23-300x138.png 300w\" sizes=\"auto, (max-width: 488px) 100vw, 488px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 7: Understanding the Descriptives of the Data<\/strong><\/p>\n\n\n\n<p>In your Insulin histogram, how would you describe the overall descriptives of this data?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Zero insulin is meaningless, so there must be some mistakes.<\/li>\n\n\n\n<li>This data is normally distributed and follows a bell curve symmetric shape.<\/li>\n\n\n\n<li>This data is bimodal. We can see two distinct populations: those with diabetes and those without.<\/li>\n\n\n\n<li>The most frequent amount of insulin is between 0 and 49, but a few people report large levels of insulin, at 400 units or above.<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 7 Answer:<\/strong>&nbsp;<\/td><td>&nbsp;<strong>D.The most frequent amount of insulin is between 0 and 49, but a few people report large levels of insulin, at 400 units or above.<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-24.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"399\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-24.png\" alt=\"\" class=\"wp-image-37935\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-24.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-24-300x192.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>In your Insulin histogram, drag Outcome to the Color under Marks. Remember that Outcome=0 means No Diabetes, and Outcome=1 means Diabetes. Now go to the leftmost bin, right-click on it to View Data, and then look at the Full Data. How would you describe the data points in this bin? Check all that apply.<br>\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-25.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"512\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-25.png\" alt=\"\" class=\"wp-image-37936\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-25.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-25-300x246.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 8: Two-Dimensional Descriptives of the Data<\/strong><\/p>\n\n\n\n<p>How would you describe the data points in the leftmost bin in terms of insulin and outcome?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>The most frequent amount of insulin is 0, but there are some values here between 0 and 49.<\/li>\n\n\n\n<li>All the Insulin values here are 0.<\/li>\n\n\n\n<li>All the Outcome values here are 0 (no Diabetes).<\/li>\n\n\n\n<li>The values here average to 25.<\/li>\n\n\n\n<li>This contains only values from people with No Diabetes as their Outcome. If you have diabetes, you need insulin.<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 8 Answer:<\/strong>&nbsp;<\/td><td>The most frequent amount of insulin is 0, but there are some values here between 0 and 49.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-26.png\"><img loading=\"lazy\" decoding=\"async\" width=\"625\" height=\"406\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-26.png\" alt=\"\" class=\"wp-image-37937\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-26.png 625w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-26-300x195.png 300w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/figure>\n\n\n\n<p>Now, let\u2019s look at the Outcome variable. We could make a histogram of this binary variable, but that\u2019s not very satisfying. Let\u2019s recode it to be a text variable. Instead of having to remember \u201c0 means No Diabetes,\u201d wouldn\u2019t it be easier to just have the words \u201cNo Diabetes\u201d showing on the screen?<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-27.png\"><img loading=\"lazy\" decoding=\"async\" width=\"333\" height=\"420\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-27.png\" alt=\"\" class=\"wp-image-37938\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-27.png 333w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-27-238x300.png 238w\" sizes=\"auto, (max-width: 333px) 100vw, 333px\" \/><\/a><\/figure>\n\n\n\n<p>Fill it out as follows: You want to create a new field called \u201cOutcome_Text,\u201d and we want it to be \u201cDiabetes\u201d if the Outcome variable was a 1, 0 otherwise, and \u201cNA\u201d if, for some weird reason, the Outcome variable was neither a 1 nor a 0.<\/p>\n\n\n\n<p>If you want to copy and paste the formula with all the glorious brackets and parentheses, here it is:<\/p>\n\n\n\n<p>IF ([Outcome]=1)<\/p>\n\n\n\n<p>THEN &#8220;Diabetes&#8221;<\/p>\n\n\n\n<p>ELSEIF ([Outcome]=0)<\/p>\n\n\n\n<p>THEN &#8220;No Diabetes&#8221;<\/p>\n\n\n\n<p>ELSE &#8220;NA&#8221;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-28.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"197\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-28.png\" alt=\"\" class=\"wp-image-37939\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-28.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-28-300x95.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: copy and paste formula<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Let\u2019s check that it coded correctly:\n<ol class=\"wp-block-list\">\n<li>How many Outcome=0 do we have? Let\u2019s look at a histogram of the original Outcome variable. Looks like we have 500 in the Outcome=0 bin and 268 in the Outcome=1 bin. (Those of you who are following along at home on your calculators will notice this is about 65% No Diabetes, 35% Diabetes.)<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<p>Alt text: copy and paste formula<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\u00a0Let\u2019s check that it coded correctly:<\/li>\n<\/ol>\n\n\n\n<p>How many Outcome=0 do we have? Let\u2019s look at a histogram of the original Outcome variable. Looks like we have 500 in the Outcome=0 bin and 268 in the Outcome=1 bin. (Those of you who are following along at home on your calculators will notice this is about 65% No Diabetes, 35% Di<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-29.png\"><img loading=\"lazy\" decoding=\"async\" width=\"339\" height=\"811\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-29.png\" alt=\"\" class=\"wp-image-37940\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-29.png 339w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-29-125x300.png 125w\" sizes=\"auto, (max-width: 339px) 100vw, 339px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: outcome<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>How does our Outcome_Text = \u201cNo Diabetes\u201d or \u201cDiabetes\u201d stack up against our original binary variable? Let\u2019s drag the Outcome_Text to the Color and also\u00a0to the Rows.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-30.png\"><img loading=\"lazy\" decoding=\"async\" width=\"473\" height=\"528\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-30.png\" alt=\"\" class=\"wp-image-37941\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-30.png 473w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-30-269x300.png 269w\" sizes=\"auto, (max-width: 473px) 100vw, 473px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: outcome<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>\u00a0We can see success! The orange bar is marked \u201cNo Diabetes\u201d in the legend, and it is showing on the left (per the binary Outcome variable = 0), while the blue bar is marked \u201cDiabetes\u201d in the legend and is showing on the right (per the binary Outcome variable = 1).<\/li>\n<\/ol>\n\n\n\n<p>We are studying this dataset to try to understand diabetes in the Pima Indian tribe. We have a dataset which contains about 35% diabetes diagnoses.<\/p>\n\n\n\n<p>Which research statement(s) does this data look like it might be able to answer? Check all that apply.<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Why do people with zero insulin recorded still have a diabetes diagnosis?<\/li>\n\n\n\n<li>What aspects of the modern diet cause diabetes?<\/li>\n\n\n\n<li>For the Pima Indian population listed in this dataset, are there any relationships between diabetes, age, and BMI which might be interesting?<\/li>\n\n\n\n<li>For the Pima Indian population listed in this dataset, are there any relationships between glucose, insulin, and diabetes diagnosis which might be interesting?<\/li>\n\n\n\n<li>For the Pima Indian population listed in this dataset, are there any relationships between household income, gender, and diabetes diagnosis which might be interesting?<\/li>\n\n\n\n<li>This is not enough data to ask anything; we really need to go to the CDC and get millions of rows of data.<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 9 Answer:<\/strong>&nbsp;<\/td><td><strong><\/strong><strong>Why do people with zero insulin recorded still have a diabetes diagnosis?<\/strong><strong><\/strong><strong><\/strong><strong>For the Pima Indian population listed in this dataset, are there any relationships between diabetes, age, and BMI which might be interesting?<\/strong><strong><\/strong><strong><\/strong><strong>For the Pima Indian population listed in this dataset, are there any relationships between glucose, insulin, and diabetes diagnosis which might be interesting?<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Let\u2019s go back to our Age histogram. Does diabetes diagnosis change with age? Pull the Outcome_Text (not the Outcome binary variable, but the Outcome_Text) as a color and also as an additional row variable. We will see something like this:<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-31.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"472\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-31.png\" alt=\"\" class=\"wp-image-37942\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-31.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-31-300x227.png 300w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-31-290x220.png 290w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>From this histogram, we can see that age starts off young on the left and goes to older on the right. The Diabetes population is on the top in blue, and the Non-Diabetes population is on the bottom in orange.<\/p>\n\n\n\n<p>We can see that while both groups cover most of the full age range, the Non-Diabetes population has a lot of young people in it while the Diabetes population has a larger percentage of its population in the older age brackets.<\/p>\n\n\n\n<p>This may give rise to a hypothesis: Does increasing age bring with it a likelihood of diabetes diagnosis among this population?<\/p>\n\n\n\n<p><strong>Question 10: Understanding Stacked Histograms<\/strong><\/p>\n\n\n\n<p>Go back to your BMI histogram. Be sure the bin sizes are still 5. (You can just ignore any BMI of 0; this is probably a data error.) &nbsp;Repeat steps similar to the Age histogram analysis we just did.&nbsp;Which statements do your stacked histograms support for the Pima Indian population from this dataset? Check all that apply.<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>The BMI for the No Diabetes group appears to be lower than the BMI for the Diabetes group if we go by the histogram midpoint.<\/li>\n\n\n\n<li>If you look at the BMI bin which contains BMI measures from 40.0 through 44.9, there are about the same number (within 5 people) in the Diabetes and No Diabetes categories.<\/li>\n\n\n\n<li>The BMI for the No Diabetes group is heavily skewed in favor of a BMI below 20.<\/li>\n\n\n\n<li>The BMI for the Diabetes group is heavily skewed in favor of a BMI of 45 or higher.<\/li>\n\n\n\n<li>If you look at the BMI bin which contains BMI measures from 20.0 through 24.9, there are about the same number (within 5 people) in the Diabetes and No Diabetes categories.<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 10 Answer:<\/strong>&nbsp;<\/td><td><strong>A. The BMI for the No Diabetes group appears to be lower than the BMI for the Diabetes group if we go by the histogram midpoint.<\/strong><strong><\/strong><strong><\/strong><strong>If you look at the BMI bin which contains BMI measures from 40.0 through 44.9, there are about the same number (within 5 people) in the Diabetes and No Diabetes categories.<\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-32.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"516\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-32.png\" alt=\"\" class=\"wp-image-37943\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-32.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-32-300x248.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Perform Clustering of the Diabetes Data in Tableau<\/strong><strong><\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>We have completed our EDA (exploratory data analysis) on the diabetes data. We are beginning to understand the shape of individual variables such as Age and BMI, and also their relationship to a diabetes diagnosis.<\/li>\n\n\n\n<li>Our next step is to run two-dimensional xy scatterplots and then cluster them to see if we can uncover additional relationships.<\/li>\n\n\n\n<li>Let\u2019s start investigating Age and BMI.\n<ol class=\"wp-block-list\">\n<li>Make a new sheet in Tableau. Drag Age (not the Age(bin) but the plain old Age from the Measures in the Data Values area) to the Columns, and drag BMI to the Rows. Tableau will give you SUM(Age) and SUM(BMI) and probably one single data point graphed. We circled in red for you below.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-33.png\"><img loading=\"lazy\" decoding=\"async\" width=\"441\" height=\"470\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-33.png\" alt=\"\" class=\"wp-image-37944\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-33.png 441w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-33-281x300.png 281w\" sizes=\"auto, (max-width: 441px) 100vw, 441px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: graph<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>We want to see all the data points, so let\u2019s Disaggregate Measures. You can do this under the Analysis Menu. (This is the fix anytime you expect lots of data points and you only have one.)<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-34.png\"><img loading=\"lazy\" decoding=\"async\" width=\"426\" height=\"378\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-34.png\" alt=\"\" class=\"wp-image-37945\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-34.png 426w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-34-300x266.png 300w\" sizes=\"auto, (max-width: 426px) 100vw, 426px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Analysis<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>We can now see an XY scatterplot of BMI vs. Age, with Age on the x-axis. Apply a filter for BMI so that BMI is allowed to be between 1 and the highest value (this removes BMI values of 0).<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-35.png\"><img loading=\"lazy\" decoding=\"async\" width=\"556\" height=\"542\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-35.png\" alt=\"\" class=\"wp-image-37946\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-35.png 556w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-35-300x292.png 300w\" sizes=\"auto, (max-width: 556px) 100vw, 556px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: scatterplot<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>We want to do two-dimensional clustering. Are there distinct groups, such as<ol><li>Younger people with lower BMI?<\/li><\/ol><ol><li>Younger people with higher BMI?<\/li><\/ol>\n<ol class=\"wp-block-list\">\n<li>To do this in Tableau, we need to switch from the Data tab on the left to the Analytics tab. From there, under the Model options, we want \u201cCluster.\u201d<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-36.png\"><img loading=\"lazy\" decoding=\"async\" width=\"590\" height=\"395\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-36.png\" alt=\"\" class=\"wp-image-37947\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-36.png 590w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-36-300x201.png 300w\" sizes=\"auto, (max-width: 590px) 100vw, 590px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: cluster<\/p>\n\n\n\n<p>&nbsp;Drag the \u201cCluster\u201d tool into the middle of your XY s<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-37.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"459\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-37.png\" alt=\"\" class=\"wp-image-37948\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-37.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-37-300x221.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: cluster<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>Tableau will automatically create some clusters for you. It groups similar data points together so that people of a similar age and BMI will be in the same cluster while people with different ages and different BMI will be in different clusters.<\/li>\n<\/ol>\n\n\n\n<p>Just looking visually at the four clusters Tableau automatically made, we can see the yellow one on the bottom left is younger people with lower BMI while the aqua one on the right-hand side is older people with lower-to-medium BMI. The red cluster is younger people with higher BMI.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-38.png\"><img loading=\"lazy\" decoding=\"async\" width=\"508\" height=\"342\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-38.png\" alt=\"\" class=\"wp-image-37949\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-38.png 508w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-38-300x202.png 300w\" sizes=\"auto, (max-width: 508px) 100vw, 508px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: cluster<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>If you like, you can experiment and drag the Clusters Mark onto the Shape Mark, and then the clusters will be distinguished by multiple X, O, and + icons and others which do not require color to differentiate them.\n<ol class=\"wp-block-list\">\n<li>We can get numeric descriptives on our clusters to help us understand them. Go to Clusters -> Describe Clusters. Here, we see that there are four clusters. Cluster 1 has 197 people in it, the median age is about 41 years old, and the BMI in this cluster is centered around about 34.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-39.png\"><img loading=\"lazy\" decoding=\"async\" width=\"410\" height=\"331\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-39.png\" alt=\"\" class=\"wp-image-37950\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-39.png 410w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-39-300x242.png 300w\" sizes=\"auto, (max-width: 410px) 100vw, 410px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-40.png\"><img loading=\"lazy\" decoding=\"async\" width=\"554\" height=\"402\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-40.png\" alt=\"\" class=\"wp-image-37951\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-40.png 554w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-40-300x218.png 300w\" sizes=\"auto, (max-width: 554px) 100vw, 554px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 11: Cluster Analysis for BMI vs. Age<\/strong><\/p>\n\n\n\n<p>Look at the BMI vs. Age Clusters created above, with a model of 4 clusters. If you had a data point with an age of 27 and a BMI of 42, which cluster would you expect to be the best match?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Cluster 1<\/li>\n\n\n\n<li>Cluster 2<\/li>\n\n\n\n<li>Cluster 3<\/li>\n\n\n\n<li>Cluster 4<\/li>\n\n\n\n<li>None of these<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 11 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;C. Cluster 3 <\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Let\u2019s stay with our BMI vs. Age Clusters. Let\u2019s say we want more clusters in order to more finely analyze our data. Under the Marks menu, go to Clusters -> Edit Clusters, and change the number of clusters to 10.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-41.png\"><img loading=\"lazy\" decoding=\"async\" width=\"308\" height=\"422\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-41.png\" alt=\"\" class=\"wp-image-37952\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-41.png 308w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-41-219x300.png 219w\" sizes=\"auto, (max-width: 308px) 100vw, 308px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-42.png\"><img loading=\"lazy\" decoding=\"async\" width=\"594\" height=\"610\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-42.png\" alt=\"\" class=\"wp-image-37953\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-42.png 594w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-42-292x300.png 292w\" sizes=\"auto, (max-width: 594px) 100vw, 594px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Clusters<\/p>\n\n\n\n<p><strong>Question 12: Changing the Numbers of Clusters<\/strong><\/p>\n\n\n\n<p>Look at your new clusters, with 10 of them now on your XY scatterplot. What is the average age of those in the cluster with the highest BMI?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>About 26<\/li>\n\n\n\n<li>About 29<\/li>\n\n\n\n<li>About 51<\/li>\n\n\n\n<li>About 52<\/li>\n\n\n\n<li>Cannot determine from available information<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 12 Answer:<\/strong><strong>&nbsp;<\/strong><strong><\/strong><\/td><td><strong><\/strong><strong>&nbsp;<\/strong><strong>About 29<\/strong><strong><\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-43.png\"><img loading=\"lazy\" decoding=\"async\" width=\"597\" height=\"382\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-43.png\" alt=\"\" class=\"wp-image-37954\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-43.png 597w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-43-300x192.png 300w\" sizes=\"auto, (max-width: 597px) 100vw, 597px\" \/><\/a><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Let\u2019s make a new sheet and investigate Glucose vs. Insulin.\n<ol class=\"wp-block-list\">\n<li>Make an XY scatterplot with Insulin on the x-axis (because we can administer insulin) and Glucose on the y-axis (because that\u2019s our outcome variable)<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-44.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"588\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-44.png\" alt=\"\" class=\"wp-image-37955\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-44.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-44-300x283.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>Add filters to remove 0 values for Glucose (because a zero blood-glucose reading does not make sense).<ol><li>Do <strong>not<\/strong>\u00a0add filters for insulin. It\u2019s OK if the amount of insulin administered is zero.<\/li><\/ol>\n<ol class=\"wp-block-list\">\n<li>Have Tableau make 3 clusters.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-45.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"641\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-45.png\" alt=\"\" class=\"wp-image-37956\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-45.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-45-292x300.png 292w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 13: Mapping Clusters to Measurements<\/strong><strong><\/strong><\/p>\n\n\n\n<p>Normal blood glucose levels are about 100 in a fasting non-diabetic adult. Which cluster best represents this?<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Cluster 1, with an average Glucose reading of about 150 and an average Insulin value of about 54<\/li>\n\n\n\n<li>Cluster 2, with an average Glucose reading of about 100 and an average Insulin value of about 52<\/li>\n\n\n\n<li>Cluster 2, with an average Glucose reading of about 100 and 220 data points in it<\/li>\n\n\n\n<li>Cluster 3, with an average Glucose reading of about 161 and 347 data points in it<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 13 Answer:<\/strong>&nbsp;<\/td><td><strong><\/strong><strong>Cluster 2, with an average Glucose reading of about 100 and an average Insulin value of about 52<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-46.png\"><img loading=\"lazy\" decoding=\"async\" width=\"596\" height=\"358\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-46.png\" alt=\"\" class=\"wp-image-37957\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-46.png 596w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-46-300x180.png 300w\" sizes=\"auto, (max-width: 596px) 100vw, 596px\" \/><\/a><\/figure>\n\n\n\n<p>Stay on your Glucose vs. Insulin sheet, but let\u2019s add another piece of information. Drag your Outcome_Text variable (the one which declares \u201cDiabetes\u201d or \u201cNo Diabetes\u201d) to the Columns area. This should now give you two panes of graphs, one with Diabetes and one with No Diabetes.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-47.png\"><img loading=\"lazy\" decoding=\"async\" width=\"462\" height=\"331\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-47.png\" alt=\"\" class=\"wp-image-37958\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-47.png 462w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-47-300x215.png 300w\" sizes=\"auto, (max-width: 462px) 100vw, 462px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-48.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"394\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-48.png\" alt=\"\" class=\"wp-image-37959\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-48.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-48-300x189.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>You should have two XY scatterplots of your Glucose vs. Insulin clusters, one with Diabetes and one with No Diabetes. Which statements would you support after inspecting these visualizations? Choose all that apply:<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Cluster 1 has an average Glucose level of about 150 in both the Diabetes and No Diabetes classifications.<\/li>\n\n\n\n<li>Cluster 1, with an average Glucose reading of about 150 and an average Insulin value of about 54.<\/li>\n\n\n\n<li>Cluster 2, which is generally lower insulin usage and lower Glucose levels, has some people with a Diabetes classification, but many more with a No Diabetes classification.<\/li>\n\n\n\n<li>Cluster 3 has very high insulin, very high Glucose levels, and only contains people with a Diabetes classification.<\/li>\n\n\n\n<li>Cluster 3, with an average Glucose reading of about 161 and 347 data points in it<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 14 Answer:<\/strong>&nbsp;<\/td><td><strong>&nbsp;C.Cluster 2, which is generally lower insulin usage and lower Glucose levels, has some people with a Diabetes classification, but many more with a No Diabetes classification.<\/strong><strong><\/strong><strong>D.Cluster 3 has very high insulin, very high Glucose levels, and only contains people with a Diabetes classification.<\/strong><strong><\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Perform Distribution Analysis with Tableau<\/strong><strong><\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>We have now done EDA (exploratory data analysis) on this dataset, and we\u2019ve also done some cluster analysis to look at relationships between two variables.<\/li>\n\n\n\n<li>Now we are going to look at distribution analysis.<\/li>\n\n\n\n<li>Sometimes it\u2019s helpful to look for outliers, average levels, or other general distribution characteristics of a dataset. A Distribution Band can visually display that information.<\/li>\n\n\n\n<li>Let\u2019s go back to our BMI vs. Age dataset and graph. Remove any clusters and keep a filter on so that Tableau only displays data where the BMI is > 0 (do not display data with a BMI = 0).<\/li>\n<\/ol>\n\n\n\n<p>Go the Analytics tab. Under Custom, choose Distribution Band. You want to drag this option to Table (Page) for this demonstration<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-49.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"516\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-49.png\" alt=\"\" class=\"wp-image-37960\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-49.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-49-300x248.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: cluster table<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>You will be given some options. For this one, you want the scope to be the entire table, and we want +\/- 1 Standard Deviation. (You will recall from statistics that if your data are normally distributed, about two-thirds of it will be within +\/-1 standard deviation. This means that about 1\/6 is above +1 STDEV, and about 1\/6 is below -1 STDEV. So if something is \u201coutside\u201d of those bounds, it\u2019s \u201ca little bit different from average.\u201d)<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-50.png\"><img loading=\"lazy\" decoding=\"async\" width=\"240\" height=\"328\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-50.png\" alt=\"\" class=\"wp-image-37961\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-50.png 240w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-50-220x300.png 220w\" sizes=\"auto, (max-width: 240px) 100vw, 240px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Standard deviation<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>That last step will put a reference band on the graph. It\u2019s now easy to see which Age data points are \u201cclose to the average\u201d (they are inside the grey band) and which data points are \u201coutside of the average\u201d (they are outside of the grey band.)<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-51.png\"><img loading=\"lazy\" decoding=\"async\" width=\"469\" height=\"370\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-51.png\" alt=\"\" class=\"wp-image-37962\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-51.png 469w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-51-300x237.png 300w\" sizes=\"auto, (max-width: 469px) 100vw, 469px\" \/><\/a><\/figure>\n\n\n\n<p>Alt text: Cluster<\/p>\n\n\n\n<ol style=\"list-style-type:lower-alpha\" class=\"wp-block-list\">\n<li>Let\u2019s go back and put on a reference band of +\/-1 STDEV for the BMI as well. We get something like the following:<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-52.png\"><img loading=\"lazy\" decoding=\"async\" width=\"525\" height=\"422\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-52.png\" alt=\"\" class=\"wp-image-37963\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-52.png 525w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-52-300x241.png 300w\" sizes=\"auto, (max-width: 525px) 100vw, 525px\" \/><\/a><\/figure>\n\n\n\n<p><strong>Question 15: Understanding the Distribution Bands<\/strong><\/p>\n\n\n\n<p>Look at the distribution band for the BMI vs. Age scatterplot. Match the area with its description.<\/p>\n\n\n\n<p>Four Quadrants: A, B, C, D<\/p>\n\n\n\n<p>Four Options:<\/p>\n\n\n\n<ol style=\"list-style-type:upper-alpha\" class=\"wp-block-list\">\n<li>Lower age, Lower BMI<\/li>\n\n\n\n<li>Lower age, Higher BMI<\/li>\n\n\n\n<li>Higher age, Lower BMI<\/li>\n\n\n\n<li>Higher age, Higher BMI<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Question 15 Answer:<\/strong><strong>&nbsp;<\/strong><strong><\/strong><\/td><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;A. Lower age, Higher BMI \u2192 &nbsp;Quadrant B<\/strong><strong><\/strong><strong>B. Higher age, Higher BMI \u2192 Quadrant D<\/strong><strong><\/strong><strong>C. Lower age, Lower BMI \u2192 Quadrant A<\/strong><strong><\/strong><strong>D. Higher age, Lower BMI \u2192 Quadrant C<\/strong><strong><\/strong><strong>&nbsp;<\/strong><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Go make a new sheet, and this time, make a graph of Glucose vs. Insulin. Review these reminders:<ol><li>Insulin should be on the x-axis<\/li><\/ol><ol><li>Glucose should be on the y-axis<\/li><\/ol><ol><li>No clusters<\/li><\/ol><ol><li>Filter so Glucose is 1 or higher<\/li><\/ol><ol><li>Do not filter on Insulin (OK if Insulin values are 0)<\/li><\/ol><ol><li>Add one +\/- 1 STDEV Distribution band for the Insulin<\/li><\/ol><ol><li>Add another +\/- 1 STDEV Distribution band for the Glucose (you will do well to add them one at a time, at the Table level)<\/li><\/ol>\n<ol class=\"wp-block-list\">\n<li>You should get something that looks similar to this:<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-53.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"511\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-53.png\" alt=\"\" class=\"wp-image-37964\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-53.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-53-300x246.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-54.png\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"474\" src=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-54.png\" alt=\"\" class=\"wp-image-37965\" srcset=\"https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-54.png 624w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-54-300x228.png 300w, https:\/\/myassignmenthelp.info\/assignments\/wp-content\/uploads\/2025\/05\/image-54-290x220.png 290w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/a><\/figure>\n\n\n<p><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[359],"tags":[],"class_list":["post-37910","post","type-post","status-publish","format-standard","hentry","category-education"],"_links":{"self":[{"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/posts\/37910","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/comments?post=37910"}],"version-history":[{"count":1,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/posts\/37910\/revisions"}],"predecessor-version":[{"id":37966,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/posts\/37910\/revisions\/37966"}],"wp:attachment":[{"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/media?parent=37910"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/categories?post=37910"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/myassignmenthelp.info\/assignments\/wp-json\/wp\/v2\/tags?post=37910"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}