February is heart (disease) awareness month and it is important that we realize that there are TONS of data that exist where we can find out about heart disease and the consequences that it has on our lives and the lives of others. The Center for Disease Control (CDC) (www.cdc.gov) has data on how many deaths result from heart related illness (the total has not changed all that much from year to year, approximately 610,000 deaths per year according to https://www.cdc.gov/dhdsp/data_statistics/fact_sheets/fs_heart_disease.htm). The amount of deaths from heart disease is more than those from suicides, unintentional accidents, influenza, diabetes, and chronic lower respiratory diseases (https://www.cdc.gov/nchs/data/nvsr/nvsr60/nvsr60_06.pdf). What this means is that heart disease is something that not only needs attention, but is in some ways preventable. According to the CDC website, almost 50% of Americans have AT LEAST ONE of THREE risk factors that are associated with heart disease. These three are elevated blood pressure, elevated LDL cholesterol, or smoking (https://www.cdc.gov/dhdsp/data_statistics/fact_sheets/fs_heart_disease.htm). This is not only troubling, but I felt necessary of further “data diving” to see the association between heart disease and areas where I personally have knowledge, like diabetes or high blood pressure.
The CDC has so much data on the subject that I started at this site to look for some data and found a survey called the Behavior Risk Factor Surveillance System (BRFSS) (https://www.cdc.gov/brfss/). This data is available to anyone and has a great amount of data that is available for download, or for data analysis using CDC web-based analysis tools. I went to the “Surveys and Documents” link and found “BRFSS Prevalence and Trends Data” which gave the user the ability to put in risk factors and find the data according to US State, gender, and a number of other characteristics. This is much better than downloading the data and having to do the analysis yourself, and also gives you an idea of the areas of the country where people are at more risk of heart disease than others. It is a great resource for those that want to look at the numbers behind the heart disease issue. If nothing else, it presents an interesting look at how the country’s regions have populations that are more at risk of some diseases and not at risk for others.
I also looked at the BRFSS Web Enabled Analysis Tool (WEAT) that allows you to look at the data from a cross-tabulation point of view. Here you can place characteristics in a number of ways to compare several factors against the disease. The tool is very easy to use and contains so many factors that it is hard to determine which ones to choose. However, for the budding data analyst, this is a great way to learn about data analysis and the multi-factor approach to the analysis. A screen shot of the WEAT page is below (https://nccd.cdc.gov/s_broker/WEATSQL.exe/weat/index.hsql).
You can see the “Cross Tabulation” link where you can click and set the numerous factors that can be associated with any of the various factors that the survey contain. Please do not get overwhelmed! There is so much data here that I used this for a project that I was required to do for one of my graduate classes in statistics from Penn State. The data were provided, already collected, and catalogued. All I had to do was do the various tests on this data. It amazes me that more people do not know about this data treasure trove. I realize that this is a phone-based survey, but from what I can tell it is one of the most extensive and intensive surveys in order to get a read on different maladies that pertain to the United States and give data analysts those tools.
Although this article was about gathering and understanding data pertaining to heart disease, the data takes you far beyond just that one malady. But by understanding some of the factors that heart disease entails, the knowledge will undoubtedly help you to understand heart disease as composed of factors, rather than just something that happens as a result of “genetics” as proposed by some.
Enjoy the CDC site and the various ways of using data to clarify a disease that will be with us for a lifetime (hopefully a LONG lifetime). To control it, we MUST understand it.
Learn, Offer, Value, Educate (LOVE)