I have been reviewing the “cottage industry” of data analytics and have found that they all lack one element — analytics. Sure, you get all the “buttonology” that goes along with these courses, but the true measure of any academic element is the critical thinking that goes along with these courses. For instance, here is the curriculum in data analytics at one unnamed major university:
- Orientation and Introduction
- Data Querying and Reporting
- Data Access and Management
- Data Cleaning
- Statistical Programming Tools
- Data Mining Overview
- Geospatial Data Analytics
- Relational Databases and Data Warehouses
- Statistical Analysis of Databases
- Linear Algebra Overview
- Data Visualization
- Presentation Skills
- Teamwork Skills
- Problem Solving Skill
Notice the people skills are at the end, and the problem solving skill course is at the very end of the list (and why do you want to have a linear algebra review?). These data analytic course curriculum are stock full of “how” and not “why.” I have taught undergraduate statistics for almost 2 decades and can tell you that everyone coming out of my classes knows what data are and what they are not. Truthfully, anything can be data, depending on the requirements for the study — whether it be shoe size or just plain dates on a calendar. The whole reason for math and statistics is to look for patterns, and that means that you HAVE to include data. But more importantly, crucial in fact, is the organ between your ears — your brain. It is the human element that makes the thinking essential to any type of data analysis. It puts the analysis in data analysis.
An example is necessary at this juncture. People get sick, they just do. They get the flu or flu-like symptoms and they try to battle this with various remedies; but what about environment? Sure, you can wash your hands and you can stay away from people who are sick, but what about the ones that are incubating the disease or are “carriers” without being aware? We all react to sickness, but rarely change our environment to prevent the disease. After reading about the flu, I found out that flu bugs, as many diseases, love dry climates, as well as a certain PH in the air. So, what if you humidify the air? The rate of the disease goes down based on the absolute humidity (actual amount of moisture in the air) instead of “relative humidity” (the ratio of air water vapor to saturation, which may vary according to temperature, see article below as citation) according to one study that was a reevaluation of a previous study on humidity and flu (http://www.webmd.com/cold-and-flu/news/20090213/influenza-linked-to-absolute-humidity). What this means is that increasing the amount of water in the air can reduce the amount of flu in that air, since they hate high absolute humidity. But, you ask, why is there flu in the tropics which has a high humidity? Extrapolating the conventional wisdom of the absolute humidity, it is because the “relative humidity” is high but the absolute humidity may be low even though it is humid, since temperature is a factor and there is high temperature in the tropics.
Taken one step further, remember those hot springs and the steam baths? Well, if the absolute humidity is high, then the bugs don’t want a part of that and therefore reduction of disease. The people that lived by taking steam baths did not understand the data portion of the answer, they just knew that when they took them they got sick less. Works for that person, maybe it will work for me.
So what did we learn from this little foray into the world of data analytics? The analysis is 90% of the process, folks. Whether you are talking about the flu or cybersecurity, it is all the same.
Did I just make a segue without using that tool? Ooops. So let’s talk about data analysis and cybersecurity. There are tools out there right now that are collecting an immense amount of data, usually to spot an outlier that will reveal the culprit trying to take corporate information and sell it to some other company at the highest bidder, or maybe an insider threat that is bringing down the network, or possibly someone taking 1 cent from every other employee and putting it in their paycheck (see “Superman III”). But what the data does is just the beginning and you could actually prevent this stuff by some good old fashioned analysis prior to getting the data. This means that people have to be vigilant, all the employees have to be observant to their workplace environment. Analysis is not necessarily sitting in front of a white board filled with formulas, nor in front of a computer staring at charts and graphs. It is just being observant and using a hundred set of eyes rather than one automated tool. Yes, the tool can help to corral the numbers, but it is really the observation that is the value added measure.
Once when I was at home talking with my Dad, I asked him when the neighbor got a trailer. He looked at me and asked me how did I know about that since they had not picked it up and certainly had not told many people. I told him that I saw their truck and it had an “extended” outside rear view mirror, which is used when hauling trailers. He just looked at me and smiled. I knew then to keep my eyes open for future possibilities.
One last thing about analysis — there is some guessing to this. The more analysis you do, the more you look for information that will either confirm or deny that guess. As the information becomes more apparent, your probability of being right or wrong also increases.
I will write more on this, but suffice it to say that analysis is something we do every day as parents, adults, and certainly a member of the planet Earth. There are those that analyze very complicated elements of human endeavor such as people like Claude Shannon (see “Fortune’s Formula” by William Poundstone), or more simple analysis like writing a paper for school. In all cases, it is brought to a conclusion with your brain, not just chart or graph which are tools but not the ultimate answer. Look around you and listen, analysis starts there.