It pays to be skeptical
I’ve been thinking this week about alternative truths, distortions of reality, and the accuracy of human memory. We stare in utter disbelief when politicians utter complete mistruths. Why? Because the data available does not corroborate what they are saying.
While these may be calculated attempts to mislead the public or a misguided exaggeration of events, increasingly we find that data and evidence gathered, are critical points of reference for determining what is true and what is not.
Very often, we talk about how a narrative is altered from one listener to another until the final account bears little semblance to the original event. The same can apply to data. In an age of heightened accountability, the question of how we preserve data integrity has become more important than ever.
There are well established data engineering practices designed to ensure that data retains its veracity as it goes through the various storage, extraction and transformation processes. Often however, an executive has no visibility into the dark arts of data engineering or machine learning for that matter, and is presented with a finished report on which she must make a decision and perhaps stake her career.
It helps in this case to be skeptical and ask pointed questions of your data team. Here are a few questions1 you might find useful:
Do you have any data to support that hypothesis?
Can you tell me something about the source of data you used in your analysis?
Are you sure that the sample data are representative of the population?
Are there any outliers in your data distribution? How did they affect the results?
What assumptions are behind your analysis? Are there conditions that would make your assumptions and your model invalid?
Asking questions like these will help keep your data team on their toes. And if they can’t answer the questions to your satisfaction, send them away until they can.