Three ways to feel the truth when dealing with data
Understanding data isn’t just about looking at the numbers — it’s about understanding the story behind them. Data on its own can be misleading, incomplete, or even flat‑out wrong if we don’t know the context it comes from. That’s why, before we jump to conclusions, it’s worth pausing to ask a few simple but powerful questions.
In this blog, I’m going to break down three things you need to know if you want to make sense of any dataset: who the data comes from, when it was collected, and what was actually asked. These may sound obvious, but as you’ll see, overlooking even one of them can completely change the “truth” you think you’re seeing.
1. The target market/audience/respondent base (Who)
If I researched the chocolate-eating habits of people in Buckinghamshire, my target market would be people in Buckinghamshire and any results I got will be the “truth” (within margin of error) as based on that target market.
2. The timing of when my research was done (When)
In my fictitious example, I asked people in Buckinghamshire about their chocolate-eating habits in August 2025, but my results could be different if I repeated the research again with the same target market (and the exact same phrasing of questions) in April 2026.
3. “what” I asked people
The phrasing of questions is important as I may get different results if I used an open-ended question on chocolate-eating habits vs. a closed-ended question with multiple-choice options.
To illustrate how important the “who,” “when” and “what” is when it comes to the “truth” of data, I am going to use a real-life example.
Why data without context can't be trusted
Some years ago, I worked in TV research and the company I worked for generated TV ratings in South Africa.
At some point a journalist contacted me, asking me to verify the truth of some data another journalist had quoted. Basically, it was a sentence that read that the TV ratings for a specific TV drama (we will call it X for our discussion) for “people” was Y.
I had to sadly write back that the company I work for has an interactive data cross-tabulation programme, where people can put their own filters on the data to get TV ratings.
For me to verify the truth of the data, I would need to know what filter was placed on the audience i.e. was it adults only or adults and kids or adults in a certain age range or any other of the many audience filters that could have been selected for “people”.
I would also need to know the “when” filter. When it comes to TV ratings, you can select not just a specific day, specific month and specific year, but a specific second and as X was a drama that had been on TV for some time, I would need that part of the puzzle.
Lastly, I would need the “what” part of the puzzle. Did the first journalist look at a specific hour that X was on or did he select the actual drama by name for a specific day and date. If he selected an hour ( firstly assuming that for that day X did not start a minute after the hour as the TV schedule was running late), he could also have included the ad breaks into the run, and they would have influenced the ratings. If he selected the drama by name for that specific date, he would have gotten the ratings for that drama, excluding the ad breaks.
Thus, I could not verify the truth of the data the first journalist had quoted until I had the context. As it turns out the first journalist could not give us any more context either.
This illustrates why it is important to, in some way, give the who, what and when if you produce a data report or article or anything else where you give people some data.
Sadly, there is also a fourth truth that comes to mind (and no, I am not changing the heading of this blog) and that is; often we do not have the resources to, for example, re-do our fictitious chocolate research in 2026. Thus, we may just assume that since the previous research is not that old (2025), the results may be similar.
We may also need to look at chocolate-eating habits in the South-East of England, but due to resource constraints, we will need to use the previous research and assume that people in Buckinghamshire are not that different from the rest of South- East England.
Finally, we may have wanted to phrase our questions slightly differently, but due to resource constraints, we’ll just have to deal with what we have as it at least gives us some insight into the topic we are interested in.
Understanding data isn’t about chasing a single, perfect “truth”— it’s about recognising the context that shapes every number you see. When you know who the data comes from, when it was collected, and what was actually asked, you’re already miles ahead in interpreting it meaningfully. And even when resources are limited, as they so often are, you can still make thoughtful, informed decisions by being clear about the assumptions you’re making.
0 comments
Log in to post a comment, or create an account if you don't have one already.