That stomach-sinking feeling when data is wrong

That nagging feeling that something is wrong with the data? Listen to it.

When my colleague Marisol Bello asked whether we could figure out how often parents kill their children – she was reporting in the wake of a high-profile case in Georgia last summer – I knew we could probably find some help in the FBI’s supplemental homicide reports, which include victim/suspect relationship details. I ran the queries and came up with some preliminary figures, but I also knew the SHR was notoriously spotty, because many cities fail to provide details on murders.

So I started looking for other research on the topic, eventually digging up what can seem to be the gold standard of analysis: a piece co-authored by an Ivy League researcher, and published in a peer-reviewed journal. The only problem? The researchers had found six times as many filicides each year as I did in the FBI data.

I contacted the researchers. They hadn’t used the data directly from the FBI, but rather had used cleaned-up figures publically available from James Alan Fox and Marc Swatt of Northeastern University. That must account for the differences, they said.

But something kept bothering me: According to the researchers’ findings, 3,000 children each year were killed by their parents. Keep in mind that there are roughly 16,000 homicides each year. That would mean that 20% of all victims were children killed by a parent or stepparent. I covered the cops beat for the first four years of my career – I thought back to all the gang battles, lovers’ quarrels and drug deals gone wrong. I could count on one hand the number of child/parent murders I had seen. It certainly wasn’t anywhere near 20%.

So I followed the researchers’ lead, downloaded Fox and Swatt’s data and opened it in SPSS. It didn’t take long to realize each case number, which was supposed to be a unique ID, was in the file six times. A phone call to Fox, who walked me through the data, revealed the researchers’ mistake. Unlike the raw FBI file, Fox and Swatt’s dataset is built for advanced statistical analysis – it has multiple imputations to allow academic researchers to fill in holes that we know exist in FBI data, either where cases are missing entirely or where certain details (the relationship between victim and killer, for instance) aren’t included. Each killing was broken into six lines: the original record, and five different imputations with different weights applied and missing values filled in.

Fox walked me through a way to properly weight cases (his data set includes separate weights for national analysis and state-by-state analysis) and how to properly fill in gaps where relationship details weren’t provided in the raw data.

The upshot? I found that on average, about 450 children are killed by a parent or stepparent each year.

Brown University has since issued a correction to their press release on the researchers’ findings. Marisol and I used the data and our findings in a were published in USA TODAY, along with another follow-up story published later.

(a version of this post was published on the American Press Institute’s Fact-Checking Project blog)

–Meghan Hoyer

FacebookTwitterGoogle+EmailShare

3 thoughts on “That stomach-sinking feeling when data is wrong

  1. Hey There. I found your blog using msn. This is a really well written article.
    I will make sure to bookmark it and return to read more of your useful information. Thanks for the post.
    I’ll definitely comeback.

  2. Hi! This is my 1st comment here so I just wanted to give a quick shout out and
    tell you I genuinely enjoy reading through your posts.
    Can you recommend any other blogs/websites/forums that cover
    the same topics? Thanks a ton!

Leave a Reply

Your email address will not be published. Required fields are marked *