Tag Archives: bad data

Retiring boomers are buying big houses?! Not really…

A misused statistic can spring from a simple mistake. But when the mistake confirms preconceptions or a seems unlikely it can take hold. If it does both, it can become an urban myth.

“Around one third of boomer retirees are upsizing into larger homes” fits the bill. Those boomers! forever self-indulgent and defiant of convention! Is this really happening? No.

A new study by Merrill Lynch and AgeWave surveyed 3,600 adults, a nationally representative sample that included 2,900 who were 50 or older. It found, among other things, that “pre-retirees who expect to downsize when they retire may be surprised to learn that half (49%) of retirees didn’t downsize in their last move. In fact, three in ten upsized into a larger home.” (Another 19% moved to a home of the same size.)

XXX WELLBEINGCOV_01.JPG D FEA USA VA

See the difference? The report cites “retirees … in their last move.” Only retirees who actually moved are included. From there it’s a big jump to “a third of boomer retirees,” as on Tweet declared. Or “Why many retirees are upsizing into larger homes,” as one headline put it.

Census Bureau data for 2009-13 shows that only about 7% of people over 50 move in a given year. It’s not surprising. About 80% are homeowners, many with paid-off mortgages, longtime community ties and family nearby. Among people over 60 who own their homes, fully 45% haven’t moved in at least a decade, according to Census data analyzed through the University of Minnesota’s IPUMS archive.

The study itself carries facts that belie the idea that a third of retirees are upsizing:

— A third of surveyed retirees have no plans to move at all in retirement.

— Households with people over 55 account for just under half (47%) of home renovation spending, and about a third of retiree renovators cited adding an office, upgrading a kitchen or bathroom or “improving curb appeal.”

–Paul Overberg

 

FacebookTwitterGoogle+EmailShare

Torturing the data till it lies

“Top 10 states for left-handers!” “Worst states for tall people” “Best country to travel to if you are 45!”

The Web is ripe with news features like this. Recipe: Assemble a basket of social measures for states or nations. Blend, rank and present as a measure of some condition. They are usually built as galleries of images or pages. Even as a reward for multiple clicks, they rarely offer a reader-friendly at-a-glance list.

The biggest problem with rankings like this: They use grouped data to conclude something about experiences that are much more tightly linked to local and personal factors.

This is the ecological fallacy. Put simply, you often can’t infer something about individuals because you have data about a group of them. This is especially true if the link that’s being claimed is barely plausible.

state_oatmeal

A simple and famous example: In the 1930 Census, a strong correlation existed between states’ English literacy rates and their shares of foreign-born people. But were immigrants more likely to be literate in English than native-born Americans? No. Census data for individuals showed the opposite, of course – Immigrants were less likely than natives to be literate in English. But immigrants had clustered in states with relatively high literacy rates, so grouped data made them seem more literate than natives.

Another example: In the presidential election of 1968, segregationist George Wallace won the electoral votes of AL, AR, GA, LA and MS. These states had the highest rates of black voters. Should we conclude that blacks voted strongly for Wallace?

States – diverse collections of people acting through laws and policies – exert little or no effect on many conditions in daily life, such as crime. And most social conditions vary within a state far more than they do among states. Data journalists spend a lot of time and sweat trying to get this right by collecting *local* crime rates or  student-pupil ratios before they start probing for patterns.

There are legitimate times to rank states, most obviously on something the state government itself can affect directly, like the climate for startup businesses or the strength of consumer protection laws.

And USA TODAY, has run such lists from content partners. They can be fun, clickable lists. But they really don’t tell us anything about ourselves.

So if your state ranks low as a place to be a coin collector or a Chevy driver, don’t fret.

–Paul Overberg

That stomach-sinking feeling when data is wrong

That nagging feeling that something is wrong with the data? Listen to it.

When my colleague Marisol Bello asked whether we could figure out how often parents kill their children – she was reporting in the wake of a high-profile case in Georgia last summer – I knew we could probably find some help in the FBI’s supplemental homicide reports, which include victim/suspect relationship details. I ran the queries and came up with some preliminary figures, but I also knew the SHR was notoriously spotty, because many cities fail to provide details on murders.

So I started looking for other research on the topic, eventually digging up what can seem to be the gold standard of analysis: a piece co-authored by an Ivy League researcher, and published in a peer-reviewed journal. The only problem? The researchers had found six times as many filicides each year as I did in the FBI data.

I contacted the researchers. They hadn’t used the data directly from the FBI, but rather had used cleaned-up figures publically available from James Alan Fox and Marc Swatt of Northeastern University. That must account for the differences, they said.

But something kept bothering me: According to the researchers’ findings, 3,000 children each year were killed by their parents. Keep in mind that there are roughly 16,000 homicides each year. That would mean that 20% of all victims were children killed by a parent or stepparent. I covered the cops beat for the first four years of my career – I thought back to all the gang battles, lovers’ quarrels and drug deals gone wrong. I could count on one hand the number of child/parent murders I had seen. It certainly wasn’t anywhere near 20%.

So I followed the researchers’ lead, downloaded Fox and Swatt’s data and opened it in SPSS. It didn’t take long to realize each case number, which was supposed to be a unique ID, was in the file six times. A phone call to Fox, who walked me through the data, revealed the researchers’ mistake. Unlike the raw FBI file, Fox and Swatt’s dataset is built for advanced statistical analysis – it has multiple imputations to allow academic researchers to fill in holes that we know exist in FBI data, either where cases are missing entirely or where certain details (the relationship between victim and killer, for instance) aren’t included. Each killing was broken into six lines: the original record, and five different imputations with different weights applied and missing values filled in.

Fox walked me through a way to properly weight cases (his data set includes separate weights for national analysis and state-by-state analysis) and how to properly fill in gaps where relationship details weren’t provided in the raw data.

The upshot? I found that on average, about 450 children are killed by a parent or stepparent each year.

Brown University has since issued a correction to their press release on the researchers’ findings. Marisol and I used the data and our findings in a were published in USA TODAY, along with another follow-up story published later.

(a version of this post was published on the American Press Institute’s Fact-Checking Project blog)

–Meghan Hoyer

It’s not so easy to count mass killings

Two years ago this month, a particularly horrific mass killing took place in a Newtown, Conn., school. We’re still struggling with how we cover these crimes:

– In September, an FBI report on so-called “active shooter” cases was widely misreported to show that “mass shootings” were increasing. A federal definition shared by several agencies defines “active shooter” as “an individual actively engaged in killing or attempting to kill people in a confined and populated area.” For its report, the FBI make two tweaks: To include cases where more than one person was shooting and to drop “confined” to include outdoor events. It’s worth pointing out that many of these cases don’t meet the FBI’s definition of a mass killing: four or more dead, not including killer(s).

The FBI just wanted tactical insight, so it also excluded whole categories of potentially qualifying events: “Specifically, shootings that resulted from gang or drug violence—pervasive, long-tracked, criminal acts that could also affect the public — were not included in this study. In addition, other gun-related shootings were not included when those incidents appeared generally not to have put others in peril (e.g., the accidental discharge of a firearm in a school building or a person who chose to publicly commit suicide in a parking lot).”

MK_chart

Wrote the authors: “The study does not encompass all mass killings or shootings in public places and therefore is limited in its scope.”

Resulting headlines at major news organizations: “Mass Shootings on the Rise, FBI Says”;  “Mass Shootings on the Rise, New FBI Study Shows”; “FBI: Mass shooting incidents occurring more frequently”; “FBI study: Deaths in mass shootings increasing.” (Search users beware: A recent check finds many still uncorrected stories on the Web.)

– We continue to update USA TODAY’s interactive graphic of mass killings, published a year ago. We use the FBI’s definition. This year we’ve counted 23, a bit below the average of 30 from 2006-13. (More on that in a minute.) Most weren’t big news outside the towns where they occurred. Typically, they involved a man targeting family members and acquaintances. Most involved guns but a handful did not. None involved semiautomatic rifles, although police haven’t revealed the weapon in a few cases. Almost half of the suspects were found dead or were killed by police. In all, 104 victims were killed, 12 more wounded. Continue reading