Category Archives: data science

Deadly Events Are Contagious: Mass killings can trigger others

Roughly a quarter of mass killings and school shootings occur because of contagion — with one event increasing the chances of another happening — even if the suspect doesn’t consciously realize it, according a study released today.

Researchers at Arizona State University and Northeastern Illinois University used USA TODAY’s mass killings data, which tracks killings of four or more people since 2006. USA TODAY’s data, unlike other sets, include many events that do not receive extensive media coverage.

Monte Talmadge walks past a sidewalk memorial in front of the Emanuel AME Church after a mass shooting there killed nine people in June.
Monte Talmadge walks past a sidewalk memorial in front of the Emanuel AME Church after a mass shooting there killed nine people in June.

The researchers also used 15 years’ of data on school shootings, plus data about public shootings where at least three people were shot but not necessarily killed, from the Brady Campaign to Prevent Gun Violence. They fit all three data sets into a contagion model to see if one event had a ripple effect.

They found that the biggest events – mass killings and school shootings, which tended to get more publicity – had an average window of 13 days of contagion.

“What we think might be happening is the very small cross section of the public that’s vulnerable, that message gets to them and causes unconscious ideation to perhaps do something similar,” says Sherry Towers, professor of statistics at Arizona State University’s Simon A. Levin Mathematical, Computational and Modeling Sciences Center.

In the case of the recent Charleston, S.C. church shooting, there haven’t been any mass killings in the two weeks since. But there have been a rash of church arsons, which investigators are probing.

“Even though there doesn’t seem to be a contagion in this certain case, there does seem to be related events that have occurred,” Tower says.

Mass shootings with fewer than four deaths created no such contagion effect, Towers said. The researchers only looked at similar events, so they did not measure whether a mass killing or school shooting lead to an increase in single shootings, for instance.

Towers said she started wondering if school shootings triggered similar events in January 2014, after she was visiting the Purdue University campus when a student entered a campus building and shot and killed another student.

“There had just been three other school shootings in the news in the past week,” she said. “I just thought, ‘Is there something that’s causing these things to bunch together?’”

Other researchers have linked increases in suicides to high-profile suicides in similar contagion studies.

Towers says she hopes their findings will serve as a starting point for other researchers to look into the contagion effect further.

“This is a huge problem in the United States,” she says. “The chances of a person being killed in a mass killing is very low, but much higher in the US than in any other industrialized country.”

The study was published in the peer-reviewed PLOS ONE. USA TODAY’s mass killings data showed they happen every two weeks on average in the U.S. , and nearly a quarter do not involve guns. Explore USA TODAY’s map of mass killings in America

–Meghan Hoyer


Data scientists’ pay: details worth $10K

The median data analyst makes about $98,000 –including bonuses — in the U.S., according to a new salary survey from O’Reilly Media. But data people being what they are, the report includes a regression that allows anyone to compare their salary based on 27 variables from location to experience, from tools used to gender.

The survey of 816 people (about two-thirds from the U.S.) isn’t random, and the fact that it deals with data wranglers certainly caught our eye. But a survey that actually breaks down the differences among salaries really stands out.

Why aren’t more salary surveys done this way?


Salaries have been a tricky thing in the past few years, especially for journalists. Publishing salaries of state or university workers is common at news organizations. They get lots of viewers — and a lot of push back for privacy invasion.

But others have argued that knowing everyone’s salary is the only way to insure pay equity,  and that salary is based on merit not one’s ability to negotiate. It can also avoid scandals such as an $800,000 city manager in a low-income suburb of Los Angeles.

Yet even if the human resources department decided everyone’s pay should be transparent, that still doesn’t provide context — is there a good reason someone earns more?

Which is why the O’Reilly survey is important. Even with the 27 variables that contribute to salary, the regression only explains about 58 percent of the variance. Still, even the attempt to explain variance reveals some interesting findings:

  • Geography matters. Not surprisingly, data scientists in California and the Northeast make more (between $17K and $26K). But working in Texas had the second-highest boost.
  • Startups don’t pay well; neither does government. Analysts in education lowered the expected salary by $30K; start ups drop the salary about $17K.
  •  Experience counts. Every year of age and each year working with data, together adds about $2,500 to the expected salary. Using tools such as Python, Natural Language Processing, NumPy and R can *each* add $1,900 in expected salary. SQL, Python, Excel and R are the most common tools used.
  • Being female hurts. The survey shows a $13K gender pay gap among data scientists — and says no differences in tools, experience or other factors account for it. See also Wage Debate at the Oscars.

Data science –whether it’s in journalism, government contracting or elsewhere  — is a rapidly expanding field, which makes predicting salaries difficult. The O’Reilly survey may not be perfect, but it gives people real tools  to create  transparency, without invading privacy.

–Jodi Upton

Our new geek-in-chief

My data science friends were all a-buzz recently: America now has a Chief Data Scientist.

DJ Patil, a former LinkedIn chief scientist and a chaos theorist, was appointed by President Obama as top cheerleader and data policy officer for Big Data.

“One of the most awesome things for me personally,” said Patil in his post-announcement address to the STRATA conference “is how much our government has embraced data science.”

As evidence, Patil points to the dashboards the President uses, the 135,000 data sets released on, and how it all contributes to government transparency and solving our social ills.

It is indeed an exciting time. But as the new national spokesman for government data, Dr. Patil has a lot of explaining to do.

The Veterans Administration – the agency awash in charges that dozens of hospital leaders falsified wait-time data to get bonuses while veterans died – recently responded to a USA TODAY open records’ request by sending data as a jpeg, a photo format.

The government didn’t release the data. It released a picture of the data.

That doesn’t even touch the politics that prevent collecting the right data in the first place.

For 19 years, the National Rifle Association has blocked federally-funded gun research. As a result we have almost no national data with details on who is injured or killed by guns, under what circumstances, what caliber, where the gun came from, whether it was illegal, and what works to prevent gun accidents and trafficking.

When an unarmed black teenager in Ferguson, Mo was killed by a police officer, everyone wanted to know: ‘How many people have been shot by cops– black or any race?’

Good luck finding out. In spite of decades of debate, the data is not collected.

So welcome aboard, Dr. Patil. We need a chief geek. But your biggest challenge may be bureaucracy below your new boss.

–Jodi Upton