Language Analysis Will Soon Make the Internet Much Less Anonymous

By Mark Shrayber | August 6, 2015 | 7:00pm

Latest

If you thought the internet was the last safe place on earth to make idle threats about killing people in between South Park reruns and games of Words With Friends, forensic linguists have some bad news for you: they’re making it harder and harder for trolls to hurt others anonymously.

It’s clear that social media isn’t going anywhere and that means both criminal and civil litigation may soon be featuring comments, tweets, and Facebook status updates (not to mention Tumblr posts) as evidence more often than ever before. And, as anyone who’s been on the internet for more than a few days knows, all of these things are easy to fake or “hack.” That’s why forensic linguists are being brought in to look at evidence that’s made up of 140 characters to determine whether it’s legitimate and/or whether it was actually written by the party it’s attributed to.

Most Popular

CBS has posted a deep dive into the world of forensic linguistics which covers how linguists used their considerable skills in court cases in the past and how they’re doing it now. Like the way an investigator discovered that a threatening text sent by someone accused of committing murder may not have been sent by them at all. Or the way another investigator figured out that a misspelled ransom note was actually written by someone from a very specific area of Ohio—they used the regional term “devil strip”—that was faking poor writing abilities to cover their tracks.

Now, linguists are creating profiles, putting together databases of text messages to determine patterns, and producing maps of regional terms in order to make identifying cyber criminals (as well as cyber bullies) much easier.

From CBS:

Tim Grant, the director of the Centre for Forensic Linguistics at Aston University in the U.K. and vice president of the International Association of Forensic Linguists (IAFL), is compiling a database of text messages.

His colleague at Aston, Jack Grieve, is building a collection of tweets to research regional language patterns in the U.S. He creates maps of words, such as swear words, to show where they are more or less common. Linguists can use that information to better predict where unknown authors — of tweets and other media — are from, like Shuy did with “devil strip.”

Of course, forensic linguistics is not an exact science and bringing computers in to run analyses may not always be a perfect way to determine if someone has committed a crime—CBS reports that critics are concerned digital programs can’t pick up the nuances of language—but experts are hoping to get ahead of the social media-based evidence cases by putting themselves in good condition to handle it when the courts are overflowing with exhibits ranging from text convos to Grindr messages (hey, Grindr’s already been on Judge Judy).

The most concerning thing for the common citizen not out there trying to plan a murder is that writing pattern analysis and coding of regional words once again suggests that none of us are really that unique in our communication style. But hey, as long as you’re just tweeting about your love of broccoli or how much you’ve always enjoyed Huey Lewis (and The News too), you’ve probably got nothing to worry about.

Contact the author at [email protected].

Image via Shutterstock