Keystroke cops: Prof outlines framework for fighting cybercrime
Assistant Professor of Information Systems Victor Benjamin has created the Darknet Identification, Collection, Evaluation with Ethics (DICE-E) framework, appropriately pronounced “dicey,” to help researchers understand and prevent cybercrime.
Chances are, cybercrime has touched your life in some way … or it will in the future. Gallup researchers found that nearly one in four Americans said they’d been victims of cybercrime in 2018, and that’s about the same number who said they’d been victimized in 2017.
The cost of this activity is staggering. In the U.S., the Council of Economic Advisers estimated losses from cybercrime to be as much as $109 billion in 2016. During 2015, IBM’s then CEO Ginni Rometty named cybercrime “the greatest threat to every profession, every industry, every company in the world.” And, according to the U.S. National Security Agency, it keeps growing every year.
Not surprisingly, corporations are ramping up efforts to thwart online criminals, and analytics is a key tool in this effort. The problem is those analytics require data — lots of data — and that’s been tough to come by in the Darknet.
To address this challenge, Assistant Professor of Information Systems Victor Benjamin has created the Darknet Identification, Collection, Evaluation with Ethics (DICE-E) framework. Appropriately pronounced “dicey,” Benjamin’s framework could help researchers understand and prevent cybercrime.
Deep, dark, and secret
To understand the value of Benjamin’s research, one first needs to understand what’s happening on the Darknet, which is a network of secret websites. Those Darknet sites generally are not indexed by traditional search engines and often serve as a platform for illegal activity, hacker communities, and a black market where things like credit card numbers, identities, drugs, or sex can be purchased.
Knowledge, the kind that enables newbie hackers to learn tricks of the trade or grab valuable source code for viruses and malware, also gets shared freely in online cybercriminal communities.
“It’s counterintuitive, but hackers often give up their attack techniques freely,” Benjamin says. “One of the ways this has been explained is that hackers try to build a reputation or credibility amongst each other.” He calls it a form of honor among thieves and adds, “After hackers build up their reputations, they start selling more advanced versions of the free assets they give away. Maybe they’ll give away a stripped-down version of some malware, virus, or hacking tool, then offer a premium version people can pay for.” It’s a business model akin to the freemium-versus-premium approach used by app developers, and it’s common on the Darknet.
According to Benjamin, gone are the days of the 1990s and early 2000s, when people hacked out of pride and showmanship. They’re no longer trying to showcase their technical savvy. Today, hacking is generally politically or financially motivated. Having a network of people paying for hacking assets and services can help finance hacker campaigns. Better yet, hackers can recruit others to assist in the online larceny or chaos.
Back on track
With all that giving, taking, and trading going on between hackers and cybercriminals, there’s plenty of data that can reveal what these bad actors are working on, thinking about, planning, and more. Since hackers share their tools, there are also insights to be gained from Darknet assets, such as the code, tutorials, and tools hackers share.
How do you collect that data so you can track and monitor cybercriminals? Until recently, Benjamin says Darknet research data collection has been done manually, leading to a quantitative analysis performed on limited snapshots of Darknet activity.
“Often, researchers were only able to look at a singular point of time and read a couple of hundred postings of content on a forum, then extrapolate and generalize their observations to describe the nature of activity happening in these communities,” he explains. There wasn’t enough data for predictive analytics, and the research did little to help prevent cybercrime itself.
“When I look at an antivirus tool, it’s as though the shooter has shot the bullet, and now you’re looking at the crime scene,” says Benjamin.
I wanted this research to go back to the shooter, not the crime scene. When we go to the cybercriminal communities, we can detect emerging threats, potential targets, and what’s being discussed. This begins to give you an idea of the probability with which companies face different risks.
Over the firewall
Benjamin’s framework enables researchers and corporate cybersecurity practitioners to move away from one-off quantitative analyses to the construction of more scalable cyberintelligence programs. His guidelines demonstrate how to identify and collect data resources that can be used in computational data modeling for actionable insights.
“A lot of hacker communities want to remain secret, and that’s where you run into some difficulties in data gathering,” Benjamin says. Those Darknet sites that don’t try to stay hidden often attract “script kiddies,” novice hackers who know just enough to grab some code and execute it without understanding what they’re doing. “The more advanced, secretive communities try not to be as open because they want to reduce the number of novice hackers in their network. They’re looking to recruit expert users.”
How do you climb over the firewalls? One way that Benjamin does it is through the snowball collection approach. This involves first going to the surface-level communities where the script kiddies play, downloading all the posts and examining them for the inevitable boasts where some hacker says, “Hey, I found a more advanced community. Check out this link.”
“If you do that long enough, that’s where the snowball collection comes in,” Benjamin explains.
Over time, people will put up hyperlinks leading to more secretive and advanced communities. Like a snowball rolling downhill, the data collection you’re doing accumulates mass as it progresses.
Snowball data collection is an automated technique, and many of the Darknet sites that researchers target have technology in place to prevent this type of large-scale automated data collection. “Some have heuristics embedded on the server software to detect a client that connects to the webserver and evaluate if it’s behaving like a human being or an automated web crawler,” Benjamin says.
A person might click through a dozen pages in a minute. A web crawler can click through hundreds. So, when conducting data collection on Darknet communities, Benjamin and his research teams set up an infrastructure consisting of multiple web crawlers using different IP addresses but acting as a single web crawler. These crawlers communicate on the back end to eliminate duplicated efforts. “It looks like several different users to the Darknet server, but from our end, the web crawlers are acting as a single entity,” he says.
Caught in the hack
Automating Darknet data collection vastly expands insight and understanding because it allows researchers to quickly see trends and activity in real-time on a wide scale. Through automation, researchers can instantly translate foreign languages — an important factor because many communities exchange information in languages other than English — as well as build databases of Darknet jargon.
“Consider the word ‘Zeus,’” says Benjamin. Typically, people associate the word Zeus with Greek mythology. In the hacker communities, it’s the name of a hacking tool.”
The speed that comes from automated data collection allowed Benjamin and his colleagues to find a hacktivist campaign called Operation Green Rights a few years ago. “This collective of hackers had begun to accuse certain corporations of causing extreme environmental damage, so the hacktivists were putting out propaganda campaigns to recruit people to help conduct attacks,” Benjamin says. “For instance, someone with no hacking skills could still contribute to a denial of service attack and cause some pain or financial loss for the target company. We were able to pick up those signals in real-time.”
Along with speed, Benjamin’s framework addresses the ethics of Darknet research. “Conceptually, you could buy Darknet tools and use them for research, but you don't want to directly support cybercrime” through payments for these assets, he explains. Worse, a lot of the hacking tools a researcher might download could be infected with viruses, a huge liability for everyone connected to the scholar’s university network.
We're researchers, but we’re directly interacting with data and communities that are highly illegal. You need to check off all the boxes to make sure the company or university systems remain secure.
Benjamin’s framework was published in MIS Quarterly, one of the premier outlets for the information systems field, and it was the first paper on Darknet research to be accepted by that prestigious publication. His landmark paper will help business practitioners and IT scholars tap into cybercriminal communities and mine out actionable intelligence for security defense.
“The data identification and collection task were two hurdles most researchers couldn’t overcome,” Benjamin says. Now, they can, and they can do so with less risk because Benjamin’s DICE-E framework makes Darknet research a lot less dicey.
Latest news
- A new chapter for Sun Devil Athletics
Sun Devil Athletics Director and two-time W. P.
- ASU AI expert recognized for impact in information systems research
Pei-yu Chen was honored for her contributions to the Management Science Journal.
- Data analytics expert receives prestigious award for dedication to information systems community
World-renowned artificial intelligence and data analytics expert Olivia Liu Sheng was honored…