sigmund-fv2j-ak0acs-unsplash.jpg

Finding value in online communities and knowledge sharing

Professor of Information Systems John Zhang automates the task of untangling right or useful answers from unhelpful ones in online communities.

By Betsy Loeff

For those who seek technical know-how and those who have it, we have online knowledge communities (OKCs), forums where some folks ask questions and others answer them. The good news about OKCs is that plenty of experts respond to the questions that other people post. The bad news is that most answers never get a thumbs up for being useful or correct, so the next person who has that same question may have a tough time determining which answers in the discussion thread are helpful or right.

John Zhang, a professor of information systems, aims to help OKC users untangle right or useful answers from unhelpful ones. Working with three other researchers, he’s designed a machine learning algorithm that can predict the usefulness of solutions among an OKC’s posted answers without human intervention.

Buried treasure

For anyone who questions whether Zhang’s algorithm could be a boon to knowledge seekers, look no further than the Apple iPhone Support Community, which is one of the two communities the researchers studied. In a recent discussion, a user posted that an iPhone 8 battery randomly started draining at “a super-fast rate.” The querent wrote, “I didn’t download or change any settings. My settings are the best for battery saving. I’m hoping it’s a weird software issue! Know why this could be happening out of the blue?”

To that Dec. 9, 2020, question, 179 others in the community had posted their thoughts on the matter by Jan. 19, 2021. Another 536 people had clicked the “I have this question, too” button. Lots of answers to the question contain advice and step-by-step instructions. Two were labeled “helpful,” and that’s rare. According to Zhang, only 27% of threads contain useful feedback in this particular online community. An even smaller number — 10% of threads ± contain usefulness feedback in the Oracle OKC, which is the other community Zhang et al. investigated.

Often, when you do a Google search on some question, the top few search results take you to some kind of discussion forum. Once you get there, you have to read through the entire discussion thread to find correct answers or useful solutions to your questions. That can be a heavy lift for end-users.

To lighten the information overload, Zhang and his team created an all-new approach to creating a model that can pull the right answers and useful information to the top of the search results. The team used a text-analysis framework based on the knowledge adoption model (KAM), which holds that a person’s perception of information usefulness rests on two primary features: argument quality and source credibility or the perceived expertise of the person who posted the answer. KAM also maintains that the more helpful information is perceived to be, the more likely others will make use of it.

In sorting query responses, some prior researchers had often used computational models that this research team called ad hoc. For instance, some researchers used only word relevance of question-answer pairs. Or, as Zhang notes, “In the past, people have used the order of the answer to predict whether it will be useful to the question.” He also says that research found this approach worked well for fact-based questions and answers, like “What’s the capitol of Arizona?” It doesn’t work as well for more complicated questions or those that require specific knowledge.

Bragging rights

The knowledge adoption model Zhang and his team used allowed the researchers to extract a comprehensive set of features from the posts and then feed those features into a classification algorithm to predict the probability of usefulness an answer will have. The list of features the researchers used included things like the appropriateness of the answer length, the relevancy of the answer, its accuracy, and its clarity, i.e., how easy was it to understand. Beneath each of these features is a long list of metrics the research team evaluated.

As noted, the team also looked at source credibility, which knowledge community users can judge by the question respondent’s reputation.

How do we measure reputation? We can look at whether this person has answered other questions correctly. Or we can view user profiles to see the person’s expertise.

Some sites also award badges or points that recognize participation, good feedback, and other tools to signal proficiency.

To evaluate these two potential predictors of information usefulness — message quality and source credibility — Zhang and his colleagues used a crawler to gather data from the Oracle sites and the iPhone Support Community. “These communities are some of the top communities in their industries,” Zhang says. Unlike corporate intranets, they’re also public, so the team had access to the data.

It was a complicated data collection effort, though. The researchers couldn’t just set crawlers loose and assume what they gathered was correctly labeled because too much of it wasn’t labeled at all. “We used helpers to label whether solutions were useful because only a small percentage of users who ask a question come back to label the answers helpful or correct,” Zhang explains. A significant number of answers have no feedback, meaning that even great answers often fail to get a thumbs up, and those expert helpers the researchers hired rectified these errors of omission.

Then, the researchers parsed out the features related to argument quality and source credibility for each question-answer pair, split the data into two datasets, and applied a model to classify answer usefulness.

As a baseline, the team first ran their data through a traditional text-categorization approach that relied only on lexical terms in constructing feature sets. The models using the KAM-based approach worked more accurately, and the rigorous analysis the researchers performed provided what they called “strong support of argument quality and source credibility in determining information usefulness levels.” The research team also found that source credibility was the strongest predictor of right or helpful answers.

Widely useful tool

Zhang sees this research as being applicable in several ways. Companies with online knowledge communities could use it to make their sites more helpful and easier to use for community participants. Intranet designers could use it, too. Many large organizations have knowledge-sharing platforms that can be as unwieldy as large public sites.

Also, this research could be used for detecting false narratives on financial news sites such as those employed in “pump-and-dump schemes” in which bad actors buy a stock when the price is low, publish news, or spread rumors to get investors to buy in and then dump the stock after the price rises. In 2017, the U.S. Securities and Exchange Commission uncovered “more than 450 problem articles, of which more than 250 falsely said the writers were not being paid,” according to coverage at the time by Reuters.

Chatbots, such as those you might encounter when calling your insurance company or cable provider, are another possible use for Zhang’s KAM-based algorithms. “When you talk to a chatbot in a phone call or type in a question online, the bot needs to search the entire knowledge base at the back end of its system and then provide an answer to your question. You want that bot to return the correct answer or a useful answer to that question.”

Finally, this research has value for anyone who often winds up in knowledge communities seeking answers and help. If there’s a long string of responses to a query, look at source credibility. It’s the most powerful predictor of answer correctness or helpfulness. So, even if the OKC you’re wading through hasn’t taken advantage of Zhang’s study, the job title, experience, and points and badges a respondent can boast may be a pretty good indicator that you’ve found the answer you seek.