How do you know if someone’s creditworthy?
An estimated 45 million Americans, most often African-American, Hispanic, and low-income, have either no credit history, or a credit history that is too thin to generate a credit score. Without a sufficient credit history, consumers face major barriers to accessing credit, or pay more for credit.
Creditworthy: Give Credit Where Credit is Due
Traditionally, lenders have relied primarily on “hard information” such as income level, assets (like property), and debt to predict creditworthiness. But sometimes that kind of hard information isn’t available, or is low quality. Additionally, these credit scoring systems don’t estimate borrowers’ probability of default—in some cases, loans with collateral can actually have higher default rates. New technologies for data generation and management offer opportunities to improve access to credit through reliance on “soft information.”
In a new paper, Wang, Drabek, and Wang, explore whether soft information can improve risk assessment and predict the probability of repayment. The researchers define soft information as “information transmitted by a selected social or psychological characteristic that captures the identity of the borrowers.” This includes demographic characteristics like age, education, gender, and race, but also even softer information, such as social networks, video interviews, profile pictures, and descriptions of prior borrowing stories.
They tested two hypotheses:
- Hypothesis 1: Credit appraisal based on appropriately selected soft information can have strong predictive power.
- Hypothesis 2: The credit predicting model can be strengthened by soft information. Soft information captures useful information that isn’t included in hard information for credit analysis.
To explore the role of soft information, the researchers used data from the Chinese P2P platform RenrenDai.com, which offers peer-based investing. Borrowers apply for microloans by filling in a loan application and publishing information online, and peer investors do the credit analysis by themselves. One loan can have several investors, and loans are used for both personal and business purposes. The authors write that because Chinese culture “favors information derived from human relationships,” this kind of rich soft information is already provided on the RenrenDai platform.
The researchers created and tested the predictive power of three different models to predict default rates: one that relied purely on hard information, one that relied on soft information, and one that relied on both. They incorporated the variables below into their models:
Model I results (Hard information only)
They found that hard financial factors didn’t predict likelihood to default on a loan very well —and some even showed opposite results to those expected. Their results hint at the need to include information outside of these factors to better predict the likelihood of defaulting.
- Defaults were higher in low-income groups and lower in high-income groups, but verified income (high or low) was a stronger predictor than unverified across income brackets
- Car ownership was an insignificant indicator of default behavior
- If the applicant had a mortgage, they were less likely to default on the P2P lending platform
Model II results (Soft information only)
An analysis of soft variables demonstrated that individuals were less likely to default on their loan if they:
- Wrote more on their loan purpose description
- Were a woman
- Had a spouse
- Had more years of education
- Had a verified mobile number
- Had a verified Weibo profile
- Did not video-verify (perhaps because this optional step was taken by borrowers who had a higher probability of default and wanted to make themselves seem more trustworthy)
Surprisingly, age was not a significant predictor in this model.
Model III results (Hard and soft information)
When hard and soft variables were analyzed together, the significance and direction of all variables remained consistent with models I and II, except for car ownership, which became significant.
This model also teased out differences in high income individuals—those with verified high income were less likely to default on their loans, but those with unverified high income were more likely to default. This suggests there’s a high possibility of borrowers lying about the information they provide online—much like the video verification results.
The authors used a statistical analysis called a receiver operating characteristic (ROC) curve to analyze each model, and found that model III (hard + soft info) had the highest accuracy as a default screening classifier. Between models I (hard information only) + II (soft information only), soft information variables had a stronger effect on classifying default borrowers than hard information variables. In summary:
- Social and psychological related soft information highly increased the predictive power of the credit analysis model.
- Hard and soft information combined model performed the best, but soft-only model out- performed hard-only model.
- Soft information plays a more important role when the business environment doesn’t have adequate financial-related hard information.
This research provides evidence to encourage the greater use of soft information in determining creditworthiness, especially in circumstances where hard information is missing or low-quality. Relying on soft information has the potential to improve access to credit for the 45 million Americans without a credit score, who are more often African-American, Hispanic, and low-income. However, when incorporating soft information, designers should account for possible lying behavior, ensure their models are checked for biases, and pay attention to how these factors vary across cultures.
Yao Wang, Zdenek Drabek, Zhengwei Wang, The role of social and psychological related soft information in credit analysis: Evidence from a Fintech Company, Journal of Behavioral and Experimental Economics (2021), doi: https://doi.org/10.1016/j.socec.2021.101806