主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2021, Vol. 29 ›› Issue (5): 34-44.doi: 10.16381/j.cnki.issn1003-207x.2019.1554

• Articles • Previous Articles     Next Articles

A Loan Credit Risk Model Incorporating Text Prior Information

WANG Xiao-yan1, ZHANG Zhong-yan1, MA Shuang-ge2   

  1. 1. College of Finance and Statistics, Hunan University, Changsha 410079, China;
    2. School of Public Health, Yale University, CT 06511, USA
  • Received:2019-10-09 Revised:2020-01-22 Online:2021-05-20 Published:2021-05-26

Abstract: Loan is not only an important way to solve the shortage of finances, but also an important business of financial institutions. The management of loan credit risk is quite essential for the survival and development of those institutions. A key step to control the credit risk is to identify the factors having significant effect on the default. In the existing studies, there may be much valuable and important information which can benefit our study. To incorporate the information, a loan credit risk evaluation model named PIPL is constructed in this study. It first searches text prior information from existing literatures via text mining techniques, obtaining the prior frequency for the credit risk indexes which indicates their importance. Then a penalized variable selection method is used to transfer the prior frequency into a prior response, which realizes the transformation from qualitative information to its quantitative counterpart. Finally the loss function of the proposed model is constructed by weighting the prior response and original observations. To achieve risk index selection, an Elastic net method is adopted. To estimate the parameters, an iteratively reweighted least squares method and a coordinate descent algorithm are used.
Simulation study is developed to verify the validity of the proposed PIPL model. Especially, various types of prior information with different extend of quality are set in the simulation, which can examine the model's utilization of the good information and the robustness to the bad information. The result shows that PIPL model can adaptively adjust the quality of prior information. When the information is of high quality, PIPL improves the weight of prior information in the model and then enhances the model's performance in terms of index selection and classification. When it lacks reliability, PIPL can adaptively reduce the weight of prior response, presenting some extend of robustness on classification.
In the empirical analysis, 123 literatures about credit risk are mined from the CNKI. Taking P2P data from Lending Club as an example, the analysis shows that PIPL model can enhance the classification accuracy and present satisfactory robustness. Both simulation and empirical study show the reasonability of the new model. It may have some practicability in the financial risk management.

Key words: text information, Logistic regression, Elastic Net, loan credit risk

CLC Number: