主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2018, Vol. 26 ›› Issue (1): 170-178.doi: 10.16381/j.cnki.issn1003-207x.2018.01.017

• Articles • Previous Articles     Next Articles

Count Judgment Decision Support System Based on Text-mining and Machine Learning

ZHU Qing1,2,4, WEI Ke-zhen1,2,4, DING Lan-lin5, LI Jian-qiang1,2,3   

  1. 1. International Business School of Shaanxi Normal University, Xi'an 710119, China;
    2. Institute of Cross-Process Perception and Control of Shaanxi Normal University, Xi'an 710119, China;
    3. Department of Management Sciences, City University of Hong Kong, Hong kong, China;
    4. Management School of xi'an Jiaotong University, Xi'an 710049, China;
    5. School of Economics and Finance of Xi'an Jiaotong University, Xi'an 710049, China
  • Received:2016-07-18 Revised:2017-01-02 Online:2018-01-20 Published:2018-03-19

Abstract: In many other countries with the continental legal system, the constant generation of new legal relationships makes, the defect of statute law which is unable to be timely formulate and modify gradually become obvious. As the number of dispute lawsuit rapidly grows, many countries in the world face the problem how to improve the efficiency of the judicial system under the premise of guaranteeing the quality of the trial. Therefore, in addition to reforming the system, the decision support system will effectively improve judicial decisions.
In this paper, medical damage judgment documents in China are taken as example, and a court judgment decision support system (CJ-DSS) is proposed based on text mining and the automatic classification technology. The system can predict the trail results of the new lawsuit texts according to the previous cases verdict:rejected and no rejected. By combining different feature extraction methods (DF, Chi-square and DF-CHI feature combination extraction method) and classifiers (SVM, ANN and KNN), multiple combinations that meet the expected performance as the base learning machines are selected. Based on the theory of Delphi Method, integrated learning is used to predict new cases. Integrated learning refers to constructing a new model and using the prediction result of base learning machines that have met expectations as input after proper training, and finally outputting a prediction result with maximum probability through linear or non-linear calculations.
At the same time, by combining with real cases, it is found that the combination feature extraction method can indeed improve the classifier's performance, especially for SVM, ANN and KNN classifiers. In addition, the system classification performance became more consistent after integrated learning. The best performance reached 93.3%, which significantly increased system accuracy.
This paper's data source is the "BeiDaFaBao" legal database. "Medical malpractice" is used as the keyword and more than 300 court verdict and mediation documents from 2013 are retrieved. Due to the short format of mediation documents and its brief case explanations, they are eliminated from the study. The rest of the documents are trained and tested after preprocessing.
In previous studies, the accuracy of text classification system has been greatly influenced by the training set size:the larger the training set data, the better the performance. This paper has a reference value for constructing structured high-performance system based on a small sample training set in the future. Meanwhile, since the process of labelling documents is costly, therefore, the study and model construction for unlabeled text should be the focus of future research for data scientists.

Key words: text-mining, automatic text classification, decision support system, CJ-DSS

CLC Number: