主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2021, Vol. 29 ›› Issue (3): 90-99.doi: 10.16381/j.cnki.issn1003-207x.2019.1841

• Articles • Previous Articles     Next Articles

Identification and Classification for Risk Paths in the Context of Cross-Border Important Data Flow

LI Jin1,2, SHEN Su-hao1, SUN Xiao-lei2, XING Xiao3   

  1. 1. School of Economics and Management, Xidian University, Xi'an 710071, China;
    2. Institute of Science and Development, Chinese Academy of Sciences, Beijing 100190, China;
    3. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
  • Received:2019-11-14 Revised:2020-03-04 Online:2021-03-20 Published:2021-04-02

Abstract: With the development of information technologies, such as artificial intelligence, big data, and cloud computing, massive data areexplosively produced and collected. The global economy and collaboration have also initiated a large scale of cross-border data flow.The data transmission potentially raises risks and challenges for data security and national security. The identification and classification of risk paths work as an important component in early warning management for the cross-border flow of important data. However, previous researches focus more on the regulation and policy suggestions. There are few researches on the risk management from a quantitative perspective, and relatively few on the early warning management for illegal transmission of important data.
Based on the complex network theory, the cross-border flow of important data are studied. First, the binary network model, including two types of nodes for data senders and receivers, is employed to simulate the cross-border data flow network. Second, the associated network is established by the common neighbor structure in the binary network. Meanwhile, the associated network can also reflect the data flow mechanism based on its transmission tendency. Third,the risk paths for important data flows across borders can be identified through the constructed binary network and its associated network. The destination risk path(DRP) algorithm, incorporating the network structure, node attribute, and data transmission frequency, is also designed to calculate the risk value for each risk path.
By collecting the cross-border data flow from an important industry in China, empirical analyses are conducted to detect the performance of proposed methods. Risk paths are empirically identified and risk values are obtained through the above methods. Using AUC as the criterion, the comparison results indicate that our proposed DRP algorithm performs better in link prediction than those algorithms in previous literatures, such as common neighbors, Jaccard, Sørensen, and potential link prediction, etc. Furthermore, the risk classification is also provided towards an efficient data flow monitoring and management. Considering the effects of network size and node attribute, a series of robustness checks are also conducted to support the main findings.
This paper focuses on the risk management issues emerging in the cross-border flow of important data. The methodology framework proposed in this paper can be widely used by different important industries,and benefits regulatory authorities to accurately identify and classify the potential risks existing in the cross-border data flow. A quantitative method is provided for the early warning management, effectively reducing the related risk, and furtherly improving the data governance capacity.

Key words: important data, cross-border flow, risk identification, risk management, early warning management

CLC Number: