Study on loan risk control by Markov chain and variational method

[thumbnail of WANG_Kun-Thesis_nosgnature.pdf]
WANG_Kun-Thesis_nosgnature.pdf - Submitted Version (1MB) | Preview

Wang, Kun (2023) Study on loan risk control by Markov chain and variational method. Research Master thesis, Victoria University.


It is widely acknowledged that loan companies face various risks, and there are several models available to help them analyze customer behavior and control these risks. My research focuses on building up models to predict customs behavior as well as control the risk. In particular, we answer the following questions: which kind of factors do we need to use to set up a model? What is the Markov chain model, and how to use it? How to use the variational inference method to estimate the transition matrix? We give an introduction to the research problem, highlighting its background, the significance of my research, and its objectives and outcomes in Chapter 1. The aim of our model is to overcome some common challenges encountered in the machine learning area, such as the concept drift problem and the data imbalance problem. One of the key contributions of our model is the provision of interval estimations for the coefficients within the transition matrices. This feature offers a distinct advantage over conventional models as it allows for greater flexibility in terms of coefficient selection. In Chapter 2, we conduct a literature review from various perspectives on the research problem. We show that there are many factors that could affect customs behavior, such as income, job situation, health shocks, divorce, accidents, the market value of house prices, current loan-to-value, and so on. Previous researchers used different models to analyze the risks for the loan company. We will particularly use the Markov chain model, logistic regression model, and random forest tree model in our following work. We will apply the variational inference method in the Markov chain transition matrix, which gives us more flexibility to the model. Chapter 3 focuses on developing an intelligent, machine learning-based Markov chain model to investigate loan risk and strategies for credit risk control. We reviewed the Markov chain model and the variational inference method in this chapter. We combined these two methods together and set up a new model. Our model involves the utilization of a Markov transition matrix to model state transitions of loan accounts. We optimize collection actions for each state and age of every consumer type to maximize the lender’s expected value. Additionally, we tackle the challenge of imbalanced data by employing the variational inference method and logistic regression model. This approach bridges the gap commonly found in traditional machine learning processes when dealing with imbalanced datasets, thus improving prediction accuracy and reliability. The results of this chapter have been submitted to a reputable journal. Our model offers a novel approach to credit risk management. We anticipate that our study will significantly impact credit risk management practices and lay the foundation for future advancements. Chapter 4 explores an effective approach to scheduling collection actions on consumer term-loan accounts. To achieve this, we employ a Markov decision model that facilitates efficient decision-making over time. The utilization of a Markov chain model in managing consumer loans rests upon the understanding that loan accounts naturally transition through various delinquency states over time. For example, an account in good standing remains so with timely payments but transitions to a delinquent state if payment is not received by the due date. Transition probabilities between states can be estimated using historical data, such as through maximum likelihood estimation. Our collection model gives an estimation of the profit for the loan company so can help to minimize their risk by adopting different actions at different levels. Our optimization approach takes into account the time value of money, balancing interest revenue and borrowing costs. It also ensures time consistency throughout the optimization process. We address competing risks that may arise between different account states and consider penalties that may be incurred due to late payments. Chapter 5 focuses on studying the Random Forest model and applying it to a specific data set. We reviewed the random forest model and the logistic regression model, and compared two of them. We create a scoring model aimed at predicting loan default for new applicants. We are provided with a dataset containing information about title loan customers, including their performance status (default) and other factors. Initially, we analyze the relationship between the factors and select appropriate ones. We then apply the random forest model to the data, achieving an AUC over 80% by carefully choosing the hyper-parameter combination. Finally, in Chapter 6 we give a conclusion about our research work and mentioned some future research directions in this area.

Additional Information

Master of Research

Item type Thesis (Research Master thesis)
Subjects Current > FOR (2020) Classification > 4611 Machine learning
Current > Division/Research > Institute for Sustainable Industries and Liveable Cities
Keywords loan companies; Markov chain model; logistic regression model; random forest model; machine learning
Download/View statistics View download statistics for this item

Search Google Scholar

Repository staff login