29 results found
Bakoben M, Bellotti A, Adams N, Identification of credit risk based on cluster analysis of account behaviours, Journal of the Operational Research Society, ISSN: 0160-5682
Assessment of risk levels for existing credit accounts isimportant to the implementation of bank policies and offeringfinancial products.This paper uses cluster analysis of be-haviour of credit card accounts to help assess credit risk level.Account behaviour is modelled parametrically and we thenimplement the behavioural cluster analysis using a recentlyproposed dissimilarity measure of statistical model parameters.The advantage of this new measure is the explicit exploitationof uncertainty associated with parameters estimated fromstatistical models.Interesting clusters of real credit cardbehaviours data are obtained, in addition to superior predictionand forecasting of account default based on the clusteringoutcomes.
Ye H, Bellotti A, 2019, Modelling Recovery Rates for Non-Performing Loans, RISKS, Vol: 7, ISSN: 2227-9091
This paper formulates a protocol for prediction of packs, which is a special case of on-line prediction under delayed feedback. Under the prediction of packs protocol, the learner must make a few predictions without seeing the respective outcomes and then the outcomes are revealed in one go. The paper develops the theory of prediction with expert advice for packs by generalising the concept of mixability. We propose a number of merging algorithms for prediction of packs with tight worst case loss upper bounds similar to those for Vovk’s Aggregating Algorithm. Unlike existing algorithms for delayed feedback settings, our algorithms do not depend on the order of outcomes in a pack. Empirical experiments on sports and house price datasets are carried out to study the performance of the new algorithms and compare them against an existing method.
Tobback E, Bellotti T, Moeyersoms J, et al., 2017, Bankruptcy prediction for SMEs using relational data, Decision Support Systems, Vol: 102, Pages: 69-81, ISSN: 0167-9236
Bankruptcy prediction has been a popular and challenging research area for decades. Most prediction models are built using financial figures, stock market data and firm specific variables. We complement such traditional low-dimensional data with high-dimensional data on the company's directors and managers in the prediction models. This information is used to build a network between small and medium-sized enterprises (SMEs), where two companies are related if they share a director or high-level manager. A smoothed version of the weighted-vote relational neighbour classifier is applied on the network and transforms the relationships between companies into bankruptcy prediction scores, thereby assuming that a company is more likely to file for bankruptcy if one of the related companies in its network has already failed. An ensemble model is built that combines the relational model's output scores with structured data and is applied on two data sets of Belgian and UK SMEs. We find that the relational model gives improved predictions over a simple financial model when detecting the riskiest firms. The largest performance increase is found when the relational and financial data are combined, confirming the complementary nature of both data types.
Bellotti A, 2017, Reliable region predictions for automated valuation models, Annals of Mathematics and Artificial Intelligence, Vol: 81, Pages: 71-84, ISSN: 1012-2443
Accurate property valuation is important for property purchasers, investors and for mortgage-providers to assess credit risk in the mortgage market. Automated valuation models (AVM) are being developed to provide cheap, objective valuations that allow dynamic updating of property values over the term of a mortgage. A useful feature of automated valuations is to provide a region of plausible price estimates for each individual property, rather than just a single point estimate. This would allow buyers and sellers to understand uncertainty on pricing individual properties and mortgage providers to include conservatism in their credit risk assessment. In this study, Conformal Predictors (CP) are used to provide such region predictions, whilst strictly controlling for predictive accuracy. We show how an AVM can be constructed using a CP, based on an underlying k-nearest neighbours approach. Time trend in property prices is dealt with by assuming a systematic effect over time and adjusting prices in the training data accordingly. The AVM is tested on a large data set of London property prices. Region predictions are shown to be reliable and the efficiency, ie region width, of property price predictions is investigated. In particular, a regression model is constructed to model the uncertainty in price prediction linked to property characteristics.
Dirick L, Bellotti T, Claeskens G, et al., 2016, Macro-economic factors in credit risk calculations: including time-varying covariates in mixture cure models, Journal of Business & Economic Statistics, ISSN: 1537-2707
The prediction of the time of default in a credit risk setting via survival analysis needs to take a high censoring rate into account. This rate is due to the fact that default does not occur for the majority of debtors. Mixture cure models allow the part of the loan population that is unsusceptible to default to be modelled, distinct from time of default for the susceptible population. In this paper, we extend the mixture cure model to include time-varying covariates. We illustrate the method via simulations and by incorporating macro-economic factors as predictors for an actual bank data set.
Bakoben M, Bellotti AG, Adams NM, 2016, Improving clustering performance by incorporating uncertainty, Pattern Recognition Letters, Vol: 77, Pages: 28-34, ISSN: 1872-7344
In more challenging problems the input to a clustering problem is not raw data objects, but rather parametric statistical summaries of the data objects. For example, time series of different lengths may be clustered on the basis of estimated parameters from autoregression models. Such summary procedures usually provide estimates of uncertainty for parameters, and ignoring this source of uncertainty affects the recovery of the true clusters. This paper is concerned with the incorporation of this source of uncertainty in the clustering procedure. A new dissimilarity measure is developed based on geometric overlap of confidence ellipsoids implied by the uncertainty estimates. In extensive simulation studies and a synthetic time series benchmark dataset, this new measure is shown to yield improved performance over standard approaches.
Bakoben M, Adams N, Bellotti A, 2016, Uncertainty aware clustering for behaviour in enterprise networks, 16th IEEE International Conference on Data Mining (ICDM), Publisher: IEEE, Pages: 269-272, ISSN: 2375-9232
Crook J, Bellotti T, Mues C, 2015, Feature Cluster: New Developments in Credit Risk Modelling, European Journal of Operational Research, Vol: 249, Pages: 395-396, ISSN: 1872-6860
Hon PS, Bellotti T, 2014, Models and forecasts of credit card balance, European Journal of Operational Research, Vol: 249, Pages: 498-505, ISSN: 1872-6860
Credit card balance is an important factor in retail finance. In this article we consider multivariate models of credit card balance and use a real dataset of credit card data to test the forecasting performance of the models. Several models are considered in a cross-sectional regression context: ordinary least squares, two-stage and mixture regression. After that, we take advantage of the time series structure of the data and model credit card balance using a random effects panel model. The most important predictor variable is previous lagged balance, but other application and behavioural variables are also found to be important. Finally, we present an investigation of forecast accuracy on credit card balance 12 months ahead using each of the proposed models. The panel model is found to be the best model for forecasting credit card balance in terms of mean absolute error (MAE) and the two-stage regression model performs best in terms of root mean squared error (RMSE).
Damian MS, Howard RS, Ben-Shlomo Y, et al., 2014, ICNARC STUDY OF MORTALITY IN NEUROLOGICAL PATIENTS ON ICU, Meeting of the Associatiion-of-British-Neurologists, Publisher: BMJ PUBLISHING GROUP, ISSN: 0022-3050
Bellotti T, Crook J, 2013, Forecasting and stress testing credit card default using dynamic models, International Journal of Forecasting, Vol: 29, Pages: 563-574, ISSN: 1872-8200
We present discrete time survival models of borrower default for credit cards that include behavioural data about credit card holders and macroeconomic conditions across the credit card lifetime. We find that dynamic models which include these behavioural and macroeconomic variables provide statistically significant improvements in model fit, which translate into better forecasts of default at both account and portfolio levels when applied to an out-of-sample data set. By simulating extreme economic conditions, we show how these models can be used to stress test credit card portfolios.
Bellotti T, Crook J, 2013, Retail credit stress testing using a discrete hazard model with macroeconomic factors, Journal of the Operational Research Society, Vol: 65, Pages: 340-350, ISSN: 1476-9360
Retail credit models are implemented using discrete survival analysis, enabling macroeconomic conditions to be included as time-varying covariates. In consequence, these models can be used to estimate changes in probability of default given downturn economic scenarios. Compared with traditional models, we offer improved methodologies for scenario generation and for the use of them to predict default rates. Monte Carlo simulation is used to generate a distribution of estimated default rates from which Value at Risk and Expected Shortfall are computed as a means of stress testing. Several macroeconomic variables are considered and in particular factor analysis is employed to model the structure between these variables. Two large UK data sets are used to test this approach, resulting in plausible dynamic models and stress test outcomes.
Damian MS, Ben-Shlomo Y, Howard R, et al., 2013, The effect of secular trends and specialist neurocritical care on mortality for patients with intracerebral haemorrhage, myasthenia gravis and Guillain-Barr, syndrome admitted to critical care, INTENSIVE CARE MEDICINE, Vol: 39, Pages: 1405-1412, ISSN: 0342-4642
Bellotti T, Crook J, 2012, Loss given default models incorporating macroeconomic variables for credit cards, International Journal of Forecasting, Vol: 28, Pages: 171-182
Crook J, Bellotti AG, 2012, Asset correlations for credit card defaults, Applied Financial Economics, Pages: 87-95
The capital requirements formula within the Basel II Accord is based on a Merton one-factor model and in the case of credit cards an asset correlation of 4% is assumed. In this article we estimate the asset correlation for two datasets assuming the one-factor model. We find that the asset correlations assumed by Basel II are much higher than those observed in the datasets we analyse. We show the reduction in capital requirements that a typical lender would have if the values we estimated were implemented in the Basel Accord in place of the current values.
Bellotti T, Matousek R, Stewart C, 2011, A note comparing support vector machines and ordered choice models' predictions of international banks' ratings, DECISION SUPPORT SYSTEMS, Vol: 51, Pages: 682-687, ISSN: 0167-9236
Crook J, Bellotti T, 2010, Time varying and dynamic models for default risk in consumer loans, Journal of the Royal Statistical Society Series A: Statistics in Society, Vol: 173, Pages: 283-305, ISSN: 0964-1998
Bellotti T, 2010, A simulation study of Basel II expected loss distributions for a portfolio of credit cards, Journal of Financial Services Marketing, Vol: 14, Pages: 268-277, ISSN: 1363-0539
Bellotti T, Crook J, 2009, Support vector machines for credit scoring and discovery of significant features., Expert Syst. Appl., Vol: 36, Pages: 3302-3308
Bellotti T, Crook J, 2009, Credit scoring with macroeconomic variables using survival analysis, Journal of the Operational Research Society, Vol: 60, Pages: 1699-1707
Chervonenkis A, Long PM, Liu X, et al., 2007, Discussion on Hedging predictions in machine learning by A. Gammerman and V. Vovk, COMPUTER JOURNAL, Vol: 50, Pages: 164-172, ISSN: 0010-4620
Strefford JC, van Delft FW, Robinson HM, et al., 2006, Complex genomic alterations and gene expression in acute lymphoblastic leukemia with intrachromosomal amplification of chromosome 21., Proc Natl Acad Sci U S A, Vol: 103, Pages: 8167-8172, ISSN: 0027-8424
We have previously identified a unique subtype of acute lymphoblastic leukemia (ALL) associated with a poor outcome and characterized by intrachromosomal amplification of chromosome 21 including the RUNX1 gene (iAMP21). In this study, array-based comparative genomic hybridization (aCGH) (n = 10) detected a common region of amplification (CRA) between 33.192 and 39.796 Mb and a common region of deletion (CRD) between 43.7 and 47 Mb in 100% and 70% of iAMP21 patients, respectively. High-resolution genotypic analysis (n = 3) identified allelic imbalances in the CRA. Supervised gene expression analysis showed a distinct signature for eight patients with iAMP21, with 10% of overexpressed genes located within the CRA. The mean expression of these genes was significantly higher in iAMP21 when compared to other ALL samples (n = 45). Although genomic copy number correlated with overall gene expression levels within areas of loss or gain, there was considerable individual variation. A unique subset of differentially expressed genes, outside the CRA and CRD, were identified when gene expression signatures of iAMP21 were compared to ALL samples with ETV6-RUNX1 fusion (n = 21) or high hyperdiploidy with additional chromosomes 21 (n = 23). From this analysis, LGMN was shown to be overexpressed in patients with iAMP21 (P = 0.0012). Genomic and expression data has further characterized this ALL subtype, demonstrating high levels of 21q instability in these patients leading to proposals for mechanisms underlying this clinical phenotype and plausible alternative treatments.
Bellotti T, Luo Z, Gammerman A, 2006, Reliable classification of childhood acute leukaemia from gene expression data using confidence machines., Publisher: IEEE, Pages: 148-153
Bellotti T, Luo Z, Gammerman A, 2006, Strangeness Minimisation Feature Selection with Confidence Machines., Publisher: Springer, Pages: 978-985
Bellotti T, Luo Z, Gammerman A, et al., 2005, Qualified predictions for microarray and proteomics pattern diagnostics with confidence machines., Int J Neural Syst, Vol: 15, Pages: 247-258, ISSN: 0129-0657
We focus on the problem of prediction with confidence and describe a recently developed learning algorithm called transductive confidence machine for making qualified region predictions. Its main advantage, in comparison with other classifiers, is that it is well-calibrated, with number of prediction errors strictly controlled by a given predefined confidence level. We apply the transductive confidence machine to the problems of acute leukaemia and ovarian cancer prediction using microarray and proteomics pattern diagnostics, respectively. We demonstrate that the algorithm performs well, yielding well-calibrated and informative predictions whilst maintaining a high level of accuracy.
van Delft FW, Bellotti T, Luo Z, et al., 2005, Prospective gene expression analysis accurately subtypes acute leukaemia in children and establishes a commonality between hyperdiploidy and t(12;21) in acute lymphoblastic leukaemia., Br J Haematol, Vol: 130, Pages: 26-35, ISSN: 0007-1048
We have prospectively analysed and correlated the gene expression profiles of children presenting with acute leukaemia to the Royal London and Great Ormond Street Hospitals with morphological diagnosis, immunophenotype and karyotype. Total RNA extracted from freshly sorted blast cells was obtained from 84 lymphoblastic [acute lymphoblastic leukaemia (ALL)], 20 myeloid [acute myeloid leukaemia (AML)] and three unclassified acute leukaemias and hybridised to the high density Affymetrix U133A oligonucleotide array. Analysis of variance and significance analysis of microarrays was used to identify discriminatory genes. A novel 50-gene set accurately identified all patients with ALL and AML and predicted for a diagnosis of AML in three patients with unclassified acute leukaemia. A unique gene set was derived for each of eight subtypes of acute leukaemia within our data set. A common profile for children with ALL with an ETV6-RUNX1 fusion, amplification or deletion of ETV6, amplification of RUNX1 or hyperdiploidy with an additional chromosome 21 was identified. This suggests that these rearrangements share a commonality in biological pathways that maintains the leukaemic state. The gene TERF2 was most highly expressed in this group of patients. Our analyses demonstrate that not only is microarray analysis the single most effective tool for the diagnosis of acute leukaemias of childhood but it has the ability to identify unique biological pathways. To further evaluate its prognostic value it needs to be incorporated into the routine diagnostic analysis for large-scale clinical trials in childhood acute leukaemias.
Luo Z, Bellotti T, Gammerman A, 2004, Qualified Predictions for Proteomics Pattern Diagnostics with Confidence Machines., Publisher: Springer, Pages: 46-51
Gammerman A, Bellotti T, 1992, Experiments Using Minimal-Length Encoding to Solve Machine Learning Problems., Publisher: IEEE Computer Society, Pages: 359-367
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.