When dealing with predictive modeling of credit-granting, different types of attributes are used: Cadastral, Behavioral, Business / Proposal, Credit Bureaux, in addition to Public, Private or Subsidiaries Sources. The Postal Address Code (Código de Endereçamento Postal CEP in Portuguese) in Brazil, in particular, has a unique contribution capacity (uncorrelated with most other attributes in general) and reasonably good predictive power. CEP is frequently used by truncating its numeric representation, considering the first d digits, for example. In this report, a preliminary methodology is proposed, aiming to elaborate clustering sets of CEPs by considering the information of clients' defaults over a period of time. Additionally, we tested the number of clusters obtained using the Information Value criterion. Promising solutions are obtained using statistical and optimizing approaches. Other methodologies are suggested and could be complementary with the principal methodology proposed.
When dealing with predictive modeling of credit-granting, different types of attributes are used: Cadastral, Behavioral, Business / Proposal, Credit Bureaux, in addition to Public, Private or Subsidiaries Sources. The Postal Address Code (Código de Endereçamento Postal CEP in Portuguese) in Brazil, in particular, has a unique contribution capacity (uncorrelated with most other attributes in general) and reasonably good predictive power. CEP is frequently used by truncating its numeric representation, considering the first d digits, for example. In this report, a preliminary methodology is proposed, aiming to elaborate clustering sets of CEPs by considering the information of clients' defaults over a period of time. Additionally, we tested the number of clusters obtained using the Information Value criterion. Promising solutions are obtained using statistical and optimizing approaches. Other methodologies are suggested and could be complementary with the principal methodology proposed.
When dealing with predictive modeling, focusing on the financial segment, risk management, and credit-granting (Application Scores), different types of attributes are used: Cadastral, Behavioral, Business / Proposal, Credit Bureaux, in addition to Public, Private or Subsidiaries Sources. Within the universe of cadastral attributes, examples such as Age, Income, Education, Profession, and Home or Work Address are often eligible as covariates of great discriminatory power. The Postal Address Code (Código de Endereçamento Postal CEP in Portuguese) in Brazil, in particular, has a unique contribution capacity (uncorrelated with most other attributes in general) and reasonably good predictive power (IV -Information Value). CEP is frequently used by truncating its numeric representation, considering the first d digits, for example.On the other hand, when using five digits, the location is narrowed more, and a smaller number of records will present these values (low representativeness). The question: How to best use it, and what controls should be applied? CEP is not a value between 01000-000 and 99999-999, but it is not a discrete or continuous quantitative variable but a categorical one. What is more, if we consider how it is distributed geographically, we can consider it ordinal. However, their ordering is not direct, increasing, but in a snail shape, making a clear grouping strategy difficult. For this reason, when treated as a nominal category, its stability over time falls considerably, de-calibrating the model and decreasing its useful life. In this report, a preliminary methodology is proposed, aiming to elaborate clustering sets of CEPs by considering the information of clients' defaults over a period of time. Additionally, we tested the number of clusters obtained using the Information Value criterion. Promising solutions are obtained using statistical and optimizing approaches. Additionally, other methodologies are suggested and could be complementary with the principal methodology proposed. Final remarks and suggestions are also included.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.