Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2010
DOI: 10.1145/1835804.1835817
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing debt collections using constrained reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
53
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 54 publications
(54 citation statements)
references
References 12 publications
1
53
0
Order By: Relevance
“…For example, note that the segmentation in the initial iteration is optimized with respect to the estimation of immediate rewards, which is clearly suboptimal for estimating the long-term reward. This is also confirmed in empirical evaluation using multiple data sets, as Abe et al (2010) discuss.…”
Section: Solution Descriptionsupporting
confidence: 67%
See 2 more Smart Citations
“…For example, note that the segmentation in the initial iteration is optimized with respect to the estimation of immediate rewards, which is clearly suboptimal for estimating the long-term reward. This is also confirmed in empirical evaluation using multiple data sets, as Abe et al (2010) discuss.…”
Section: Solution Descriptionsupporting
confidence: 67%
“…We refer the reader to Abe et al (2010) for the technical details. As we mention above, we use the MDP framework (Puterman 1994, Tsitsiklis andVan Roy 1997) to formulate the process of tax collections.…”
Section: Solution Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…Along the same lines, Abe et al (2010) present a safe RL method that enforces high-level business and legal constraints during each value iteration step of the RL process, and apply their method to a tax collection optimization problem. ARL is complementary to the approaches proposed in (Abe et al, 2011;Castro et al, 2012;Moldovan and Abbeel, 2012) as it supports different types of constraints. In particular, the PCTLencoded constraints used by our approach are not specific to any domain or application like those from (Castro et al, 2012) and (Abe et al, 2011).…”
Section: Related Workmentioning
confidence: 99%
“…The existing results from this area focus on specifying bounds for the reward obtained by the RL agent or for simple measures associated with this reward (Abe et al, 2011;Castro et al, 2012;Delage and Mannor, 2010;Geibel, 2006;Moldovan and Abbeel, 2012;Ponda et al, 2013). In contrast to these approaches, ARL uses probabilistic model checking to formally establish safe AMDP policies associated with the broad range of safety, reliability and performance constraints that can be formally specified in PCTL (Hansson and Jonsson, 1994) extended with rewards (Andova et al, 2004).…”
Section: Introductionmentioning
confidence: 99%