“…Initialize the Q-function hypothesis 0 Q e ← 0 repeat Examples ← ∅ Generate a starting schedule state s 0 i ← 0 repeat choose a repair operator a i at s i using a policy (e.g., ε-greedy) based on the current hypothesis ê Q implement operator a i , observe r i and the resulting schedule s i+1 i ← i +1 until schedule state s i is a goal state for j =i -1 to 0 do generate example Several incremental relational regression techniques have been developed that meet the above requirements for RRL implementation: an incremental relational tree learner TG , an instance based learner , a kernel-based method (Gärtner et al, 2003;Driessens et al, 2006) and a combination of a decision tree learner with an instance-based learner (Driessens and Džeroski, 2004). Of these algorithms, the TG is the most popular one, mainly because it is relatively easy to specify background knowledge in the form of a language bias.…”