“…Equation ( 15) takes similar form as Equation ( 11). Since we have already learned Q (s t , q 1,t , a t ) and Q (s t , q 2,t , a t ), and Q q1,t∧q2,t (s t , q t , a t ) is nonzero only when there are states s t where D q1,t φ1 ∧ D q2,t φ2 is true, we should obtain a good initialization of Q (s t , q t , a t ) by adding Q (s t , q 1,t , a) and Q (s t , q 2,t , a t ) (similar technique is adopted by Haarnoja et al [2018]). This addition of local Q functions is in fact an optimistic estimation of the global Q function, the properties of such Q-decomposition methods are studied by Russell and Zimdars [2003].…”