The problem of finding a sublogarithmic time optimal parallel algorithm for 3-colouring rooted forests has been open for long. We settle this problem by obtaining an O((log log n) log * (log * n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM).Furthermore, we show that if f (n) is the running time of the best known algorithm for 3-colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O( f (n)) time with the same number of processors, on the same model.Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2-colouring and prefix summation on rooted forests can be done in O(log n) optimal time on a TOLERANT CRCW PRAM.These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N ) only, can be simulated on an n-processor TOLERANT PRAM in O(N /n) time, with no asymptotic increase in space or cost, when n = O(N/log log N ).