In this paper, we show that averaging of the Vector Recovery Index (VRI) score for a test involving many images is not accurate and leads to bias. We demonstrate that the higher the difference in primitive count between the data files in an experiment, the higher the bias in calculating the VRI. Normalizing VRI scores is proposed to remove the bias and to get VRI scores that precisely reflects the performance based on images under scrutiny. Empirical performance evaluation on three datasets from the arc segmentation contests attached to International Workshops on Graphics Recognition 2005, 2009, and 2011 shows that the proposed normalization score provides accurate and realistic performance results than the unweighted average of VRI scores. The results based on the modified VRI score show that the vectorisation methods have lower performance than was usually thought.