In retinoblastoma, accurate segmentation of ocular structure and tumor tissue is important when working towards personalized treatment. This retrospective study serves to evaluate the performance of multi-view convolutional neural networks (MV-CNNs) for automated eye and tumor segmentation on MRI in retinoblastoma patients. Forty retinoblastoma and 20 healthy-eyes from 30 patients were included in a train/test (N = 29 retinoblastoma-, 17 healthy-eyes) and independent validation (N = 11 retinoblastoma-, 3 healthy-eyes) set. Imaging was done using 3.0 T Fast Imaging Employing Steady-state Acquisition (FIESTA), T2-weighted and contrast-enhanced T1-weighted sequences. Sclera, vitreous humour, lens, retinal detachment and tumor were manually delineated on FIESTA images to serve as a reference standard. Volumetric and spatial performance were assessed by calculating intra-class correlation (ICC) and dice similarity coefficient (DSC). Additionally, the effects of multi-scale, sequences and data augmentation were explored. Optimal performance was obtained by using a three-level pyramid MV-CNN with FIESTA, T2 and T1c sequences and data augmentation. Eye and tumor volumetric ICC were 0.997 and 0.996, respectively. Median [Interquartile range] DSC for eye, sclera, vitreous, lens, retinal detachment and tumor were 0.965 [0.950–0.975], 0.847 [0.782–0.893], 0.975 [0.930–0.986], 0.909 [0.847–0.951], 0.828 [0.458–0.962] and 0.914 [0.852–0.958], respectively. MV-CNN can be used to obtain accurate ocular structure and tumor segmentations in retinoblastoma.