This work presents a three-step segmentation process based on Convolutional Neural Networks. The task is to identify the different parts of shoes from Computed Tomography scans of boxed pairs of shoes. The first step of the three-step algorithm uses a scaled-down volume image to separate the shoe material from its surroundings. The second step segments the shoe's inside volume, i.e. the space enclosed by shoe material. The third and last step splits the segmented shoe material into individual components: shoe upper material, outer and insole. The complete process employs CNNs derived from three-dimensional UNets. Residual SE UNet, Dense UNet, and Bottleneck Residual UNet are evaluated for the three steps. The architectures are modified for large receptive fields. The networks are trained and tested for each step separately and conjointly on CT scans comprising various shoe types. The test results inspire hope for using the process for automated segmentation and extraction of meshes from large batches of CT scans. In particular, the first step using a Residual SE UNet achieves an F1-score of 88.2 % for shoes and 58.9 % for the packing material. The second step segments the inside volume with an F1-score of 81.0 %. The third step segments the shoe into its components and achieves an F1-score for insole of 79.5 %, outer sole of 88.7 % and upper material of 81.3 %.