Segmentation, or the classification of pixels (grid cells) in imagery, is ubiquitously applied in the natural sciences. Manual methods are often prohibitively time-consuming, especially those images consisting of small objects and/or significant spatial heterogeneity of colors or textures. Labeling complicated regions of transition that in Earth surface imagery are represented by collections of mixed-pixels, -textures, and -spectral signatures, can be especially error-prone because it is difficult to reliably unmix, identify and delineate consistently. However, the success of supervised machine learning (ML) approaches is entirely dependent on good label data. We describe a fast, semi-automated, method for interactive segmentation of N-dimensional (x,y,N) images into two-dimensional (x,y) label images. It uses human-in-the-loop ML to achieve consensus between the labeler and a model in an iterative workflow. The technique is reproducible; the sequence of decisions made by human labeler and ML algorithms can be encoded to file, so the entire process can be played back and new outputs generated with alternative decisions and/or algorithms. We illustrate the scientific potential of segmentation of imagery of diverse settings and image types using six case studies from river, estuarine, and open coast environments. These photographic and non-photographic imagery consist of 1- and 3-bands on regular and irregular grids ranging from centimeters to tens of meters. We demonstrate high levels of agreement in label images generated by several labelers on the same imagery, and make suggestions to achieve consensus and measure uncertainty, ideal for widespread application in training supervised ML for image segmentation.