Background/aims In comparison with commercial drugs, there are few regulations concerning the labeling of investigational drugs. This leads to variability in their content and layout. This increases the risk of errors during storage, validation, compounding, dispensing and administration. The aim of this study was to evaluate the conformity and variability of investigational drug labels. Additional exploratory aims were to evaluate the use of an automated script to describe the labels and to identify the factors associated with the ease of finding a kit number. Methods An 87-criterion list was developed to evaluate content, format and readability. It included eight criteria to evaluate the conformity to the Canadian Food and Drugs Regulation. A systematic cross-sectional evaluation of all investigational drug labels in our 500-bed mother–child center was performed. All active protocols during the period of 14–22 February 2018 were included. Labels from drugs that were sourced locally were excluded. Labels affixed to the outside (external) and inside (internal) containers, as well as labels from American and European sponsors, were compared with the chi-square and Student’s t tests. A script was developed in Python to automatically determine key information (number of words, main colors and their proportion). A short survey was conducted with a convenience sample of pharmacists to rate the ease of finding the kit number on labels. Correlation was evaluated with different factors. Results A total of 27 protocols were included (24 internal and 34 external labels). The majority (33/34) of external labels were compliant with the Regulation. Some internal labels did not state the expiry date (9/13), the sponsor address (2/13) or storing conditions (1/13). A total of 10 criteria were different between internal and external labels, for instance, the number of languages was higher on external labels (median 3 (2–14) vs 10 (2–50); p = 0.013). Five criteria were different depending on the sponsors’ location, for instance, European sponsors were more prone to use bold characters (25% vs 61%, p = 0.034). There was a mean of 146 ± 111 words and 78.3% ± 7.3% empty space per label. These were positively correlated (p < 0.001). The proportion of free space on a label was also correlated with the ease of finding the kit number (p = 0.002). Conclusion We measured a high variability in the labeling of investigational drugs. Key information was missing from labels affixed to internal containers, despite the use of a high number of pages. The automation worked well and further work is needed to identify criteria that may improve readability and reduce error risk. Detailed and harmonized international guidelines are needed.