An increasingly large body of converging evidence supports the idea that the semantic system is distributed across brain areas and that the information encoded therein is multimodal. Within this framework, feature norms are typically used to operationalize the various parts of meaning that contribute to define the distributed nature of conceptual representations. However, such features are typically collected as verbal strings, elicited from participants in experimental settings. If the semantic system is not only distributed (across features) but also multimodal, a cognitively sound theory of semantic representations should take into account different modalities in which feature-based representations are generated, because not all the relevant semantic information may be easily verbalized into classic feature norms, and different types of concepts (e.g., abstract vs. concrete concepts) may consist of different configurations of non-verbal features.In this paper we acknowledge the multimodal nature of conceptual representations and we propose a novel way of collecting non-verbal semantic features. In a crowdsourcing task we asked participants to use emoji to provide semantic representations for a sample of 300 English nouns referring to abstract and concrete concepts, which account for (machine readable) visual features. In a formal content analysis with multiple annotators we then classified the cognitive strategies used by the participants to represent conceptual content through emoji.The main results of our analyses show that abstract (vs. concrete) concepts are characterized by representations that: 1. consist of a larger number of emoji; 2. include more face emoji (expressing emotions); 3. are less stable and less shared among users; 4. use representation strategies based on figurative operations (e.g., metaphors) and strategies that exploit linguistic information (e.g. rebus); 5. correlate less well with the semantic representations emerging from classic features listed through verbal strings.