Interactions with embodied conversational agents can be enhanced using human-like co-speech gestures. Traditionally, rule-based co-speech gesture mapping has been utilized for this purpose. However, the creation of this mapping is laborious and often requires human experts. Moreover, human-created mapping tends to be limited, therefore prone to generate repeated gestures. In this article, we present an approach to automate the generation of rule-based co-speech gesture mapping from publicly available large video data set without the intervention of human experts. At run-time, word embedding is utilized for rule searching to get the semantic-aware, meaningful, and accurate rule. The evaluation indicated that our method achieved comparable performance with the manual map generated by human experts, with a more variety of gestures activated. Moreover, synergy effects were observed in users' perception of generated co-speech gestures when combined with the manual map.