The study of natural language using a network approach has made it possible to characterize novel properties ranging from the level of individual words to phrases or sentences. A natural way to quantitatively evaluate similarities and differences between spoken and written language is by means of a multiplex network defined in terms of a similarity distance between words. Here, we use a multiplex representation of words based on orthographic or phonological similarity to evaluate their structure. We report that from the analysis of topological properties of networks, there are different levels of local and global similarity when comparing written vs. spoken structure across 12 natural languages from 4 language families. In particular, it is found that differences between the phonetic and written layers is markedly higher for French and English, while for the other languages analyzed, this separation is relatively smaller. We conclude that the multiplex approach allows us to explore additional properties of the interaction between spoken and written language.