The vertebrate inner ear is composed of multiple sensory receptor epithelia, each of which is specialized for detection of sound, gravity or angular acceleration. Each receptor epithelium contains mechanosensitive hair cells, which are connected to the brainstem by bipolar sensory neurons. Hair cells and their associated neurons are derived from the embryonic rudiment of the inner ear epithelium, but the precise spatial and temporal patterns of their generation, as well as the signals that coordinate these events, have only recently begun to be understood. Gene expression, lineage tracing, and mutant analyses suggest that both neurons and hair cells are generated from a common domain of neural and sensory competence in the embryonic inner ear rudiment. Members of the Shh, Wnt and FGF families, together with retinoic acid signals, regulate transcription factor genes within the inner ear rudiment to establish the axial identity of the ear and regionalize neurogenic activity. Close-range signaling, such as that of the Notch pathway, specifies the fate of sensory regions and individual cell types. We also describe positive and negative interactions between basic helix-loop-helix and SoxB family transcription factors that specify either neuronal or sensory fates in a context-dependent manner. Finally, we review recent work on inner ear development in zebrafish, which demonstrates that the relative timing of neurogenesis and sensory epithelial formation is not phylogenetically constrained.