We present TransitLabel, a crowd-sensing system for automatic enrichment of transit stations indoor floorplans with different semantics like ticket vending machines, entrance gates, drink vending machines, platforms, cars' waiting lines, restrooms, lockers, waiting (sitting) areas, among others. Our key observations show that certain passengers' activities (e.g., purchasing tickets, crossing entrance gates, etc) present identifiable signatures on one or more cell-phone sensors. TransitLabel leverages this fact to automatically and unobtrusively recognize different passengers' activities, which in turn are mined to infer their uniquely associated stations semantics. Furthermore, the locations of the discovered semantics are automatically estimated from the inaccurate passengers' positions when these semantics are identified. We evaluate TransitLabel through a field experiment in eight different train stations in Japan. Our results show that TransitLabel can detect the fine-grained stations semantics accurately with 7.7% false positive rate and 7.5% false negative rate on average. In addition, it can consistently detect the location of discovered semantics accurately, achieving an error within 2.5m on average for all semantics. Finally, we show that TransitLabel has a small energy footprint on cell-phones, could be generalized to other stations, and is robust to different phone placements; highlighting its promise as a ubiquitous indoor maps enriching service.