Human perceptions of speaker characteristics, needed to perform automatic predictions from speech features, have generally been collected by conducting demanding in-lab listening tests under controlled conditions. Concurrently, crowdsourcing has emerged as a valuable approach for running user studies through surveys or quantitative ratings. Micro-task crowdsourcing markets enable the completion of small tasks (commonly of minutes or seconds), rewarding users with micro-payments. This paradigm permits effortless collection of user input from a large and diverse pool of participants at low cost. This paper presents different auditory tests for collecting perceptual voice likability ratings employing a common set of 30 male and female voices. These tests are based on direct scaling and on paired-comparisons, and were conducted in the laboratory and via crowdsourcing using micro-tasks. Design considerations are proposed for adapting the laboratory listening tests to a mobile-based crowdsourcing platform to obtain trustworthy listeners' answers. Our likability scores obtained by the different test approaches are highly correlated. This outcome motivates the use of crowdsourcing for future listening tests investigating e.g. speaker characterization, reducing the efforts involved in engaging participants and administering the tests on-site.