A new efficient implementation of the multithreaded quicksort algorithm called CPP11sort is presented. This implementation is built exclusively upon the threading primitives of the C++ programming language itself. The performance of CPP11sort is evaluated and compared with its mainstream competitors provided by GNU, Intel, and Microsoft. It is shown that out of the considered implementations, CPP11sort mostly yields the shortest sorting times and is the only one that is portable to any conforming C++ implementation without a need of external libraries or nonstandard compiler extensions. The experimental evaluation with various input data distributions resulted in parallel speedup between 16.1 and 44.2 on a 56-core server and between 6.8 and 14.5 on a 10-core workstation with enabled hyperthreading.