Field-programmable gate arrays (FPGAs) have garnered significant interest in research on highperformance computing because their flexibility enables the building of application-specific computation pipelines and data supply systems. In addition to the flexibility, toolchains for the development of FPGAs in OpenCL have been developed and offered by FPGA vendors that reduce the programming effort required. However, the high level of abstraction in the OpenCL-based development approach is a disadvantage, making it difficult to perform fine-grained performance tuning. In this paper, we present one of the methodologies to achieve both the reduction of FPGA programming cost and the provision of high performance. We focus on data sorting, which is a basic arithmetic operation, and we introduce a sorting library that can be used with the OpenCL programming model for FPGAs. Our sorting library has so far only supported integer data, but in this paper, we propose a new method that supports floating-point data. It consumes at least twice as many hardware resources compared to the merge sort restructured for the OpenCL programming model for FPGAs. However, its operating frequency is 1.08x higher and its sorting throughput is three orders of magnitude greater than the baseline. The source code of our sorting library is open source, and it can be used by application developers around the world.