Privacy preservation is a sensitive issue in our modern society. It is becoming increasingly important in many applications in this ever-growing and highly connected digital era. Functional encryption is a computation on encrypted data paradigm that allows users to retrieve the evaluation of a function on encrypted data without revealing the data, thus effectively protecting users' privacy. However, existing functional encryption implementations are still very time-consuming for practical deployment, especially when applied to machine learning applications that involve huge amount of data. In this paper, we present a high performance implementation of inner-product functional encryption (IPFE) based on ring-learning with errors on graphics processing units. We propose novel techniques to parallelize the Gaussian sampling, which is one of the most time-consuming operations in the IPFE scheme. We further execute a systematic investigation to select the best strategy for implementing number theoretic transform and inverse number theoretic transform for different security levels. Compared to the existing AVX2 implementation of IPFE, our implementation on a RTX 2060 GPU device can achieve 34.24×, 40.02×, 156.30× and 18.76× speed-up for Setup, Encrypt, KeyGen and Decrypt respectively. Finally, we propose a fast privacy-preserving Support Vector Machine (SVM) application to classify data securely using our GPU-accelerated IPFE scheme. Experimental results show that our implementation can classify 100 inputs with 591 support vectors in 688 ms (less than a second), which is 33.12× faster than the AVX2 version that takes 23 seconds.