Monte Carlo (MC) simulation is considered as the most accurate method for calculation of absorbed dose and fundamental physics quantities related to biological effects in carbon ion therapy. To improve its computational efficiency, we have developed a GPU-oriented fast MC package named goCMC, for carbon therapy. goCMC simulates particle transport in voxelized geometry with kinetic energy up to 450 MeV/u. Class II condensed history simulation scheme with a continuous slowing down approximation was employed. Energy straggling and multiple scattering were modeled. δ-electrons were terminated with their energy locally deposited. Four types of nuclear interactions were implemented in goCMC, i.e., carbon-hydrogen, carbon-carbon, carbon-oxygen and carbon-calcium inelastic collisions. Total cross section data from Geant4 were used. Secondary particles produced in these interactions were sampled according to particle yield with energy and directional distribution data derived from Geant4 simulation results. Secondary charged particles were transported following the condensed history scheme, whereas secondary neutral particles were ignored. goCMC was developed under OpenCL framework and is executable on different platforms, e.g. GPU and multi-core CPU. We have validated goCMC with Geant4 in cases with different beam energy and phantoms including four homogeneous phantoms, one heterogeneous half-slab phantom, and one patient case. For each case 3 × 107 carbon ions were simulated, such that in the region with dose greater than 10% of maximum dose, the mean relative statistical uncertainty was less than 1%. Good agreements for dose distributions and range estimations between goCMC and Geant4 were observed. 3D gamma passing rates with 1%/1 mm criterion were over 90% within 10%) isodose line except in two extreme cases, and those with 2%/1 mm criterion were all over 96%. Efficiency and code portability were tested with different GPUs and CPUs. Depending on the beam energy and voxel size, the computation time to simulate 107 carbons was 9.9–125 sec, 2.5–50 sec and 60–612 sec on an AMD Radeon GPU card, an NVidia GeForce GTX 1080 GPU card and an Intel Xeon E5-2640 CPU, respectively. The combined accuracy, efficiency and portability make goCMC attractive for research and clinical applications in carbon ion therapy.