Calotropis procera R. Br., a traditional medicinal plant in India, is a promising source of commercial proteases, because the cysteine proteases from the plant exhibit high thermo-stability, broad pH optima, and plasma-clotting activity. Though several proteases such as Procerain, Procerain B, CpCp-1, CpCp-2, and CpCp-3 have been isolated and characterized, the information of their transcripts is limited to cDNAs encoding their mature peptides. Due to this limitation, in this study, to determine the cDNA sequences encoding full open reading frame of these cysteine proteases, transcripts were sequenced with an Illumina Hiseq2000 sequencer. A total of 171,253,393 clean reads were assembled into 106,093 contigs with an average length of 1,614 bp and an N50 of 2,703 bp, and 70,797 contigs with an average length of 1,565 bp and N50 of 2,082 bp using Trinity and Velvet-Oases software, respectively. Among these contigs, we found 20 unigenes related to papain-like cysteine proteases by BLASTX analysis against a non-redundant NCBI protein database. Our expression analysis revealed that the cysteine protease contains an N-terminal pro-peptide domain (inhibitor region), which is necessary for correct folding and proteolytic activity. It was evident that expression yields using an inducible T7 expression system in Escherichia coli were considerably higher with the pro-peptide domain than without the domain, which could contribute to molecular cloning of the Calotropis procera protease as an active form with correct folding.