In traditional NLP, we tokenize a sentence as a preprocessing, and thus the tokenization is unrelated to a downstream task. To address this issue, we propose a novel method to explore an appropriate tokenization for the downstream task. Our proposed method, Optimizing Tokenization (OpTok), is trained to assign a high probability to such appropriate tokenization based on the downstream task loss. Op-Tok can be used for any downstream task which uses a sentence vector representation such as text classification. Experimental results demonstrate that OpTok improves the performance of sentiment analysis, genre prediction, rating prediction, and textual entailment. The results also show that the proposed method is applicable to Chinese, Japanese, and English. In addition, we introduce OpTok into BERT, the state-ofthe-art contextualized embeddings, and report a positive effect on the performance.