With wide applications of facial manipulation technology, fake images and videos are becoming a great public concern. Although existing methods for face forgery detection could achieve fairly good results on public database, most of them perform poorly when the fake images/videos are compressed as they are usually done in social networks. To tackle this issue, a self-supervised decoupling network (SSDN), that incorporates compression irrelevance, is proposed in this paper. The proposed model learns two separate feature representations for the suspect videos, i.e., authenticity and compression. A joint self-supervised strategy is then utilized for feature decoupling, in which, the similarity decoupling is carried out by similarity learning on authentic features, whereas for adversarial decoupling, the proposed SSDN model is trained in an adversarial manner for robust feature learning. Experimental results show that the SSDN outperforms the state-ofthe-art methods for deepfake detection against compression attacks on public datasets, e.g., FaceForensics++.