Background: Schizophrenia is a chronic and severe mental disease, which largely influences the daily life and work of patients. In the clinic, schizophrenia with negative symptoms is usually misdiagnosed and hardly treated. The diagnosis is also dependent on the experience of clinicians. It is urgent to develop an objective and effective method to diagnose schizophrenia with negative symptoms. Recent studies had shown that impaired speech could be considered as an indicator to diagnose schizophrenia. The literature about schizophrenia speech detection was mainly based on feature engineering, in which effective feature extraction is difficult because of the variability of speech signals. Methods: A novel deep learning architecture based on a convolutional neural network, termed Sch-net, is designed for end-to-end schizophrenia speech detection in this work. It avoids the procedure of artificial feature extraction and combines the advantages of skip connections and attention mechanism to discriminate schizophrenia patients and controls. Results: We validate our Sch-net through ablation experiments on a schizophrenia speech dataset that contains 28 schizophrenia patients and 28 healthy controls. The comparisons with the models based on feature engineering and classic deep neural networks are also conducted. The experimental results show that the Sch-net has a great performance on schizophrenia speech detection task, which can achieve 97.76% accuracy on the schizophrenia speech dataset. To further verify the generalization of our model, the Sch-net is tested on open access LANNA children speech database for specific language impairment detection. Our code is available at https://github.com/Scu-sen/Sch-net. Conclusions: Extensive experiments show that the proposed Sch-net can provide the aided information for the diagnosis of schizophrenia speech and specific language impairment.