Deep learning models have shown great promise in predicting genome-wide regulatory effects from DNA sequence, but their informativeness for human complex diseases and traits is not fully understood. Here, we evaluate the disease informativeness of two types of deep learning annotations: (1) variant-level annotations (based on the reference allele), assessing whether they are more informative for complex disease than the underlying experimental data used to train the predictive models; and (2) allelic-effect annotations (absolute value of the predicted difference between reference and variant alleles), which have been a major focus of recent work. In each case, we primarily consider annotations constructed using two previously trained deep learning models, DeepSEA and Basenji. We apply stratified LD score regression (S-LDSC) to 41 independent diseases and complex traits (average N =320K) to evaluate each annotation's informativeness for disease heritability conditional on a broad set of coding, conserved, regulatory and LDrelated annotations from the baseline-LD model and other sources; as a secondary metric, we also evaluate the accuracy of models that incorporate deep learning annotations in predicting disease-associated or fine-mapped SNPs. We aggregated annotations across all tissues (resp. blood cell types or brain tissues) in metaanalyses across all 41 traits (resp. 11 blood-related traits or 8 brain-related traits). Variant-level annotations, despite being highly enriched for disease heritability, produced no conditionally significant results in meta-analyses across all 41 traits or 11 blood-related traits, but brain-specific DeepSEA-H3K4me3 and Basenji-H3K27ac annotations were conditionally significant in meta-analyses across 8 brain-related traits; a sequence motif analysis suggests that these annotations could be capturing unique information about nucleosome occupancy. Allelic-effect annotations were also highly enriched for disease heritability, and produced conditionally significant results for Basenji-H3K4me3 in meta-analyses across all 41 traits and brain-specific Basenji-H3K4me3 in meta-analyses across 8 brain-related traits. We conclude that deep learning models are informative for disease, but their informativeness cannot be inferred from metrics based on their accuracy in predicting regulatory annotations.