Disease-symptom relationships are of primary importance for biomedical informatics, but databases that catalog them are incomplete in comparison with the state of the art available in the scientific literature. We propose in this paper a novel method for automatically extracting disease-symptom relationships from text, called SPARE (standing for Syntactic PAttern for Relationship Extraction). This method is composed of 3 successive steps: first, we learn patterns from the dependency graphs; second, we select best patterns based on their respective quality and specificity (their ability to identify only disease-symptom relationships); finally, the patterns are used on new texts for extracting disease-symptom relationships. We experimented SPARE on a corpus of 121,796 abstracts of PubMed related to 457 rare diseases. The quality of the extraction has been evaluated depending on the pattern quality and specificity. The best F-measure obtained is 55.65% (for speci f icity ≥ 0.5 and quality ≥ 0.5). To provide an insight on the novelty of disease-symptom relationship extracted, we compare our results to the content of phenotype databases (OrphaData and OMIM). Our results show the feasibility of automatically extracting disease-symptom relationships, including true relationships that were not already referenced in phenotype databases and may involve complex symptom descriptions.
Cet article traite des opérateurs elliptiques du second ordre P dont les coefficients sont vus comme des variables aléatoires. Son but est d'obtenir des estimations de leurs solutions qui soient polynomiales dans les coefficients. Ces estimations sont utiles pour la quantification de l'incertitude. Dans cet article, nous traitons le cas dans lequel le bord et l'interface sont parallèles aux hyperplans {x m = 0} ⊂ R m .
"Let $P : \CI(M; E) \to \CI(M; F)$ be an order $\mu$ differential operator with coefficients $a$ and $P_k := P : H^{s_0 + k +\mu}(M; E) \to H^{s_0 + k}(M; F)$. We prove polynomial norm estimates for the solution $P_0^{-1}f$ of the form $$\|P_0^{-1}f\|_{H^{s_0 + k + \mu}(M; E)} \le C \sum_{q=0}^{k} \, \| P_0^{-1} \|^{q+1} \,\|a \|_{W^{|s_0|+k}}^{q} \, \| f \|_{H^{s_0 + k - q}},$$ (thus in higher order Sobolev spaces, which amounts also to a parametric regularity result). The assumptions are that $E, F \to M$ are Hermitian vector bundles and that $M$ is a complete manifold satisfying the Fr\'echet Finiteness Condition (FFC), which was introduced in (Kohr and Nistor, Annals of Global Analysis and Geometry, 2022). These estimates are useful for uncertainty quantification, since the coefficient $a$ can be regarded as a vector valued random variable. We use these results to prove integrability of the norm $\|P_k^{-1}f\|$ of the solution of $P_k u = f$ with respect to suitable Gaussian measures."
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.