The Process Safety Management System (PSMS) of an industrial asset relies on multiple and independent barriers for preventing the occurrence of major accidents and/or mitigating their consequences on people, environment, asset and company reputation. It is, then, fundamental to assess the performance of the barriers with respect to the occurrence of Process Safety Events (PSEs), i.e. unplanned or uncontrolled events during which a Loss Of Primary Containment (LOPC) of any material, including non-toxic and non-flammable material, occurs. An essential aspect of PSMS is learning from incidents and taking corrective actions to prevent their recurrence. For this, a procedure for timely and consistently reporting and investigating PSEs is generally implemented. After the occurrence of a PSE, a report containing free-text and multiple-choice fields is filed to describe the PSE, its causes and consequences, and to provide a quantification of its the level of severity with reference to predefined Tier levels, as per API RP 754 guidelines. This work investigates the possibility of text-mining and structuring the knowledge on the performance of the PSMS from an electronic repository of PSE reports. The methodology developed falls within the framework of Natural Language Processing (NLP), combining Term Frequency Inverse Document Frequency (TFIDF) and Normalized Pointwise Mutual Information (NPMI) for the automatic extraction of keywords from the PSE reports. Then, a taxonomy is built to organize the vocabulary in a top-down structure of homogeneous categories, such that semantic and functional relations between and within them can be defined. Based on these relations, a Bayesian Network (BN) is developed for modeling the PSEs consequences. The proposed methodology is applied to a repository of real reports concerning the PSEs of hydrocarbon facilities of an Oil and Gas (O&G) company.