Background
Demographic and socio-behavioural factors are strong drivers of HIV infection rates in sub-Saharan Africa. These factors are often studied in qualitative research but ignored in quantitative analyses. However, they provide an in-depth insight into the local behaviour, and may help to improve HIV prevention.
Methods
To obtain a comprehensive overview of the socio-behavioural factors influencing HIV prevalence and incidence in Malawi, we systematically reviewed the literature. Due to the choice of broad search terms (“HIV AND Malawi”), our preliminary search revealed many thousands of articles. We, therefore, developed a Python tool to automatically extract, process, and categorise open-access articles published from January 1st, 1987 until October 1st, 2019 in Pubmed, Pubmed Central, JSTOR, Paperity, and arXiV databases. We then used a topic modelling algorithm to classify and identify publications of interest.
Results
Our tool extracted 22'709 unique articles; 16'942 could be further processed. After topic modelling, 519 of these were clustered into relevant topics; 20 of which were kept after hand-screening. We retrieved 7 more publications after examining references so that 27 publications were finally included in the review. Reducing the 16'942 articles to 519 potentially relevant ones by using the software took 5 days. Several factors were identified to contribute to the risk of HIV infection, including religion, gender and relationship dynamics, beliefs, and socio-behavioural attitudes.
Conclusions
Our software does not replace traditional systematic reviews, but it returns useful results to broad queries of open-access literature in under a week, without a priori knowledge. This produces a “seed data-set” of relevance which could be further developed. It identified known factors and rare factors that may be specific to Malawi. In the future, we aim to expand the tool by adding more social science databases and applying it to other sub-Saharan African countries.