This research explores the feasibility of automatically clustering and categorising Indian research output in terms of the seventeen categories of Sustainable Development Goals (SDGs) proposed by the United Nations (sdgs.un.org/goals). Utilising the OpenAlex database (openalex.org), an extensive open-access bibliographic repository containing over 250 million research objects, this study gathered research publications from India (with at least one Indian author) published during 2016 (the year of inception of the SDG categories) - 2023. OpenAlex’s comprehensive metadata includes SDG data elements for numerous publications, alongside other bibliographic information. The SDG classifications and corresponding accuracy scores (ranging from 0 to 1) in OpenAlex are derived from the Aurora SDG classifier (aurora-universities. eu/ sdg-research/). Initially, this study compiled a dataset of 500,000 research publications originating from India within the specified period, of which approximately 278,845 records contained SDG data elements. This primary dataset was divided into two subsets: the All Dataset (ADS), encompassing all records (278,845 records), and the High Accuracy Dataset (HAD), consisting of records with an SDG accuracy score of ≥0.5 (204,793 records). Both datasets (ADS and HAD) were further segmented into three groups: the training dataset (97% of total data), the validation dataset (1.5% of total data), and the test dataset (1.5% of total data). The training dataset was utilised to train the locally implemented open-source automated indexing tool, Annif, using various backends, including lexical, associative, and ensemble (simple) methods. The validation dataset was employed to develop an optimal weightage formula for combining results from different backends in an advanced-level neural network backend. One significant advantage of the neural network backend is its capacity for successive training with additional datasets. It was found that: (i) automatic categorisation of research publications by SDGs is feasible; (ii) the neural network-based backend outperformed other backends, such as SVC, Omikuji, and FastText, in terms of retrieval metrics like F1@5 and NDCG; (iii) lexical models are unsuitable for this purpose, performing poorly in terms of F1@5 and NDCG; and (iv) the neural network-based backend also outperformed other backends for the High Accuracy Dataset.