On social media platforms, information, ideas, or other forms of expressions are created and/or shared among people in an interactive manner. During exchange of information, users may encounter humorous, funny, offensive, trolling, or malicious content targeting the individuals, groups, or communities. One common way of trolling on social media is to create memes by combining an image with textual information - usually a catchy phrase often obscured by humor or sarcasm, and share it on social media. Memes shared with the intention of trolling need to be filtered out from social media as they may hurt the sentiments of people and create an unhealthy atmosphere in the society. The increasing number of social media users and the increasing number of trolls on social media complicates the task of identifying the trolling memes manually. Hence, there is a demand for the tools to automatically identify the trolling memes. However, this task is challenging due to the unavailability of annotated data. The complexity of the task gets intensified if the text is written in code-mixed under-resourced regional languages like Kannada or Tulu - the languages of south India. To tackle the unavailability of annotated data and tools to identify trolling memes in under-resourced languages - Kannada and Tulu, we created two datasets: i)~\textit{KAmemes} - a meme dataset embedded code-mixed text in Kannada and ii)~\textit{TUmemes} - a meme dataset embedded code-mixed text in Tulu, consisting of memes labeled as \lq Troll' and \lq Not\_Troll'. To benchmark these datasets, Uni-modal and Multi-modal models are proposed to classify a given meme as \lq Troll' or \lq Not\_Troll'. While the uni-modal approaches consider only text or only image to classify a given meme, multi-modal approaches explore both text and image modalities. Several ML and DL baselines are implemented for uni-modal and multi-modal models. The proposed baselines are also evaluated on the available \textit{TamilMemes} dataset to illustrate their efficacy. Among the proposed baselines, a multi-modal joint representation based dual encoder model achieved the best macro F1 scores of 0.90, 0.78, and 0.58 for \textit{TUmemes}, \textit{KAmemes}, and \textit{TamilMemes} datasets respectively.