Nowadays, Machine Learning is one of the most dynamic fields, as it attracts strong research interest from both industry and academia alike. It is not surprising that a huge amount of funding from government agencies, universities, Tech giants and well-funded startups is currently being allocated exclusively to this field. Reinforcement Learning, one of the three major subfields of Machine Learning, has recently gained a tremendous traction due to the fact that algorithms can run more efficiently. This is mainly due to two reasons, firstly affordable and portable hardware, such as mobile phones, wearables and Internet of Things devices, now has the capacity to run these algorithms and secondly, new methods and models are being proposed that deal with the matter of efficiency from an algorithmic point of view. This proposal is concerned with dealing with open challenges in memory efficiency, while devising and applying such Reinforcement Learning algorithms for embedded systems in the domains of Games, Natural Language Processing and Robotics using Deep Learning models and Bayesian inference, a very powerful framework. Natural Language Processing is a domain of Machine Learning in which the input is given in the form of a text from a natural language that human agents use for everyday communication. It is also a domain that still faces a series of ongoing challenges, as opposed to more saturated domains, such as Computer Vision. One of them is the fact that ground truth is difficult to be decided due to the nature of text in general. Other challenges include the personalized type and tone of the conversation held by the human agents, such as formal, informal, aggressive, polite, etc. Therefore, this proposal deals with all of these matters, mainly in the subdomain of Question Answering systems, also known as chatbots, in a multi-agent setting.