Providing security in Mobile Ad Hoc Network is crucial problem due to its open shared wireless medium, multi-hop and dynamic nature, constrained resources, lack of administration and cooperation. Traditionally routing protocols are designed to cope with routing operation but in practice they may be affected by misbehaving nodes so that they try to disturb the normal routing operations by launching different attacks with the intention to minimize or collapse the overall network performance. Therefore detecting a trusted node means ensuring authentication and securing routing can be expected. In this article we have proposed a Trust and Q-learning based Security (TQS) model to detect the misbehaving nodes over Ad Hoc On Demand Distance-Vector (AODV) routing protocol. Here we avoid the misbehaving nodes by calculating an aggregated reward, based on the Q-learning mechanism by using their historical forwarding and responding behaviour by the way misbehaving nodes can be isolated.