The insurance companies around the world work with very simple formula and have a very specific agenda. They convince people to deposit money on their name to the insurance company, in return the people are promised to be given a large sum of amount when they get an expensive hospital bill or when they meet with an accident. This amount to be paid, is generally taken from people on a monthly basis. Customers are convinced to join such a scheme as it is very tempting and the prospect of money troubles taken care of for nothing in a time of crisis seems wonderful. Insurance companies on the other hand pray that nothing happens to the customers or their families, so that they don't come looking for compensation. The money that they collect from new insurance holders is what they use to pay of the losses. Data analysis is the process of understanding the behaviour of a certain dataset when measured against certain static quantities. In this paper we are proposing to use Data science and in particular regression analysis, to analyse a dataset of patients and devise a method to predict their insurance amount. There are various types of learning and broadly speaking linear regression comes under supervised learning. We have a dataset consisting of over 1300 patients each with 7 characteristics like smoker or not, do they have children, their age, sex, BMI, etc. We are also proposing to devise methods to overcome the shortcomings of Linear regression like multicollinearity and homoscedascity, and perform the required data cleaning..