Skip to content

datascientistpurushotam/Health-Insurance-Cross-Sell-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Classification-Project

We get the HealthCare dataset from the almabetter for our capstone project based on Classification ML algorithm. This data set contains rows 381109 and 12 Column.

Our client is an Insurance company that has provided Health Insurance to its customers now they need our help in building a model to predict whether the policyholders (customers) from past year will also be interested in Vehicle Insurance provided by the company An insurance policy is an arrangement by which a company undertakes to provide a guarantee of compensation for specified loss, damage, illness, or death in return for the payment of a specified premium. A premium is a sum of money that the customer needs to pay regularly to an insurance company for this guarantee.

For example, he pays a premium of Rs. 5000 each year for a health insurance cover of Rs. 200,000/- so that if, God forbid, you fall ill and need to be hospitalised in that year, the insurance provider company will bear the cost of hospitalisation etc. for upto Rs. 200,000. Now if you are wondering how can company bear such high hospitalisation cost when it charges a premium of only Rs. 5000/-, that is where the concept of probabilities comes in picture. For example, like you, there may be 100 customers who would be paying a premium of Rs. 5000 every year, but only a few of them (say 2-3) would get hospitalised that year and not everyone. This way everyone shares the risk of everyone else.

Just like medical insurance.

Link to dataset:- https://drive.google.com/file/d/1AW5Gz6IqktDOoIjaBeWvy-HMaF5Y84sX/view?usp=share_link

In this project we have been provided with a CSV datasets for Machine Learning model Generation.

After understanding the data set we applied data wranglling and feature engineering.

After the treatment of dataset we performed Univariate, Bivariante and Multivariante analysis to understand the dataset.

To build our ML model we splited the dataset into 70:30 where 30% is test dataset. After the splliting we transformed them and performed normalization over it.

Begining with Logistic Regression, Decision tree and random forest over the dataset but we got maximum accuracy on the ML model of Random Forest.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published