Predict whether people have diabetes when their characteristics are specified.It is desired to develop a machine learning model that can model data analysis and feature engineering steps required before development you are expected to do.

Information about the dataset

The dataset is part of the large dataset held at the National Institutes of Diabetes-Digestive-Kidney Diseases in the USA. in the USA on Pima Indian women aged 21 and over living in Phoenix, the 5th largest city in the State of Arizona.Data used for diabetes research.The target variable is specified as “outcome”; 1 indicates positive diabetes test result, 0 indicates negative.


Pregnancies: Number of pregnancies
Glucose Oral: 2-hour plasma glucose concentration in glucose tolerance test
Blood Pressure: Blood Pressure (Small blood pressure) (mm Hg)
SkinThickness: Skin Thickness
Insulin: 2-hour serum insulin (mu U/ml)
DiabetesPedigreeFunction: Function (2 hour plasma glucose concentration in oral glucose tolerance test)
BMI: Body mass index
Age: Age (year)
Outcome:Have the disease (1) or not (0)


View Github