--- library_name: sklearn tags: - sklearn - skops - tabular-classification model_format: pickle model_file: skops-89qohtne.pkl widget: - structuredData: AGE: - 32 - 45 - 25 GENDER: - m - f - f HOWPAID: - 'weekly ' - 'weekly ' - 'weekly ' INCOME: - 21772 - 27553 - 23477 LOANS: - 1 - 2 - 1 MARITAL: - 'married ' - divsepwid - 'single ' MORTGAGE: - y - y - n NUMCARDS: - 2 - 6 - 1 NUMKIDS: - 1 - 4 - 1 STORECAR: - 3 - 5 - 2 --- # Model description This is a logistic regression model trained on customers' credit card risk data in a bank using sklearn library. The model predicts whether a customer is worth issuing a credit card or not. The full dataset can be viewed at the following link: https://huggingface.co/datasets/saifhmb/CreditCardRisk ## Training Procedure The data preprocessing steps applied include the following: - Dropping high cardinality features, specifically ID - Transforming and Encoding categorical features namely: GENDER, MARITAL, HOWPAID, MORTGAGE and the target variable, RISK - Splitting the dataset into training/test set using 85/15 split ratio - Applying feature scaling on all features ### Hyperparameters
Click to expand | Hyperparameter | Value | |-------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | memory | | | steps | [('preprocessor', ColumnTransformer(remainder='passthrough',
transformers=[('cat',
Pipeline(steps=[('onehot',
OneHotEncoder(handle_unknown='ignore'))]),
['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']),
('num',
Pipeline(steps=[('scale', StandardScaler())]),
Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))])), ('classifier', LogisticRegression())] | | verbose | False | | preprocessor | ColumnTransformer(remainder='passthrough',
transformers=[('cat',
Pipeline(steps=[('onehot',
OneHotEncoder(handle_unknown='ignore'))]),
['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']),
('num',
Pipeline(steps=[('scale', StandardScaler())]),
Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))]) | | classifier | LogisticRegression() | | preprocessor__n_jobs | | | preprocessor__remainder | passthrough | | preprocessor__sparse_threshold | 0.3 | | preprocessor__transformer_weights | | | preprocessor__transformers | [('cat', Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))]), ['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']), ('num', Pipeline(steps=[('scale', StandardScaler())]), Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))] | | preprocessor__verbose | False | | preprocessor__verbose_feature_names_out | True | | preprocessor__cat | Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))]) | | preprocessor__num | Pipeline(steps=[('scale', StandardScaler())]) | | preprocessor__cat__memory | | | preprocessor__cat__steps | [('onehot', OneHotEncoder(handle_unknown='ignore'))] | | preprocessor__cat__verbose | False | | preprocessor__cat__onehot | OneHotEncoder(handle_unknown='ignore') | | preprocessor__cat__onehot__categories | auto | | preprocessor__cat__onehot__drop | | | preprocessor__cat__onehot__dtype | | | preprocessor__cat__onehot__handle_unknown | ignore | | preprocessor__cat__onehot__max_categories | | | preprocessor__cat__onehot__min_frequency | | | preprocessor__cat__onehot__sparse | deprecated | | preprocessor__cat__onehot__sparse_output | True | | preprocessor__num__memory | | | preprocessor__num__steps | [('scale', StandardScaler())] | | preprocessor__num__verbose | False | | preprocessor__num__scale | StandardScaler() | | preprocessor__num__scale__copy | True | | preprocessor__num__scale__with_mean | True | | preprocessor__num__scale__with_std | True | | classifier__C | 1.0 | | classifier__class_weight | | | classifier__dual | False | | classifier__fit_intercept | True | | classifier__intercept_scaling | 1 | | classifier__l1_ratio | | | classifier__max_iter | 100 | | classifier__multi_class | auto | | classifier__n_jobs | | | classifier__penalty | l2 | | classifier__random_state | | | classifier__solver | lbfgs | | classifier__tol | 0.0001 | | classifier__verbose | 0 | | classifier__warm_start | False |
### Model Plot
Pipeline(steps=[('preprocessor',ColumnTransformer(remainder='passthrough',transformers=[('cat',Pipeline(steps=[('onehot',OneHotEncoder(handle_unknown='ignore'))]),['GENDER', 'MARITAL','HOWPAID', 'MORTGAGE']),('num',Pipeline(steps=[('scale',StandardScaler())]),Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))])),('classifier', LogisticRegression())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
## Evaluation Results - The target variable, RISK is multiclass. In sklearn, precision and recall functions have a parameter called, average. This parameter is required for a multiclass/multilabel target. average = 'micro' was used to calculate the precision and recall metrics globally by counting the total true positives, false negatives and false positives | Metric | Value | |-----------|----------| | accuracy | 0.699187 | | precision | 0.699187 | | recall | 0.699187 | ### Feature Importance SHAP was used to determine the important features that helps the model make decisions ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6662300a0ad8c45a1ce59190/OhZPMDUQi1N4R0XlMQm9A.png) ### Confusion Matrix ![Confusion Matrix](confusion_matrix.png) # Model Card Authors This model card is written by following authors: Seifullah Bello # Model Card Contact You can contact the model card authors through following channels: saifhmb@gmail.com