saifhmb's picture
Update README.md
fe21654 verified
metadata
library_name: sklearn
tags:
  - sklearn
  - skops
  - tabular-classification
model_format: pickle
model_file: skops-89qohtne.pkl
widget:
  - structuredData:
      AGE:
        - 32
        - 45
        - 25
      GENDER:
        - m
        - f
        - f
      HOWPAID:
        - 'weekly '
        - 'weekly '
        - 'weekly '
      INCOME:
        - 21772
        - 27553
        - 23477
      LOANS:
        - 1
        - 2
        - 1
      MARITAL:
        - 'married  '
        - divsepwid
        - 'single   '
      MORTGAGE:
        - 'y'
        - 'y'
        - 'n'
      NUMCARDS:
        - 2
        - 6
        - 1
      NUMKIDS:
        - 1
        - 4
        - 1
      STORECAR:
        - 3
        - 5
        - 2

Model description

This is a logistic regression model trained on customers' credit card risk data in a bank using sklearn library. The model predicts whether a customer is worth issuing a credit card or not. The full dataset can be viewed at the following link: https://huggingface.co/datasets/saifhmb/CreditCardRisk

Training Procedure

The data preprocessing steps applied include the following:

  • Dropping high cardinality features, specifically ID
  • Transforming and Encoding categorical features namely: GENDER, MARITAL, HOWPAID, MORTGAGE and the target variable, RISK
  • Splitting the dataset into training/test set using 85/15 split ratio
  • Applying feature scaling on all features

Hyperparameters

Click to expand
Hyperparameter Value
memory
steps [('preprocessor', ColumnTransformer(remainder='passthrough',
transformers=[('cat',
Pipeline(steps=[('onehot',
OneHotEncoder(handle_unknown='ignore'))]),
['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']),
('num',
Pipeline(steps=[('scale', StandardScaler())]),
Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))])), ('classifier', LogisticRegression())]
verbose False
preprocessor ColumnTransformer(remainder='passthrough',
transformers=[('cat',
Pipeline(steps=[('onehot',
OneHotEncoder(handle_unknown='ignore'))]),
['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']),
('num',
Pipeline(steps=[('scale', StandardScaler())]),
Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))])
classifier LogisticRegression()
preprocessor__n_jobs
preprocessor__remainder passthrough
preprocessor__sparse_threshold 0.3
preprocessor__transformer_weights
preprocessor__transformers [('cat', Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))]), ['GENDER', 'MARITAL', 'HOWPAID', 'MORTGAGE']), ('num', Pipeline(steps=[('scale', StandardScaler())]), Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))]
preprocessor__verbose False
preprocessor__verbose_feature_names_out True
preprocessor__cat Pipeline(steps=[('onehot', OneHotEncoder(handle_unknown='ignore'))])
preprocessor__num Pipeline(steps=[('scale', StandardScaler())])
preprocessor__cat__memory
preprocessor__cat__steps [('onehot', OneHotEncoder(handle_unknown='ignore'))]
preprocessor__cat__verbose False
preprocessor__cat__onehot OneHotEncoder(handle_unknown='ignore')
preprocessor__cat__onehot__categories auto
preprocessor__cat__onehot__drop
preprocessor__cat__onehot__dtype <class 'numpy.float64'>
preprocessor__cat__onehot__handle_unknown ignore
preprocessor__cat__onehot__max_categories
preprocessor__cat__onehot__min_frequency
preprocessor__cat__onehot__sparse deprecated
preprocessor__cat__onehot__sparse_output True
preprocessor__num__memory
preprocessor__num__steps [('scale', StandardScaler())]
preprocessor__num__verbose False
preprocessor__num__scale StandardScaler()
preprocessor__num__scale__copy True
preprocessor__num__scale__with_mean True
preprocessor__num__scale__with_std True
classifier__C 1.0
classifier__class_weight
classifier__dual False
classifier__fit_intercept True
classifier__intercept_scaling 1
classifier__l1_ratio
classifier__max_iter 100
classifier__multi_class auto
classifier__n_jobs
classifier__penalty l2
classifier__random_state
classifier__solver lbfgs
classifier__tol 0.0001
classifier__verbose 0
classifier__warm_start False

Model Plot

Pipeline(steps=[('preprocessor',ColumnTransformer(remainder='passthrough',transformers=[('cat',Pipeline(steps=[('onehot',OneHotEncoder(handle_unknown='ignore'))]),['GENDER', 'MARITAL','HOWPAID', 'MORTGAGE']),('num',Pipeline(steps=[('scale',StandardScaler())]),Index(['AGE', 'INCOME', 'NUMKIDS', 'NUMCARDS', 'STORECAR', 'LOANS'], dtype='object'))])),('classifier', LogisticRegression())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation Results

  • The target variable, RISK is multiclass. In sklearn, precision and recall functions have a parameter called, average. This parameter is required for a multiclass/multilabel target. average = 'micro' was used to calculate the precision and recall metrics globally by counting the total true positives, false negatives and false positives
Metric Value
accuracy 0.699187
precision 0.699187
recall 0.699187

Model Explainability

SHAP was used to determine the important features that helps the model make decisions image/png

Confusion Matrix

Confusion Matrix

Model Card Authors

This model card is written by following authors: Seifullah Bello

Model Card Contact

You can contact the model card authors through following channels: [email protected]