Building Ethical and Explainable AI
Introduction:
The assignment for my CS5100: Foundations in Artificial Intelligence class served as a walkthrough of fairness metrics and explainability in machine learning training and outputs respectively. Given the Adults data set, we did the following:
-
Preprocessed the data with an exploratory data analysis (EDA) and basic cleaning
-
Developed a Multi-Layer Perceptron (MLP) Classifier to predict income
-
Used the SHAP library to analyze feature importance and explain model predictions
-
Assessed the model using fairness metrics, including Statistical Parity Difference, Disparate Impact, and Equal Opportunity Difference. Evaluated model performance across various demographic groups, identifying any disparities or biases.
-
Implemented an in-processing bias mitigation technique (Adversarial Debiasing) to improve model fairness. Re-evaluated the model post-mitigation to assess both fairness improvements and predictive performance retention.
-
Reflected on the ethical implications of the model's predictions, especially in relation to sensitive demographic groups.
Results:
The plots below represent the feature weights in the MLP's before adversarial debiasing and after adversarial debiasing with respect to individuals' income predictions (> or < $50k).
Figure 1: Initially the model primarily focuses on sensitive demographic features which may lead to ethical complications.
Figure 2: After adversarial debiasing, the model primarily focuses on capital-gains and significantly reduces reliance on sensitive demographics. ​​​
Code
See Disclaimer on Projects Page
Figure 1: Initial MLP Weights

Figure 2: Post Debiasing MLP Weights
