Addressing Gender Imbalance in Cirrhosis Prediction with CTGAN and Transformer-Based Generative Models
Conference
Saxena, D, Rishe, N. (2025). Addressing Gender Imbalance in Cirrhosis Prediction with CTGAN and Transformer-Based Generative Models
. 10.1109/AIBThings66987.2025.11296145
Saxena, D, Rishe, N. (2025). Addressing Gender Imbalance in Cirrhosis Prediction with CTGAN and Transformer-Based Generative Models
. 10.1109/AIBThings66987.2025.11296145
Cirrhosis is a severe liver condition characterized by irreversible scarring due to chronic liver damage. Early detection is critical for effective treatment and improved patient outcomes. This study investigates the application of various machine learning models for cirrhosis prediction, including Random Forest (RF), Logistic Regression (LR), Convolutional Neural Networks (CNN), Multilayer Perceptron (MLP), and a hybrid CNN-Long Short-Term Memory (CNN-LSTM) model. To address the gender imbalance in the dataset-where male samples outnumber female-synthetic data generation techniques are applied, with a focus on Conditional Generative Adversarial Networks (CTGAN). Two hybrid balancing approaches are also explored: one combining CTGAN with Edited Nearest Neighbors (ENN) and the other with the Synthetic Minority Over-sampling Technique (SMOTE). This paper introduces a novel generative framework, the Transformer-Conditional Generative Adversarial Network (TransCTGAN), which incorporates Transformer-based selfattention layers into both the generator and discriminator of the CTGAN architecture. The effectiveness of synthetic data is assessed through model performance improvements after dataset balancing. Our results demonstrate that CTGAN-based augmentation significantly boosts predictive accuracy, showing the potential of synthetic data in enhancing machine learning models. While not all models show marked improvement, the approach effectively mitigates class imbalance without introducing bias toward the male-dominated class.