Training a robust artificial facial expression recognition system requires diverse samples. In-the-lab datasets are collected under class-balanced distribution. By contrast, samples in the real world are more complex and imbalanced. Traditional data augmentation (DA) methods apply random operation amplitudes and unidirectional expansions in the feature space, which leads to bias in classes with more samples. Our paper proposes a DA model and combines it with a feature enhancement mechanism for unbalanced facial emotion recognition. We introduce a learning-based dynamic sample adjustment mechanism that selects the most valuable samples to guide the training process of each sample’s augmentation parameters. It explores each sample’s maximum augmentation potential and increases the accuracy in a class-balanced way. Compared with state-of-the-art methods on the facial emotion recognition task, it achieves the best average accuracy of 86.08% on the real-world affective faces database, outperforming the state-of-the-art imbalanced learning methods. Primarily, it shows significant accuracy improvement for extreme minority classes, which improve by over 15% at best compared with previous works. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Facial recognition systems
Data modeling
Statistical modeling
Databases
Feature extraction
Performance modeling
Space operations