I am currently pursuing a graduate degree at Purdue University, specializing in Machine Learning.
Previously, I worked as a Software Engineer at Elevance Health, where I collaborated with the Responsible AI and Explainable AI (XAI)
team to enhance fairness, interpretability, and mitigate data drift in machine learning models.
I hold a Bachelor of Technology in Computer Science and Engineering from the Indian Institute of Information Technology, Guwahati,
which I completed in 2022.
My research experience includes an internship at NVIDIA,
where I focused on speech recognition, meta-learning, and few-shot learning. I also interned at the AI-ML-NLP Lab at IIT Patna, where I worked
on multimodal learning techniques.
In addition, I have made contributions to the open-source community, particularly to the HuggingFace Transformers library.
My contributions include integrating two vision transformer models: Microsoft's CvT: Convolutions to Vision
Transformers and Meta's LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference.
Contact: anugunjjha at gmail dot
com
2024
-
Random Propagations in GNNs
Thu Bui, Anugunj Naman, Carola-Bibiane Schönlieb, Bruno Ribeiro, Beatrice Bevilacqua,
Moshe Eliasof
2nd NeurIPS Workshop on Unifying Representations in Neural Models (UniReps), 2024
Graph learning benefits many fields. However, Graph Neural Networks (GNNs) often struggle
with scalability, especially on large graphs. At the same time, many tasks seem to be simple
in terms of learning, e.g., simple diffusion yields favorable performance. In this paper, we
present Random Propagation GNN (RAP-GNN), a framework that addresses two main research
questions: (i) can random propagations in GNNs be as effective as end-to-end optimized GNNs?
and (ii) can they reduce the computational burden required by traditional GNNs? Our empirical
findings indicate that RAP-GNN reduces training time by a factor of two, while maintaining
strong accuracy for node and graph classification tasks.
2021
-
Indic Languages Automatic Speech Recognition using Meta-Learning Approach
Anugunj Naman, Kumari Deepshikha
4th International Conference on Natural Language and Speech Processing (ICNLSP),
2021
Recently Conformer-based models have shown promising leads to Automatic Speech Recognition
(ASR), outperforming transformer-based networks while meta-learning has been extremely
useful in
modeling deep learning networks with a scarcity of abundant data. In this work, we use
Conformers to model both global and local dependencies of an audio sequence in a very
parameter-efficient way and meta-learn the
initialization parameters from several languages during training to attain fast adaptation
on the unseen target languages, using model-agnostic meta-learning algorithm (MAML). We
analyse and evaluate the proposed approach for seven different Indic languages. Preliminary
results showed that the proposed method, MAML-ASR, comes significantly closer to
state-of-the-art monolingual Automatic Speech Recognition for all seven different Indic
languages in terms of character error rate.
-
A Multimodal Author Profiling System for Tweets
Chanchal Suman, Anugunj Naman, Sriparna Saha, Pushpak Bhattacharyya
IEEE Transactions on Computational Social Systems,
2021
The rising usage of social media has motivated to invent different methodologies of
anonymous
writing, which leads to an increase in malicious and suspicious activities. This anonymity
has
created difficulty in finding the suspect. Author profiling deals with the characterization
of
an author through some key attributes such as gender, age, language, dialect region variety,
personality, and so on. Identifying the gender of the author of a suspect document is a
salient task of author-profiling. The linguistic profile of a user can help in determining
his/her demographics. Different social media platforms, such as Twitter, Facebook, and
Instagram, are used regularly by users for sharing their daily life activities. Moreover,
users often post images along with text on different social media platforms; thus, the usage
of multimodal information is very common nowadays. In this article, the task of automatic
gender prediction from multimodal Twitter data is posed as a classification problem and an
efficient multimodal neural framework is proposed for solving this. The popularly used
BERT_base is utilized for learning the encoded representation for the text part of the
tweet,
and recently introduced EfficientNet is used for extracting the features from images.
Finally,
a direct product-based fusion strategy is applied for fusing the text and image
representations, followed by a fully connected layer for predicting the gender of a Twitter
user. Plagiarism detection authorship analysis near end duplicate detection (PAN)-2018
author
profiling data are used for evaluating the performance of our proposed approach. Our
proposed
model achieved accuracies of 82.05%, 86.22%, and 89.53% for pure-image, pure-text, and
multimodal setting, respectively; outperforming the previous state-of-the-art works in all
the
cases. Moreover, a deep analysis is carried out to interpret the produced results; different
words that serve as clues for gender classification are identified characterizing different
gender classes.