Advancing healthcare through multimodal AI systems
University of Southern California
Multimodal AI • Explainable AI • Agentic LLMs • RAG
Los Angeles, CA
Integrating vision-language models with clinical knowledge for enhanced medical image interpretation.
Leveraging convolutional neural networks and computer vision for accurate skin lesion classification.
University of Southern California | Los Angeles, USA | Dec 2024 – Present
Abstract: Diabetic retinopathy (DR) remains a leading cause of blindness, necessitating early detection and continuous monitoring to prevent irreversible vision loss. However, existing AI models for DR lack interpretability and fail to incorporate dynamic patient data beyond static retinal images. We propose Agentic Multimodal Retrieval-Augmented Generation (AM-RAG), an AI-driven framework that integrates retinal imaging, electronic health records (EHRs), and lab trends with clinician-in-the-loop workflows. AM-RAG aligns imaging biomarkers (e.g., microaneurysms, exudates) with clinical risk factors (e.g., HbA1c, hypertension) using multimodal fusion for dynamic risk stratification. A retrieval engine enhances explainability by grounding predictions in similar historical cases, while agentic AI workflows prompt clinicians for missing data, improving decision support. Evaluated on a diverse dataset, AM-RAG provides a holistic view of DR progression, offering interpretable insights (e.g., “Progression aligns with patients with HbA1c greater than 9%”). This approach advances multimodal, explainable AI for chronic disease management, providing a scalable, clinician-integrated solution for digital healthcare systems and personalized DR monitoring.
Keywords: Diabetic retinopathy, Multimodal AI, Explainable AI, RFMiD
University of Southern California | Los Angeles, USA | Dec 2024 – Present
Abstract: Artificial Intelligence-driven medical image analysis has transformed healthcare by facilitating automated disease diagnosis and detection through advanced medical imaging technologies. This paper presents an agentic multimodal Retrieval-Augmented Generation (RAG) framework for interactive medical image analysis, combining deep learning architectures with clinical knowledge integration. The system employs a deep learning model for visual feature extraction, a Differential Analyzer Approach (DAA-Deep) for selecting clinically significant features, and Contrastive Language-Image Pre-training (CLIP) embeddings for aligning visual and textual data. The RAG framework retrieves relevant medical knowledge from a structured database and generates detailed diagnostic reports, while supporting interactive follow-up dialogue for enhanced clinical decision-making. Validated on the HAM10000 dataset for skin lesion analysis, the system demonstrates state-of-the-art performance in diagnostic accuracy and explainability. Its modular design, integrating DAA-Deep for feature selection, CLIP for multimodal alignment, and RAG for knowledge retrieval, ensures adaptability to diverse medical imaging domains. This work showcases the potential of agentic, multimodal systems to revolutionize medical image analysis and improve healthcare outcomes.
Keywords: RAG, DAA-Deep, CLIP, HAM10000, Multimodal Learning
JSS Science and Technology University, Mysuru | Mysore, India | Jan 2023 – May 2023
Abstract: Skin cancer is a globally prevalent and potentially life-threatening disease that underscores the importance of early detection and treatment. Traditional diagnostic methods rely heavily on visual inspections conducted by dermatologists, which can be prone to human error and constrained by resource limitations. Recent advancements in machine learning and deep learning techniques have offered promising prospects for automating the skin cancer detection process. However, this domain still faces persistent challenges, particularly the need for effective feature selection methods to improve model performance, interpretability, and computational efficiency. Our goal in this work is to substantially elevate the precision and efficiency of skin cancer detection. This research proposes a novel fusion of the Differential Analyzer Approach (DAA) with deep learning models to enhance skin cancer detection. Our work encompasses the ISIC 2018 dataset of skin cancer images, intricate convolutional neural network (CNN) architectures, and a seamlessly integrated DAA mechanism. The proposed DAA-Deep learning model significantly outperforms conventional deep learning models, showcasing higher accuracy and improved diagnostic capabilities. The results attained by the DAA-Deep model surpass the performance of numerous existing models, exhibiting a remarkable accuracy rate of 96% along with an impressive AUC (Area Under the ROC Curve) value of 0.99.
Index Terms: DAA-Deep, CNN, DAA, ResNet50, Skin cancer, Deep learning, feature extraction
JSS Science and Technology University, Mysuru | Mysore, India | April 2022 – November 2022
Abstract: Integrated health apps are accessible to users at all times and places. Health apps have become a part of the movement towards mobile health programs in health care. Our proposed work is to develop an Integrated Health Application to create a convenient and easy-to-use application for users. Our application can replace the current system in addition to a couple of extra features. The scope primarily consists of three health features. It includes functionalities like Heart Disease Prediction using ML, Skin Cancer Classification using Deep Learning, and tracking and notifying about real-time COVID-19 vaccine availability.
Competed at an all-India level hosted by MetaMorph and Microsoft for Startups | Year: 2023
Advanced by showcasing strong problem-solving skills | Year: 2022
Unacademy, Bengaluru, India | May 2024 - July 2024
eSamudaay, Bengaluru, India | November 2022 - May 2024
Accolite Digital, Bengaluru, India | April 2022 - November 2022