About Me
Ankit Shah is an AI researcher and the LLM Architecture Associate Director at Accenture’s Center for Advanced AI, and holds a Ph.D. from the Language Technologies Institute at Carnegie Mellon University. His research focuses on computational audition, multimodal and audio-visual machine learning, weakly- and semi-supervised sound event detection, learning from imprecise labels, and large language model (LLM) systems and agents.
Research areas: computational audition · multimodal & audio-visual machine learning · weakly-/semi-supervised sound event detection · learning from imprecise/noisy labels · LLM systems, agents & generative AI · multimedia analysis.
You live only once, if you do it right once is enough. Be the change you want to see in the world.
Greetings!
Ankit is currently working at Accenture as part of Center for Advanced AI as LLM Architecture Associate Director. There, he led the development of Fortune Analytics — a generative-AI business-intelligence platform built with Fortune Media that answers natural-language questions across decades of Fortune 500 data — which won the iF Design Award 2025 (User Interface) and led to European patent EP4660825A1. In November 2025 he presented Accenture’s reasoning-model work on Microsoft Foundry at Microsoft Ignite 2025. He obtained his Ph.D from Language Technologies Institute at Carnegie Mellon University in School of Computer Science. His doctoral thesis, “Computational Audition with Imprecise Labels”, is available on KiltHub and his defense presentation is on YouTube. He was advised by Prof. Bhiksha Raj and Prof. Rita Singh as part of the CMU Machine Learning for Signal Processing MLSP Group. In April 2023, Ankit’s team won the top prize at the NYC AI GPT Hackathon for their project “FactGPT,” an automated fact-checker for language models. Prior to his PhD, Ankit worked as a Deep Learning Scientist at ReviveMed. ReviveMed performs AI-driven Drug Discovery to find novel therapeutics for metabolic diseases. We leverage tens of thousands of metabolomic datapoints to discover novel biology and impactful therapeutics.
Ankit is fascinated by the applications of Multimedia Analysis and its growing importance in today’s world where more than 90 percent of data consumed is Multimedia. To learn more about the field, he graduated with a Masters in Language Technologies at School of Computer Science in Carnegie Mellon University. During the masters program, Ankit was advised by Prof. Alexander Hauptmann in the Language Technologies Institute. His research interests lie in the areas of machine learning and signal processing with a focus on audio and multimedia analysis. His work on deciphering guntype information using acoustic analysis of gunshot recordings and on Deep Intermodal Video analytics to recognize activities in large scale surveilliance videos was well appreciated.
Previously, he worked as a verification engineer at ARM, a leading semiconductor IP provider. He is familiar with ARM protocols like AMBA, AXI, CPU specifications, ARM architecture and power architecture. At ARM, his role was related to designing inhouse verification tools for simultaneous verification of sub-systems. These enhanced the scalability of other in-house tools and further ensured less rework across projects, thereby improving the throughput of our team’s deliverables. He designed a power controller used across various projects at ARM. Due to efficient deliverables in workplace, his supervisors gave him the responsibility to deliver the first chip-to-chip subsystem verification on ARM platform which was executed successfully by Ankit as a Project Lead. Ankit was consecutively rated amongst the top 5% engineers globally within ARM for two years.
Prior to that, he graduated with a Bachelor of Technology (B.Tech) in Electronics and Communication (EC) from National Institute of Technology Karnataka (NITK), Surathkal.
Selected Publications
A selection of representative work (see the full list of 63 publications):
- Imprecise label learning: A unified framework for learning with various imprecise label configurations — Advances in Neural Information Processing Systems 37 (NeurIPS 2024)
- LLM Unlearning via Loss Adjustment with Only Forget Data — International Conference on Learning Representations (ICLR 2025)
- Understanding and mitigating the label noise in pre-training on downstream tasks — International Conference on Learning Representations (ICLR 2024)
- Improving LLM Retrieval with GraphRAG-SF: Semantic Filtering for Graph-Based RAG Systems — IEEE International Conference on Big Data (2025)
- Importance of negative sampling in weak label learning — ICASSP 2024 (IEEE Int. Conf. on Acoustics, Speech and Signal Processing)
- Overview of the Tenth Dialog System Technology Challenge: DSTC10 — IEEE/ACM Transactions on Audio, Speech, and Language Processing (2024)
- Multimodal Behavioral Markers Exploring Suicidal Intent in Social Media Videos — 21st ACM International Conference on Multimodal Interaction (ICMI 2019)
- Learning Sound Events From Webly Labeled Data — 28th International Joint Conference on Artificial Intelligence (IJCAI 2019)
- Computational Audition with Imprecise Labels — Carnegie Mellon University PhD Thesis (2024)
Ankit likes to read around 20 books a year and listening to Music. His playlists has over 60 thousand followers on Spotify. Hindi Songs Collection and English Songs Collection
Same playlist links on YouTube Music English Songs Collection and Hindi Songs Collection