Varun Gangal


I am an AI researcher working on LLM post-training & evaluation at Amazon AGI. Before Amazon AGI, I did research on aligning & evaluating LLMs for end-to-end generative AI based Customer Support Agents at ASAPP Inc from 2022-2024. Even prior, I journeyed through my PhD from 2016-2022 with the Language Technologies Institute, School of Computer Science at CMU, where I was advised by Prof. Eduard Hovy. My research is broadly on language generation, with specific interests in style transfer, data-to-text generation, narrative generation and low-resource & creative generation.

Varun Gangal 
At NEURIPS 2023, New Orleans, at the WANT workshop, presenting my work on DYAD - a blocksparse, GPU-aware approximation to the MLP Layer. ⟶

In the past few years, I have also been involved in co-organizing many collaborative NLP research efforts, such as:
  1. The Controllable Generative Modelling in Language and Vision Workshop (CtrlGen) at NEURIPS'21, which aimed to explore controllability, disentanglement and manipulation for language and vision tasks. We solicited submissions for Papers as well as Demos. Checkout the proceedings page for talk vids, slides, accepted paper info and more.

  2. The GEM benchmark, associated workshop@ACL'21, and paper for better and standardized evaluation and comparison of NLG models and systems - a parallel to GLUE for generation
  3. The challenge sets submodule of GEM, where we built domain-shifted sets under a unified theme for NLG tasks in our benchmark, using various perturbation [backtranslation], sub-selection [length] and other domain shift [diachronic] strategies. Our work was accepted @ NEURIPS'21 Datasets & Benchmarks Track!
  4. The NL-Augmenter participative repository and benchmark, which provides a structure for NLPers to contribute and evaluate task-specific data augmentations a.k.a transformations, as well as subset selection strategies a.k.a filters. We aim to create a large, usable suite (~140 and counting!) of transformations and filters leveraging wisdom-of-the-crowd - opening the door to more systematic analysis and deployment of data augmentation/robustness evaluation.

Before CMU, I graduated with a Dual Degree (B.Tech+M.Tech) in Computer Science and Engineering from IIT Madras in 2016. For my thesis, I was advised by Prof. Ravindran and Ramasuri Narayanam from IBM Research, working on Social Network Analysis problems related to unconventional social networks such as centrality measures for signed networks, influence maximization for hypergraphs and multiplex graphs to model citation networks

For an overview of my published research and preprints, check out my Google Scholar profile.

Email  /  Research Statement (Nov 2021)  /  Thesis Proposal (Mar 2022)  /  Google Scholar  /  LinkedIn  /  Twitter  /  GitHub