Sankaran Vaidyanathan

shun-ka-run • /ʃʌŋkəˈɹʌn/ • சங்கரன்

me.jpg

I am a PhD student at the College of Information and Computer Sciences, UMass Amherst, where I am advised by David Jensen. My research spans the areas of causal inference, probabilistic machine learning, and reinforcement learning. More recently, I have also worked on approaches for mechanistic interpretability in LLMs.

I aim to create tools for analyzing and evaluating the behaviour of complex AI systems, with a focus on problems in blame and responsibility attribution, explainability, and alignment with human norms. Unlike most applications of causal inference that involve objective experimentation and interaction with the external world, these issues are traditionally grounded in subjective human judgments. These involve norms that can be very counterintuitive, and pose a significant challenge to purely statistical approaches in causal inference. By developing formal approaches for modeling norms and inference algorithms that align with norms, I hope to support open and scientific evaluation and auditing of AI systems, and the growth of AI systems that better align with norms.

selected publications

  1. arXiv
    Automated Discovery of Functional Actual Causes in Complex Environments
    Caleb Chuck*, Sankaran Vaidyanathan*, Stephen Giguere, and 3 more authors
    arXiv preprint arXiv:2404.10883, 2024
  2. arXiv
    Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
    Aman Singh Thakur*, Kartik Choudhary*, Venkat Srinik Ramayapally*, and 2 more authors
    arXiv preprint arXiv:2406.12624, 2024
  3. arXiv
    Adaptive Circuit Behavior and Generalization in Mechanistic Interpretability
    Jatin Nainani*, Sankaran Vaidyanathan*, AJ Yeung, and 2 more authors
    arXiv preprint arXiv:2411.16105, 2024