photo

 Elita Lobo

  PhD Student
  University of Massachusetts Amherst
  elobo@umass.edu
  Google Scholar
  Github
  LinkedIn

Bio

I am currently a 3rd-year Ph.D. student in the College of Information and Computer Sciences at UMass Amherst, working with Prof. Yair Zick at FED.
My research focuses on Trustworthy Reinforcement Learning (RL) and Machine Learning, with a particular emphasis on developing practical, fair, and robust algorithms. Before starting my PhD, I completed a Master's degree in Computer Science at UMass Amherst in 2020, during which I had the privilege of working with external collaborators, Dr. Marek Petrik and Dr. Hima Lakkaraju. I also spent two years in industry as a Software Engineer at Flipkart and Endurance International Group. I graduated from NIT Durgapur with a B.Tech in Electronics and Communication Engineering.

Research Areas

Robust and Fair Decision Making Systems: My PhD research centers on reinforcement learning and resource allocation under uncertainty, adversarial conditions, and fairness constraints. In Soft-Robust Algorithms for Batch Reinforcement Learning, I propose the soft-robust criterion as a principled alternative to the standard percentile criterion, which often results in overly conservative policies. I develop two approximate algorithms that demonstrate, both theoretically and empirically, more balanced and effective decision-making. Expanding on this, Percentile Criterion Optimization in Offline Reinforcement Learning introduces a Value-at-Risk-based dynamic programming approach to optimize robust policies without constructing explicit uncertainty sets. The method allows the learning of less conservative, uncertainty-aware policies. In Data Poisoning Attacks on Off-Policy Policy Evaluation Methods, we present the first known data poisoning framework targeting off-policy evaluation. Using influence functions, I show how small, targeted data perturbations can significantly skew policy value estimates, underscoring the need for robust evaluation techniques. In Fair and Welfare-Efficient Constrained Multi-Matchings under Uncertainty, I address resource allocation when agent utilities are unknown, using both stochastic and robust optimization to balance fairness and efficiency. These methods are validated on a real-world reviewer assignment dataset.

Large Language Models (LLM): As an additional line of research, I have explored the reasoning capabilities, fine-tuning dynamics, and unlearning behavior of LLMs. In On the Impact of Fine-Tuning on Chain-of-Thought Reasoning, I investigate how fine-tuning influences LLM reasoning. The study shows that while fine-tuning improves task-specific performance, it can reduce the consistency and faithfulness of chain-of-thought reasoning across datasets, highlighting trade-offs between optimization and reasoning integrity. I am also developing counterfactual verifiers for mathematical and logical reasoning tasks, using counterfactual data augmentation and contrastive loss to enhance robustness. In Matching Table Metadata with Business Glossaries Using Large Language Models, I apply LLMs to align enterprise metadata with business glossaries. This work demonstrates that LLMs can infer complex relationships between table column names and glossary descriptions without manual tuning, enabling scalable metadata alignment in restricted-access environments.

Fairness-Centric and Interpretable Machine Learning: I also contributed to On Welfare-Centric Fair Reinforcement Learning, where we introduce a framework in which an agent receives vector-valued rewards from multiple beneficiaries and optimizes a specified welfare function. We show that welfare-optimal policies are inherently stochastic and start-state dependent, and present the E4 learner, which operates within an adversarial-fair learning framework to manage exploration and maintain welfare guarantees. Additionally, I contributed to Axiomatic Aggregations of Abductive Explanations, which addresses the challenge of multiple valid abductive explanations per data point by proposing aggregation techniques that generate feature importance scores. Grounded in cooperative game theory (via power indices) and causal strength measures, these techniques are axiomatically characterized to meet desirable interpretability properties. Unlike popular methods such as SHAP and LIME, the proposed explanations exhibit improved robustness against adversarial perturbations.

Education

P.h.D. in Computer Science
University of Massachusetts Amherst.
Started PhD in Fall 2022.
Robust Decision Making Systems. Supervised by Prof. Yair Zick.
Master in Computer Science
University of Massachusetts Amherst. 2018.
Thesis: Soft Robust Algorithms for Batch RL.
Bachelor of Technology (B.Tech) in Electronics and Communications Engineering
National Institute of Technology, Durgapur B.E. 2016.

Experience

  1. Amazon (Central ML Team), Seattle Spring 2024
  2. Harvard Business School, MA Summer 2024
  3. Microsoft Research, India Summer 2023
  4. IBM Research, Yorktown Heights, NY Summer 2023
  5. IBM Watson, Yorktown Heights, NY Summer 2022
  6. IBM Watson, Yorktown Heights, NY Summer 2021
  7. Harvard Business School, MA Winter 2020-2021
  • Flipkart, Bangalore, India Aug 2017 - Jul 2018
  • Endurance International Group, Bangalore, India Jul 2016 - Aug 2017
  • Publications

    Please visit my Google Scholar page for an updated list of publications.

    1. Elita Lobo, Nhan Pham, Dharmashankar Subramanian, Tejaswini Pedapati. A Metahyperparameter Tuning Framework for Reinforcement Learning.
      In-Submission: Patents 2023 - Reinforcement Learning
      [Under Review]
    2. Elita Lobo, Nhan Pham, Oktie Hassanzadeh, Dharmashankar Subramanian, Nandana Sampath Mihindukulasooriya, Long Vu. A novel system for metadata to glossary matching in data lakes using human feedback and generative models.
      In-Submission: Patents 2024 - Data Systems
    3. Elita Lobo, Chirag Agarwal, Hima Lakkaraju. On the Impact of Fine-Tuning on Chain-of-Thought Reasoning in LLMs.
      NACCL 2025 - Large Language Models
      [Paper]
    4. Anmol Mekala, Vineeth Dorna, Shreya Dubey, Abhishek Lalwani, David Koleczek, Mukund Rungta, Sadid Hasan, Elita Lobo*. Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models.
      COLING 2024 - Knowledge Unlearning
      [Paper]
    5. Elita Lobo*, Justin Payan*, Cyrus Cousins, Yair Zick. Fair and Welfare-Efficient Resource Allocation under Uncertainty.
      NeurIPS 2024 - Fairness & Optimization
      [Paper]
    6. Cyrus Cousins, Elita Lobo, Kavosh Asadi, Michael L. Littman. On Welfare-Centric Fair Reinforcement Learning.
      RLC 2024 - Reinforcement Learning
      (Outstanding Paper)
      [Paper]
    7. Vignesh Viswanathan, Elita Lobo, Yacine Izza, Gagan Biradar, Yair Zick. Axiomatic Aggregations of Abductive Explanations.
      AAAI 2023 - Explainable AI
      [Paper]
    8. Elita Lobo, Cyrus Cousins, Marek Petrik, Yair Zick. Percentile Criterion Optimization in Offline Reinforcement Learning.
      NeurIPS 2023 - Offline RL
      [Paper]
    9. Elita Lobo, Harvineet Singh, Cynthia Rudin, Himabindu Lakkaraju. Data Poisoning Attacks on Off-Policy Policy Evaluation Methods.
      UAI 2022 - Robustness in RL
      (Top 5%)
      [Paper]
    10. Elita Lobo, Oktie Hassanzadeh, Nhan Pham, Nandana Mihindukulasooriya, Dharmashankar Subramanian, Horst Samulowitz. Matching table metadata with business glossaries using large language models.
      Ontology Matching Workshop 2023 - Data Integration
      [Paper]
    11. Elita Lobo, Mohammad Ghavamzadeh, Marek Petrik. Soft-robust Algorithms for Batch Reinforcement Learning.
      IJCAI R2AW Workshop 2021 - Robust RL
      [Paper]
    12. Elita Lobo, Yash Chandak, Dharmashankar Subramanian, Josiah Hanna, Marek Petrik. Behavior Policy Search for Risk Estimators in RL.
      NeurIPS Safe-RL Workshop 2021 - Safe RL
      [Paper]
    13. Elita Lobo, Harvineet Singh, Cynthia Rudin, Himabindu Lakkaraju. Data Poisoning Attacks on Off-Policy Policy Evaluation Methods (Workshop version).
      ICLR PAIR2Struct Workshop 2022 - Security in ML
      [Paper]

    Skills

    Published Software

    Mentorship Experience