IISc Bangalore

Exploring watermarking LLMs with Dr. Danish Pruthi (Summer 2023)

I worked with Professor Danish Pruthi on evaluating the effects watermarking large language models on a variety of downstream tasks. We show that performance on classification and generation tasks (like summarization and translation) can be disproportionally affected by a popular watermarking strategy (moreso than perplexity increases might intuitively indicate). We also performed detailed analyses and explored some potential augmentations to the watermarking scheme for reclaiming lost performance on these tasks. We have released our findings as a preprint.