清华大学交叉信息研究院

Algorithms and Protocols for a Trustworthy Cyberspace in the Era of Large Language Models

演讲人： Tianxing He University of Washington
时间： 2023-10-18 15:00-2023-10-18 16:00
地点：FIT 1-222
内容：

The widening adoption of large language models (LLMs) on cloud brings urgent problems related to privacy and social engineering. Focusing on the user-server interactions related to prompted generation, we propose protocols and algorithms to address two key issues: (1) What if the user want to keep the generated text to themselves and hide from a peeking server? We propose LatticeGen, a cooperative framework in which the server still handles most of the computation while the user controls the sampling operation. The key idea is that the true generated sequence is mixed with noise tokens by the user and hidden in a noised lattice. (2) How to control the proliferation of LLM-generated texts? We propose SemStamp, a semantic watermark algorithm, which robustly impose hidden patterns in generated texts that are imperceptible to humans, but make the outputs of LLMs algorithmically identifiable as synthetic. We design the watermark to be robust to sentence-level paraphrase attacks. The slide will be in English but the oral talk and QA will be given in Chinese.

个人简介:

Tianxing He works with Yulia Tsvetkov in UW on natural language generation as a postdoc. Before UW, he did PhD in MIT supervised by James Glass. His master was in SJTU SpeechLab supervised by Kai Yu. He is also from ACM 10 (SJTU). Currently he works on developing algorithms or protocols for a trustworthy cyberspace in the era of large language models. During his PhD, he worked towards a better understanding of how the current large language models work. Related, he is interested in monitoring and detecting different behaviors of language models under different scenarios, and approaches to fix undesirable behaviors.