Institute for Interdisciplinary Information Sciences
News: on Spring 2016 I’m starting as an Assistant Professor at the Chinese Academy of Sciences, at Xiaoming Sun’s group. My new website is here, and I will not be updating this website after March, 2016.
I finished my Ph.D on August 2015 and defended on December. In graduate school, I read computer science and mathematics under the advising of Periklis A. Papakonstantinou in Institute for Theoretical Computer Science (Turing Award winner Andrew Yao’s institute) at Tsinghua University. I was also a member of and contributed in the Laboratory for the Study of Randomness in Computation and Cryptographic Complexity (RC^3/IIIS) from the beginning of its establishment.
My undergraduate study was in the elite Tsinghua CS class, aka Yao Class. I was offered direct admission to Yao Class because of a Gold Medal in CMO, and was offered direct admission to the Tsinghua PhD program as well.
I have a broad interest in cryptography and complexity theory, especially in foundations of cryptography and streaming algorithms. My CV is available here.
My Research on Crypto & Randomness over Big Data
Streaming Randomness Extraction:
A randomness extractor is applied to extract high-quality randomness (i.e. almost uniform) from a weakly random source, with the help of a short random seed. We hope to implement extractors in a streaming fashion, where both the local memory and the ability to access the input (e.g. the weak source) are severely limited. In particular, in a small space (e.g. poly-log(n) for input length n) an algorithm that computes a randomness extractor is allowed to make only a tiny small (e.g. log log n) number of passes (scanning from left to right) over the input and an auxiliary read/write external stream.
I started working on this topic back in 2011 in a joint work with Prof. Papakonstantinou.
When this work started, only a lower bound for the single-stream case was known. Later, we concluded the first lower bound when more than one streams are available (i.e. for stronger streaming algorithms). This happened in 2013 -- we should acknowledge the contribution and thank Professor Paul Beame for the collaboration back in the summer of 2012.
Perhaps surprisingly, instead of improving our (seemingly weak) lower bound, we obtained a working construction that matched it -- i.e. our lower bound is optimal. Since then, much of the effort is spent in experimentally realizing high-throughput streaming extractors and testing the quality of the produced outputs. The numbers measured in these experiments are spectacular.
This research has been completed now, and we are putting it together in one integrated writeup -- contains both the extensive experimental and theoretical parts. This is to be submitted soon.
The goal is to implement cryptographic primitives, e.g. one-way functions, pseudorandom generators, and public-key systems in a streaming fashion. In particular, we consider small space (e.g. poly-logarithmically) algorithms making only a small (e.g. constant) number of passes over the input (e.g. the key/message). This is not to be confused with stream ciphers, since the key is too long to be stored in the internal memory.
In a joint work with Papakonstantinou, I find a way to construct one-way functions computable by a log-space streaming machine using only 5 passes over 2 read/write streams, assuming there is a one-way function computable in NC^1. We have also fleshed-out a generic construction of streaming pseudorandom generators from any streaming one-way functions, with an overhead of 2 additional passes. Moreover, when conjecturing the cryptographic hardness of decoding a random linear code, we have a simpler and more efficient construction of streaming one-way functions.
For public-key systems, we can achieve semantically security and both encryption/decryption streaming computable.
On the impossibility aspect, we prove an impossibility result for super-linear stretch streaming pseudorandom generators. Similar tools can be applied to achieve a lower bound to the private-key length in any semantically secure public-key encryption scheme with a streaming decryption algorithm.
The most exciting point is that we base cryptographic primitives on hardness assumptions without being able to compute its instances. For example, we can make use of the hardness of factoring without multiplying two large integers! This is quite counter-intuitive and is definitely impossible by black-box constructions. To that end, highly non-standard non-black-box techniques (i.e. randomized encoding) must be employed. Though the idea is similar to Cryptography in NC^0 by Applebaum-Ishai-Kushilevitz, the streaming cryptography is technically orthogonal and completely different.
Previously, I have got some negative results in this area for specific types of constructions. An example from my work on this, is in a joint work with Papakonstantinou, where I showed that applying linear operators, in particular, a universal family of linear hash functions, to shrink the output of a pseudorandom generator could be even not one-way, which excludes specific methods in constructing length-efficient streaming cryptographic primitives.
I also work on function incompressibility, zero-knowledge/witness-indistinguishable PCP, online story scheduling, and some learning algorithms.