Network switches sometimes suffer from "microbursts", a phenomenon where a sudden surge of traffic causes long queuing and even packet drops due to a full queue, all within milliseconds. Data center and carrier network operators know the existence of microbursts but struggle to pinpoint the root cause, since existing monitoring tools operates on a coarser time scale and only reports queue length, not the contents of queue. We present ConQuest, a queue analytics data structure running on programmable switches to identify the flows contributing significantly to queue buildups directly in the data plane, and take targeted actions to mark, drop, or reroute these flows in real time. Evaluation shows ConQuest can accurately identify the bursty flows occupying significant queuing space, and improve network flow completion time when taking targeted action on those flows. In addition, we propose a novel setup to use ConQuest to measure queues and microbursts in legacy, non-programmable network devices through link tapping.
Xiaoqi Chen is a third year PhD student at Department of Computer Science, Princeton University, advised by Prof. Jennifer Rexford. He received his Bachelor's degree from Institute for Interdisciplinary Information Sciences (Yao class), Tsinghua University in 2017. His research focuses on running approximated network measurements in programmable switches using P4, and his research interest also includes data center networking, sketches, and network science.