Dan Stubbs, a UMass CMPSCI undergraduate
Computer Science Building, Room 140
Faculty Host: David Mix Barrington
The problem of finding characteristic elements of a data set naturally emerges in the context of massive data streams. Algorithms for exact or approximate order statistics using space sublinear in the input length are of great interest. We present two algorithms for finding order statistics in random order streams. The first extends a median-finding algorithm from 1980 due to Munro and Paterson and exactly finds elements of rank k in O(sqrt(klogn)) space with high probability. The second, from 2012, due to McGregor and Valiant, finds an additive n^(1/3+o(1)) approximation to the median storing only a constant number of counters.