I will illustrate how to view fragments produced in a high-throughput sequencing experiment as points in the plane. This collection of points forms a two-dimensional spatial Poisson process, which yields a model for random fragment coverage. I will then show how the successive jumps of the depth coverage function can be encoded as a random tree that is approximately a Galton-Watson tree with generation-dependent offspring distributions. Two applications of this theory will be presented: an algorithm for finding protein binding sites using ChIP-Seq data and a statistical test to address fragment bias in RNA-Seq data.
Colloquium tea precedes the talk in the Mathematics Department Lounge Kidder 368 at 3:30. All are welcome.