Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no sound practical method to numerically evaluate whether any network motif discovery algorithm performs as intended - thus it was not possible to assess the validity of resulting network motifs. In this work, we present IndeCut, the first and only method that allows characterization of network motif finding algorithm performance on any network of interest.
This method leverages a relatively new weak law of large numbers for the so-called "cut-norm," and then addresses the NP-hardness of computing the cut-norm by bounding it between a semi-definite relaxation of a quadratic integer program, and a random hyperplane projection algorithm.