Ulrike von Luxburg (Project Leader),
Debarghya Ghoshdastidar
(University of Tübingen)

Network analysis is an integral part of many scientific disciplines ranging from bioinformatics to social sciences. A key question in this field is whether two different networks are likely to have been generated from the same underlying process or not.

This question is often formulated in terms of statistical hypothesis testing. There has been a tremendous amount of research on this topic, and empirical observations are often used to validate the hypotheses that real networks fit certain mathematical models or that two networks are dissimilar since their degree distributions or diameters are different. It is natural to wonder if there is a sound theoretical basis for these claims. Surprisingly, there has been little research on the theoretical aspects of network testing, and it is often not clear whether one can meaningfully test large networks by observing only a few replicates, which is a typical setting in applications of network analysis.

In this project, we focus on the problem of two-sample hypothesis testing of networks and provide a clear understanding of the statistical complexity of testing small populations of networks, each defined on a large number of nodes. We study the following formal problem. Given two populations of random graphs, $G_1,\ldots,G_m$ and $H_1,\ldots,H_m$, the problem is to test whether both populations are generated from the same model or not. We first consider the setting where the graphs are generated from an inhomogeneous Erdős-Rényi (IER) model defined on a common set of $n$ vertices, that is, $G_1,\ldots,G_m \sim_{\text{iid}} \text{IER}(P)$ and $H_1,\ldots,H_m \sim_{\text{iid}} \text{IER}(Q)$, where $P,Q \in \mathbb{R}^{n\times n}$ denote the model parameters. Under this setup, we test whether \begin{align*} \mathcal{H}_{\text{null}}: P = Q \text{ or } \mathcal{H}_{\text{alternative}}: d(P,Q) > \rho, \end{align*} where $d$ is a metric, and $\rho\geq0$ is a scalar. In the setting of fixed $m$ and $n\to\infty$, we show that popular $\chi^2$-type tests are not reliable since the associated formal testing problem is not solvable for $m\ll n$ under a minimax testing framework. We also identify two testing problems, based on the Frobenius norm and the spectral norm of $(P-Q)$, that are solvable in the small sample, large graph regime, and derive minimax-optimal theoretical hypothesis tests.

We put theory into practice and develop practical two-sample tests based on asymptotic distributions [ ]. We show that these tests are both theoretically consistent in the small sample regime and exhibit good empirical performance for moderate sized networks.

We also study the more general problem of comparing two large graphs defined on different vertex sets. Standard tests in this setting are based on some network statistics. Using such testing principles, we provide a formal framework and demonstrate that the testing problem can be ill-posed, and network statistics may not distinguish highly dissimilar networks [ ].

3 results

**Practical Methods for Graph Two-Sample Testing**
In *Proceedings Neural Information Processing Systems*, Neural Information Processing Systems (NIPS 2018) , 2018 (inproceedings)

**Two-sample Hypothesis Testing for Inhomogeneous Random Graphs**
2017, arXiv preprint (arXiv:1707.00833) (article)

**Two-sample tests for large random graphs using network statistics**
In *Conference on Computational Learning Theory (COLT)*, Conference on Computational Learning Theory (COLT), 2017 (inproceedings)