Despite the ubiquity of Gaussian process regression in the applied context, almost no theoretical results are available that account for the fact that parameters of the covariance kernel need to be jointly estimated from the dataset. The lack of theoretical understanding draws into question whether Gaussian process regression should be used at all in important, e.g. safety-critical, applications. To gain some insight, we studied the scenario where the scale parameter of the kernel is estimated using maximum likelihood. Our main result is a bound on the rate at which the Gaussian process can become overconfident as the size of the dataset is increased. The analysis is based on a combination of techniques from nonparametric regression and scattered data interpolation, and is joint work with Toni Karvonen, George Wynne, Filip Tronarp and Simo Sarkka.