We consider the problem of approximating a function in general nonlinear subsets of L2 when only a weighted Monte Carlo estimate of the L2-norm can be computed. Standard concentration of measure arguments can be used to provide a worst-case bound for the probability that a certain error is achieved with a prescribed number of sample points. For model classes of tensor networks, however, this bound depends exponentially on the order of the networks and is independent of the regularity of the sought function. This behaviour is not observed in many practical applications but can indeed be demonstrated in numerical experiments. Restricting the model class to a neighborhood of the best approximation, we can derive a new bound that is able to utilize the regularity and thereby reduce the number of samples that are required to reach a prescribed accuracy.