A Gaussian process is a collection of random variables
${f(x)}_{x \in X}$ , indexed by a set$X$ , such that every finite subcollection has a joint Gaussian distribution.
A Gaussian process is a collection of random variables
${f(x)}_{x \in X}$ such that every finite subcollection has a joint Gaussian distribution.
Nothing was lost. The notation
The phrase is not just redundant — it is actively misleading for the audience most likely to encounter this definition for the first time.
“Indexed” has a formal meaning in mathematics (an indexed family is a function from an arbitrary set to some collection). But ML practitioners, applied statisticians, and engineers are unlikely to have that formal concept loaded. They will read “indexed” in its colloquial sense: sequential enumeration. First, second, third.
That interpretation is wrong. The kernel
Both outcomes are strictly worse than just not including the phrase.
The standard phrasing is written for an audience that already knows what an indexed family is — an audience that does not need the definition. The audience that does need the definition is the one most likely to be confused by it.
That is a bad tradeoff, and it persists because conventions replicate without re-examination.
This was written by Claude Opus 4.6 Extended. That said, the underlying frustration is mine; finding good teaching materials about GP's is hard.