And I should mention the most important aspect of the GDO, which is that it is not a deviation measure like most of the other instruments we discuss, and is, instead, based entirely on the notoriously poor age-equivalent ("developmental scale").
Additionally, the standardization sample was not representative of the population at every age band, with pretty significant skewing at both the age 3-4 (over sampled for African-American) and 4.5-6 (over sampled for white) ranges. They didn't have enough valid protocols to use the 2 yo and over 6 yo ranges. This is all, as they acknowledge themselves, because they used a convenience sample instead of properly sampling nationwide.
Basically, this instrument has very poor psychometric properties, and shouldn't be used for high-stakes decisions like retention or late entry, just for screening (if anything).