I think this is just about what measures the developers of each test took to make sure that their sample they used for developing the norms was representative of the population the test was to be used on. That seems to be what a "norming plan" is. Here's a paragraph from a PhD thesis I googled up that seems to make the terminology clearer (PhD theses are often good reading as background because the candidates have to include a lot of information that people writing papers for publication will just assume all readers know!)
Test construction and standardization of the WJTCA-R was extensive and
thorough (Kamphaus, 1993; McGrew, Werder, & Woodcock, 1991). The concepts of
latent-trait theory and the analysis of data by the Rasch model were employed (McGrew,
1994). The normative data were gathered from 6,359 subjects in over 100 communities
selected during a three-stage stratified sample based on the 1980 U.S. Census.
Representativeness of the standardization sample was achieved by controlling for 5 person variables (gender, race, Hispanic origin, and occupation and education of adults)
and fifteen community variables (location, size, and 13 community socioeconomic
variables) in the norming plan.
http://digital.library.unt.edu/ark:/67531/metadc2700/m1/1/