U.S. Census Bureau 1990 Population Sampling!
http://www.census.gov/genealogy/names/dist.all.last
To find out their methodology for the following numbers
check out this web site:
http://www.census.gov/genealogy/www/freqnames.html
A lot can be said for and against statistics. The file
provided listed 88,799 Surnames but appeared to be
incomplete since Cumulative Frequency ended at 90.483.
The sampling data was compiled using Surnames not
even found in the Sampling. For instance, 69,959 of
the surnames had no Frequency in percent but were
included in the rankings. Hmm?
Each of the three files, (dist.all.last), (dist. male.first), and
(dist female.first) contain four items of data. The four items
are:
(1). A "Name"
(2). Frequency in percent
(3). Cumulative Frequency in percent
(4). Rank
Example:
In the file (dist.all.last) one entry appears as:
MOORE 0.312 5.312 9
In our Search Area sample, MOORE ranks 9th in terms of frequency.
5.312 percent of the sample population is covered by MOORE and
the 8 names occurring more frequently than MOORE. The surname,
MOORE, is possessed by 0.312 percent of our population sample.
Detailed Methodology
---------------------------------------
Variables in Names Files:
name
freq = Frequency in percent
cum.freq = Cumulative Frequency in percent
rank
Carman Surname and Variations:
Name % Freq. Cum. Freq. Rank
CARMAN 0.003 59.791 3756
CARMON 0.001 71.230 10118
CARMEN 0.001 71.231 10119