Journal of Andrology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Published-Ahead-of-Print May 14, 2009, DOI:10.2164/jandrol.108.006825
Journal of Andrology, Vol. 30, No. 6, November/December 2009
Copyright © American Society of Andrology
DOI: 10.2164/jandrol.108.006825

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Material
Right arrow All Versions of this Article:
30/6/642    most recent
Author Manuscript (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Amann, R. P.
Right arrow Articles by Chapman, P. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Amann, R. P.
Right arrow Articles by Chapman, P. L.

Total Sperm per Ejaculate of Men: Obtaining a Meaningful Value or a Mean Value With Appropriate Precision

RUPERT P. AMANN* AND PHILLIP L. CHAPMAN{dagger}

From the * Animal Reproduction and Biotechnology Laboratory and the {dagger} Department of Statistics, Colorado State University, Fort Collins, Colorado.

Correspondence to: Dr Rupert P. Amann, 909 Centre Ave, #123, Ft Collins, CO 80526-2091 (e-mail: rpalra62{at}comcast.net).
Received for publication September 10, 2008; accepted for publication April 1, 2009.

Abstract

We retrospectively mined and modeled data to answer 3 questions. 1) Relative to an estimate based on ~20 semen samples, how imprecise is an estimate of an individual's total sperm per ejaculate (TSperm) based on 1 sample? 2) What is the impact of abstinence interval on TSperm and TSperm/h? 3) How many samples are needed to provide a meaningful estimate of an individual's mean TSperm or TSperm/h? Data were for 18–20 consecutive masturbation samples from each of 48 semen donors. Modeling exploited the gamma distribution of values for TSperm and a unique approach to project to future samples. Answers: 1) Within-individual coefficients of variation were similar for TSperm or TSperm/h abstinence and ranged from 17% to 51%; average ~34%. TSperm or TSperm/h in any individual sample from a given donor was between –20% and +20% of the mean value in 48% of 18–20 samples per individual. 2) For a majority of individuals, TSperm increased in a nearly linear manner through ~72 hours of abstinence. TSperm and TSperm/h after 18–36 hours' abstinence are high. To obtain meaningful values for diagnostic purposes and maximize distinction of individuals with relatively low or high sperm production, the requested abstinence should be 42–54 hours with an upper limit of 64 hours. For individuals producing few sperm, 7 days or more of abstinence might be appropriate to obtain sperm for insemination. 3) At least 3 samples from a hypothetical future subject are recommended for most applications. Assuming 60 hours' abstinence, 80% confidence limits for TSperm/h for 1, 3, or 6 samples would be 70%–163%, 80%–130%, or 85%–120% of the mean for observed values. In only ~50% of cases would TSperm/h for a single sample be within –16% and +30% of the true mean value for that subject. Conclusions: Pooling values for TSperm in samples obtained after 18–36 or 72–168 hours' abstinence with values for TSperm obtained after 42–64 hours is inappropriate. Reliance on TSperm for a single sample per subject is unwise.

     Key words: Imprecision of total sperm per ejaculate, abstinence interval, meaningful semen data, noninvasive evaluation of spermatogenesis



There are 2 groups of individuals who draw conclusions from characteristics of an ejaculate and sperm therein. They are clinicians serving patient couples and epidemiologist-andrologist teams serving society. The latter seek associations between some factor(s) and illness of the testes or some other reproductive organ. Clinicians long have recognized that evaluation of several samples of semen, ideally each after 2–3 days' abstinence, is important for a meaningful conclusion. Members of epidemiologic-andrologic teams generally report study of 1 sample per subject, with samples after 2–7 days' abstinence deemed acceptable. Obviously, clinicians and epidemiologist-andrologist teams take different approaches to ascertain whether a subject's testes are diseased in respect to spermatogenesis.

It generally is accepted that there is substantial within-individual variation in seminal volume or total number of sperm per ejaculate (TSperm, as 106/ejaculate). Obviously, results for a single sample will be imprecise and this could hamper correct detection of testicular illness or nonillness by an epidemiologist-andrologist team. We found no report considering the impact of within-individual variation in TSperm on detection of decreased sperm production in a subject and noted that most reports of coefficients of variation for TSperm were limited by number of samples per subject, number of subjects, or other factors.

Available data (reviewed in Amann, 2009a) show that TSperm increases for only 2–3 days of abstinence in many men and increases for 6–7 days in other men. In epidemiologic studies it is common to adjust TSperm or sperm concentration after abstinence intervals of 2–9 days to standardized values (eg, to 96 hours; Jørgensen et al, 2001). Does this ignore the dynamics of sperm accumulation in the excurrent ducts? We had the perception that studying normalcy of spermatogenesis in an individual on the basis of 1 or 2 samples of semen obtained after an abstinence interval ranging from 2 to more than 7 days might result in flawed conclusions. New information was needed for evidence-based recommendations on appropriate number of samples and especially abstinence interval.

By retrospective data mining and modeling we sought answers to 3 questions. 1) Relative to an estimate based on ~20 semen samples, how imprecise is an estimate of an individual's TSperm based on 1 sample? 2) What is the impact of abstinence interval on TSperm and TSperm/h? 3) How many samples are needed to provide a meaningful estimate of an individual's mean TSperm or TSperm/h?


Materials and Methods

Semen Data

To access data appropriate for study of intraindividual variation in TSperm, we approached commercial semen banks. Personnel associated with Fairfax Cryobank, Fairfax, Virginia, and Cryogenic Laboratories Inc, Roseville, Minnesota, reviewed records for 2000–2005 to identify 50 donors who had provided 20 or more semen samples. A given donor provided samples to only 1 center. All donors had signed documents allowing research use of information derived from their samples; no provided data could be traced to a specific individual. Information for each sample included code number, collection date, abstinence interval (hours, as stated by donor), seminal volume (mL), and sperm concentration (106 sperm/mL).

All semen specimens were evaluated using standardized procedures, with quality control meeting World Health Organization guidelines (World Health Organization, 1999). Each semen specimen was evaluated by a technician who had passed an annual proficiency test after formal internal training or retraining. A given semen sample was evaluated by 1 technician, but different technicians likely evaluated samples from a given donor.

All semen specimens were collected on site into a prelabeled 150-mL sterile cup, in a private room with literature to help sexual arousal if a donor examined it. Both sperm banks requested correct information on abstinence interval, but had no quality assurance procedure for this attribute. In the laboratory, after allowing 30 minutes for liquefaction, semen was transferred into a graduated, conical, 15-mL centrifuge tube (Falcon 352097) using a 5- or 10-mL serological pipette. Seminal volume was recorded to the nearest 0.1 mL, as read from the tube.

When measuring sperm concentration, the same site-specific counting device was used for semen from a given donor. Neat semen was mixed and 10 µL was diluted with 190 or 40 µL of diluent (tap water) for placement into a hemocytometer or 20-µm MicroCell counting device (Bright-Line; Hausser Scientific, Horsham, Pennsylvania; Conception Technologies, San Diego, California) then in use, respectively, at Fairfax Cryobank and Cryogenic Laboratories Inc. Sperm within the demarked volume were counted using phase-contrast optics at 200x total magnification. Calculations were based on the mean for counts from 2 chambers of the counting device. If counts between 2 chambers did not agree within 10%, a recount was performed and only the last 2 counts were used to calculate the mean.

Data for 48 donors were transmitted electronically; there were 20 samples for 40 donors and 18 or 19 samples for 8 donors. After the first sample in a series, the interval between successive samples (not abstinence interval) was less than or equal to 14 days for ~90% of samples and less than or equal to 8 days for 75% of samples. Days rather than weeks between samples was desirable for study of intraindividual variation, but precluded meaningful study of seasonal variation. For each sample, we calculated TSperm as (volume)(sperm concentration) and then TSperm per hour of abstinence (TSperm/h) was calculated. TSperm/h sometimes is termed sperm accumulation rate.

Data Mining and Modeling

Statistical analyses focused on the relationship between TSperm and abstinence interval, as well as within-donor and among-donor variation in TSperm and TSperm/h. We did not model seminal volume or sperm concentration because neither attribute informs about potential illness of the testes (Amann, 2009b). TSperm was plotted as a linear function of abstinence interval without restriction on the intercept. Also, individual straight lines were calculated for each donor with the intercept set equal to zero. Finally, data for TSperm and TSperm/h at abstinence intervals common to many donors (ie, 24, 36, 48, 60, 84, 96, and 120 hours) were log10-transformed and compared by abstinence interval using a mixed model with donor as the random effect.

Preliminary least squares analysis revealed that the standard deviation of TSperm was approximately proportional to its mean and that the distribution of deviations from the mean was right-skewed rather than bell-shaped. To match those properties TSperm was assumed to have a gamma distribution, which is a common model for measurements that have large coefficients of variation (CVs) and long right tails and are constrained to nonnegative values. The family of gamma distributions includes {chi}2 distributions and is flexible in that with a high CV the gamma distribution is highly skewed, but with lower CVs it becomes nearly symmetric and normal (see Supplemental Figure 1, available online at www.andrologyjournal.org). This choice allowed modeling of the nontransformed response of TSperm with increasing abstinence interval using generalized linear model methods described in the Supplemental Material. In this model an outcome value βi can be interpreted as mean sperm accumulation rate (106 sperm/h) for the ith donor with a dispersion parameter {phi}, which is the square of the CV.

Maximum likelihood estimates of the model parameters were obtained using SAS PROC GENMOD (SAS Institute, 2003). The data were restricted to samples for which abstinence interval was between 13 and 84 hours, the range over which the response was most linear; this excluded 125 samples for which abstinence interval was 85–100 hours and 20 samples with an abstinence interval of 102–240 hours. As described in the Supplemental Material, 2-sided confidence intervals for logei) were produced by the program using the likelihood-ratio method, and endpoints of the intervals were exponentiated to obtain confidence limits (CLs) for βi for each of the 48 donors in our data set. For hypothetical future subjects, CLs for accumulation rates (β) were calculated using method 1 described in the Supplemental Material.


Figure 1
View larger version (28K):
[in this window]
[in a new window]

 
Figure 1. Relationship of TSperm and abstinence interval for 16 individual donors. Each panel includes: solid line, linear regression calculated using 1/(abstinence interval)2 as a weighting factor; a, estimated sperm/h abstinence; associated r value, calculated without weighting; and dotted line, linear regression forced to zero (calculations and plots in SigmaPlot 8.02). (Panels A–F) include 1–4 samples provided after 13–40 hours' abstinence, and it is evident that extension of the solid line often would intercept the Y-axis near zero (compare solid and dotted lines), although the slope a ranges between 1.6 and 6.8 x 106 sperm/h abstinence. (Panel G) is for a donor with a wide range in abstinence intervals, and shows separate linear plots for 24–100 and 100–168 hours' abstinence, and also a simple exponential rise to a maximum [y = y0 + a(1 – bx)]. (Panels H–L) only have samples between 35 and 96 hours' abstinence, and range for the intercept of the depicted slope (–260 to +200 TSperm) is greater than for panels with some samples after a shorter abstinence interval (ie, Panels A–G). For some donors (eg, panels M–P) there is little evidence for a consistent change in TSperm as a function of abstinence interval (solid line), although the true response might be closer to the dotted line. Inspection of the plot for each of the 48 donors suggested that TSperm/h leveled off before 80 hours for most donors. Clearly, the best estimate of TSperm/h during linear accumulation differs among individuals. See plot of pooled data for TSperm in Figure 2. It is evident that TSperm ranges widely for samples provided by an individual after a given abstinence interval (eg, panels A, C, F, H, K, M, and N).

 

Figure 2
View larger version (13K):
[in this window]
[in a new window]

 
Figure 2. Changes in TSperm and TSperm/h with increasing length of abstinence interval (P < .01). Means with the same letter are not significantly different. The dotted line connects the origin with values for TSperm after 48, 60, or 72 hours of abstinence. The "extra sperm" evident after 24 or 36 hours' abstinence result from distal movement 0–18 hours after the previous ejaculation rather than newly produced sperm (see text). Values are back-transformed means from least-squares analyses of log10-transformed data for samples provided after an abstinence interval of 24, 36, 48, 60, 72, 84, 96, or 120 hours. Numbers of samples are 21, 24, 197, 76, 301, 32, 104, and 10.

 


Results

Base Data and Imprecision of 1 Sample

Table 1 presents summary data, with emphasis on variation within and among the 48 donors. Median values were slightly lower than the respective mean, reflecting moderate skewness to larger values. The range for volume or TSperm in individual samples was wide (eg, 46–1290 x 106 sperm/ejaculate). Reported abstinence interval ranged widely, but for 97% of samples was between 13 and 100 hours. Across donors, the means and standard deviations were correlated (r = 0.84). Although within-donor CVs for TSperm ranged up to 50%, the 95% CLs around the mean were 33%–37%. Among-donor CVs for TSperm, or other seminal attributes, were not substantially greater than mean within-donor CVs (Table 1). Log10 transformation of raw values reduced CVs (see Supplemental Material). As anticipated for seminal donors, the among-donor CLs were narrow, with 90% of donor means between 2.9 and 3.4 mL or 309 and 368 x 106 sperm/ejaculate. Calculation of TSperm/h, to correct for abstinence interval, did not substantially reduce CVs. They averaged 33% within donors and 39% among donors.


View this table:
[in this window]
[in a new window]

 
Table 1. Descriptive statistics for semen from 48 donors
 

Considering all 946 samples, most correlations among seminal volume, sperm concentration, TSperm, and abstinence interval were low (Supplemental Table 1). TSperm had a greater association with sperm concentration than seminal volume (r2 of 0.41 vs 0.28). For all samples, the correlation between TSperm and abstinence interval was 0.27, and for samples provided after 13 to 84 hours' abstinence it was 0.24 (n = 791).

We had asked, how imprecise is a single sample as an estimate of an individual's TSperm? For this data set and relative to a mean value based on 18–20 samples, the value for TSperm in any individual sample from a given donor was between –20% and +20% of the mean value in 46% of the 18–20 possible cases. TSperm for an individual sample was between –30% and +30% of the mean value in 63% of cases. The situation was similar for TSperm/h. Individual values were between –20% and +20% of the mean value in 49% of possible cases.

Impact of Abstinence Interval on TSperm

To understand the relationship between TSperm and abstinence interval, we examined plots of data for individual donors (Figure 1 shows representative data). Three features were obvious. 1) For a majority of donors TSperm increased in a more or less linear manner from the shortest abstinence interval through near 72 hours, after which the change in TSperm/h sometimes leveled off. Panels M–P in Figure 1 present obvious exceptions to this generalization. 2) TSperm ranged widely for samples provided by a donor after the same reported abstinence interval (see array of data points on the Y-axis for a given abstinence interval (X-axis) in panels A, C, F, H, K, M, or N in Figure 1). 3) The slope of the linear plot (value for a in upper left corner of each panel; describes the dotted line), TSperm/h abstinence, ranged from 0 to 11 x 106 sperm/h (including plots not in Figure 1). Correlation coefficients (value for r in top center of each panel) between TSperm and abstinence interval for individual donors generally were greater than 0.55 and occasionally were greater than 0.80. This means that, based on r2, abstinence interval usually was associated with less than 45% of the variation in TSperm. For most donors a linear plot provided a better fit than a second- or third-order polynomial (panel G in Figure 1 illustrates an exception, when the donor's full range of abstinence intervals was considered).

Analysis of all available data for abstinence intervals with many samples (Figure 2) showed that TSperm increased through 72 hours. Although the numerical value for 84 hours was larger than that for 72 hours, the difference was not significant (P = .34) and means for 72, 84, and 120 hours were not significantly different from each other. TSperm after 24 or 36 hours of abstinence was greater than would be expected based on linear extrapolation (dotted line) from the origin to TSperm after 48, 60, or 72 hours of abstinence. TSperm/h was similar at 24 and 36 hours of abstinence, and obviously different from values after 48–84 hours of abstinence. There seemed to be "extra sperm." The back-transformed value for mean TSperm/h after 24 hours of abstinence was 6.88 x 106 sperm/h, compared with a value of 4.66 x 106 sperm/h for abstinence intervals of 48–84 hours (back-transformed from the weighted average of the least squares means of the log values). For both TSperm and TSperm/h, the interaction of abstinence interval and donor was significant (P < .01). In other words, for certain donors the temporal change in TSperm differed from that for other donors; not a surprise.

The finding of "extra sperm" after a short abstinence interval led to a within-individual comparison. We identified donors who had provided 4 or more samples after 18–40 hours of abstinence and also after 41–72 hours (7 donors; 4–16 samples per interval). On average, TSperm/h was 22% greater after 18–40 hours of abstinence (7.07 vs 5.81 x 106; P = .01). Also see Supplemental Material, page 3. We concluded that TSperm/h was atypically high in samples provided after less than or equal to 40 hours of abstinence compared to samples from the same individual provided after 41–72 hours of abstinence.

Precision of Estimated TSperm/H

Based on the generalized linear model, mean sperm accumulation rates (βi) were estimated for the 48 donors using samples with an abstinence interval of 13–84 hours. Estimated values ranged from 2.36 to 10.0 x 106 sperm/hour and averaged 5.40 x 106 sperm/hour. These model-based estimates usually were identical with mean TSperm/h abstinence calculated from each sample from a given donor, but the latter is not equal to the sometimes used ({Sigma}TSperm)/({Sigma}all abstinence intervals). The model estimate for {nu} was 9.4212 (standard error = 0.4662), which implied a CV of 32%, nearly matching the CV of 33% calculated for TSperm/h using all values of abstinence interval. Lower and upper 90% CLs for the individual donor sperm accumulation rates averaged –2.6% and +15.4% of the estimated value. For individuals with 20 samples, the lower and upper 90% CLs were –11% and +13%, respectively. On the other hand, for an individual having only 2 samples with an abstinence interval between 13 and 84 hours, lower and upper 90% CLs were –30% and +50%, respectively.

To calculate CLs for 1 or more samples from a hypothetical future individual we used the gamma model and the estimated {phi} from the donor data set. The results (Table 2) demonstrated the desirable asymmetry of the gamma model and the high degree of uncertainty when mean TSperm/h was based on only a few samples. Both features were especially evident if the mean was based on fewer than 6 samples (see Supplemental Figure 2). An estimated upper CL based on a single sample from a future individual is almost twice as far from the recorded value as an estimated lower CL. We calculated a probability of 0.76 that a single sample will not provide a value for TSperm/h within ±10% of a future individual's true value.


View this table:
[in this window]
[in a new window]

 
Table 2. Factors to calculate estimated CLs for TSperm (106) per hour in samples provided by a future hypothetical subjecta
 


Discussion

Caveats to Our Study

Readers should consider caveats detailed in the Supplemental Material. Data therein support correctness of the model for low values of TSperm, such as 15–60 x 106 after 48 hours of abstinence. Hence, factors in Table 2 and CLs in Supplemental Table 3 should be applicable for most samples encountered by a clinician or epidemiologist-andrologist team.

Comparison of Statistical Approaches

It is common to use a logarithmic, square-root, or cube-root transformation to normalize values for TSperm. Handelsman (2002) evaluated several normalizing transformations, using data for semen from nonoligozoospermic men, and suggested that the cube root was easiest to use. The cube-root transformation also is used to normalize {chi}2 data (Wilson and Hilferty, 1931), which are a special case of the gamma distribution. The gamma model is advantageous compared to normalization via cube roots for 2 reasons: 1) it avoids transformation of data, so the linear relationship between TSperm and abstinence interval is preserved; and 2) it eliminates negative statistical bias that occurs when estimated TSperm is back-transformed from averages of cube roots or logarithms (Lindgren, 1993).

TSperm per Hour Abstinence

Our model fitted a single linear function to each donor based on his TSperm. For our donors, mean sperm accumulation rate (average of individual β values) was 5.40 x 106 sperm/h. This value is comparable with values calculated from the literature when abstinence interval was 1–3 days (see Amann, 2009a).

Abstinence Interval

For this data set, the rate of increase in TSperm slowed after ~60 hours of abstinence (Figure 2), and the rate of sperm accumulation in the excurrent ducts must have approached zero near 84 hours of abstinence because TSperm in ejaculated semen remained stable. These data, together with information in the literature (Amann, 2009a), led to emphasizing the need for an abstinence interval shorter than often accepted.

Also obvious in Figure 2 is that TSperm/h was higher for abstinence intervals of 24 or 36 hours than after abstinence intervals of 48–84 hours. This difference has a biological basis that was evident as "extra sperm" when the plot for TSperm and the dotted line in Figure 2 were compared. It was estimated that 34–78 x 106 sperm (95% CLs) were moved distally in the excurrent ducts during 0–18 hours after the preceding ejaculation, and this estimate was in reasonable agreement with number of sperm in the first ejaculate after vasectomy (see Amann, 2009a). The impact of these extra sperm on TSperm was diminished when abstinence interval was 36 rather than 24 hours, and negligible by 48 hours. This postejaculation movement of sperm typically is not considered when thinking about TSperm as a function of abstinence interval and especially distorts calculations of TSperm/h if abstinence interval is less than 42 hours.

For clinicians, if the goal is to obtain a representative sample of semen for diagnostic use, then an abstinence interval of ~48 hours (uniform as 42–54 hours or more leeway with 42–60 hours) will provide the most meaningful value for TSperm, especially if the patient had ejaculated during the previous week once every 42–60 hours. Samples produced after 42–60 hours' abstinence also should allow meaningful evaluation of sperm motion and morphology. This recommendation is shorter than the 2–3 days of abstinence often recommended (eg, Sharlip et al, 2002; McLachlan et al, 2003) for an initial clinical evaluation.

For epidemiologists, the recommended abstinence interval also is 42–60 hours (exclude all samples with abstinence intervals >64 hours). A stringent range of abstinence intervals provides the best separation of individuals with a high, normal, or low sperm production rate and, hence, the best chance to detect if agent X affected testes function.

Why is a short abstinence interval important? If abstinence intervals greater than 64 hours are accepted for routine clinical evaluations or epidemiologic studies, resulting calculations will underestimate true TSperm/h abstinence for men with a reasonable rate of sperm production and overestimate true TSperm/h for individuals with moderately reduced sperm production. In other words, acceptance of an abstinence interval of 3 days or more might hamper detection of individuals with moderately reduced sperm production or borderline oligozoospermia, because the excurrent ducts could accommodate all sperm produced over far longer than 42–60 hours before any loss. See "Stabilize Number of Sperm in the Excurrent Ducts" in Amann (2009a). However, when it is known that an individual produces relatively few sperm, capability of the epididymides to accumulate sperm for 7 days or more can be exploited by use of a long abstinence interval before obtaining a sample for artificial insemination.

Adjustment of TSperm for abstinence interval (ie, expression as TSperm/h) usually would have little impact on any clinical conclusion based on number of sperm ejaculated, if abstinence interval was 42–60 hours. This is because for each hour of deviation from 48 hours of abstinence, TSperm would decrease or increase by ~2%. Our conclusion that correction for abstinence interval has little clinical utility when abstinence interval is relatively short confirms Baker et al (1981). For an epidemiologist-andrologist team, however, an unadjusted 12-hour difference in abstinence would result in a ~25% error in TSperm for a given subject, which might hamper correct placement of that subject with others having diseased or nondiseased testes.

For donations to a sperm bank, an abstinence interval greater than 64 hours is appropriate because one wishes to maximize the number of good-quality sperm available to process as 1 batch rather than obtain a reasonably precise value for TSperm/h. For nonoligozoospermic donors an abstinence interval of 72–96 hours might be appropriate, although samples after an abstinence interval of 42–72 hours often would have a similar TSperm.

How Many Samples Are Needed?

Imprecision of quantitative values based on 1 semen sample has long been recognized, and standard texts emphasize that 2–3 samples should be evaluated. What is the impact of imprecision on conclusions based on semen data? Our approach and model for the first time allow facile calculation of CLs showing the uncertainty around TSperm/h for a single future sample, followed by refinement based on 2 or 3–6 samples from a future individual. The 80% CLs would be –30% to +64% for a single value, but if based on 3 samples the CLs would be –20% to +30% of mean TSperm/h (Table 2). For a hypothetical future individual, the CLs for TSperm narrow (Supplemental Figure 2) by approximately 35%, 20%, and 15% as his data base is expanded from 1 to 2, 2 to 3, and 3 to 4 samples. Precision of a mean based on 3 samples is almost twice as good as that of a mean based on 1 sample. A mean for TSperm/h based on 3 samples will be within –20% and +30% of the true value in ~80% of cases, whereas a single value would be within –16% and +30% of the true value in only ~50% of cases (Table 2).

A clinician seeks values for seminal attributes of an individual with sufficient precision to make a recommendation to a patient couple. This might not require precision represented by 90% CLs assuming a continuous variable (eg, TSperm), but simply sufficient precision to allow correct conclusions in a go/no-go (binomial) manner. For this reason, both 50% and 80% CLs are included in Table 2 and Supplemental Figure 2. Obviously, TSperm is only 1 of many factors to consider. For future samples, calculations based on TSperm, recorded abstinence interval, and Table 2 can be made almost instantaneously by any technician. For a single sample, the CLs are obtained by multiplying the product of the observed TSperm/h and recorded abstinence interval by the appropriate table entries. When multiple samples are available, the average of the sample values for TSperm/h and table entries are used. Appropriate calculations could be incorporated into computer programs for seminal records.

Width of the intervals in Table 2 draws attention to the difficulty of evaluating the effect of a change (eg, old or placebo vs new treatment) in the same or different individuals using small sample sizes. If the 2 confidence intervals do not overlap (examples in Supplemental Table 3), the 2 treatments would be significantly different (although the converse is not true). With increasing sample numbers CLs are narrower and detection of differences is easier. Not surprisingly, for an extreme test result (relative to an individual's true value) a second value is likely to be closer to the true value (Baker and Kovacs, 1985) and additional samples will bring the mean closer to the true value.

Large epidemiologic studies usually involve detailed evaluation of 1 or 2 ejaculates to characterize the number and quality of sperm then being produced by each subject's testes, with abstinence intervals ranging up to 7 days. Implicit in the intent of most studies is detection of illness vs nonillness of each individual's testes in respect to spermatogenesis. The weakness of the conventional approach is evident from information in this paper and Amann (2009a). There are 2 problems. First, pooling data for TSperm for samples with abstinence intervals greater than 64 hours with values for samples with abstinence intervals less than or equal to 60 hours inappropriately benefits individuals with impaired sperm production. Second, and assuming a short abstinence interval, evaluation of 1 semen sample per subject would be appropriate only if the hypothesis in a future study can be evaluated with a protocol whereby test results for 25% of the subjects are more than 16% below their true value for TSperm and test results for another 25% of subjects are more than 31% above their true value for TSperm. Note that a test result for TSperm based on a single sample is more likely to be far above an individual's true value (Supplemental Figure 2) than far below.


Conclusions

To obtain meaningful data on TSperm and maximize detection of individuals with low sperm production, requested abstinence interval should be 42–54 hours with an upper limit of 64 hours. Accurately determine TSperm. For individuals known to produce few sperm, 7 days or more of abstinence will maximize number of sperm available for insemination or deposition during intercourse. TSperm/h based on a single sample will be within –16% and +30% of the true value in only ~50% of cases, whereas a mean TSperm/h based on 3 samples will be within –20% and +30% of the true value in ~80% of cases.


Acknowledgments

Genetics & IVF Institute, Fairfax, Virginia, kindly allowed transfer of data for seminal donors at 2 of their facilities, making this study possible. David S. Karabinus and Stephen H. Pool, of the Genetics & IVF Institute, reviewed historic data, selected the sets of 20 samples used, corrected the semen data portion of the methods section of the manuscript, and read the manuscript. We greatly appreciate their enthusiasm about this study and willingness to provide semen data.


References

Amann RP. Considerations in evaluating human spermatogenesis on the basis of total sperm per ejaculate. J Androl. 2009a; 30: XX –XX.

Amann RP. Evaluating spermatogenesis using semen: the biology of emission tells why reporting total sperm per sample is important, and why reporting only number of sperm per milliliter is irrational. J Androl. 2009b;30: XX –XX.

Baker HWG, Burger HG, de Kretser DM, Lording DW, McGowan P, Rennie GC. Factors affecting the variability of semen analysis results in infertile men. Int J Androl. 1981; 4: 609 –622.[Medline]

Baker HWG, Kovacs GT. Spontaneous improvement in semen quality: regression towards the mean. Int J Androl. 1985; 8: 421 –426.[Medline]

Handelsman DJ. Optimal power transformations for analysis of sperm concentration and other semen variables. J Androl. 2002; 23: 629 –634.[Abstract/Free Full Text]

Jørgensen N, Andersen A-G, Eustache F, Irvine DS, Suominen J, Petersen JH, Andersen AN, Auger J, Cawood EHH, Horet A, Jensen TK, Jouannet P, Keiding N, Vierula M, Toppari J, Skakkebæk NE. Regional differences in semen quality in Europe. Hum Reprod. 2001; 16: 1012 –1019.[Abstract/Free Full Text]

Lindgren BW. Statistical Theory. 4th ed. New York, NY: Chapman & Hall; 1993.

McLachlan RI, Baker HWG, Clarke GN, Harrison KL, Matson PL, Holden CA, de Kretser DM. Semen analysis: its place in modern reproductive medical practice. Pathology. 2003; 35: 25 –33.[CrossRef][Medline]

SAS Institute. SAS Online Documentation. Cary, NC: SAS Institute; 2003.

Sharlip ID, Jarow JP, Belker AM, Lipshultz LI, Sigman M, Thomas AJ, Schlegel PN, Howards SS, Nehra A, Damewood MD, Overstreet JW, Sadovsky R. Best practice policies for male infertility. Fertil Steril. 2002; 77: 873 –882.[CrossRef][Medline]

Wilson EB, Hilferty MM. The distribution of Chi-square. Proc Nat Acad Sci U S A. 1931; 17: 684 –688.[Free Full Text]

World Health Organization. WHO Laboratory Manual for the Examination of Human Semen and Sperm–Cervical Mucus Interaction. 4th ed. Cambridge, United Kingdom: Cambridge University Press; 1999: 81 –87.




This article has been cited by other articles:


Home page
Hum ReprodHome page
R.P. Amann
Evaluating testis function non-invasively: how epidemiologist-andrologist teams might better approach this task
Hum. Reprod., January 1, 2010; 25(1): 22 - 28.
[Abstract] [Full Text] [PDF]


Home page
J AndrolHome page
R. P. Amann
Evaluating Spermatogenesis Using Semen: The Biology of Emission Tells Why Reporting Total Sperm per Sample Is Important, and Why Reporting Only Number of Sperm per Milliliter Is Irrational
J Androl, November 1, 2009; 30(6): 623 - 625.
[Full Text] [PDF]


Home page
J AndrolHome page
R. P. Amann
Considerations in Evaluating Human Spermatogenesis on the Basis of Total Sperm per Ejaculate
J Androl, November 1, 2009; 30(6): 626 - 641.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Material
Right arrow All Versions of this Article:
30/6/642    most recent
Author Manuscript (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Amann, R. P.
Right arrow Articles by Chapman, P. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Amann, R. P.
Right arrow Articles by Chapman, P. L.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS