Nomogram to predict the number of oocytes retrieved in controlled ovarian stimulation
Article information
Abstract
Objective
Ovarian reserve tests are commonly used to predict ovarian response in infertile patients undergoing ovarian stimulation. Although serum markers such as basal follicle-stimulating hormone (FSH) or random anti-Müllerian hormone (AMH) level and ultrasonographic markers (antral follicle count, AFC) are good predictors, no single test has proven to be the best predictor. In this study, we developed appropriate equations and novel nomograms to predict the number of oocytes that will be retrieved using patients' age, serum levels of basal FSH and AMH, and AFC.
Methods
We analyzed a database containing clinical and laboratory information of 141 stimulated in vitro fertilization (IVF) cycles performed at a university-based hospital between September 2009 and December 2013. We used generalized linear models for prediction of the number of oocytes.
Results
Age, basal serum FSH level, serum AMH level, and AFC were significantly related to the number of oocytes retrieved according to the univariate and multivariate analyses. The equations that predicted the number of oocytes retrieved (log scale) were as follows: model (1) 3.21–0.036×(age)+0.089×(AMH), model (2) 3.422–0.03×(age)–0.049×(FSH)+0.08×(AMH), model (3) 2.32–0.017×(age)+0.039×(AMH)+0. 03×(AFC), model (4) 2.584–0.015×(age)–0.035×(FSH)+0.038×(AMH)+0.026×(AFC). model 4 showed the best performance. On the basis of these variables, we developed nomograms to predict the number of oocytes that can be retrieved.
Conclusion
Our nomograms helped predict the number of oocytes retrieved in stimulated IVF cycles.
Introduction
Prediction of individual ovarian response to exogenous gonadotropin is one of the most important strategies for successful and safe in vitro fertilization (IVF) treatment. Basal serum follicle-stimulating hormone (FSH) concentration is traditionally the most commonly used ovarian reserve test (ORT). A menstrual cycle day 3 serum FSH concentration >10–20 IU/L indicates diminished ovarian function, which shows high specificity (80%–100%) for the prediction of a poor response to ovarian stimulation but low sensitivity (10%–30%) [1]; however, it is difficult to predict a high response.
Anti-Müllerian hormone (AMH) is another serum marker that can be used as an ORT. The AMH level is associated with low oocyte retrieval and poor embryo quality, but this factor has low sensitivity and specificity for predicting successful achievement of pregnancy [2345]. Serum AMH level reflects the overall amount of granulosa cells in the follicular pool; thus, the AMH level is generally known to be proportional to the number of follicles. However, the serum AMH level depends on follicle size, granulosa cell volume or maturity, and intrafollicular environment, which are inconsistent among follicles and affected by individual genetic characteristics [6]. A previous study noted the presence of inter-test discrepancies that occurred as frequently as one in every five subjects when serum values of AMH <0.8 ng/mL and FSH >10 IU/L were defined as risk factors for poor ovarian response, and this was more common with an increase in age [7].
Antral follicle count (AFC) is an ultrasonographic marker used as an ORT that is proportional to the number of primordial follicles remaining in the ovary [8]. AFC is known to have a high specificity (73%–100%) for predicting a poor response, but exhibits low sensitivity (9%–73%) for predicting a poor response and low sensitivity (8%–33%) for predicting a failure to achieve pregnancy [9101112]. Large interobserver variation is an inherent shortcoming because of the subjective nature of ultrasonography for measuring follicle counts.
Thus far, no single ORT has shown superior accuracy. Therefore, the present study aimed to investigate the combined relationship of age and various ORTs with the final number of oocytes retrieved in order to develop appropriate equations and novel nomograms to predict the final number of oocytes retrieved. This could help ensure more objective and quantitative prediction of ovarian response and counsel for patients requiring ovarian stimulation.
Methods
1. Study subjects
We selected 141 infertile patients who were undergoing the first IVF cycle at a fertility clinic of the Seoul National University Hospital (October 2009 to December 2013). This study was approved by the Institutional Review Board of Seoul National University Hospital. Clinical data of treatment cycles were collected from the medical database. Basal serum FSH level and random serum level of AMH were recorded if they were measured within 6 months prior to ovarian stimulation. AFC through transvaginal ultrasonography was recorded if it was measured on menstrual day 2 to 4 within 6 months prior to ovarian stimulation.
Patients with polycystic ovary syndrome (PCOS), in accordance with the diagnosis defined by the 2003 ASRM/ESHRE Rotterdam consensus, were excluded from this study. Other exclusion criteria were a history of surgical correction of ovarian endometriosis or current endometriosis, irregular menstruation, ovarian cysts >2 cm at the time of ovarian stimulation, a history of oral contraceptive use for 3 months before the start of ovarian stimulation, abnormal findings on thyroid function test or elevated prolactin level, and current medication for chronic diseases.
2. Blood test
Basal serum FSH level was measured on menstrual cycle day 2 to 3 using a Coat-A-Count FSH immunoradiometric assay procedure (Siemens Healthcare Diagnostics, Los Angeles, CA, USA). The sensitivity of the FSH measurement was 0.06 IU/L, inter-assay and the intra-assay coefficients of variation were 6.8% and 4.9%, respectively. AMH level was measured using a kit via enzyme-linked immunosorbent assay, which was measured by an Immunotech version (ref A11893, Beckman Coulter Inc., Marseilles, France) until September 2012 and Gen II (ref A79765, Beckman Coulter, Brea, CA, USA) starting in October 2012. Their sensitivity was 0.14 ng/mL and 0.08 ng/mL, inter-assay coefficients of variation were 5.6% and 4.5%, and intra-assay coefficients of variation were 5.4% and 3.6%, respectively.
3. Controlled ovarian stimulation and oocyte retrieval
Ovarian stimulation was performed using recombinant FSH (rFSH; Gonal-F, Merck-Serono, Geneva, Switzerland). The starting dose of rFSH varied from 75 IU to 450 IU according to five clinicians' judgment. Usually, the starting dose for women expected to be normal responders was 225 IU. For expected high responders, it was 150 IU and for expected poor responders, it was 300 IU. For expected extremely high and poor responders, the starting dose was 75–112.5 IU and 375–450 IU, respectively. The pituitary was suppressed using a flexible multiple-dose protocol of gonadotropin-releasing hormone (GnRH) antagonist (Cetrotide 0.25 mg/d, Merck-Serono) (n=75) or a mid-luteal long protocol of GnRH agonist (Decapeptyl 0.1 mg/d, Ferring, Malmo, Sweden) (n=66). Follicular growth was monitored from menstrual cycle day 7 via transvaginal ultrasonography. Dose adjustment was performed once at that time if necessary, and follicular growth was then observed every 1 to 2 days. When the mean diameter of a dominant follicle reached 18 mm in diameter or two follicles reached 17 mm in diameter, recombinant human chorionic gonadotropin (rhCG; Ovidrel 250 µg, Merck-Serono) was administered to trigger final follicular maturation. Oocyte retrieval was performed 35 to 36 hours after rhCG administration. Oocyte retrieval was implemented by experienced clinicians using a 17-gauge single lumen ovum aspiration needle (Cook Medical, Queensland, Australia), and all follicles with a diameter >10 mm were aspirated. MI and MII oocytes were counted as the total number of oocytes considered to have the potential for fertilization.
4. Statistical analysis
To predict the number of oocytes retrieved, a generalized linear model (GLM) including Poisson distribution and log link function was applied with the response variable of the number of oocytes and the prediction variables of age, basal FSH level, AMH level, and AFC. Cycles with GnRH antagonist and agonist long protocol were not separated because a similar predictive performance of each ORT and their combinations was produced in both protocols. Since the AMH values obtained from the Gen I and Gen II version were almost same, they were analyzed together [13]. SAS ver. 9.2 (SAS Institute Inc., Cary, NC, USA) was used for all statistical analyses and the production of prediction equations and nomograms (R-3.0.3). Akaike information criterion (AIC) and Bayesian information criterion (BIC) were employed to compare goodness of fit among the prediction models. The predictive performance of each prediction model was evaluated via differences between the values observed by calibration plots and those predicted in the models. For discrimination, the linear relationship between the observed and the predicted values was assessed. The results were considered statistically significant at p-values of <0.05.
Results
1. Clinical characteristics and univariate analysis
The statistical characteristics of woman's age, ORTs and stimulation outcomes from 141 IVF cycles are summarized in Table 1.

Statistical characteristics of several ovarian reserve markers and stimulation outcomes from 141 IVF cycles
Univariate analysis revealed that the number of oocytes retrieved had a significant negative relationship with age (Spearman correlation coefficient ρ=–0.433, p<0.001) and basal serum FSH level (ρ=–0.404, p<0.001), but showed a significant positive relationship with serum AMH level (ρ=0.718, p<0.001) and AFC (ρ=0.757, p<0.001) (Figure 1). The number of oocytes retrieved was significantly different between cycles with high basal FSH level (>10 IU/L) and normal FSH level (<10 IU/L) (relative risk, 0.46; 95% confidence interval, 0.40–0.54, p<0.001).
2. Multivariate analysis and prediction equations
Multivariate analysis was performed using significant predictive factors for the number of oocytes retrieved (age, basal serum FSH level, serum AMH level, and AFC) (Table 2). Four prediction models were constructed using several combinations of ORTs, and all the models showed statistical significance:
Model 1: Ln (number of oocytes retrieved)=3.21–0.036×(age)+0.089×(AMH)
Model 2: Ln (number of oocytes retrieved)=3.422–0.03×(age)–0.049×(FSH)+0.08×(AMH)
Model 3: Ln (number of oocytes retrieved)=2.32–0.017×(age)+0.039×(AMH)+0.03×(AFC)
Model 4: Ln (number of oocytes retrieved)=2.584–0.015×(age)–0.035×(FSH)+0.038×(AMH)+0.026×(AFC)
On AIC and BIC analysis, model 4 showed the lowest values, indicating that this model had the best goodness of fit (Table 3). In the evaluation of discrimination, the Spearman correlation coefficient was the highest for model 3, followed by models 4, 2, and 1; however, models 3 and 4 showed a non-significant difference. Leave-one-out cross-validation performed as an internal validation also showed the same pattern of the order.
On a calibration plot of the predicted number of oocytes and the actual number of oocytes observed (Figure 2), models 3 and 4 showed a similar trend; that is, these models were the closest to the straight line that intercepts the origin (0, 0), indicating a perfect match between the observed and predicted number of oocytes.

Calibration plots showing the difference between the predicted and observed number of oocytes according to model 1 (A), model 2 (B), model 3 (C), and model 4 (D).
According to the four prediction models, four nomograms were generated (Figure 3).

Nomograms to predict the number of oocytes retrieved using each prediction model: model 1 (A, B), model 2 (C), model 3 (D), and model 4 (E). For example, in a 34-year-old woman who has a serum AMH value of 2.0 ng/mL, approximately 8.7 oocytes could be expected by nomogram (A). The same result could be obtained by nomogram (B) (age, 34 years; gains score, 36.5; and AMH 2.0 gains score, 11.5; thus their sum score 48 corresponds to approximately 8.7 oocytes). AMH, anti-Müllerian hormone; FSH, follicle-stimulating hormone; AFC, antral follicle count.
Discussion
This is the first study to establish nomograms that predict the final number of oocytes in stimulated IVF cycles by using several combinations including age, basal serum FSH level, serum AMH level, and AFC. Our developed nomograms may be clinically useful because the number of oocytes can be predicted in women with basal serum FSH or serum AMH level alone.
Among prediction models in the present study, model 4, which used four variables, showed the best ability to predict the number of oocytes. However, model 3 (leaving out basal serum FSH level) also showed superior results. This suggests a relatively weak predictive performance of basal serum FSH level to determine ovarian response to exogenous stimulation. This finding was in accordance with the results of Broer et al. [14]. They reported that predictive power for poor responders (number of oocytes <4) was superior by using a combination of age, serum AMH level, and AFC, when compared to the use of one or two variables, but no marked difference was seen when four variables including FSH level were combined.
High basal serum FSH level usually predicts a poor response, but in this case, the serum AMH levels were helpful to determine outcome of stimulation. In one study, among women with elevated basal serum FSH levels, the subgroup with serum AMH levels ≥0.6 ng/mL yielded twice the number of oocytes compared with the subgroup with AMH levels <0.6 ng/mL [15].
Previous studies have usually reported the predictive performance of each ORT or their combination for the prediction of poor responders; however, our study focused on the performance of several combinations of ORTs to predict the number of oocytes to quantitatively measure ovarian reserve.
As serum AMH measurement was introduced in clinics since 2009, previous studies have usually used age and basal serum FSH level to determine the starting dose of rFSH [161718]. However, a recent study included serum AMH level, and nomograms were developed for determining the rFSH starting dose by using a combination of age, basal serum FSH level, and serum AMH level [19]. The nomograms were generated from the prediction model of the rFSH starting dose to retrieve nine oocytes; therefore, many high (women with polycystic ovary on ultrasonography) and poor responders (basal serum FSH level >15 IU/L) were excluded.
We also excluded women with PCOS, but included high, normal, and poor responders in the data set. Therefore, our developed nomograms can be applied to both substantially high and poor responders.
A higher starting dose of gonadotropin would usually result in a greater number of oocytes in stimulated IVF cycles. However, in the present study, the number of oocytes retrieved was inversely proportional to the starting and total dose of rFSH. This appears to be mainly due to the inclusion of poor responders in the study; in poor responders, few oocytes were retrieved even after the administration of a high dose of gonadotropin. In the GLM model, the starting dose and the total dose of rFSH were excluded because of their low predictive performance. The number of oocytes retrieved might depend on the ORT or combination of ORTs used, rather than the starting or total dose of gonadotropin.
We included all cycles with the GnRH antagonist or agonist long protocol. Although the GnRH antagonist protocol was more frequently used in poor responders in the present study, a similar predictive performance of ORTs and their combinations was produced with both protocols.
In conclusion, the number of oocytes was accurately predicted by a combination of age, basal serum FSH level, serum AMH level, and AFC. From several prediction models, we developed nomograms to predict the number of oocytes retrieved. This would greatly improve the area of assisted conception. The nomogram needs further validation to improve individualized prediction.
Acknowledgments
We appreciate the statistical support provided by the Seoul National University Hospital Medical Research Collaboration Center.
Notes
This work was supported by the Seoul National University Hospital Research Fund (NO. 0420130290), Korea.
Conflict of interest: No potential conflict of interest relevant to this article was reported.