Introduction

The OGTT is used to classify subjects as having normal glucose tolerance (NGT), impaired glucose tolerance (IGT) or diabetes. In addition to the determination of glucose tolerance, measurements made during the OGTT are frequently used to derive measures of beta cell function. This pertains in particular to epidemiology or intervention studies, where use of the OGTT is more practical than time-consuming measures such as the IVGTT or the hyperglycaemic clamp.

Many different indices of beta cell function have been suggested. For example, the early insulin response to changes in glucose during the OGTT, termed the insulinogenic index, is often used as a measure of beta cell function. Integrated measures of the insulin response relative to the change in glucose over the time course of the OGTT have also been used. In addition, a new mathematical model has been applied to the OGTT to provide three indices of beta cell function: beta cell sensitivity to glucose, beta cell sensitivity to the rate of change of glucose and potentiation to factors such as incretin hormones and hyperglycaemia [1].

While the within-subject variability of fasting and 2 h measurements of glucose and insulin has been extensively documented [14], the reproducibility of measures of beta cell function from an OGTT has not been well characterised. Determining the within-subject variability of these measures is crucial in the design of longitudinal and intervention studies that aim to investigate changes in beta cell function and insulin sensitivity in the pathogenesis of type 2 diabetes or the prevention or treatment thereof. Longitudinal studies that rely on paired analysis and use measures of beta cell function with high within-subject variability will require much larger sample sizes. Thus, the primary aim of this study was to determine the within-subject variability of a number of different measures of beta cell function calculated using 2 h OGTT measurements in subjects with NGT, IGT and type 2 diabetes. In addition, we compared the variability of different measures of insulin sensitivity derived from either fasting or OGTT measures. This information provides a valuable resource for investigators, allowing them to more accurately calculate sample sizes.

Methods

Participants

A total of 39 volunteers were enrolled in the study and completed two study days. Participants were categorised by glucose tolerance status, based on their fasting and 2 h glucose values from their first OGTT, according to the American Diabetes Association guidelines (NGT, 2 h glucose <7.8 mmol/l; IGT, 2 h glucose 7.8–11.1 mmol/l; diabetes, fasting plasma glucose  > 7.0 mmol/l, 2 h glucose  > 11.1 mmol/l or use of diabetes medication). Six of the subjects with IGT also had impaired fasting glucose (IFG; fasting plasma glucose, 5.6–7.0 mmol/l). Because we were primarily interested in the variability of measures derived from values obtained during the OGTT rather than from fasting values, subjects with IFG but a normal 2 h glucose (n = 3) were included in the NGT group. CVs were also calculated for the NGT group excluding these three subjects with IFG. Four subjects with known diabetes were being treated with a stable dose of metformin that was continued on both study days. These subjects were classified as having diabetes. The additional ten subjects with diabetes were on diet treatment alone. Two subjects were excluded for the following reasons: one subject lost 4 kg between study days 1 and 2 (IGT on the first OGTT, NGT on the second OGTT), and one subject had a high fasting insulin level on one study day, which suggested that the study subject was not fasting (NGT subject). Thus, data from 37 subjects were used for this analysis. Subjects were otherwise in good health and were not taking medications such as steroids or niacin that would affect beta cell function or insulin sensitivity. All subjects gave written informed consent to participate in the study, which was reviewed and approved by the Human Subjects Review Committee at the University of Washington.

Study design

Each subject was studied twice within 2 weeks. Subjects were instructed to fast for at least 10 h overnight and to refrain from exercise on the morning of the study. They were told to maintain their usual daily routine between study days. After placement of an i.v. catheter, subjects were allowed to rest for at least 15 min before the basal samples were drawn. A standard 75 g OGTT was performed and samples were drawn at −10, −5, −1, 15, 30, 60, 90 and 120 min and placed immediately on ice. Blood samples were separated and the plasma stored at −80°C prior to being assayed.

Assays

Samples for study days 1 and 2 were run in the same assay to avoid inter-assay variability. Plasma glucose concentrations were measured using the glucose oxidase method. Plasma immunoreactive insulin levels were measured using a modification of a double-antibody RIA [5]. Plasma C-peptide levels were measured by a two-site immunoenzymometric assay (Tosoh AIA 600II autoanalyzer; Tosoh Bioscience, South San Francisco, CA, USA). The intra-assay CVs were 1.4% for glucose, 8% for insulin and 3.2% for C-peptide.

Computation and statistical analysis

Fasting values for glucose, insulin and C-peptide were calculated as the average of the three basal values. The insulinogenic index was calculated as the change in insulin divided by the change in glucose over the first 30 min of the OGTT (ΔI0–30/ΔG0–30). As a comparator to the insulinogenic index, the change in C-peptide divided by the change in glucose over the first 30 min of the OGTT (ΔCP0–30/ΔG0–30) was also computed. The incremental AUCs for insulin (incAUCins) and C-peptide (incAUCCP) were computed by applying the trapezoidal rule on values from 0 to 120 min after subtracting the basal (fasting) value, and the corresponding value for glucose (incAUCglu) was computed in a similar manner. The ratios of these parameters (incAUCins/incAUCglu and incAUCCP/incAUCglu) were evaluated as measures of insulin release during the entire OGTT. Basal insulin secretion (IS-0), and the total integrated insulin secretion divided by the mean change in glucose from 0 to 120 min (IS/Glu0–120) were determined using C-peptide deconvolution as previously described [6]. Based on these measures of insulin secretion, estimates of basal insulin clearance (IC) were calculated as IS-0 divided by average fasting insulin, and estimates of total insulin clearance (IC0–120) were calculated as integrated insulin secretion from 0 to 120 min divided by the average insulin concentration from 0 to 120 min.

Parameters of beta cell function were derived from mathematical analysis of plasma glucose and C-peptide concentrations during the OGTT, according to a previously developed model [7]. Beta cell glucose sensitivity denotes the mean slope of the relationship relating insulin secretion to the glucose concentration, and reflects the response of the beta cell to prevailing plasma glucose concentrations in the steady state. Rate sensitivity denotes the dynamic insulin secretion component that is a function of the rate of change of the plasma glucose concentration during the OGTT. The dose response is modulated by a time-dependent potentiation factor that is normalised such that the 0–120 min integrated value is 1.0 for each experiment. The ratio of the potentiation factor at the end of the OGTT (100–120 min) to the potentiation factor at the beginning of the OGTT (0–20 min) is defined as the potentiation parameter. Although the precise physiological implications of the potentiation parameter are unknown, it may reflect the effects of incretin hormones such as glucose-dependent insulinotropic peptide and glucagon-like peptide-1.

Several indices of insulin sensitivity or insulin resistance were determined, including: (1) fasting insulin; (2) homeostasis model assessment of insulin resistance (HOMA-IR), using non-specific insulin (using the calculator available from http://www.dtu.ox.ac.uk, last accessed in August 2007) [8]; (3) the model-derived oral glucose insulin sensitivity (OGIS-120), which is calculated using glucose and insulin concentrations at baseline and 90 and 120 min [9]; (4) Matsuda’s insulin sensitivity index (ISIM), which is calculated as 10,000/(fasting glucose × fasting insulin × average glucose OGTT × average insulin OGTT)1/2 [10]; and Stumvoll’s insulin sensitivity index (ISIS), which is calculated as 0.226 – 0.0032 × BMI – 0.0000645 × Ins120 – 0.00375 × Gluc90 [11]. All indexes were calculated using SI units.

The data were ln-transformed to correct for heteroscedasticity with variance increasing with increasing mean, except for two variables, 2 h glucose and rate sensitivity, which were already homoscedastic. For these two variables an estimated CV based on the mean for the group was calculated as the SD divided by the mean. Because the mean 2 h glucose differed between glucose tolerance categories, estimates for the CV for 2 h glucose are reported for each category only. Since ln-transformation cannot be performed on negative values, the following numbers of subjects were excluded from analysis: ISIS (n = 7, four IGT and three diabetes), ΔI0–30/ΔG0–30 (n = 1), ΔCP0–30/ΔG0–30 (n = 2), incAUCins/incAUCglu (n = 1) and incAUCCP/incAUCglu (n = 1).

The distribution of the difference between study days 1 and 2 (using ln-transformed variables as indicated above) was examined visually, and outliers were tested using the extreme Studentised deviate (ESD) statistic (∣extreme—mean∣/SD). Values outside the 95th percentile from the mean (ESD, >2.98) were excluded as outliers. One outlier was excluded for each of the following variables: ln[fasting C-peptide], ln[ISIS], ln[OGIS-120], ln[ΔI0–30/ΔG0–30], ln[ΔCP0–30/ΔG0–30], ln[incAUCins/incAUCglu], ln[IS-0], ln[IS/Glu0–120], ln[glucose sensitivity], rate sensitivity and ln[basal IC].

Bland–Altman plots were used to assess for systematic bias in variables between the first and second study day. The 95% CI for the mean difference between study days was calculated. If the 95% CI included zero then there was no systematic bias. Repeated-measures ANOVA with study day as the within-subject factor and glucose tolerance category as the between-subjects factor was performed on ln-transformed data. Based on ANOVA, the significance of differences between day 1 and day 2 was determined for each variable and differences by glucose tolerance category were assessed. The CV of the non-transformed variable was calculated from the mean square error of the ln-transformed variable as: CV = [exp(Mean Squarewithin) − 1]1/2 [12]. Differences in CV between glucose tolerance categories were assessed using the Kruskal–Wallis rank sum test.

The intra-class correlation coefficient (ICC; defined as the proportion of the total variance that is due to between-subject variance) was computed for each variable using a one-way random effects model on ln-transformed data for all variables except 2 h glucose and rate sensitivity. The ICC demonstrates how the reproducibility of a measure compares with the variation in the specified population.

The confidence limits of the CVs were determined from 10,000 bootstrap replications of the original data, i.e. the original data was re-sampled with replacement. With the bootstrap method, re-sampling of data with replacement is approximately equivalent to sampling from the (unknown) original population, thus confidence limits can be determined using multiple bootstrap replications. Using this approach, the 95% confidence limit values were then determined as the 2.5 and 97.5 percentile rankings of the replicate values [13].

Statistical analyses were performed using SPSS software (version 14; SPSS, Chicago, IL, USA). Data that are not normally distributed are presented as medians and interquartile range (IQR). A p value of <0.05 was considered statistically significant.

Results

Subject characteristics

Thirteen subjects were categorised as having NGT (seven men, six women), ten as having IGT (three men, seven women) and 14 as having diabetes (eight men, six women). The median time interval between study day 1 and study day 2 was 7 days (range, 5–14) and did not differ significantly between glucose tolerance categories. NGT subjects (age 44.3 ± 3.2 years; mean±SD, p < 0.05) were significantly younger than subjects with IGT (56.8 ± 4.5 years) or diabetes (60.6 ± 2.2 years). Subject characteristics are presented in Table 1. All variables showed significant differences between the glucose tolerance categories, with the exception of the rate sensitivity, potentiation parameters and measures of insulin clearance.

Table 1 Measures from the OGTT study day 1 and study day 2 for all study subjects

The glucose, insulin and C-peptide responses during the OGTT were similar between study day 1 and day 2 (Fig. 1). There were small but statistically significant differences between study day 1 and study day 2 when analysed for all subjects for the following variables (median [IQR]): fasting insulin (76.9 [38.7] vs 65.1 [45.4] pmol/l, p = 0.02), HOMA-IR (1.49 [0.82] vs 1.29 [0.85], p = 0.02), glucose sensitivity (76.4 [68.8] vs 84.0 [83.9], pmol [mmol/l]−1 m−2 min−1, p = 0.04), and potentiation (1.57 ± 0.74 vs 1.31 ± 0.61, p = 0.04). When analysed by glucose tolerance category, BMI was significantly higher on study day 2 vs day 1 in the NGT group (p = 0.008), glucose sensitivity was significantly higher on study day 2 vs day 1 in the diabetes group (p = 0.008) and potentiation was higher on study day 1 vs day 2 in the diabetes group (p = 0.02). There were no statistically significant differences between study day 1 and day 2 for any of the other variables.

Fig. 1
figure 1

Glucose (a), insulin (b) and C-peptide (c) curves for study days 1 and 2 for each glucose tolerance category. Data are presented as means±SEM. Circles, NGT; squares, IGT; triangles, diabetes; closed symbols represent study day 1 and open symbols represent study day 2

Bland–Altman plots

Bland–Altman plots for fasting and 2 h measures are presented in Fig. 2, and measures of insulin response, secretion and mathematical model-derived beta cell function are presented in Fig. 3. The ln-transformed values are shown, except for 2 h glucose and rate sensitivity.

Fig. 2
figure 2

Bland–Altman plots for fasting and 2 h glucose, fasting insulin and fasting C-peptide. Variables that demonstrated heteroscedasticity (increasing variability with increasing mean) were ln-transformed. Circles, NGT; squares, IGT; triangles, diabetes; dashed lines indicate the 95% CIs for the mean difference between study days. a ln[fasting plasma glucose], b 2 h plasma glucose, c ln[fasting insulin], d ln[fasting C-peptide]. Fasting insulin showed a slight bias, with higher values on study day 1 vs 2

Fig. 3
figure 3

Bland–Altman plots for measures of insulin secretion and beta cell function are presented. Variables demonstrating heteroscedasticity were ln-transformed. Circles, NGT; squares, IGT; triangles, diabetes; dashed lines indicate the 95% CIs for the mean difference between study days.. a ln[ΔI0–30/ΔG0–30], b ln[ΔCP0–30/ΔG0–30], c ln[incAUCins/incAUCglu], d ln[IS/Glu0–120], e ln[IS-0], f ln[glucose sensitivity], g rate sensitivity, h ln[potentiation]. Glucose sensitivity and potentiation showed a slight bias as the 95% CI did not include 0

The 95% CI for the mean of the difference between study days did not include zero for the following variables: fasting insulin, glucose sensitivity and potentiation, although the 95% CI’s lower bounds for these variables were very close to zero, suggesting a relatively unimportant effect of the day of study on measure values.

Bland–Altman plots were also constructed for ln[HOMA-IR], ln[OGIS-120], ln[ISIM], ln[ISIS], ln[incAUCCP/incAUCglu] and ln[IC], and all demonstrated homoscedasticity (plots not shown). The 95% CIs for these variables included zero for ln[ISIS] (−0.056 to 0.137), ln[OGIS-120] (−0.055 to 0.020), ln[incAUCCP/incAUCglu] (−0.111 to 0.057), ln[basal IC] (−0.091 to 0.003) and ln[IC0–120] (−0.048 to 0.054), but did not include zero for ln[HOMA-IR] (0.016 to 0.169) or ln[ISIM] (−0.060 to −0.134).

Within-subject variability

Within-subject variability was computed, and the CVs for each variable are presented in Table 2. The insulinogenic index (ΔI0–30/ΔG0–30) showed marked within-subject variability. There was less variability when C-peptide rather than insulin was used. The values for incAUCins/incAUCglu, incAUCCP/incAUCglu and the integrated insulin secretion measure using C-peptide deconvolution (IS/Glu0–120), exhibited much less variability than either of the early 30 min responses. Glucose sensitivity showed moderate variability, whereas that for the potentiation factor was higher. Measures of insulin sensitivity, including fasting insulin, HOMA-IR, ISIM, ISIS and OGIS-120, had CVs that ranged from 7.3 to 24.0%.

Table 2 CV for OGTT measures

The CVs did not differ by glucose tolerance category, with the exception of fasting C-peptide (p < 0.05) and IS-0 (p < 0.05; Table 2). Exclusion of those subjects with diabetes who were on metformin did not alter the results appreciably (Table 2).

When the three subjects with IFG were removed from the NGT group, most CVs did not change appreciably (FPG 4.5%, OGIS-120 8.3%, ISIM 10.9%, ISIS 19.5%, OGIS-120 8.3%, IS-0 8.4%, ΔI0–30/ΔG0–30 43.0%, incAUCins/incAUCglu 29.6%, incAUCCP/incAUCglu 21.6%, glucose sensitivity 18.0%, potentiation 47.2%, basal IC 11.5%, IC0–120 11.2%). However, the CVs for fasting insulin (13.1%), HOMA-IR (12.7%), fasting C-peptide (8.4%) and ΔCP0–30/ΔG0–30 (21.9%) decreased somewhat when the subjects with IFG were removed.

The intra-class correlation coefficient

The ICC was determined for each variable for all subjects and the data are presented in Fig. 4. Most variables showed excellent ICCs (>0.75) in this population sample. Both the early insulin and C-peptide responses showed relatively lower ICCs, while rate sensitivity and potentiation demonstrated much poorer ICCs. The ICC for ln[incAUCCP/incAUCglu] (0.97; 95% CI 0.95–0.99) was very similar to that for IS/Glu0–120 and is not included in the figure.

Fig. 4
figure 4

ICC for each variable. These were computed using a one-way random effects model (±95% CI). An ICC  > 0.75 is considered excellent; 0.4–0.75, fair to good; and <0.4, poor

Sample size calculations

Samples size estimates were calculated based on the within-subject SDs for all subjects for paired t test comparisons using an α-level of 0.05 and a power of 90% to detect a change from baseline of 20, 30 and 50%. Estimates used ln-transformed data for all subjects and are listed in Table 3. The early insulin and C-peptide responses would require much larger sample sizes than other measures.

Table 3 Sample size estimates for paired t test

Discussion

We have shown that there are important differences in within-subject variability for beta cell measures obtained from an OGTT. Thus, while easy to perform, the OGTT is hampered by high variability, particularly in measures of the early insulin response. Variability was decreased by using C-peptide rather than insulin and by using integrated measures. Use of a mathematical model to assess beta cell function showed that while the beta cell glucose sensitivity measure had modest variability, the potentiation parameter and rate sensitivity had higher variability.

We demonstrated a CV of 57.1% for the insulinogenic index. For comparison, the CV of the acute insulin response to intravenous glucose determined by this research group is less than half that (CV 20.6%) [14], which demonstrates the superior reproducibility of the intravenous test. Although within-subject variability may be affected by differences in protocol, technique or assay precision, our within-subject variability for the insulinogenic index in NGT subjects (CV 41.1%) is similar to that reported by Schousboe et al. (CV 34%) [15]. Thus, we believe the high variability of the insulinogenic index is correct. We eliminated inter-assay variability by making all the sample measurements for each individual in the same assay. Variability may be greater in studies in which the samples from the same subject are assayed at different times. In addition, the variability may be increased by other factors, such as prolonged storage between study visits.

The use of integrated AUC responses from 0 to 120 min and C-peptide measures decreased the within-subject variability compared with the insulinogenic index. In particular, the ratios of the incremental AUCs for insulin or C-peptide to the corresponding values for glucose (incAUCIns/incAUCglu and incAUCCP/incAUCglu), which are easy to calculate and do not require models, result in very acceptable within-subject variability and may be good alternative measures of the insulin response. The early C-peptide measure and the integrated insulin secretion measures using C-peptide also demonstrated less within-subject variability. The higher analytical variability of the insulin assay (intra-assay CV 8% for insulin vs 3.2% for C-peptide) and variability in insulin clearance may contribute to this finding. However, the C-peptide assay is relatively expensive, C-peptide is less stable in stored samples and C-peptide deconvolution techniques may not be readily available to all investigators. These factors must be balanced against the decreased sample size requirement when C-peptide is used.

While the beta cell glucose sensitivity measure derived from a mathematical model showed within-subject variability similar to that of the integrated insulin responses, estimates of the rate sensitivity and potentiation ratio showed moderate variability. High variability for rate sensitivity has been reported in a previous study using this same model [1]. In spite of this considerable variability, rate sensitivity and potentiation may be useful to quantify treatment effects when the magnitude of change is large [16, 17].

To evaluate the degree to which within-subject variability contributes to the overall variability in our study population, we computed the ICC for each variable. In keeping with the higher CVs for certain measures, we observed poorer ICC values for the rate sensitivity and potentiation measures and lower ICC values for the early insulin and C-peptide responses compared with the integrated measures. The ICC is by definition population-based, and thus ICC results from this study would not be applicable to other studies. We therefore suggest that the CV values rather than the ICC results be used for study design or sample size computation.

While not the primary objective of the study, we also examined the variability of a number of surrogate measures of insulin sensitivity. Different measures of insulin sensitivity using fasting measures, formulas incorporating different time points during the OGTT and the OGIS-120 model all showed reasonable variability.

Measures of the insulin response cannot be interpreted alone, but must be assessed relative to the prevailing level of insulin sensitivity. The relationship between the acute insulin response and insulin sensitivity derived from an IVGTT has been shown to be hyperbolic in nature, and the product of the two (frequently called the disposition index) has been used as a measure of beta cell function [18]. It has therefore been assumed that the relationship between any two measures of insulin sensitivity and insulin release is hyperbolic. However, to truly determine whether a hyperbolic relationship exists, the variability of both measures needs to be accounted for in the analyses, an important statistical step that is often overlooked in analysis of this hyperbolic relationship.

It is important to recognise that this study was not designed to determine the validity of measures of beta cell function and insulin sensitivity or to compare one measure with another. However, our study does provide the data from which sample size calculations can be made for longitudinal or interventional studies where within-subject variability is important. Table 3 illustrates clearly that the use of measures such as the early insulin response will require much larger sample sizes than the incremental AUC approach to detect a similar percentage change in response. Further, use of measures that are based on C-peptide require smaller sample sizes for estimating both insulin secretion and insulin sensitivity.

Our study was not designed to determine the mechanisms underlying the variability in OGTT measures of beta cell function. Possible mechanisms include variation in the secretion of, or response to, incretin hormones; changes in gastrointestinal motility and absorption; the impact of changes in recent glucose exposure, leading to a priming effect; and changes in physical activity levels. In premenopausal women, glucose metabolism differs in the different phases of the menstrual cycle [19] and this could therefore be a factor. While subjects were instructed to maintain their usual activity and diet, these variables are somewhat difficult to control in free-living adults. An intervention, such as provision of, or instruction in, a standardised diet for the 3 days preceding the OGTT test may have decreased variability, as would controlling for the phase of the menstrual cycle, but this is an impractical approach in large epidemiological or interventional studies. As the aim of the present study was to evaluate variability in OGTT-derived measures in free-living adults, no dietary or menstrual phase standardisation was included in the study design.

A limitation of this study was the exclusion of a few outlying data points. Outliers will in fact contribute to the overall variability of any study. On the other hand, inclusion of the outliers dramatically skewed the CV results. To prevent investigator bias we used a statistical approach using the ESD statistic to exclude outliers. In addition, negative values were dropped, since ln transformation cannot be performed. One out of 74 (1.4%) of the OGTTs in our study had a negative ΔI0–30/ΔG0–30. In general, such negative insulin responses are uncommon. For example, we found 24/2027 (1.2%) subjects had negative or zero ΔI0–30/ΔG0–30 responses in the American Diabetes Association Genetics of Non-insulin Dependent Diabetes (GENNID) study (unpublished data), a percentage very similar to our findings here. Of interest, use of the ISIS resulted in seven subjects being excluded for having negative values (four with IGT and three with diabetes). This is not surprising, since high 120 min insulin values can result in a large negative term in the equation and an overall negative ISIS. This defect is not present in HOMA-IR or the ISIM.

In conclusion, we have determined the within-subject variability for a number of measures that can be derived from results from an OGTT. We found that measures based on the early insulin response were highly variable, and that measures that utilise C-peptide or incorporate multiple time points demonstrate reduced variability, but clearly also increase the effort and cost of the test. Such factors must be considered when designing clinical studies, where within-subject variability can dramatically affect sample size requirements, study procedures and overall study costs.