Adequate Patient Characterization in COPD: Reasons to Go Beyond GOLD Classification
Tewe L Verhage, Yvonne F Heijdra, Johan Molema, Leonie Daudey, P.N Richard Dekhuijzen, Jan H Vercoulen*
Identifiers and Pagination:Year: 2009
First Page: 1
Last Page: 9
Publisher Id: TORMJ-3-1
Article History:Received Date: 24/11/2008
Revision Received Date: 14/12/2008
Acceptance Date: 2/1/2009
Electronic publication date: 13/2/2009
Collection year: 2009
open-access license: This is an open access article licensed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.
The Global Initiative for Chronic Obstructive Lung Disease (GOLD) serves as a guide to treat and manage different severity classes of patients with COPD. It was suggested that the five categories of FEV1 % predicted (GOLD 0–4), can be applied for selecting different therapeutic approaches. However, validation of these selective properties is very poor. To determine the relevance of the GOLD staging system for estimating the severity of clinical problems, GOLD 2 (n=70) and GOLD 3 (n=65) patients were drawn from a prospective cohort of patients with COPD and evaluated crosssectionally by a newly developed Nijmegen Integral Assessment Framework (NIAF). The NIAF is a detailed assessment of a wide range of aspects of health status (HS). Significant, though small, differences were found in Static Lung Volumes, Exercise Capacity, Subjective Pulmonary Complaints, Subjective Impairment, and Health-Related QoL, besides Airflow of course. Moreover, overlap between scores of these five HS sub-domains was substantial, indicating small clinical relevance for discernment. No significant differences were found in nine other aspects of HS. It is concluded that GOLD stages do not discriminate in any aspect of HS other than airflow obstruction, and therefore do not help the clinician in deciding which treatment modalities are appropriate.
The GOLD classification for chronic obstructive pulmonary disease (COPD) has been introduced in order to facilitate comparability of clinical studies, by stratifying patients according to severity of airflow limitation, measured as forced expiratory volume in 1 second as percentage of predicted (FEV1%) [1,2]. Severity classes of COPD can be used to tailor diagnostic and therapeutic interventions in the management of this large patient group.
Clearly, airflow obstruction is only one aspect of COPD. Other key pathophysiological aspects include hyperinflation, diminished exercise capacity, malnourishment, and decreased muscle strength. In addition, there are clinical manifestations such as dyspnoea, functional impairment in daily life, and quality of life. All these aspects are key components of health status (HS) [3-5].
It is unclear to what extent GOLD stages discriminate in aspects of health status, other than airflow obstruction. One study showed a statistically significant difference only in St. George’s Respiratory Questionnaire (SGRQ) section impact and total score between GOLD 2 and 3 (previously designated as stages 2a and 2b), but not in the sections symptoms and activities . Another study found higher exacerbation rates with increasing COPD severity stage (0-IV), but the correlation was weak (r=0.29) . Other relationships between GOLD stages and aspects of health status have not been reported.
In studies on the relationship between health status and GOLD staging, to date, only few aspects of health status were measured. The SGRQ is most frequently used, but as this instrument contains only three sections, at best three aspects of health status are measured. In a previous study we have developed and validated a conceptual framework of health status, the Nijmegen Integral Assessment Framework (NIAF) . This framework provides a much more detailed definition of health status, and is much more formulated in terms of empirical observations than definitions found in the literature. The NIAF covers the following main domains of health status: Physiological Functioning, Complaints, Functional Impairment, and Quality of Life. These four main domains were found to be subdivided into 15 more concrete and relatively independent sub-domains. In addition, the NIAF provides an integration of many existing tests and instruments, by indicating which aspect of health status is measured by each instrument.
The purpose of the present study was to evaluate the relevance of the GOLD classification in COPD to a broad spectrum of aspects of health status, as measured by the NIAF. We focused on GOLD stages 2 and 3, as these stages make up the major part of patients seen in primary and secondary care, and because previous studies have shown that GOLD stages discriminated in aspects of health status between these two stages, and not between other consecutive stages.
Study Design and Recruitment of Patients
Cross-sectional data were collected from a prospective cohort of patients with COPD, who visited three outpatient pulmonary clinics for respiratory complaints. The clinics consisted of one University Hospital (Radboud University Nijmegen Medical Centre, location Dekkerswald), a non-academic teaching hospital (Rijnstate Hospital), and a smaller city-hospital (Maas Hospital). Patients were seen by pulmonologists, whether for reasons of follow-up or new referrals. All patients with an established diagnosis of COPD within GOLD stages 2 and 3 (FEV1 post-bronchodilation between 30-80% predicted), and in a stable condition, were selected by examination of the medical records by one of the authors (JM). Exclusion criteria were: co-morbidity (e.g. cardial, neurological, oncological, or diabetes mellitus), acute exacerbation of COPD within six weeks before enrolment, participation in a pulmonary rehabilitation program within the last six months, or inability to speak Dutch.
The recruitment procedure resulted in 361 eligible patients (Fig. 1). Of them, 316 (88%) gave permission for a telephone call by one of the investigators (LD). One-hundred-and-forty-eight patients did not consent with the study protocol for a variety of reasons, e.g. refusing cycle ergometry, travel problems, being too busy at work or at home, feeling too old. Of the remaining 168 patients, all of whom had given written informed consent, 156 belonged to GOLD stages 2 and 3. Twelve patients had to be classified into GOLD 1 or 4 on the basis of their baseline post-bronchodilator spirometry (shortly after enrolment), which differed slightly from spirometric values observed during the recruitment procedure. Finally, 21 non-smokers, all with a clear history of asthma extending from childhood, were excluded, resulting in 135 smoking-related COPD patients. Subgroup analysis of these never smoking chronic asthmatics revealed neither significant differences in scores of the sub-domains of health status, nor in demographic variables, as compared to the total group (results not shown).
Flow chart of inclusion procedure of patient recruitment, with list of reasons for non-participation.
Box plots of Sub-domainTotalScores (y-axis) which were significant in t-test, split by GOLD stage (x-axis).
Health Status was assessed by the Nijmegen Integral Assessment Framework, developed and validated in a previous study of our research group . This framework covers the following main domains of Health Status: (I) Physiological functioning, (II) Complaints, (III) Functional Impairment and (IV) Quality of Life. These main domains were shown to be subdivided into 15 different sub-domains. Each sub-domain is measured by different existing tests and instruments, and for each sub-domain a Sub-domainTotalScore was calculated. Reliability of the different sub-domains of the framework was adequate to excellent. See Appendix for details on instruments used for each sub-domain.
I. Main Domain Physiological Functioning
Routine pulmonary function tests were performed including transfer capacity for carbon monoxide (CO), maximal exercise capacity, respiratory and skeletal muscle function, and indices of body composition. Various parameters were incorporated into one of the following six sub-domains: Airflow, Static lung volumes, Exercise capacity, Gas Exchange, Muscle Force, and Body Composition. Maximal ergometry consisted of a symptom limited, incremental bicycle test. Patients cycled at a rate of 60 rpm. After a 3 minute reference phase, workload was increased each minute by 10% of estimated maximum work capacity. Whenever major abnormalities developed during the test in ECG or O2 saturations at pulse-oxymetry, the test was ended by the attending physician. The quadriceps leg pressure dynamometer was used to measure quadriceps force. Fat free mass index (FFMI) was derived from the standard formula using bioelectrical impedance measurement (Bodystat 1997).
II. Main Domain Complaints
The sub-domains were measured by specific subscales of the following self-reported questionnaires: Physical Activity Rating Scale-Dyspnoea (PARS-D) , Dyspnoea Emotions Questionnaire (DEQ) , and Quality of Life for Respiratory Illness Questionnaire (QoLRIQ) .
III. Main Domain Functional Impairment
The following questionnaires were used: Sickness Impact Profile (SIP) [11, 12], Global Impairment (measuring general subjective experienced functional impairment) , and Quality of Life for Respiratory Illness Questionnaire (QoLRIQ). In addition, an accelerometer was used to measure actual physical activity level in daily life . This small electronic device is worn at the ankle for 10 consecutive days, 24 hours per day (except when taking a bath or when swimming).
Differences between GOLD 2 and 3 with respect to nominal variables were tested by χ² test. Differences in the Sub-domainTotalScores of the Nijmegen Integral Assessment Framework between GOLD stages 2 and 3 were analyzed by t-tests. To avoid Type I error due to the large number of tests, the p-value was set at < 0.01 for all analyses. For Sub-domainTotalScores that reached significance, box-plots were produced to test the clinical relevance of the statistical difference found between GOLD stages 2 and 3. The larger the overlap between both stages, the less the clinical relevance. As the GOLD classification represents a categorical system, with relatively arbitrary boundaries, additional analyses were performed using the non-categorized FEV1% predicted. For this purpose, Pearson correlation coefficients of FEV1% predicted with the Sub-domainTotalScores were calculated.
The main characteristics describing the patient sample are presented in Table 1. No significant differences were found between patients in GOLD 2 and 3 concerning age, sex, body mass index (BMI), smoking status, and duration of a self-reported diagnosis of COPD. The same was true for inspiratory vital capacity (IVC)% predicted, TLC% predicted, and PaCO2% predicted. TLCO% predicted was significantly different between the two GOLD stages. Obviously, FEV1% predicted and Tiffeneau index were significantly different, because the division into different GOLD stages is based on FEV1% predicted. The number of patients in each group was equally distributed.
Anthropometric, Basic Pulmonary Function, and Demographic Data of the Study Sample
|Variable||(N=70) GOLD 2||GOLD 3 (N=65)|
|Age yrs||64.3 (10.2)||64.7 (8.3)|
|BMI kg/m2||26.5 (3.8)||24.8 (4.0)|
|IVC %pred||94.5 (15.2)||90.1 (14.2)|
|FEV1 %pred||61.0 (7.1)||40.9 (5.2)|
|Tiffeneau %pred||50.0 (8.0)||35.0 (8.0)|
|TLC %pred||100.9 (16.6)||106.0 (14.7)|
|TLCO %pred||73.7 (21.5)||59.4 (22.3)|
|PaO2 rest, kPa||11.1 (1.3)||10.6 (1.3)|
|PaCO2rest, kPa||5.1 (0.5)||5.2 (0.5)|
|Self-reported COPD diagnosis, yrs.
1 − 10 yrs.
> 10 yrs.
Note: BMI= body mass index; IVC %pred= inspiratory vital capacity as percentage of predicted; FEV1 %pred= forced expiratory volume in one second as percentage of predicted; Tiffeneau %pred= FEV1/IVC as percentage of predicted. TLC %pred= total lung capacity as percentage of predicted; TLCO %pred= lung transfer capacity for carbon monoxide as percentage of predicted. Data are presented as N, or mean (SD).
Health Status Main Domain (I) Physiological Functioning. Sub-Domain Total Scores (Mean (SD)) are Presented with p-Values for Difference Between GOLD Stage 2 and 3 and Values of Individual Parameters Composing Each Sub-Domain. Higher Scores of Sub-Domain Total Scores Indicate Worse Condition
|GOLD 2||GOLD 3||p-Value T-Test|
|Mean (SD)||(Mean (SD)|
VE max %pred
Static lung volumes
Delta BE (kPa)
- 5.5 (2.2)
Δ (A-a)DO2 (kPa)
Δ PaCO2 (kPa)
Pi max %pred
Pe max %pred
Leg force %pred
FFMI (kg/m 2)
Note: Delta BE= change in Base Excess during exercise test; Δ (A-a)DO2 (kPa)= change in (A-a)DO2 during exercise test; ΔPaCO2 (kPa): change in PaCO2 during exercise test calculated as value at maximum exercise minus value at rest; FFMI : fat free mass index; values for males and females separately, taking into account different lower limits of normal. p < .01 is considered statistically significant and is displayed in bold.
Health Status (HS) Main Domains II Complaints, III Functional Impairment and IV Quality of Life. Sub-Domain Total Scores Mean (SD) for GOLD 2 and 3 and p-Values for Differences from T-Tests. Detailed Information on the Instruments that Constitute these Sub-Domains is Available in Appendix 1
|GOLD 2||GOLD 3||Diff. GOLD 2 vs 3. (T-Test)|
|Mean (SD)||Mean (SD)||p Value|
subjective pulmonary complaints
III Functional Impairment
actual physical activity
IV Quality of Life
satisfaction in social relations
Note: Significant p-values are depicted in bold.
Pearson Correlations of Subdomains of Health Status (HS) Main Domains Physiological Functioning, Complaints, Functional Impairment, and Quality of Life (QoL) with FEV1% Predicted and p-Value
|Sub-Domains of Main Domains||Pearson Corr. with FEV1% pred.||p-Value|
Main domain Physiological Functioning
Static Lung Volumes
Main domain Complaints
Main domain Functional Impairment
Actual Physical Activity
Main domain Quality of Life
General Quality of Life
Satisfaction with relationships
Sub-Domains of Health Status in Relation to GOLD 2 and 3
Data of pulmonary function at rest and during maximal cycle ergometry, muscle function, and body composition are summarized by the six sub-domains of Physiological Functioning, and are shown in Table 2. Regarding FFMI, all values were in the normal range (> 16 kg/m² for males and > 15 kg/m² for females), although the mean value of females in GOLD 3 hardly exceeded the lower limit of normal. Results on the sub-domains of the main domains Complaints, Functional Impairment, and Quality of Life are presented in Table 3.
Though a clear difference existed between GOLD 2 and 3 regarding the sub-domain Airflow, differences in the other Sub-domainTotalScores were small or absent. Statistical significant differences were only found in five out of the other 14 sub-domains: Static lung volumes, Exercise capacity, Subjective Pulmonary Complaints, Subjective Impairment, and Health-Related QoL. The sub-domain Gas Exchange (Table 2) lacks a Sub-domainTotalScore, as the two composing variables were not linear in distribution. Box-plots (Fig. 2) show that considerable overlap existed between GOLD stage 2 and 3 for all five sub-domains that reached statistical significance.
Sub-Domains of Health Status in Relation to FEV1% Predicted
Correlations between Sub-domainTotalScores and FEV1% predicted are presented in Table 4. Significant correlations were found only for Airflow, Exercise capacity, and an index of Gas Exchange (Δ PaCO2).
This is the first study that evaluated the relevance of the GOLD classification in relation to a broad range of aspects of health status in COPD. In the present study it was shown that the GOLD classification has no clinical relevance in aspects of health status other than airway obstruction.
The relevance of GOLD stages with respect to different aspects of health status (HS) was evaluated using the recently developed Nijmegen Integral Assessment Framework (NIAF) of health status in COPD . Most existing instruments contain only three to five subscales, and thus measure only few aspects of HS. An integral assessment of HS therefore requires the use of multiple instruments. However, an evidence-based integration of existing instruments is lacking . The NIAF covers the main domains Physiological Functioning, Complaints, Functional Impairment, and Quality of Life. We found that these four main domains were conceptually distinct, and were shown to be further subdivided into 15 more concrete and homogeneous sub-domains. The NIAF has four key characteristics. First, it covers a broad range of aspects of health status relevant to COPD. Second, it integrates existing instruments by indicating which instruments measure the same sub-domain of HS, and it provides information on the validity of these instruments by indicating which sub-domain is actually measured. Third, all 15 sub-domains were shown to be relatively independent, which means that each sub-domain represents a unique aspect of the patient’s health status. Fourth, the sub-domains are measured by existing tests and instruments and for each sub-domain a single score is calculated. Taken together, the NIAF provides an empirical and detailed definition of health status, and allows a valid and integral assessment of 15 relatively unrelated and therefore unique aspects of health status in COPD, expressed in only 15 scores.
We found significant differences between GOLD stage 2 and stage 3 in only five of 15 sub-domains, other than airway obstruction. These were Static lung volumes, Exercise Capacity, Subjective Pulmonary Complaints, Subjective Impairment, and Health-Related QoL. Considering the large overlap in the scores of these five sub-domains between both stages, the clinical relevance of these differences is small. With respect to the sub-domain Exercise Capacity, significant but minor correlations with FEV1% predicted were reported by several authors, based mostly on maximal oxygen consumption (VO2max), varying between .22 and .44 [18-20]. In the present study the sub-domain Exercise Capacity contained three other factors in addition to VO2max, but we found a similar correlation with FEV1% predicted (r= .38). Hyperinflation commonly occurs in conjunction with decreasing FEV1, but empirical studies on this issue are lacking. The correlation of the sub-domains Static Lung Volumes and FEV1% predicted was low (r=-.22) and not statistically significant, indicating low shared variance. With respect to the main domain Complaints, there was a significant difference between both stages in sub-domain Subjective Pulmonary Complaints, but not in the sub-domains Dyspnoea Emotions and Dyspnoea Expected. With respect to the main domain Functional Impairment, GOLD discriminated in the sub-domain Subjective Impairment, but not in the sub-domains Behavioural Impairment and Actual Physical Activity. Concerning the main domain Quality of Life, GOLD stages discriminated only in the sub-domain Health-Related QoL.
Poor associations between airflow obstruction and other pathophysiological parameters, complaints, functional impairment, and quality of life have been reported by many studies [17, 20-27], but little research has been done on the clinical relevance of the GOLD classification system in relation to these different aspects of health status. A low correlation (r=0.29) was reported between GOLD staging and number of hospital readmissions for exacerbations . Antonelli-Incalzi et al.  found significant differences between GOLD stages regarding the St. George’s Respiratory Questionnaire (SGRQ). However, no significant differences were found for the 6-minute walk distance, quality of sleep, and cognitive and affective status. Also, the differences in the SGRQ scores were only found for some subscales of the SGRQ, and only between stages 2 and 3 (previously stages 2a and 2b). Similar to the present study, the authors found large variability within all GOLD stages, resulting in major overlap in scores between consecutive stages. Strikingly, this study also demonstrated substantial problems in health status in patients with COPD with GOLD stage 0 and stage 1. The authors concluded that health status cannot be inferred from GOLD stage.
Two studies used criteria for disease staging similar to GOLD. Ferrer and colleagues  found low to moderate correlations of the SGRQ subscales with COPD stages according to the American Thoracic Society (ATS) guidelines (r=.27 to .51). This relationship was stronger in patients without co-morbidity, as compared to patients with co-morbidity (r=.68 versus .40, respectively). These authors also reported a large overlap in scores between the ATS stages and substantial health status problems in patients with mild disease severity. Hajiro et al. used British Thoracic Society (BTS) staging criteria . They found significant differences between moderate and severe COPD on all subscales of the SGRQ and VO2max. Between mild and moderate COPD they only found significant differences on the subscale activity and the total score of the SGRQ. Several pulmonary function parameters, dyspnoea, anxiety, and depression were not significantly different between any stage.
Results of the present study and previous studies show that a classification system based on the severity of airflow obstruction has no relevance with respect to aspects of health status other than airflow obstruction. In only a few aspects of health status significant relationships were found, but considering the major overlap in scores between consecutive stages these differences have little clinical relevance. We explicitly would discard staging systems based on only one aspect of health status in favour of assessment incorporating multiple aspects of health status.
This conclusion is not surprising, because many studies have shown that airflow obstruction (the parameter on which the GOLD stages are based) is poorly related to other pathophysiological processes, functional impairment, complaints, and quality of life. In addition, previously, we found that the 15 sub-domains of HS represented by the NIAF are relatively independent . This means that scores on a particular sub-domain, such as airflow, do not predict scores on other sub-domains.
Some methodological considerations should be discussed. First, we focused on patients with GOLD stages 2 and 3, as these patients constitute the most part of patients seen in medical care, and because in a previous study it was shown that GOLD stages discriminated in aspects of health status between these stages, and not between other consecutive stages . Second, we excluded primary co-morbidity to exclude its confounding effect, which was reported in a study on the relationship between ATS classification and health status . In that study, also a relationship between disease stage and the SGRQ was found. This relationship was most pronounced in patients without co-morbidity. Thus, if any differences between GOLD stages are present, these would have been found in the present sample where co-morbidity was excluded.
The present findings have important implications for clinical practice and research. The GOLD classification was designed to create more homogeneous subgroups for research purposes, and it was expected that this classification could guide diagnosis and treatment of COPD [1, 2]. Our findings show that GOLD staging is only clinically meaningful with respect to airway obstruction, but not to any other aspect of health status, such as other physiological processes, complaints, functional impairment in daily life, and quality of life. The relative independence of the different sub-domains of health status implies that treatment directed at only one aspect of HS, for example airflow obstruction, does not result in improvement in other aspects. Integral diagnosis, treatment, and management of COPD therefore should go beyond staging and management of airway obstruction alone, and should include additional and specific interventions aimed at improving other aspects of health status as well. In order to tailor treatment to the needs of an individual patient, GOLD staging proves to be inadequate. What is needed is an integral assessment of all aspects of health status.
The key characteristic of the NIAF is that multiple, and relatively unique aspects of health status are integrated, which produces a more complete picture of the patient with COPD. The recently developed BODE index also is an approach in which different aspects of COPD are combined to yield a more complete picture of the patient. The BODE index (Body Mass Index, airway obstruction, dyspnoea, and exercise capacity) significantly better predicted mortality  and hospitalization , as compared to FEV1 alone. The difference between the BODE index and the NIAF is, that the former combines the different aspects into a single parameter. The NIAF, in contrast, produces 15 different scores. In clinical practice, this has the advantage of obtaining a detailed picture of an individual patient, with respect to which aspects of health status are problematic, and which aspects are not. This enables the clinician to fine-tune treatment to the needs of an individual patient.
In conclusion, GOLD staging only has clinical relevance with respect to airflow obstruction, and not to any other sub-domain of HS. Therefore, the GOLD classification system is not useful in guiding treatment and management of COPD. Integral assessment of different aspects of health status is needed. The NIAF provides clinicians with a detailed picture of the patient’s health status, and therefore a guide to tailor treatment to the needs of the individual patient.
We are indebted to Dr. F van den Elshout (pulmonologist, Rijnstate Hospital, Arnhem) and Dr. R Bunnik (pulmonologist, Maas Hospital, Boxmeer) for their contribution in the patient recruitment. The study was supported by grants of the Dutch Asthma Foundation, GlaxoSmithKline, and the Department of Medical Psychology and the Department of Pulmonary Diseases, Radboud University Nijmegen Medical Centre.
List of instruments used to measure the HS main domains and sub-domains with description of included subscales.
|Main domain (bold)
|FEV1%, MEF50%, VE%
VO2%, TLCO%, BE-Delta, HRmax%
Pi max %, Pe max %, Leg force %
BMI kg/m2, FFMI kg/m 2
|Subjective Pulmonary Complaints||Physical activity rating scale-Dyspnoea
Dyspnoea Emotions Questionnaire
Quality of Life for Respiratory Illness questionnaire
|Dyspnoea Emotions||Dyspnoea Emotions Questionnaire||frustration
|Expected Dyspnoea||Physical activity rating scale-Dyspnoea||expected dyspnoea|
|Actual physical activity||Accelerometer|
|Behavioural impairment||Sickness impact profile||body care & movements
|Subjective impairment||Quality of Life for Respiratory Illness questionnaire
Sickness impact profile
|Quality of life|
|General QoL||Satisfaction With Life Scale
Beck Depression Inventory