Journal Home
Search for

Volume 105, Issue 1, Pages 81-91 (January 2008)


View previous. 10 of 40 View next.

The responsiveness of EQ-5D utility scores in patients with depression: A comparison with instruments measuring quality of life, psychopathology and social functioning

Oliver H. GüntherabCorresponding Author Informationemail address, Christiane Roickb, Matthias C. Angermeyerb, Hans-Helmut Königab

Received 25 January 2007; received in revised form 20 April 2007; accepted 20 April 2007.

Abstract 

Introduction

The EQ-5D provides preference weights (utilities) for health-related quality of life to be used for calculating quality-adjusted life years (QALYs) in cost-utility analysis. The aim of this study was to compare differences in EQ-5D utility scores with differences in quality of life, psychopathology, and social functioning scores.

Methods

In an observational longitudinal cohort study, EQ-5D utilities (EQ visual analogue scale (EQ VAS), EQ-5D indices of the United Kingdom (EQ-5D index-UK) and Germany (EQ-5D index-D)) were compared with scores of the WHOQOL-BREF, CGI, and GAF at baseline and at 18 months (N=104). The patients' health status at follow-up was categorized as “worse”, “stable”, or “better” using the EQ-5D transition question (patient-based anchor) and the Bech–Rafaelsen melancholy scale (clinician-based anchor). Effect sizes (ES) were used to compare differences in scores within each group over time; regression analysis was used to derive meaningful difference scores in health status associated with a shift from “stable” to “better” health status.

Results

The most responsive instrument was the CGI (patient-based anchor: ES=|0.98|; clinician-based anchor: ES=|1.35|); responsiveness was large in EQ VAS (patient-based anchor: ES=|0.84|; clinician-based anchor: ES=|1.19|), but rather small to medium for EQ-5D index-UK (patient-based anchor: ES=|0.55|; clinician-based anchor: ES=|0.65|) and EQ-5D index-D (patient-based anchor: ES=|0.41|; clinician-based anchor: ES=|0.45|). Compared with the other instruments, the shift to a “better health status” was smaller if elicited by the EQ-5D indices.

Discussion

Both EQ-5D indices were less responsive and need larger patient samples to detect meaningful differences compared with EQ VAS and the other instruments.

Article Outline

Abstract

1. Introduction

2. Methods

2.1. Subjects

2.2. Measures

2.2.1. EQ-5D

2.2.2. EQ-5D index

2.2.3. WHOQOL-BREF

2.2.4. CGI-S, GAF

2.2.5. Bech–Rafaelsen melancholia scale (BRAMES)

2.3. Statistical analysis

2.4. Ethics

3. Results

3.1. Demographic characteristics

3.2. Health status assessed by the EQ-5D descriptive system

3.3. Relationship of difference scores in the EQ VAS and in the EQ-5D indices with other instruments

3.4. Responsiveness of the EQ VAS and the EQ-5D indices compared with other instruments

3.5. Meaningful differences of the EQ VAS and the EQ-5D indices compared with other instruments

4. Discussion

Acknowledgment

References

Copyright

1. Introduction 

return to Article Outline

The EQ-5D is a short generic patient rated questionnaire measuring health-related quality of life (HRQOL). It is often applied as a measure of outcome in studies comparing different treatments (The EuroQol Group provides a detailed reference list [April 2007]: http://www.euroqol.org). The EQ-5D provides two important aspects: a descriptive profile of HRQOL based on five dimensions including mobility, self-care, usual activities, pain/discomfort, anxiety/depression, and a valuation of the profile by a visual analogue scale (EQ VAS), i.e. a single score reflecting patients’ preferences (Brooks, 1996, The EuroQol Group, 1990). In addition, for various countries (including the United Kingdom, Germany, and the United States) an index score is available assigned to all possible health states described by the EQ-5D according to a particular set of preference values derived from surveys of the general population (Dolan, 1997, Greiner et al., 2004, Shaw et al., 2005). The patients' scores and/or index scores derived from the general population might be used in evaluating change in health status, with the former reflecting the preferences of beneficiaries of care and the latter reflecting community preferences (Dolan, 1999). For the purpose of cost-utility analysis in economic evaluation, with the consequences of treatment being measured in terms of quality-adjusted life years (QALYs), these preference weights are typically used for calculating QALYs (Drummond et al., 1997, Gold et al., 1996).

Decisions about the suitability of the EQ-5D in economic evaluation, especially concerning the EQ VAS and the various country specific societal EQ-5D indices, need to be based on a clear conceptual framework; that means the EQ-5D has to demonstrate its psychometric validity and reliability (Revicki et al., 2000). In the field of depressive disorders, several studies evaluated the suitability of the EQ-5D index-UK (Hayhurst et al., 2006, Lamers et al., 2006, Sapin et al., 2004). In the study by Sapin et al., the authors showed that significant change in EQ-5D index-UK was found by disease severity level, with more severe patients having lower index scores (Sapin et al., 2004). Further studies demonstrated that the mean in EQ-5D index-UK decreased with deterioration in health status and that the EQ-5D index-UK discriminates between groups with varying levels of depression (Hayhurst et al., 2006, Lamers et al., 2006). Overall, these results could corroborate that the EQ-5D index-UK reflected psychopathology and mental aspects of the quality of life in patients with major depression.

In addition to aspects of validity and reliability, the EQ-5D has to show evidence demonstrating responsiveness. Responsiveness reflects the ability of an instrument to detect a change in health status. Responsiveness is determined by evaluating the relationship between changes in clinical endpoints and changes in an instrument’s outcome over time in either observational or clinical trials (Guyatt et al., 1987, Revicki et al., 2000). Recently, the Food and Drug Administration (FDA) in the Unites States has raised a draft on patient reported outcomes including methods to calculate responsiveness and to interpret the detected changes as meaningful (Food and Drug Administration, 2006). The recommended best practice in the evaluation of responsiveness is the calculation of various distribution-based estimates (i.e. effect size, standardized response mean, standard error of measurement) under several anchor-based criteria (i.e. patient or clinician ratings of global improvement) (Revicki et al., 2006): on this note, the anchor-based criterion is used as a external indicator to assign patients into groups reflecting “no change”, and a “(small) positive/negative change”. The distribution-based estimates describing responsiveness consist of a ratio in which the difference in mean baseline to endpoint score reflects the numerator, and different estimates of variability reflect the denominator. Each of these statistical measures acts as a quantitative description of change within the groups. Guidance on interpretation of the magnitude of a distribution-based estimate, for example whether differences in scores are viewed as meaningful, is provided (Cohen, 1988, Norman et al., 2003).

Nonetheless, there is no gold standard in terms of whether the difference in scores are meaningful either from the patient’s or clinician’s perspective, but there are some methods which one will find very useful in interpretation; for example, one definition of a meaningful difference is based on the “minimal important difference” (MID) in scores perceptible to patients as a beneficial change (Guyatt et al., 2002, Jaeschke et al., 1989). In practice, the MID is viewed as the difference in scores between the group with “no change” and the group with “small positive/negative change”. Actually, the MID is an interpretation of change across the groups.

The objective of this study was to compare and contrast the responsiveness in preference-based scores of the EQ-5D for patients with depression with the responsiveness in composite summary scores of instruments measuring quality of life, psychopathology and social functioning. To facilitate interpreting whether the differences in scores are meaningful, we combined measurement information with respect to a patient and clinician anchor-based criterion. We hypothesized the following: (1) since the preference-based scores of the EQ-5D as well as the other instruments used for comparison reflect aspects of HRQOL, the difference scores should show at least a moderate relationship with each other; (2) since the studies published on quality of life in depression showed that HRQOL is related to depressive symptoms, instruments measuring a specific “facet of depression” are expected to reflect a more responsive, meaningful change than the generic preference-based scores derived from the EQ-5D (Papakostas et al., 2004, Wiebe et al., 2003).

2. Methods 

return to Article Outline

2.1. Subjects 

Patients suffering from an affective disorder according to the International Statistical Classification of Diseases (ICD-10) (World Health Organization, 2004) were recruited consecutively by clinicians at three departments of psychiatry in the federal state “Schleswig Holstein”, Germany. Only patients with a depressive episode according to the ICD classification F32.1, F32.2, F33.1 and F33.2 participated in this observational longitudinal cohort study. Consequently, patients with psychotic symptoms and patients with manic symptoms were excluded from participation. Baseline interviews were administered after admission from September 2003 until March 2004 by trained interviewers as part of a research study within the pilot project of the Regional Budget for Mental Health Care (Roick et al., 2005). All patients received therapy and were again interviewed after follow-up period of eighteen month. They were at least 18 years old and had residence in “Schleswig Holstein” or the city of “Hamburg”. Their participation was voluntary, and non-participation was not recorded.

2.2. Measures 

All subjects were assessed by a set of instruments containing self-rated instruments and scales to be completed by clinicians. The study measures will be briefly described.

2.2.1. EQ-5D 

A standard version of the EQ-5D was administered, comprised of five items relating to problems in the following dimensions: “mobility”, “self-care”, “usual activities”, “pain/discomfort”, and “anxiety/depression” (Brooks, 1996, The EuroQol Group, 1990). Responses in each dimension are divided into three ordinal levels coded: (1) no problems, (2) moderate problems, and (3) extreme problems. Theoretically, 35=243 different health states can be defined.

The EQ-5D descriptive system was followed by a visual analogue scale (EQ VAS), similar to a thermometer ranging from 0 (worst imaginable health state) to 100 (best imaginable health state). The EQ VAS records the respondent's self-rated valuation of HRQOL, which is based on the respondent's preferences.

2.2.2. EQ-5D index 

Two different EQ-5D indices which represent societal preference values were used in the present study: the first index was obtained from a large United Kingdom population sample (n=2997), where the valuation of 42 EQ-5D health states by Time Trade Off (TTO) technique was used to derive an algorithm for societal preference values of all 243 possible EQ-5D health states (Dolan, 1997). The second index used resulted from TTO based valuations of 36 EQ-5D health states by a random sample of the German general population (n=334) (Greiner et al., 2004). According to each individual's health status on the self-classifier of the EQ-5D, one EQ-5D index-UK value and one EQ-5D index-D value were assigned.

2.2.3. WHOQOL-BREF 

The WHOQOL-BREF is a self-rated generic questionnaire for measuring quality of life over the previous two weeks (WHOQOLGroup, 1998). In the study only the WHOQOL-BREF score for overall quality of life was used. This score is calculated from two items ranging from 1 (worst) to 100 (best) (Angermeyer et al., 2000).

2.2.4. CGI-S, GAF 

CGI-S is a single-item scale rated by the clinician. It is a standard measure for global assessment of severity of illness rated on a seven-point-Likert scale ranging from 1 (“not ill at all”) to 7 (“among the most extremely ill individuals”) (Guy, 1976). The GAF single-item scale measures the overall level of occupational functioning (Goldman et al., 1992). Rated by the clinician, the GAF scale consists of a series of ranked sentences associated with numerical scores ranging from 1 (worst) to 100 (best).

2.2.5. Bech–Rafaelsen melancholia scale (BRAMES) 

The BRAMES is used to assess the severity of depression rated by the clinician (Bech et al., 1979). It consists of 11 items, all scored on a five-point Likert scale. The BRAMES fulfills the criteria of unidimensionality resulting in a total score range from 0 (best) to 44 (worst) (Licht et al., 2005).

2.3. Statistical analysis 

The relationship of difference scores in EQ-5D utilities and in the composite overall measures used in the study was determined by analyzing their level of correlation. Since the EQ VAS score and the EQ-5D indices did not follow a normal distribution, the non-parametric Spearman rank correlation coefficient (rs) was calculated. Correlation was considered to be small for |0.1|>rs|0.3|, medium for |0.3|>rs|0.5| and large for rs>|0.5|.

All patients were classified in three different groups according to their change between baseline and follow-up on two anchor-based criteria: the first criterion was the EQ-5D transition question with patients rating their health status as worse, stable or better at follow-up in comparison with the baseline interview. Patients answered the transition question after the EQ-5D descriptive profile and the EQ VAS. The second criteria was based on the clinical change in BRAMES score: an interval defined as half a standard deviation (SD) around zero change represented the category “stable health status”. Negative values beyond this interval represent the category “worse health status”, whereas positive values beyond this interval represented the category “better health status” (Norman et al., 2003).

Responsiveness was compared by paired t-tests statistics, effect sizes (ES) and standardized response mean (SRM). The method to calculate ES was: ES=M2M1/SDbaseline; where M1 is the mean score of baseline assessment, M2 the mean score of the post-assessment, and SDbaseline the pooled SD of the baseline assessment. The method to calculate SRM was: SRM=M2M1/SDM2M1; where the numerator remains the same as for calculating ES, but the denominator represents the SD of the differences in mean scores. We considered an absolute magnitude of difference scores expressed by ES and SRM from <|0.20| as trivial, from ≤|0.20| to <|0.50| as small, from ≤|0.50| to <|0.80| as medium, and from ≥|0.80| as large based on Cohen's interpretation guidelines (Cohen, 1988).

Missing values (<1.3%) were replaced by the baseline value or by the follow-up value, respectively. As a consequence, the sum of baseline minus follow-up value was equal to zero. In the case of missing values at both measurement points, the difference in scores was set equal to zero (Sprangers et al., 2002).

“Meaningful differences” in health status were estimated by a linear regression model. The model is represented by the equation ▵T=a+b1X1+b2X2+b3X3+b4X4+b5X5+e; where ΔT is the difference in instrument's score of the baseline and the follow-up assessment, a is the constant, b1b2 are the regression coefficients for “worse” and “better” health state”, X1X2 are the dummy variables for “worse” and “better” health state, b3b5 are the regression coefficients for the variables X3 (score at baseline), X4 (the period since first diagnosis (in years)), X5 (age of the patients), and e is the error term. This regression model generates coefficients that reflect the incremental amount in difference scores for a shift to better/worse health status compared to the stable health status and concurrently controlled for instrument's baseline score, the period since first diagnosis, and age.

All calculations were preformed by using the software STATA (STATA Corp., College Station, Texas, USA, Version 9.2).

2.4. Ethics 

The research protocol of this study was reviewed by the Committees of Research Ethics at the Medical Faculty of the University of Leipzig.

3. Results 

return to Article Outline

3.1. Demographic characteristics 

Of the 141 patients initially participating in the study at baseline, 37 dropped out at follow-up, resulting in 104 patients completing both interviews. At baseline the mean age of patients who completed the study was 47.2 (SD 13.9). The age ranged from 20 to 86 years, and the majority (70.2%) were female. More than half of all patients lived with a partner (51.9%). Most patients were diagnosed with moderate to severe levels of depressive episodes (61.6%), followed by repeated depressive episodes (38.5%). The mean duration of disease was 7.2 (SD 8.0) years and was associated with 1.8 (SD 2.1) inpatient stays on average. At the time of recruitment 40.8% of the patients were using inpatient care, 35.0% were outpatients, and 24.2% were day-patients.

At baseline, mean scores of patients who completed at follow-up were similar in measures of quality of life compared to patients not participating in the follow-up interview (p>0.05). Some mean scores in psychopathology (CGI, BRAMES) and social functioning (GAF) were significantly different in the two groups indicating a higher impairment in the drop-out group.

3.2. Health status assessed by the EQ-5D descriptive system 

Fig. 1 shows the frequency of patients indicating problems in the dimensions of the EQ-5D descriptive system: at baseline patients indicated problems predominantly in the dimension “anxiety/depression” (78.8%), followed by “usual activities” (66.4%), “pain/discomfort” (66.0%), “mobility” (28.8%) and “self care” (27.9%). At follow-up improvement in health status were clearly indicated in the dimensions “usual activities”, “pain/discomfort”, and “anxiety/depression”. The number of patient indicating an improvement in health status elicited by one of the EQ-5D dimensions ranged from 16 patients in the dimension “mobility” to 40 patients in the dimension “usual activities”. Indicated deterioration in health status ranged from 13 patients in the dimension “usual activities” to 21 patients in the dimension “anxiety/depression”.


View full-size image.

Fig. 1. Distribution of responses to items of the EQ-5D self-classifier in patient sample at baseline and eighteen months after (N=104).


At baseline, the most frequently reported EQ-5D self-classified health state showed no problems in the dimensions “mobility” and “self care”, but moderate problems in the other dimensions, which was indicated by 12.6% of all patients. The proportion of individuals with extreme problems in at least one of the dimensions was 25.4%. At follow-up the most frequently reported EQ-5D self-classified health state was no problems on any dimension, which was indicated by 17.3%. Extreme problems in at least one dimension were stated by 21.5%.

3.3. Relationship of difference scores in the EQ VAS and in the EQ-5D indices with other instruments 

Table 1 demonstrates that difference scores of the EQ VAS and the EQ-5D indices showed significant correlation with all other difference scores of instruments used for comparison. Correlations of the EQ VAS and the EQ-5D indices with the BRAMES total score indicate a medium to large relationship. The EQ-5D index-D and the EQ-5D index-UK showed a large correlation of rs=0.945. Correlations of the EQ VAS score with overall composite scales of the instruments tended to be stronger than correlations of the EQ-5D indices with these scales.

Table 1.

Correlation between change score of preference-based measures, and of other instruments’ overall scores (N=104)

WHOQOL-BREF (total score)CGIGAFEQ VASEQ-5D index UKEQ-5D index DBRAMES
WHOQOL-BREF (total score)1.00
CGI0.5681.00
GAF0.5910.7841.00
EQ VAS0.6420.4440.5081.00
EQ-5D index UK0.5450.5390.4920.4401.00
EQ-5D index D0.4290.4410.3820.3210.9451.00
BRAMES0.6800.7040.7480.5740.5760.4621.00

p<0.01.

3.4. Responsiveness of the EQ VAS and the EQ-5D indices compared with other instruments 

Table 2 shows the mean scores and the SD of the baseline interview, at follow-up and the resulting difference scores classified by the levels “worse health status”, “stable health status” and “better health status” according to the two anchors “EQ-5D transition question” and “BRAMES psychopathology scale”. For both anchors and all instruments, the category “better health status” was associated with mean difference scores reflecting an improvement in health status. The category “worse health status” was associated with mean difference scores reflecting a deterioration in health status, except for the EQ VAS anchored by the EQ-5D transition question; here the EQ VAS difference score reflects an improvement in health, although the anchor is associated with “worse health status”. Another striking issue concerns both EQ-5D indices: the absolute values of mean difference scores reflecting a deterioration in health status in the category “worse health status” were considerably larger than the mean difference scores reflecting an improvement in health status in the category “better health state”.

Table 2.

Baseline, follow-up, and change scores of instruments for patients classified as deteriorated, stable and improved by criterion EQ-5D transition question and BRAMES anchor

Instruments [possible range of scores: worst–best]Time periodAnchor by EQ-5D transition question (patient based)Anchor by BRAMES psychopathology scale (clinician based)
Worse HSStable HSBetter HSWorse HSStable HSBetter HS
N=16N=39N=49N=10N=43N=51
Mean (SD)
WHOQOL-BREF global score [0–100]Baseline34.38 (23.50)45.19 (23.41)44.39 (26.03)52.50 (18.44)51.74 (25.96)34.07 (21.66)
Follow-up26.56 (13.60)49.36 (18.57)67.60 (18.91)30.00 (18.82)52.61 (21.57)60.78 (21.94)
Change7.81 (28.46)4.17 (23.18)23.21 (25.26)22.50 (24.15)0.87 (21.55)26.71 (22.36)
CGI [6–0]Baseline3.31 (0.87)2.95 (1.34)3.02 (1.20)2.50 (0.85)2.65 (1.23)3.47 (1.10)
Follow-up3.87 (0.93)2.49 (1.39)1.84 (1.28)3.70 (0.95)2.53 (1.28)1.98 (1.46)
change0.56 (0.89)0.46 (1.60)1.18 (1.44)1.20 (0.79)0.12 (1.07)1.49 (1.46)
GAF [1–100]Baseline53.88 (12.60)61.13 (17.92)59.82 (16.76)68.30 (11.85)65.88 (15.34)52.18 (15.66)
Follow-up51.56 (8.54)62.10 (13.68)67.57 (14.96)49.80 (8.61)61.88 (13.49)66.65 (15.03)
Change2.31 (14.30)0.97 (20.73)7.76 (15.79)18.50 (10.81)4.0 (10.91)14.47 (16.49)
EQ VAS [0–100]Baseline36.38 (24.36)51.82 (23.26)53.88 (22.96)51.90 (27.40)58.77 (23.94)43.08 (20.98)
Follow-up38.25 (21.36)60.21 (18.67)73.73 (17.17)40.20 (20.53)62.05 (19.32)68.06 (21.59)
Change1.88 (26.04)8.38 (21.58)19.20 (25.71)11.7 (20.48)3.28 (20.56)24.98 (22.48)
EQ-5D index UK [−0.59–1.0]Baseline0.461 (0.290)0.583 (0.315)0.643 (0.283)0.677 (0.283)0.662 (0.284)0.518 (0.303)
Follow-up0.171 (0.326)0.659 (0.256)0.798 (0.194)0.413 (0.462)0.626 (0.303)0.716 (0.285)
Change0.290 (0.301)0.075 (0.301)0.155 (0.286)0.265 (0.310)0.037 (0.235)0.198 (0.336)
EQ-5D index D [−0.21–1.0]Baseline0.661 (0.245)0.754 (0.265)0.808 (0.231)0.826 (0.207)0.811 (0.230)0.715 (0.266)
Follow-up0.412 (0.282)0.810 (0.221)0.903 (0.144)0.645 (0.401)0.776 (0.253)0.836 (0.224)
Change0.249 (0.246)0.056 (0.260)0.095 (0.243)0.181 (0.286)0.035 (0.207)0.121 (0.293)

HS=Health status; SD=Standard deviation.

Table 3 shows the responsiveness statistics (T-statistics, effect size (ES), standardized response mean (SRM)) for all instruments used split by the two anchors. The results are mainly based on the mean scores and SD presented in Table 2. In general, responsiveness of instruments was larger according to the clinician-based anchor compared with the patient-based anchor. Moreover, the WHOQOL-BREF, the CGI, and the EQ VAS showed a large improvement in health in the category “better health status” according to both anchors. Concerning the category “better health status”, the EQ-5D index-UK demonstrated rather medium ES and SRM, whereas the responsiveness of the EQ-5D index-D was even smaller. Concerning the category “worse health status” the ES and SRM of both EQ-5D indices were almost twice as large, indicating rather large responsiveness.

Table 3.

Comparison of responsiveness statistics for all instruments' composite overall scores and EQ-5D's preference measures by external anchors of change

StatisticsSummary scoreAnchor by EQ-5D transition question (patient based)Anchor by BRAMES psychopathology scale (clinician based)
Worse HS N=16Stable HS N=39Better HS N=49Worse HS N=10Stable HS N=40Better HS N=51
T-statistics (paired t-test)WHOQOL BREF (total score)1.101.126.43⁎⁎2.950.278.53⁎⁎
CGI1.961.805.76⁎⁎4.81⁎⁎0.717.28⁎⁎
GAF0.650.293.44⁎⁎5.41⁎⁎2.416.27⁎⁎
EQ VAS0.292.435.23⁎⁎1.811.057.93⁎⁎
EQ-5D index UK3.85⁎⁎1.533.79⁎⁎2.701.034.21⁎⁎
EQ-5D index D4.07⁎⁎1.342.73⁎⁎2.061.112.93
Effect sizeWHOQOL BREF (total score)0.330.180.891.220.031.23
CGI0.640.340.981.410.101.35
GAF0.180.050.461.560.260.92
EQ VAS0.080.360.840.430.141.19
EQ-5D index UK1.000.240.550.940.130.65
EQ-5D index D1.020.210.410.870.150.45
SRMWHOQOL BREF (total score)0.270.180.920.930.041.19
CGI0.630.290.821.520.111.02
GAF0.160.050.491.710.370.88
EQ VAS0.070.390.750.570.161.11
EQ-5D index UK0.960.250.540.850.160.59
EQ-5D index D1.010.220.390.630.170.41

Significant t-statistics (p<0.05; ⁎⁎p<0.01) and effect sizes, standardizes response mean >|0.8| are printed bold.

SRM=Standardized response mean.

HS=Health status.

3.5. Meaningful differences of the EQ VAS and the EQ-5D indices compared with other instruments 

Table 4 shows the results of the regression analysis. In general, absolute values of regression coefficients associated with a shift to “better health status” were smaller than regression coefficients associated with a shift to “worse health status” (for both anchors). Moreover, all coefficients indicating the influence of the baseline score were significant, but age and period since first diagnosis had no significant influence on difference scores. That means, the smaller the score at baseline the larger is the difference between the two assessments. In detail, the WHOQOL-BREF and the EQ VAS showed almost similar meaningful differences regarding both anchors compared to the group with stable health status. That the difference scores recording a meaningful improvement/deterioration in health status could differ across anchors became explicitly apparent in the EQ-5D indices; the EQ-5D index-UK indicated a gain between 0.104 and 0.167 as a meaningful difference associated with a shift to a “better health status” and a reduction between 0.470 and 0.223 as a meaningful difference associated with a shift to a “worse health status”. Compared with the instruments measuring quality of life, psychopathology or social functioning, the shift to a “better health status” was smaller if elicited by the EQ-5D indices.

Table 4.

Results of the regression model estimating score differences controlled by instrument’s baseline score, period since first diagnosis, and age of the instruments interpreted as meaningful according the patient/clinician-based anchors based on the stable health status and (N=104)

InstrumentsWHOQOL-BREFCGIGAFEQ VASEQ-5D index UKEQ-5D index D
Regression coefficient
Anchor by EQ-5D transition question (patient based)
Constant [SE]36.48 [7.41]1.77 [0.55]39.97 [7.30]45.41 [7.81]0.560 [0.098]0.648 [0.093]
(95%-CI)21.76; 51.190.68; 2.8625.48; 54.4529.91; 60.920.366; 0.7540.464; 0.833
Shift in change scores from stable HS to worse HS [SE]18.78 [5.30]0.98 [0.37]6.77 [3.98]18.11 [5.50]0.470 [0.038]0.397 [0.057]
(95%-CI)29.28; −8.270.24; 1.7114.67; 1.1429.02; −7.200.605; −0.3340.509; −0.284
Shift in change scores from stable HS to better HS [SE]19.02 [3.76]0.75 [0.26]6.84 [2.83]12.29 [3.86]0.104 [0.049]0.059 [0.040]
(95%-CI)11.56; 26.481.28; −0.231.24; 12.454.62; 19.950.007; 0.2000.022; 0.139
Instrument' score at baseline [SE]0.75 [0.07]0.67 [0.10]0.72 [0.08]0.73 [0.07]0.661 [0.075]0.674 [0.075]
(95%-CI)0.89; −0.610.87; −0.470.88; −0.570.88; −0.580.809; −0.5130.822; −0.526
Period since first diagnosis (years) [SE]0.23 [0.23]0.03 [0.02]0.28 [0.17]0.06 [0.24]0.004 [0.003]0.004 [0.002]
(95%-CI)0.69; 0.230.004; 0.060.63; 0.070.41; 0.540.002; 0.0090.0004; 0.009
Age [SE]0.06 [0.13]0.01 [0.01]0.14 [0.10]0.01 [0.14]0.002 [0.002]0.002 [0.001]
(95%-CI)0.20; 0.320.03; 0.010.06; 0.330.26; 0.280.006; 0.0010.005; 0.0006
R20.630.410.510.530.580.58
Anchor by BRAMES psychopathology scale (clinician based)
Constant [SE]29.13 [8.55]1.34 [0.51]21.48 [7.48]34.31 [8.39]0.328 [0.124]0.429 [0.121]
(95%-CI)12.17; 46.090.33; 2.346.63; 36.3217.66; 50.960.082; 0.5750.188; 0.669
Shift in change scores from stable HS to worse HS [SE]23.18 [6.59]1.26 [0.41]13.18 [4.08]18.55 [6.33]0.223 [0.095]0.139 [0.080]
(95%-CI)36.25; −10.100.44; 2.0821.29; −5.0831.12; −5.990.411; −0.0360.298; 0.021
Shift in change scores from stable HS to better HS [SE]15.98 [4.25]1.00 [0.27]11.98 [2.73]13.49 [4.00]0.167 [0.059]0.106 [0.049]
(95%-CI)7.55; 24.401.53; −0.476.57; 17.395.54; 21.430.050; 0.2830.008; 0.204
Instrument' score at baseline [SE]0.54 [0.08]0.44 [0.11]0.49 [0.08]0.53 [0.08]0.444 [0.092]0.505 [0.094}
(95%-CI)0.70; −0.380.65; −0.230.65; −0.340.69; −0.380.627; −0.2610.691; −0.319
Period since first diagnosis (years) [SE]0.17 [0.25]0.02 [0.16]0.19 [0.16]0.07 [0.24]0.01 [0.04]0.02 [0.003]
(95%-CI)0.67; 0.330.01; 0.050.50; 0.120.40; 0.550.006; 0.0080.004; 0.008
Age [SE]0.021 [0.14]0.01 [0.01]0.17 [0.09]0.01 [0.14]0.02 [0.002]0.01 [0.002]
(95%-CI)0.26; 0.300.03; 0.010.001; 0.350.27; 0.260.006; 0.0020.005; 0.002
R20.560.450.600.510.370.34

HS=Health status; SE=Standard error; 95%-CI=95% confidence interval.

Significant coefficients (p<0.05) are printed bold.

4. Discussion 

return to Article Outline

Consistent with the first hypothesis, there was a medium to large overlap of constructs measuring aspects of quality of life, psychopathology, and social functioning. More specifically, the EQ VAS seemed to measure similar constructs as the WHOQOL-BREF, whereas the two EQ-5D indices showed less overlap with the instruments used for comparison. The EQ-5D index-UK revealed almost perfect correlation with the EQ-5D index-D. Thus, correlation analysis could lead to the impression that the measurement constructs recorded by the EQ-5D indices and the remaining instruments were not perfectly congruent. Results concerning the second hypothesis supported this impression; in comparison with the responsiveness statistics of the WHOQOL-BREF, the CGI, and the GAF, responsiveness was large in EQ VAS, yet rather medium in EQ-5D indices. EQ-5D indices differed from other instruments in respect of quantifying a shift in health status according to both, a patient-based and clinician-based anchor. In this respect, preference weights elicited by the general population seemed either to underestimate a health improvement or probably to measure a construct different from that of the instruments used for comparison.

It is important to appreciate that particularly the characteristics of the EQ-5D preference-based measures, the definition of the anchors used and the nature of illness might account for the instrument’s responsiveness to record meaningful change in health status (Beaton et al., 2001, Krabbe et al., 2004).

The characteristics of EQ-5D index scores are rather complex, reflecting status on health dimensions as well as preferences of the general population. At baseline patients were mainly affected by problems in the EQ-5D dimensions “usual activities”, “pain/discomfort“, and “anxiety/depression”; at follow-up substantial improvement was indicated in the two EQ-5D dimensions “anxiety/depression” and “usual activities” with a smaller number of patients revealing moderate and extreme problems. Responsiveness and meaningful differences of the EQ-5D indices was partly determined by their scoring algorithm. It contains a term – N3 – that reduces the index by 0.269 (EQ-5D index-UK) and 0.323 (EQ-5D index-D) units if patients indicate “extreme problems” in any dimensions. Thus if patients shift away from “extreme problems”, the EQ-5D index increases by the same amount. In this sample only four patients changed in this direction. In case of the EQ-5D index-UK, a gain of 0.094 units was attached to the shift from “extreme problems” to “no problems” in the dimension “usual activities”, and a gain of 0.236 units was attached to such a shift in the dimension “anxiety/depression” (Dolan, 1997). Such a shift in the dimension “mobility”, which captures problems principally due to physical impairment, would lead to a gain of 0.314 units. Thus, the numerical weight of each dimension on the composite preference index suggests that the general population in the United Kingdom assigns “usual activities” and “anxiety/depression” lower importance than, for example, “mobility” and “pain/discomfort” within the five dimensional health state described by the EQ-5D. As a consequence, it seems that disease specific aspects of depression have a rather low impact on the EQ-5D index-UK (Willige van de et al., 2005). With respect to the EQ-5D index-D, the shift from “extreme problems” to “no problems” in the dimension “usual activities” and “anxiety/depression” leads to zero unit change and a gain of only 0.065 units, respectively (Greiner et al., 2004). Different from the two indices, disease specific aspects of depression seem to exert influence on the EQ VAS evaluation of health status. This may be one reason for exhibiting a larger responsiveness in the category “better” health status of both anchors compared to the EQ-5D indices.

The EQ-5D index-UK and to a higher extent the EQ-5D index-D recorded a change within the category “worse health status” with greater responsiveness than a change in the category “better health status”. Intuitively, one explanation might be that the general population associates a greater loss in utility if patients shift to a “worse health status “. This would support the validity of the EQ-5D index in this patient sample. Another possible explanation might be attributed to the often reported ceiling effect of the EQ-5D indices (Brazier et al., 2004, König et al., 2006). Thus, the larger range at the bottom of the EQ-5D indices provided a potential for the assessment of larger change in health than at the compressed top of the scale. It seems that the responsiveness of the EQ-5D indices depend not only on the patient's change in health status but also on the degree of patient's impairment.

An instrument's responsiveness and the ability of detecting a “meaningful difference” depend to a considerable extent on the anchor's definition (Guyatt et al., 2002). A clinician’s opinion about a “meaningful difference” may differ from patient's opinion and may lead to different results. In the patient-based anchor, the EQ-5D transition questions determined the categories of change. The estimation of “meaningful differences” of an instrument which severs as an anchor (EQ-5D transition questions) and which is also part of the analysis (EQ VAS, EQ-5D indices), may lead to confounded results. Moreover, the likelihood of a patient belonging to one category defined by one of the anchors might be affected by the follow-up score. Supplementary Spearman correlation analysis showed large positive correlation of the EQ-5D transition question with follow-up scores, but small near zero correlation with baseline scores. This may be a hint that patients do not remember their baseline health status well at the time of answering the transition question (Norman et al., 1997). In contrast, the clinician-based anchor showed large positive correlation with follow-up scores and large negative correlation with baseline scores, which was expected since this transition assessment is based on the distribution of a disease specific instrument measuring psychopathology. Additionally, the time between baseline and follow-up assessment was up to 18 months, which is a large time frame compared with other longitudinal studies evaluating HRQOL in depression (Sapin et al., 2004). It could be expected that there is presumably a response shift due to adaptation in health status over time and that the meaningful difference score might not be constant over time, especially if a large change in health status is expected at the beginning of therapy.

The selection of the five EQ-5D dimensions describing HRQOL was mainly based on the experience of the EuroQol group members along with a review of other generic HRQOL instruments (Brooks, 1996). In regard to patients with depression, empirical results suggest that other attributes like “sleep, cognition, energy, and participation in recreation” have also a connotatively impact on HRQOL (Skevington and Wright, 2001). Briefly, quality of life in patients with depression is presumably more comprehensive than the EQ-5D is able to elicit and the scope to what extent EQ-5D index scores reflect other than problems in the dimension “anxiety/depression” of patients with depression needs further research (Supina et al., 2007).

It should be noted that the EQ-5D index was primarily developed as an instruments to measure outcome for QALY analysis in economic evaluation rather than eliciting HRQOL in clinical trials (Supina et al., 2007). Within the framework of economic evaluation, the incremental cost-effectiveness ratio (ICER) (i.e. additional cost per QALY gained) is more informative than the difference in HRQOL alone. Theoretically, even a small difference in the EQ-5D index might still be “cost-effective”, if additional cost for such a change in HRQOL is very low. Conclusively, despite the small “meaningful differences” the economic information might be useful from decision maker’s perspective. However, the low responsiveness of the EQ-5D index tends to increase the uncertainty associated with the estimated ICER. Further research is needed at least in terms of how change in health status of patients with depression is elicited by other preference-based questionnaires (e.g. the SF-6D, the HUI or the Aqol).

A limitation of our study was the sample size possibly precluding the detection of the very small responsiveness of the EQ-5D indices. We suggest that further research should resolve problems in statistical power by using a larger sample. However, the sample size was adequate to show the superior responsiveness of the instruments used for comparison. Therefore, against the background of study resources, the investigator has to consider that a larger sample size is required to detect significant meaningful differences in EQ-5D index scores of patients with depression.

Acknowledgement 

return to Article Outline

This study was funded by the German Statutory Health Insurance (grant number 932000-050) and the German Federal Ministry of Education and Research (grant number 01ZZ0106).

References 

return to Article Outline

Angermeyer et al., 2000. 1.Angermeyer MC, Kilian R, Matschinger H. Handbuch für die deutsche Version der WHO Instrumente zur Erfassung der Lebensqualität, Hogrefe Verlag. 2000;.

Beaton et al., 2001. 2.Beaton DE, Bombardier C, Katz JN, Wright JG. A taxonomy for responsiveness. J. Clin. Epidemiol. 2001;54:1204–1217. Abstract | Full Text | Full-Text PDF (250 KB) | CrossRef

Bech et al., 1979. 3.Bech P, Bolwig TG, Kramp P, Rafaelsen OJ. The Bech–Rafaelsen mania scale and the hamilton depression scale. Acta Psychiatr. Scand. 1979;59:420–430. CrossRef

Brazier et al., 2004. 4.Brazier J, Roberts J, Tsuchiya A, Busschbach J. A comparison of the EQ-5D and SF-6D across seven patient groups. Health Econ. 2004;13:873–884. MEDLINE | CrossRef

Brooks, 1996. 5.Brooks R. EuroQol: the current state of play. Health Policy. 1996;37:53–72. Abstract | Full-Text PDF (1303 KB) | CrossRef

Cohen, 1988. 6.Cohen J. Statistical Power Analysis for Behavioral Science. 2 edn.. Hilsdale, NJ: Lawrence Earlbaum Associates; 1988;.

Dolan, 1997. 7.Dolan P. Modeling valuations for EuroQol health states. Med. Care. 1997;35:1095–1108. MEDLINE | CrossRef

Dolan, 1999. 8.Dolan P. Whose preferences count?. Med. Decis. Mak. 1999;19:482–486.

Drummond et al., 1997. 9.Drummond M, Jonsson B, Rutten F. The role of economic evaluation in the pricing and reimbursement of medicines. Health Policy. 1997;40:199–215. Abstract | Full-Text PDF (1623 KB) | CrossRef

Food and Drug Administration, 2006. 10.Food and Drug Administration . Draft guidance for industry patient reported outcome measures: use in medical product development in support labeling claims. Fed. Regist. 2006;71:5862–5863.

Gold et al., 1996. 11.Gold M, Siegel J, Russel L, Weinstein M. Cost-Effectiveness in Health and Medicine. New York: Oxford University Press; 1996;.

Goldman et al., 1992. 12.Goldman HH, Skodol AE, Lave TR. Revising axis V for DSM-IV: a review of measures of social functioning. Am. J. Psychiatry. 1992;149:1148–1156.

Greiner et al., 2004. 13.Greiner W, Claes C, Busschbach JJ, Graf von der Schulenburg JM. Validating the EQ-5D with time trade off for the German population. Eur. J. Health Econ. 2004;6:124–130. CrossRef

Guy, 1976. 14.Guy W. CGI, ECDEU assessment manual for psychopharmacology. In: US Departement of Health, Education, and Welfare. 1976;p. 76–338.

Guyatt et al., 1987. 15.Guyatt G, Walter S, Norman G. Measuring change over time: assessing the usefulness of evaluative instruments. J. Chronic. Dis. 1987;40:171–178. MEDLINE | CrossRef

Guyatt et al., 2002. 16.Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR. Methods to explain the clinical significance of health status measures. Mayo Clin. Proc. 2002;77:371–383Notes: CORPORATE NAME: Clinical Significance Consensus Meeting Group. MEDLINE | CrossRef

Hayhurst et al., 2006. 17.Hayhurst H, Palmer S, Abbott R, Johnson T, Scott J. Measuring health related quality of life in bipolar disorder: relationship of the EuroQol (EQ-5D) to condition-specific measures. Qual. Life Res. 2006;15:1271–1280. MEDLINE | CrossRef

Jaeschke et al., 1989. 18.Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control. Clin. Trials. 1989;10:407–415. MEDLINE | CrossRef

König et al., 2006. 19.König HH, Roick C, Angermeyer MC. Validity of the EQ-5D in assessing and valuing health status in patients with schizophrenic, schizotypal or delusional disorders. Eur. Psychiatr. 2006;22:177–187.

Krabbe et al., 2004. 20.Krabbe PF, Peerenboom L, Langenhoff BS, Ruers TJ. Responsiveness of the generic EQ-5D summary measure compared to the disease-specific EORTC QLQ C-30. Qual. Life Res. 2004;13:1247–1253. MEDLINE | CrossRef

Lamers et al., 2006. 21.Lamers LM, Bouwmans CA, van Straten A, Donker MC, Hakkaart L. Comparison of EQ-5D and SF-6D utilities in mental health patients. Health Econ. 2006;15:1229–1236. MEDLINE | CrossRef

Licht et al., 2005. 22.Licht RW, Qvitzau S, Allerup P, Bech P. Validation of the Bech–Rafaelsen melancholia scale and the hamilton depression scale in patients with major depression; is the total score a valid measure of illness severity?. Acta Psychiatr. Scand. 2005;111:144–149. CrossRef

Norman et al., 1997. 23.Norman GR, Stratford P, Regehr G. Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach. J. Clin. Epidemiol. 1997;50:869–879. Abstract | Full-Text PDF (1150 KB) | CrossRef

Norman et al., 2003. 24.Norman GR, Sloan JA, Wyrwich KW. Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med. Care. 2003;41:582–592. MEDLINE | CrossRef

Papakostas et al., 2004. 25.Papakostas GI, Petersen T, Mahal Y, Mischoulon D, Nierenberg AA, Fava M. Quality of life assessments in major depressive disorder: a review of the literature. Gen. Hosp. Psych. 2004;26:13–17.

Revicki et al., 2000. 26.Revicki DA, Osoba D, Fairclough D, Barofsky I, Berzon R, Leidy NK, et al. Recommendations on health-related quality of life research to support labeling and promotional claims in the United States. Qual. Life Res. 2000;9:887–900. MEDLINE | CrossRef

Revicki et al., 2006. 27.Revicki DA, Cella D, Hays RD, Sloan JA, Lenderking WR, Aaronson NK. Responsiveness and minimal important differences for patient reported outcomes. Health Qual. Life Outcomes. 2006;4:70. MEDLINE | CrossRef

Roick et al., 2005. 28.Roick C, Deister A, Zeichner D, Birker T, Konig HH, Angermeyer MC. The regional budget for mental health care: a new approach to combine inpatient and outpatient care. Psychiatr. Prax. 2005;32:177–184. MEDLINE | CrossRef

Sapin et al., 2004. 29.Sapin C, Fantino B, Nowicki ML, Kind P. Usefulness of EQ-5D in assessing health status in primary care patients with major depressive disorder. Health Qual. Life Outcomes. 2004;2:20. MEDLINE | CrossRef

Shaw et al., 2005. 30.Shaw JW, Johnson JA, Coons SJ. US valuation of the EQ-5D health states: development and testing of the D1 valuation model. Med. Care. 2005;43:203–220. MEDLINE | CrossRef

Skevington and Wright, 2001. 31.Skevington SM, Wright A. Changes in the quality of life of patients receiving antidepressant medication in primary care: validation of the WHOQOL-100. Br. J. Psychiatry. 2001;178:261–267. MEDLINE | CrossRef

Sprangers et al., 2002. 32.Sprangers MA, Moinpour CM, Moynihan TJ, Patrick DL, Revicki DA. Assessing meaningful change in quality of life over time: a users' guide for clinicians. Mayo Clin. Proc. 2002;77:561–571. MEDLINE | CrossRef

Supina et al., 2007. 33.Supina AL, Johnson JA, Patten SB, Williams JV, Maxwell CJ. The usefulness of the EQ-5D in differentiating among persons with major depressive episode and anxiety. Qual. Life Res. 2007;16:749–754. MEDLINE | CrossRef

The EuroQol Group, 1990. 34.The EuroQol Group . EuroQol-a new facility for the measurement of health-related quality of life. The EuroQol group. Health Policy. 1990;16:199–208. MEDLINE | CrossRef

WHOQOLGroup, 1998. 35.WHOQOLGroup . Development of the World Health Organization WHOQOL-BREF quality of life assessment. Psychol. Med. 1998;28:551–558. MEDLINE | CrossRef

Wiebe et al., 2003. 36.Wiebe S, Guyatt G, Weaver B, Matijevic S, Sidwell C. Comparative responsiveness of generic and specific quality-of-life instruments. J. Clin. Epidemiol. 2003;56:52–60. Abstract | Full Text | Full-Text PDF (143 KB) | CrossRef

Willige van de et al., 2005. 37.Willige van de G, Wiersma D, Nienhuis FJ, Jenner JA. Changes in quality of life in chronic psychiatric patients: a comparison between EuroQol (EQ-5D) and WHOQoL. Qual. Life Res. 2005;14:441–451. MEDLINE | CrossRef

World Health Organization, 2004. 38.World Health Organization . International Statistical Classification of Diseases and Related Health Problems — Tenth Revision. 2004;.

a Health Economics Research Unit, University of Leipzig, Johannisallee 20, 04317 Leipzig, Germany

b Department of Psychiatry, University of Leipzig, Johannisallee 20, 04317 Leipzig, Germany

Corresponding Author InformationCorresponding author. University of Leipzig, Health Economics Research Unit, Department of Psychiatry, Johannisallee 20, D-04317 Leipzig, Germany. Tel.: +49 341 97 24560; fax: +49 341 97 24569.

 Conflict of interest: All authors declare that they have no conflicts of interest.

Contributors: All authors contributed to and have approved the final manuscript:

Oliver H. Günther undertook the statistical analyze and wrote the first draft and the final version of manuscript. Christiane Roick designed the study and managed the process of data collection. Matthias C. Angermeyer planned the study and reviewed the final manuscript. Hans-Helmut König analyzed the data and reviewed the first draft.

Role of funding source: Funding for this study was provided by German Statutory Health Insurance (grant number 932000-050) and the German Federal Ministry of Education and Research (grant number 01ZZ0106); both had no further role in study design; in the collection, analysis and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication.

PII: S0165-0327(07)00142-5

doi:10.1016/j.jad.2007.04.018


View previous. 10 of 40 View next.