Clinical effectiveness of cognitive behavioral therapy for depression in routine care: A propensity score based comparison between randomized controlled trials and clinical practice

doi:10.1016/j.jad.2015.08.072

Journal of Affective Disorders

Volume 189, 1 January 2016, Pages 150-158

https://doi.org/10.1016/j.jad.2015.08.072 Get rights and content

Highlights

•
Patients with MDD were matched to an RCT in order to compare treatment effects.
•
Inclusion/exclusion criteria of the RCT led to different treatment effects.
•
PSM revealed that CBT in naturalistic sample was as effective as in RCT.
•
CBT in clinical practice might be equally effective as in RCTs.
•
However, treatments lasted significantly longer under routine care conditions.

Abstract

Background

The efficacy of cognitive behavioral therapy (CBT) for the treatment of depressive disorders has been demonstrated in many randomized controlled trials (RCTs). This study investigated whether for CBT similar effects can be expected under routine care conditions when the patients are comparable to those examined in RCTs.

Method

N=574 CBT patients from an outpatient clinic were stepwise matched to the patients undergoing CBT in the National Institute of Mental Health Treatment of Depression Collaborative Research Program (TDCRP). First, the exclusion criteria of the RCT were applied to the naturalistic sample of the outpatient clinic. Second, propensity score matching (PSM) was used to adjust the remaining naturalistic sample on the basis of baseline covariate distributions. Matched samples were then compared regarding treatment effects using effect sizes, average treatment effect on the treated (ATT) and recovery rates.

Results

CBT in the adjusted naturalistic subsample was as effective as in the RCT. However, treatments lasted significantly longer under routine care conditions.

Limitations

The samples included only a limited amount of common predictor variables and stemmed from different countries. There might be additional covariates, which could potentially further improve the matching between the samples.

Conclusions

CBT for depression in clinical practice might be equally effective as manual-based treatments in RCTs when they are applied to comparable patients. The fact that similar effects under routine conditions were reached with more sessions, however, points to the potential to optimize treatments in clinical practice with respect to their efficiency.

Introduction

With a lifetime prevalence of 9.5% depressive disorders are the second most common mental disorder after anxiety disorders (18.1%; Kessler et al., 2005). According to the World Health Organization (WHO) depression is even the leading disorder concerning the overall burden of diseases and it might be the second-leading cause of disability worldwide by 2020 (Murray and Lopez, 1996). Not surprisingly, depression therefore is one of the most intensively studied mental disorders (e.g. Cuijpers et al., 2008, Cuijpers et al., 2014). Actually, more than 350 randomized controlled trails (RCT) on the efficacy of depression treatment have been published. The effects of well-standardized depression treatments found in highly controlled RCTs have to be compared to the effects of depression treatment when delivered under routine care conditions, however. There are several peculiarities of RCTs which aim to strengthen the internal validity of study findings but which may hamper the external validity, that is, transfer of the study's findings to clinical practice:

RCTs usually use highly structured treatment manuals for psychosocial interventions and therapists are intensively trained to ensure that all patients receive a comparable treatment. Therapists in clinical practice may often not follow treatment manuals that strictly. Strict standardization of psychotherapeutic procedures and their one-to-one transfer from RCTs to clinical practice is therefore much more difficult in psychotherapy research than for other medical interventions (e.g. pharmacotherapy). Moreover, RCTs usually only include patients who meet a series of highly specific inclusion criteria in order to generate homogenous samples and hence to strengthen the validity of the causal inferences. Combined with the restriction on voluntary patients who accept to be randomly assigned to a treatment condition, these inclusion/exclusion criteria may lead to highly selective samples in RCTs that omit many patients encountered in clinical practice. For instance, studies on antidepressant medications often exclude more than 80% of the patients with a major depression disorder (MDD) due to any non-conformity with the inclusion criteria (e.g. Keitner et al., 2003; Zetin and Hoepner, 2007). While comorbid disorders commonly represent an exclusion criterion in RCTs, patients with more than one mental disorder are frequently seen in clinical practice. Consequently, well-conducted efficacy studies increasingly became criticized in terms of their external validity (Rothwell, 2005), and several efforts have been made to improve the external validity in RCTs. The STAR*D research program, for example, used an equipoised stratified randomized design and gave each patient the possibility to accept the assignment to a particular treatment strategy (e.g., pharmacotherapy and CBT) or decline it and to move to another study arm. This procedure was intended to be more close to what happens in routine care and to reduce the number of non-consenters, resulting in a higher external validity of the study's findings (Warden et al., 2007).

To date, it is generally accepted that both, efficacy (strictly controlled RCTs) and effectiveness studies (studies in naturalistic clinical settings that strengthen external validity at the cost of internal validity) are necessary to evaluate the usefulness of a treatment protocol (Castonguay et al., 2013, Finger and Rand, 2003, Green and Glasgow, 2006, Rothwell, 2005, Taylor and Asmundson, 2008). Results on the transferability of findings from RCTs to naturalistic studies are mixed: while some studies found similar effects (Merrill et al., 2003, Minami et al., 2008), others report that efficacy studies tend to find larger effect sizes than naturalistic studies (Gibbons et al., 2010, Hansen et al., 2002, Weisz et al., 1992). Furthermore, the outcome variance in naturalistic samples tends to be larger than in RCTs (e.g. McEvoy and Nathan, 2007). These findings point to the need for a further investigation of the comparability between treatment effects in RCTs and in naturalistic settings.

We therefore aimed to compare the effects of CBT for patients with MDD in (a) a high-quality RCT (Elkin et al., 1989) and (b) a naturalistic study performed under routine care conditions. As in previous research (e.g. Shadish et al., 1997, Shadish et al., 2000; Schindler et al., 2011), we first applied the inclusion/exclusion criteria of the RCT to the sample from routine care to enhance the comparability of the patients examined in both study designs. In addition, we subsequently implemented propensity score matching (PSM) to adjust for confounding baseline variables between samples and to match the variable distributions (e.g. Rosenbaum and Rubin, 1983; West et al., 2015).

Section snippets

Methods

The current study was based on data from the National Institute of Mental Health Treatment of Depression Collaborative Research Program (TDCRP; Elkin et al., 1989), which was a large multicenter RCT in the US, as well as on naturalistic outcome data, which was routinely assessed at the University Outpatient Clinic Trier in the Southwest of Germany.

Results

Independent t tests and χ² tests were calculated to compare the baseline variables (BSI, BDI, DAS-K, sex, age, education and employment status; Table 1) and treatment length between the full naturalistic sample (N=574) and the CBT subsample of the TDCRP used in this study (n=40). Pretreatment scores in the BSI (t(612) =2.30, p=.02) and the BDI (t(53.28) =2.48, p=.02) both were significantly lower in the full naturalistic sample than in the RCT sample whereas education status was significantly

Discussion

The present study examined whether the effects of CBT for depressive patients in routine care are similar to the effects in a high-quality RCT, if the naturalistic sample is adjusted for inclusion/exclusion criteria of the RCT and matched for further baseline covariates that might affect treatment outcome. PSM, which is a sample matching procedure that takes the distribution of confounding baseline variables into account, was used to select a subsample of patients treated with CBT at a

Acknowledgements

We express our appreciation to the investigators in the Treatment of Depression Collaborative Research Program (TDCRP), especially Irene Elkin as the Coordinator, for providing access to their data set. Other leading collaborators at the National Institute of Mental Health (NIMH) were M. Tracie Shea (Associate Coordinator), John P. Docherty and Morris B. Parloff. The principal investigators and project coordinators at the three participating research sites were Stuart M. Sotsky and Davis Glass

References (60)

A.T. Beck et al.
Psychometric properties of the Beck Depression Inventory: twenty-five years of evaluation
Clin. Psychol. Rev.
(1988)
P. Cuijpers et al.
The effects of psychotherapies for major depression in adults on remission, recovery and improvement: a meta-analysis
J. Affect. Disord.
(2014)
C.J. Gibbons et al.
The clinical effectiveness of cognitive therapy for depression in an outpatient clinic
J. Affect. Disord.
(2010)
R.S. Lipman et al.
The Hopkins Symptom Checklist (HSCL): factors derived from the HSCL-90
J. Affect. Disord.
(1979)
W. Lutz et al.
Patterns of early change and their relationship to outcome and follow-up among patients with major depressive disorders
J. Affect. Disord.
(2009)
P.M. Rothwell
External validity of randomised controlled trials: “to whom do the results of this trial apply?”
The Lancet
(2005)
American Psychiatric Association
Diagnostic and Statistical Manual of Mental Disorders
(2000)
Augurzky, B., Schmidt, C.M., 2001. The propensity score: A means to an end. IZA Discussion paper series No....
J. Barabas
How deliberation affects policy opinions
Am. Polit. Sci. Rev.
(2004)
A.T. Beck et al.
Cognitive theory of depression
(1979)

A.T. Beck et al.

Comparison of Beck Depression Inventories-IA and-II in psychiatric outpatients

J. Personal. Assess.

(1996)

A.T. Beck et al.

An inventory for measuring depression

Arch. Gen. Psychiatry

(1961)

C.L. Boyd et al.

Untangling the causal effects of sex on judging

Am. J. Polit. Sci.

(2010)

L.G. Castonguay et al.

Practice-oriented research: approaches and application

P. Cuijpers et al.

Psychotherapy for depression in adults: a meta-analysis of comparative outcome studies

J. Consult. Clin. Psychol.

(2008)

R. Dehejia et al.

Causal effects in non-experimental studies: re-evaluating the evaluation of training programs

J. Am. Stat. Assoc.

(1999)

L.R. Derogatis

SCL-90-R:SCL-90. Administration, Scoring & Procedures. Manual-I for the R(evised) Version and other Instruments of the Psychopathology Rating Scale Series

(1977)

L.R. Derogatis et al.

The brief symptom inventory: an introductory report

Psychol. Med.

(1983)

I. Elkin et al.

NIMH treatment of depression collaborative research program

Arch. Gen. Psychiatry

(1985)

I. Elkin et al.

National Institute of Mental Health treatment of depression collaborative research program: general effectiveness of treatments

Arch. Gen. Psychiatry

(1989)

M.S. Finger et al.

Addressing validity concerns in clinical psychology Research

First, M.B., Spitzer, R.L, Gibbon, M., Williams, J., 2002. Structured Clinical Interview for DSM IV-TR Axis I...

M. Floyd et al.

The Dysfunctional Attitudes Scale: factor structure, reliability, and validity with older adults

Aging Mental Health

(2004)

Franke, G., 2000. BSI: Brief Symptom Inventory von L.R. Derogatis (Kurzform der SCL-90-R) – Deutsche Version. Beltz...

S.L. Garfield

Research on client variables in psychotherapy

L.W. Green et al.

Evaluating the relevance, generalization, and applicability of research issues in external validation and translation methodology

Eval. Health Prof.

(2006)

S. Guo et al.

Propensity Score Analysis: Statistical methods and Applications

(2014)

M. Hamilton

Development of a rating scale for primary depressive illness

Br. J. Soc. Clin. Psychol.

(1967)

N.B. Hansen et al.

The psychotherapy dose-response effect and its implications for treatment delivery services

Clin. Psychol.: Sci. Pract.

(2002)

D.E. Ho et al.

MatchIt: nonparametric preprocessing for parametric casual inference

J. Stat. Softw.

(2011)

Cited by (33)

What the future holds: Machine learning to predict success in psychotherapy
2022, Behaviour Research and Therapy
Citation Excerpt :
However, the therapist's experience may be much less relevant (Leon et al., 2005; Shapiro & Shapiro, 1982; Smith & Glass, 1977). Moreover, the ratio of patients who completed treatment successfully compares well with other studies that examine the effectiveness of routine care (Forand et al., 2011; Lutz et al., 2016; Westbrook & Kirk, 2005). Also noteworthy, therapists received frequent supervision in our setting and adhered to high training standards and treatment manuals.
Machine learning (ML) may help to predict successful psychotherapy outcomes and to identify relevant predictors of success. So far, ML applications are scant in psychotherapy research and they are typically based on small samples or focused on specific diagnoses. In this study, we predict successful therapy outcomes with ML in a heterogeneous sample in routine outpatient care. We trained established ML models (decision trees and ensembles of them) with routinely collected clinical baseline information from n = 685 outpatients to predict a successful outcome of cognitive behavioral therapy. Treatment success was defined as clinically significant change (CSC) on the Brief-Symptom-Checklist (reached by 326 patients; 48%). The best performing model (Gradient Boosting Machines) achieved a balanced accuracy of 69% (p < .001) on unseen validation data. Out of 383 variables, we identified the 16 most important predictors, which were still able to predict CSC with 67% balanced accuracy. Our study demonstrates that ML models built on data, which is typically available at the outset of therapy, can predict whether an individual will substantially benefit from the intervention. Some of the predictors were theoretically expected (e.g., level of functioning), but others need further validation (e.g., somatization). From a theoretical and practical perspective, ML is clearly an attractive addition to more established psychotherapy research methodology.
The effects of social group interventions for depression: Systematic review
2021, Journal of Affective Disorders
Citation Excerpt :
We can compare these findings with published outcomes from the landmark Treatment of Depression Collaborative Research Project in which patients were randomly assigned to one of four treatment conditions for 12 weeks, and the effect sizes (based on BDI scores from all participants who entered the trial) were around 1.5 for CBT, interpersonal therapy and antidepressant medication (Elkin et al., 1995). An effect size of 0.94 was produced for CBT under naturalistic conditions (Lutz et al., 2016). Our effect sizes are also like those reported in an earlier review of studies using group CBT for depression (Oei and Dingle, 2008).
There is a growing prevalence of prolonged antidepressant use globally. Social group interventions may be an effective way to manage mild to moderate depression, especially with patients seeking to discontinue antidepressant use. This systematic review evaluates studies that used social group interventions to manage depression.
Studies published up to June 2019 in nine bibliographic databases were identified using search terms related to depression, social interventions, and social participation. Formal therapies for depression (cognitive behaviour therapy, music therapy) were excluded as they have been reviewed elsewhere.
24 studies met inclusion criteria; 14 RCTs, 6 non-randomised controlled trials and 4 pre-post evaluations. In total, 28 social group programs were evaluated, 10 arts-based groups, 13 exercise groups and 5 others. Programs ranged in ‘dose’ from 5 to 150 hours (M = 31 hours) across 4 to 75 weeks (M = 15 weeks) and produced effect sizes on depression in the small to very large range (Hedge's g = .18 to 3.19, M = 1.14). A regression analysis revealed no participant variables, study variables or intervention variables were related to effect size on depression.
Risks of bias were found, primarily in the non-randomised studies, which means the findings must be regarded as preliminary until replicated.
These findings indicate that social group interventions are an effective way to manage mild to moderate depression symptoms in a variety of populations. This approach may also help to prevent relapse among patients tapering off antidepressant medication.
Outcomes, skill acquisition, and the alliance: Similarities and differences between clinical trial and student therapists
2020, Behaviour Research and Therapy
Citation Excerpt :
Several studies have compared the outcomes of depressed clients treated with CT in clinical trials with the outcomes of clients treated with CT in standard clinical settings, as described below. Across these seven studies, settings have included: private practice (Persons, Bostrom, & Bertagnolli, 1999), the UK's National Health Service (Westbrook & Kirk, 2005), an outpatient CT clinic (Gibbons et al., 2010), a university outpatient clinic (Lutz, Schiefele, Wucherpfennig, Rubel, & Stulz, 2016), a community mental health center (Merrill, Tolbert, & Wade, 2003), an outpatient medical center clinic (Forand, Evans, Haglin, & Fishman, 2011), and an outpatient mood disorders clinic (Peeters et al., 2013). Many of these studies reported including trainees or students as therapists (Forand et al., 2011; Gibbons et al., 2010; Lutz et al., 2016; Westbrook & Kirk, 2005), and several others included therapists who were likely to be less experienced or expert than therapists providing CT in clinical trials.
Considerable evidence from clinical trials supports the efficacy of cognitive therapy (CT) of depression. Less is known about outcomes when provided in other contexts, such as when provided by student therapists. We conducted a non-randomized comparison of student therapists vs. clinical trial therapists on change in depressive symptoms, dropout, change in CT skills, and therapeutic alliance among 100 clients with moderate to severe depression. Treatment manual and duration were the same. Clients treated by student therapists had largely comparable outcomes on depressive symptom change, therapeutic alliance, and CT skills. Results supported non-inferiority of student therapists on change in depressive symptoms, but non-inferiority was not supported when using an interviewer evaluated measure of depression. Evidence of non-inferiority was also obtained for client CT skills and therapeutic alliance. In fact, conventional superiority analyses indicated student therapists outperformed clinical trial therapists on alliance and CT skills. The rate of dropout among student therapists (30%) was numerically higher than among clinical trial therapists (17%) and our results did not support non-inferiority on dropout. CT provided by student therapists can achieve outcomes similar to those in a clinical trial, but more research about dropout is needed.
IDEM-depression: Characteristics and evaluation of an open group that combines psychoeducation and cognitive-behavior therapy
2018, Encephale
Nous avons développé un groupe ouvert combinant psycho-éducation et thérapie cognitivo-comportementale (TCC), le groupe information, découverte, échange et mobilisation concernant la dépression (IDEM-dépression), composé de 17 séances thématiques et indépendantes qui abordent chacune un sujet lié à la dépression. Il s’adresse à des patients présentant un épisode dépressif de sévérité variée, s’inscrivant dans le cadre d’un trouble unipolaire ou bipolaire. Le groupe est donc accessible à un grand nombre de patients présentant un épisode dépressif majeur. Nous présentons ici les caractéristiques de ce groupe, sa faisabilité et les résultats d’une étude d’évaluation portant sur l’impact immédiat sur l’humeur, mesuré via des échelles visuelles analogiques proposées en début et en fin de séance, et sur la satisfaction globale des participants, évaluée via deux questionnaires de satisfaction. Les résultats, issus de deux centres où le programme a été implémenté, suggèrent une amélioration significative de l’humeur suite à la participation aux séances ainsi qu’une satisfaction auto-rapportée élevée. Ils indiquent aussi que le groupe IDEM-dépression peut s’adresser à un large spectre de patients dépressifs ayant des caractéristiques cliniques hétérogènes. Dans ce contexte, il constitue une offre psychothérapique à coût réduit, orientée vers une pathologie psychiatrique très prévalente. Son efficacité sur l’humeur et sa satisfaction élevée semblent liées à son contenu de type psycho-éducation et TCC, et à son format groupal, flexible et ouvert, qui favorise la décentration et l’entraide ainsi que la réduction de la stigmatisation et de l’isolement dus à la maladie.
Depression is a highly prevalent mental illness that is associated with high rates of morbidity and functional impairment. At the psychiatric unit of the University Hospital of Strasbourg, France, we have developed an open group that combines psychoeducation and cognitive-behavior therapy (CBT), the information, discovery, exchange and mobilization for depression group (IDEM-depression). IDEM-depression is composed of 17 thematic, structured, and independent sessions, which address different aspects of depression (i.e., rumination, pharmacological treatments). Because of its flexible format, patients with varying degrees of depression severity (from remission up to severe depressive symptoms) and whose depression might be bipolar or unipolar, are able to participate in the group. Thus, the group is well suited to a large number of patients with major depression. In the present study we aimed at describing the IDEM-depression group and presenting results regarding patients’ overall satisfaction, assessed via two self-report questionnaires (the Client Satisfaction Questionnaire, the CSQ-8, and the IDEM ad hoc questionnaire), as well as its effect on mood following each session assessed via a visual analog scale (VAS) ranging from 0 up to 100.
Sixty-five patients participated in 50 sessions of the IDEM-depression group in two hospitals in Alsace. 61% of the patients had bipolar disorder, and 41% of them were inpatients. Sessions took place on a weekly basis, lasted 2 hours and were proposed by a CBT-trained clinical psychologist. Patients were asked to fill-out the VAS at the beginning and at the end of each session. Moreover, they were asked to fill-out the CSQ-8 and the IDEM ad hoc questionnaire when they left the group. Other than one session (“yoga and mindfulness”), all the sessions (16 out of 17) were structured on a Powerpoint^© presentation. During the first hour information was given regarding the topic (i.e., rumination), and a shared CBT conceptualization of the topic was formulated by the participants and the psychologist. For most sessions, the first hour was therefore communication and information-based, whereas during the second hour participants were asked to participate in in-session behavioral experiments and/or to evaluate specific aspects of their behavior (thoughts, emotions, activity, mindful behavior) during the last few days. The therapist manual and the slides for each session are available via e-mail to the first author.
Regarding the results, self-reported mood on the VAS was compared between the onset (225 VAS) and the end (225 VAS) of each session. Overall, results suggest that self-reported mood is significantly improved following the participation in sessions (t = −5. 87, P < 0.001). Moreover, mean results on the CSQ-8 suggest that patients are highly satisfied with the group (M = 24.46, SD = 6.42). Among them, 82% reported a moderate-high satisfaction with the group. On the IDEM ad hoc questionnaire, patients reported an overall high satisfaction level regarding (i) the content of sessions, (ii) the duration of sessions, (iii) the frequency of sessions, (iv) how much they felt they could express themselves during sessions. In the qualitative comments of this questionnaire, patients reported that the group helped them to gain an understanding of the mechanisms involved in depression; to feel less isolated and guilty; and to learn about specific psychotherapeutic tools (i.e., mindfulness) and to try to implement them.
Our results suggest that an IDEM-depression group is well suited to a wide-array of clinical pictures associated with depression (varying severity, bipolar or unipolar, inpatients and outpatients). This is probably due to its open-group format which is particularly well-adapted to the dynamic symptomatology associated with major depression, and may stimulate decentering in patients who have different levels of severity of symptoms but participate in the same session. Moreover, its impact on mood improvement, and the high satisfaction level reported by patients, seem to be related to its CBT and psychoeducation-based content on the one hand, which has shown its efficacy in depression. On the other hand, IDEM's structured open-group format might have also contributed to the improvement in mood and the overall good satisfaction reported by patients, through the social support provided by the group, improved feeling of self-efficiency, and its effect on stigmatization. Thus, IDEM-depression group is an efficacious, flexible, low-cost, and easy to implement (in different clinical settings) psychotherapeutic option for major depression.
How effective psychological treatments work: mechanisms of change in cognitive behavioural therapy and beyond
2023, Behavioural and Cognitive Psychotherapy
Effectiveness of Telemental Health During the COVID-19 Pandemic: A Propensity Score Noninferiority Analysis of Outcomes
2023, Psychotherapy

View all citing articles on Scopus

¹: Fax: +49 651 201 2886.

²: Fax: +49 651 201 2886.

³: Fax: +41 31 631 4155.

View full text

Research reportClinical effectiveness of cognitive behavioral therapy for depression in routine care: A propensity score based comparison between randomized controlled trials and clinical practice

Highlights

Abstract

Background

Method

Results

Limitations

Conclusions

Introduction

Section snippets

Methods

Results

Discussion

Acknowledgements

Clin. Psychol. Rev.

J. Affect. Disord.

J. Affect. Disord.

J. Affect. Disord.

J. Affect. Disord.

The Lancet

Diagnostic and Statistical Manual of Mental Disorders

How deliberation affects policy opinions

Am. Polit. Sci. Rev.

Cognitive theory of depression

Comparison of Beck Depression Inventories-IA and-II in psychiatric outpatients

J. Personal. Assess.

An inventory for measuring depression

Arch. Gen. Psychiatry

Untangling the causal effects of sex on judging

Am. J. Polit. Sci.

Practice-oriented research: approaches and application

Psychotherapy for depression in adults: a meta-analysis of comparative outcome studies

J. Consult. Clin. Psychol.

Causal effects in non-experimental studies: re-evaluating the evaluation of training programs

J. Am. Stat. Assoc.

SCL-90-R:SCL-90. Administration, Scoring & Procedures. Manual-I for the R(evised) Version and other Instruments of the Psychopathology Rating Scale Series

The brief symptom inventory: an introductory report

Psychol. Med.

NIMH treatment of depression collaborative research program

Arch. Gen. Psychiatry

National Institute of Mental Health treatment of depression collaborative research program: general effectiveness of treatments

Arch. Gen. Psychiatry

Addressing validity concerns in clinical psychology Research

The Dysfunctional Attitudes Scale: factor structure, reliability, and validity with older adults

Aging Mental Health

Research on client variables in psychotherapy

Evaluating the relevance, generalization, and applicability of research issues in external validation and translation methodology

Eval. Health Prof.

Propensity Score Analysis: Statistical methods and Applications

Development of a rating scale for primary depressive illness

Br. J. Soc. Clin. Psychol.

The psychotherapy dose-response effect and its implications for treatment delivery services

Clin. Psychol.: Sci. Pract.

MatchIt: nonparametric preprocessing for parametric casual inference

J. Stat. Softw.

Research report
Clinical effectiveness of cognitive behavioral therapy for depression in routine care: A propensity score based comparison between randomized controlled trials and clinical practice