Elsevier

Journal of Affective Disorders

Volume 189, 1 January 2016, Pages 150-158
Journal of Affective Disorders

Research report
Clinical effectiveness of cognitive behavioral therapy for depression in routine care: A propensity score based comparison between randomized controlled trials and clinical practice

https://doi.org/10.1016/j.jad.2015.08.072Get rights and content

Highlights

  • Patients with MDD were matched to an RCT in order to compare treatment effects.

  • Inclusion/exclusion criteria of the RCT led to different treatment effects.

  • PSM revealed that CBT in naturalistic sample was as effective as in RCT.

  • CBT in clinical practice might be equally effective as in RCTs.

  • However, treatments lasted significantly longer under routine care conditions.

Abstract

Background

The efficacy of cognitive behavioral therapy (CBT) for the treatment of depressive disorders has been demonstrated in many randomized controlled trials (RCTs). This study investigated whether for CBT similar effects can be expected under routine care conditions when the patients are comparable to those examined in RCTs.

Method

N=574 CBT patients from an outpatient clinic were stepwise matched to the patients undergoing CBT in the National Institute of Mental Health Treatment of Depression Collaborative Research Program (TDCRP). First, the exclusion criteria of the RCT were applied to the naturalistic sample of the outpatient clinic. Second, propensity score matching (PSM) was used to adjust the remaining naturalistic sample on the basis of baseline covariate distributions. Matched samples were then compared regarding treatment effects using effect sizes, average treatment effect on the treated (ATT) and recovery rates.

Results

CBT in the adjusted naturalistic subsample was as effective as in the RCT. However, treatments lasted significantly longer under routine care conditions.

Limitations

The samples included only a limited amount of common predictor variables and stemmed from different countries. There might be additional covariates, which could potentially further improve the matching between the samples.

Conclusions

CBT for depression in clinical practice might be equally effective as manual-based treatments in RCTs when they are applied to comparable patients. The fact that similar effects under routine conditions were reached with more sessions, however, points to the potential to optimize treatments in clinical practice with respect to their efficiency.

Introduction

With a lifetime prevalence of 9.5% depressive disorders are the second most common mental disorder after anxiety disorders (18.1%; Kessler et al., 2005). According to the World Health Organization (WHO) depression is even the leading disorder concerning the overall burden of diseases and it might be the second-leading cause of disability worldwide by 2020 (Murray and Lopez, 1996). Not surprisingly, depression therefore is one of the most intensively studied mental disorders (e.g. Cuijpers et al., 2008, Cuijpers et al., 2014). Actually, more than 350 randomized controlled trails (RCT) on the efficacy of depression treatment have been published. The effects of well-standardized depression treatments found in highly controlled RCTs have to be compared to the effects of depression treatment when delivered under routine care conditions, however. There are several peculiarities of RCTs which aim to strengthen the internal validity of study findings but which may hamper the external validity, that is, transfer of the study's findings to clinical practice:

RCTs usually use highly structured treatment manuals for psychosocial interventions and therapists are intensively trained to ensure that all patients receive a comparable treatment. Therapists in clinical practice may often not follow treatment manuals that strictly. Strict standardization of psychotherapeutic procedures and their one-to-one transfer from RCTs to clinical practice is therefore much more difficult in psychotherapy research than for other medical interventions (e.g. pharmacotherapy). Moreover, RCTs usually only include patients who meet a series of highly specific inclusion criteria in order to generate homogenous samples and hence to strengthen the validity of the causal inferences. Combined with the restriction on voluntary patients who accept to be randomly assigned to a treatment condition, these inclusion/exclusion criteria may lead to highly selective samples in RCTs that omit many patients encountered in clinical practice. For instance, studies on antidepressant medications often exclude more than 80% of the patients with a major depression disorder (MDD) due to any non-conformity with the inclusion criteria (e.g. Keitner et al., 2003; Zetin and Hoepner, 2007). While comorbid disorders commonly represent an exclusion criterion in RCTs, patients with more than one mental disorder are frequently seen in clinical practice. Consequently, well-conducted efficacy studies increasingly became criticized in terms of their external validity (Rothwell, 2005), and several efforts have been made to improve the external validity in RCTs. The STAR*D research program, for example, used an equipoised stratified randomized design and gave each patient the possibility to accept the assignment to a particular treatment strategy (e.g., pharmacotherapy and CBT) or decline it and to move to another study arm. This procedure was intended to be more close to what happens in routine care and to reduce the number of non-consenters, resulting in a higher external validity of the study's findings (Warden et al., 2007).

To date, it is generally accepted that both, efficacy (strictly controlled RCTs) and effectiveness studies (studies in naturalistic clinical settings that strengthen external validity at the cost of internal validity) are necessary to evaluate the usefulness of a treatment protocol (Castonguay et al., 2013, Finger and Rand, 2003, Green and Glasgow, 2006, Rothwell, 2005, Taylor and Asmundson, 2008). Results on the transferability of findings from RCTs to naturalistic studies are mixed: while some studies found similar effects (Merrill et al., 2003, Minami et al., 2008), others report that efficacy studies tend to find larger effect sizes than naturalistic studies (Gibbons et al., 2010, Hansen et al., 2002, Weisz et al., 1992). Furthermore, the outcome variance in naturalistic samples tends to be larger than in RCTs (e.g. McEvoy and Nathan, 2007). These findings point to the need for a further investigation of the comparability between treatment effects in RCTs and in naturalistic settings.

We therefore aimed to compare the effects of CBT for patients with MDD in (a) a high-quality RCT (Elkin et al., 1989) and (b) a naturalistic study performed under routine care conditions. As in previous research (e.g. Shadish et al., 1997, Shadish et al., 2000; Schindler et al., 2011), we first applied the inclusion/exclusion criteria of the RCT to the sample from routine care to enhance the comparability of the patients examined in both study designs. In addition, we subsequently implemented propensity score matching (PSM) to adjust for confounding baseline variables between samples and to match the variable distributions (e.g. Rosenbaum and Rubin, 1983; West et al., 2015).

Section snippets

Methods

The current study was based on data from the National Institute of Mental Health Treatment of Depression Collaborative Research Program (TDCRP; Elkin et al., 1989), which was a large multicenter RCT in the US, as well as on naturalistic outcome data, which was routinely assessed at the University Outpatient Clinic Trier in the Southwest of Germany.

Results

Independent t tests and χ2 tests were calculated to compare the baseline variables (BSI, BDI, DAS-K, sex, age, education and employment status; Table 1) and treatment length between the full naturalistic sample (N=574) and the CBT subsample of the TDCRP used in this study (n=40). Pretreatment scores in the BSI (t(612) =2.30, p=.02) and the BDI (t(53.28) =2.48, p=.02) both were significantly lower in the full naturalistic sample than in the RCT sample whereas education status was significantly

Discussion

The present study examined whether the effects of CBT for depressive patients in routine care are similar to the effects in a high-quality RCT, if the naturalistic sample is adjusted for inclusion/exclusion criteria of the RCT and matched for further baseline covariates that might affect treatment outcome. PSM, which is a sample matching procedure that takes the distribution of confounding baseline variables into account, was used to select a subsample of patients treated with CBT at a

Acknowledgements

We express our appreciation to the investigators in the Treatment of Depression Collaborative Research Program (TDCRP), especially Irene Elkin as the Coordinator, for providing access to their data set. Other leading collaborators at the National Institute of Mental Health (NIMH) were M. Tracie Shea (Associate Coordinator), John P. Docherty and Morris B. Parloff. The principal investigators and project coordinators at the three participating research sites were Stuart M. Sotsky and Davis Glass

References (60)

  • A.T. Beck et al.

    Comparison of Beck Depression Inventories-IA and-II in psychiatric outpatients

    J. Personal. Assess.

    (1996)
  • A.T. Beck et al.

    An inventory for measuring depression

    Arch. Gen. Psychiatry

    (1961)
  • C.L. Boyd et al.

    Untangling the causal effects of sex on judging

    Am. J. Polit. Sci.

    (2010)
  • L.G. Castonguay et al.

    Practice-oriented research: approaches and application

  • P. Cuijpers et al.

    Psychotherapy for depression in adults: a meta-analysis of comparative outcome studies

    J. Consult. Clin. Psychol.

    (2008)
  • R. Dehejia et al.

    Causal effects in non-experimental studies: re-evaluating the evaluation of training programs

    J. Am. Stat. Assoc.

    (1999)
  • L.R. Derogatis

    SCL-90-R:SCL-90. Administration, Scoring & Procedures. Manual-I for the R(evised) Version and other Instruments of the Psychopathology Rating Scale Series

    (1977)
  • L.R. Derogatis et al.

    The brief symptom inventory: an introductory report

    Psychol. Med.

    (1983)
  • I. Elkin et al.

    NIMH treatment of depression collaborative research program

    Arch. Gen. Psychiatry

    (1985)
  • I. Elkin et al.

    National Institute of Mental Health treatment of depression collaborative research program: general effectiveness of treatments

    Arch. Gen. Psychiatry

    (1989)
  • M.S. Finger et al.

    Addressing validity concerns in clinical psychology Research

  • First, M.B., Spitzer, R.L, Gibbon, M., Williams, J., 2002. Structured Clinical Interview for DSM IV-TR Axis I...
  • M. Floyd et al.

    The Dysfunctional Attitudes Scale: factor structure, reliability, and validity with older adults

    Aging Mental Health

    (2004)
  • Franke, G., 2000. BSI: Brief Symptom Inventory von L.R. Derogatis (Kurzform der SCL-90-R) – Deutsche Version. Beltz...
  • S.L. Garfield

    Research on client variables in psychotherapy

  • L.W. Green et al.

    Evaluating the relevance, generalization, and applicability of research issues in external validation and translation methodology

    Eval. Health Prof.

    (2006)
  • S. Guo et al.

    Propensity Score Analysis: Statistical methods and Applications

    (2014)
  • M. Hamilton

    Development of a rating scale for primary depressive illness

    Br. J. Soc. Clin. Psychol.

    (1967)
  • N.B. Hansen et al.

    The psychotherapy dose-response effect and its implications for treatment delivery services

    Clin. Psychol.: Sci. Pract.

    (2002)
  • D.E. Ho et al.

    MatchIt: nonparametric preprocessing for parametric casual inference

    J. Stat. Softw.

    (2011)
  • Cited by (33)

    • What the future holds: Machine learning to predict success in psychotherapy

      2022, Behaviour Research and Therapy
      Citation Excerpt :

      However, the therapist's experience may be much less relevant (Leon et al., 2005; Shapiro & Shapiro, 1982; Smith & Glass, 1977). Moreover, the ratio of patients who completed treatment successfully compares well with other studies that examine the effectiveness of routine care (Forand et al., 2011; Lutz et al., 2016; Westbrook & Kirk, 2005). Also noteworthy, therapists received frequent supervision in our setting and adhered to high training standards and treatment manuals.

    • The effects of social group interventions for depression: Systematic review

      2021, Journal of Affective Disorders
      Citation Excerpt :

      We can compare these findings with published outcomes from the landmark Treatment of Depression Collaborative Research Project in which patients were randomly assigned to one of four treatment conditions for 12 weeks, and the effect sizes (based on BDI scores from all participants who entered the trial) were around 1.5 for CBT, interpersonal therapy and antidepressant medication (Elkin et al., 1995). An effect size of 0.94 was produced for CBT under naturalistic conditions (Lutz et al., 2016). Our effect sizes are also like those reported in an earlier review of studies using group CBT for depression (Oei and Dingle, 2008).

    • Outcomes, skill acquisition, and the alliance: Similarities and differences between clinical trial and student therapists

      2020, Behaviour Research and Therapy
      Citation Excerpt :

      Several studies have compared the outcomes of depressed clients treated with CT in clinical trials with the outcomes of clients treated with CT in standard clinical settings, as described below. Across these seven studies, settings have included: private practice (Persons, Bostrom, & Bertagnolli, 1999), the UK's National Health Service (Westbrook & Kirk, 2005), an outpatient CT clinic (Gibbons et al., 2010), a university outpatient clinic (Lutz, Schiefele, Wucherpfennig, Rubel, & Stulz, 2016), a community mental health center (Merrill, Tolbert, & Wade, 2003), an outpatient medical center clinic (Forand, Evans, Haglin, & Fishman, 2011), and an outpatient mood disorders clinic (Peeters et al., 2013). Many of these studies reported including trainees or students as therapists (Forand et al., 2011; Gibbons et al., 2010; Lutz et al., 2016; Westbrook & Kirk, 2005), and several others included therapists who were likely to be less experienced or expert than therapists providing CT in clinical trials.

    View all citing articles on Scopus
    1

    Fax: +49 651 201 2886.

    2

    Fax: +49 651 201 2886.

    3

    Fax: +41 31 631 4155.

    View full text