https://doi.org/10.1037/a0029312, Watson, P. J., & Workman, E. A. Threats to Internal Validity in Multiple-Baseline Design Variations, https://doi.org/10.1007/s40614-022-00326-1, Concurrence on Nonconcurrence in Multiple-Baseline Designs: A Commentary on Slocum et al. Watson and Workman did not explicitly address threats to internal validity other than coincidental events. The assumption that maturation contacted all tiers is strongparticipants were all exposed to maturational variables (i.e., unidentified biological events and environmental interactions) for the same amount of time. Correspondence to Experimental and quasi-experimental designs for research. We challenge this assertion. The functional answer to this question is that there must be sufficient tiers so that none of the threats to internal validity are plausible explanations for the pattern of effects across the set of tiers. As we mentioned above, across-tier comparisons require the assumptions that coincidental events will (1) contact and (2) have similar effects on all tiers of the design. The logic of replicated within-tier analysis applies equally to concurrent and nonconcurrent designs. For example, for a child who is on the cusp of walking, a month of exposure to maturational variables may result in a significant improvement in walking, but much less change in fine motor skills. Wacker, D., Berg, W., Harding, J., & Cooper-Brown, L. (2004). PubMed Single case experimental designs: Strategies for studying behavior change (3rd ed.). Every multiple baseline design in which potential treatment effects are observed in some but not all tiers demonstrates that tiers are not always equally sensitive to interventions. (p. 365), Of course, the major problem with this [nonconcurrent multiple baseline] strategy is that the control for history (i.e., the ability to assess subjects concurrently) is greatly diminished. Perhaps a more general and powerful triad of processes that support demonstration of experimental control would be prediction, contradiction, and replication. While the fact that the researcher does not use a large number of participants has its advantages, it also has a downside: Because the experimental trials are run on only one subject, it is difficult to empirically show with the experiment's data that the findings will generalize out to larger populations. However, the specific issues in this controversy have never been thoroughly identified, discussed, and resolved; and instead a consensus emerged without the issues being explicitly addressed. 2023 Springer Nature Switzerland AG. AB Design. Sidman, M. (1960). The concurrent multiple baseline design opened up many new opportunities to conduct applied research in contexts that were not amenable to other SCDs. (1968) who emphasized the replicated within-tier comparison. and (2) Was any change the result of the independent variable? Book et al. Additional replications further reduce the plausibility of extraneous variables causing change at approximately the same time that the independent variable is applied to each tier. The consensus in recent textbooks and methodological papers is that nonconcurrent designs are less rigorous than concurrent designs because of their presumed limited ability to address the threat of coincidental events (i.e., history). The author has no known conflicts of interest to disclose. Timothy A. Slocum, P. Raymond Joslyn, Sarah E. Pinkelman, Thomas R. Kratochwill, Joel R. Levin, Esther R. Lindstrm, Marc J. Lanovaz, Stphanie Turgeon, Tara L. Wheatley, Jonathan Rush, Philippe Rast & Scott M. Hofer, Perspectives on Behavior Science We can identify at least three general categories of issues that influence the number of tiers required to render threats implausible: challenges associated with the phenomena under study, experimental design features, and data analysis issues. Thus, a multiple baseline with phase changes sufficiently lagged (in terms of number of sessions) provides rigorous control for this threat. Multiple baseline and multiple probe designs. Therefore, concurrent and nonconcurrent designs are virtually identical in control for testing and session experience. WebIn yet a third version of the multiple-baseline design, multiple baselines are established for the same participant but in different settings. In concurrent multiple baseline across participants, behaviors, or stimulus materials that take place in a single setting, this kind of event would contact all the tiers of the multiple baseline. The definition states that there must be sufficient lag between phase changesthis is not further specified because the amount of lag necessary to ensure that any single amount of maturation, number of sessions, or coincidental event could not cause changes in multiple tiers must be determined in the context of the particular study. (p. 325), Compared to its concurrent multiple baseline design sibling, a non-concurrent arrangement is inherently weaker . Independent from Watson and Workman (1981), Hayes (1981) published a lengthy article introducing SCDs to clinical psychologists and made the point that these designs are well-suited to conducting research in clinical practice. Kennedy, C.H. Single-case designs for educational research. If A changes after B is put into practice, a researcher can draw the Conclusion that B caused A to change. This statement, of course, fails to satisfy the operational desire for a specific number of tiers that accomplishes this function. These variables share the key characteristic that their impact would be expected to accumulate as a function of number of experimental sessions. Single-case experimental designs: Strategies for studying behavior change. The assumption that all tiers respond similarly to maturation may be somewhat more problematic. because a non-concurrent design does not allow any AB comparisons across baselines, it omits the opportunity to see if responding under the control condition changes when the treatment condition is implemented in the other baseline. . Oxford. Coincidental events might be expected to be more variable in their effect than interventions that are designed to have consistent effects. https://doi.org/10.1901/jaba.1968.1-91, Article Hayes, S. C. (1981). An example of multiple baseline across behaviors might be to use feedback to develop a comprehensive exercise program that involves stretching, aerobic exercise, Use of brief experimental analyses in outpatient clinic and home settings. It is possible that a coincidental event may be present for all tiers but have different effects on different tiers. (1975). Kazdin and Kopel (1975) parallel much of Hersen and Barlows (1976) commentaryFootnote 3 but they also point out an apparent contradiction in the assumptions about behavior on which the multiple baseline design is built. This critical requirement is mainly addressed by the lag between phase changes in successive phases. Watson and Workman described a nonconcurrent multiple baseline design in which participants could be begin a study as they became known to the researcher. They state, the nonconcurrent multiple baseline across participants design is inherently weaker than other multiple baseline design variations. Second, the across-tier comparison assumes that extraneous variables will affect multiple tiers similarly. PubMed Central Thus, for any multiple baseline design to address the threat of maturation, it must show changes in multiple tiers after substantially differing numbers of days in baseline. https://doi.org/10.1007/s40614-022-00343-0, SI: Commentary on Slocum et al, Threats to Internal Validity. Thus, both of the articles introducing nonconcurrent multiple baselines made explicit arguments that replicated within-tier comparisons are sufficient to address the threat of coincidental events. With stable data, the range within which future data points will fall is Finally, we make recommendations for more rigorous use, reporting, and evaluation of multiple baseline designs. If an extraneous variable were to have a tier-specific effect, it would be falsely interpreted as a treatment effect. Kazdin, A. E. (2021). in their classic 1968 article that defined applied behavior analysis. If either of these assumptions are not valid for a coincidental event, then the presence and function of that event would not be revealed by the across-tier analysis. We have no known conflict of interest to disclose. What are the benefits and problems of these designs? Behavioral Interventions, 20(3), 219224. WebA multiple baseline design across behaviors was used to examine intervention effects. Webtreatment (Kazdin & Nock, 2003). Recommendations for reporting multiple-baseline designs across participants. The lag between phase changes must be long enough that maturation over any single amount of time cannot explain the results in multiple tiers. 7. The purposes of this article are to (1) thoroughly examine the impact that threats to internal validity can have on concurrent and nonconcurrent multiple baseline designs; (2) describe the critical features of each design type that control for threats to internal validity; and (3) offer recommendations for use and reporting of concurrent and nonconcurrent multiple baseline designs. https://doi.org/10.1177/001440290507100203, Johnston, J. M., Pennypacker, H. S., & Green, G. (2020). One area that has, in the past, been particularly controversial is the experimental rigor of concurrent versus nonconcurrent multiple baseline designs; that is, the degree to which each can rule out threats to internal validity. These baseline-treatment comparisons, which we will refer to as tiers, differ from one another with respect to participants, behaviors, settings, stimulus materials, and/or other variables. This argument rests on the assumptions that any extraneous variable that affects one tier will (1) contact all tiers and (2) have a similar effect on all tiers. If this requirement is not met and a single extraneous event could explain the pattern of data in multiple tiers, then replications of the within-tier comparison do not rule out threats to internal validity as strongly. A potential treatment effect in any single tier could plausibly be explained as a result of a coincidental event. Tactics of scientific research. As a result, concurrent and nonconcurrent designs are virtually identical in their control for maturation threats. This has been the sharpest point of criticism of nonconcurrent multiple baselines. Based on the logic laid out in this article, we believe that the treats of maturation and testing and session experience are controlled equivalently in concurrent and nonconcurrent design. Basic Books. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Nonconcurrent multiple baseline designs and the evaluation of educational systems. A : true B : false. When conditions are less ideal, additional tiers may be necessary. Concurrent multiple baseline designs are multiple baseline designs in which the tiers are synchronized in real time. Routledge/Taylor & Francis Group. Perspect Behav Sci 45, 647650 (2022). In particular, within-tier comparisons may be strengthened by isolating tiers from one another in ways that reduce the chance that any single coincidental event could coincide with a phase change in more than one tier (e.g., temporal separation). If we observe a potential treatment effect in one tier and corresponding changes in untreated tiers after similar amounts of time (i.e., number of days), maturation becomes a more plausible alternative explanation of the initial potential treatment effect. (Similar arguments can be made for comparisons across settings, persons, and other variables that might define tiers.) Single-case research designs: Methods for clinical and applied settings (3rd ed.). Maturation refers to extraneous variables such physical growth, physiological changes, typical interactions with social and physical environments, academic instruction, and behavior management procedures that tend to cause changes in behavior over time (cf., Shadish et al., 2002). a potential treatment effect in the first tier would be vulnerable to the threat that the changes in data could be a result of However, ina concurrent multiple baseline across settings a setting-level event would contact only a single tierthe design would be inherently insensitive to these coincidental events. In the current study, it is likely that exposure to some of the measures can affect scores on other measures or repeated exposure to a measure can lead to socially desirable responding or Journal of Behavioral Education, 13(4), 267276. Threats to Internal Validity in Multiple-Baseline Design Variations. Nonconcurrent multiple baseline designs for educational program evaluation. Carr, J. E. (2005). limitation of alternating treatment designs: o it is susceptible to multiple treatment interference, o rapid back-and-forth switching of treatments does not reflect the typical manner in which interventions are applied and may be viewed as artificial and undesirable. Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Routledge/Taylor & Francis Group. Further, for the across-tier comparison to detect the influence of a coincidental event, that event must not only contact multiple tiers, it must cause similar changes in the dependent measure across multiple tiers. Interrater agreement on the visual analysis of individual tiers and functional relations in multiple baseline designs. This question cannot be addressed by data analysis alone; any pattern of data, no matter how dramatic, could be a result of an extraneous variable if the experimental design features are not properly arranged. Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide. Although the design entails two of the three elements of baseline logicprediction and replicationthe absence of concurrent baseline measures precludes the verification of [the prediction]. Google Scholar, Coon, J. C., & Rapp, J. T. (2018). This has at least two effects: first, the multiple baseline is seen as weaker than the withdrawal design because of this dependence on the across-tier analysis; and second, when nonconcurrent multiple baseline designs are introduced years later, their rigor will be understood by many methodologists in terms of control by across-tier comparisons only, without consideration of replicated within-tier comparisons. In general, a longer lag is better because it reduces the chance that an event could impact multiple tiers. WebDisadvantage: Covariance among subjects may emerge if individuals learn vicariously through the experiences of other subjects Also, identifying multiple subjects in the same Concurrence is not necessary to detect and control for maturation. 288335). The lack of change in untreated tiers should be interpreted only as weak evidence supporting internal validity given the plausible alternative explanations of this lack of change. Multiple baseline designs are intended to evaluate whether there is a functional (causal) relation between the introduction of the independent variable and changes in the dependent variable. This control assumes that the replications are sufficiently offset in real time (e.g., calendar days) to ensure that a single coincidental event could not plausibly cause the effects observed in multiple tiers. For both types of comparisons, addressing maturation begins with an AB contrast in a single tier. If a potential treatment effect is seen in one tier and on the same day there is no change in other tiers, this is taken as strong evidence that the potential treatment effect was not a result of a coincidental event, because a coincidental event would have had an effect on all tiers. Each replication requires an assumption of a separate event coinciding with a distinct phase change. The reversal model is fine for many questions, but in some instances, removing a type of treatment could be unwise or even unethical. Thus, the additional temporal separation that is possible in a nonconcurrent design is a strength rather than a weakness in controlling for coincidental events. Nonconcurrent multiple baseline designs, however, do not afford this comparison. Finally, practitioners whose work may be influenced by SCD research must understand these issues so they can give appropriate weight to research findings. The details of situations in which this across-tier comparison is valid for ruling out threats to internal validity are more complex than they may appear. This is a preview of subscription content, access via your institution. https://doi.org/10.3758/s13428-011-0111-y, Article Advantages and Disadvantages of ABA Design. Hayes, S. C. (1985). Without these dimensions of lag explicitly stated in the definition, we cannot claim that multiple baseline designs will necessarily include the features required to establish experimental control. When determining whether a multiple baseline study demonstrates experimental control, researchers examine the data within and across tiers and also consider the extent to which alternative explanations (e.g., extraneous variables or confounds) could plausibly account for the obtained data patterns. - 216.238.99.111. Create the graph from the data in Sheets; 3. With control for coincidental events in multiple baseline designs resting squarely on replicated within-tier comparisons, there is no basis for claiming that, in general, concurrent designs are methodologically stronger than nonconcurrent designs. Without the latter you cannot conclude, with confidence, that the intervention alone is responsible for observed behavior changes since baseline (or probe) data are not concurrently collected on all tiers from the start of the investigation. However, current practice provides little or no direct information on either the temporal duration (e.g., number of days) of baseline nor the offset between phase changes in real time (i.e., number of calendar days between phase changes). In addition, arranging tiers that are isolated in other dimensions (e.g., location, behaviors, participants) confers overall strength, not weakness, for addressing coincidental events. Perspect Behav Sci 45, 619638 (2022). Journal of Behavioral Education, 13(4), 213226. The ABA or Reversal Design Slider with three articles shown per slide. Provided by the Springer Nature SharedIt content-sharing initiative, Over 10 million scientific documents at your fingertips, Not logged in Although publication dates would suggest that Kazdin and Kopel (1975) was published before Hersen and Barlow (1976), Kazdin and Kopel cite Hersen and Barlow, and not the other way around. If each tier of a multiple baseline represents a different participant in a different environment (e.g., school versus clinic) located in a different city, this would further reduce the chance that any single event or pattern of events could have contacted the participants coincident with the phase changes. The key characteristic that maturational processes share is that they may produce behavioral changes that would be expected to accumulate as a function of elapsed time in the absence of participation in research.Footnote 2 In order to control for maturation, we must attend to the passage of timetypically, calendar days. That is, session numbers do not necessarily correspond to the same periods of real time across tiers. When changes in data occur immediately after the phase change, are large in magnitude, and are consistent across tiers, threats to internal validity tend to be less plausible explanations of the data patterns, and fewer tiers would be required to rule them out. Under the proposed definition, such a study would not be considered a full-fledged multiple baseline. Textbooks commonly describe and characterize the design without clearly defining it. If the pattern of change shortly after implementation of the treatment is replicated in the other tiers after differing lengths of time in baseline (i.e., different amounts of maturation), maturation becomes increasingly implausible as an alternative explanation. We examine how these comparisons address maturation, testing and session experience, and coincidental events.