Effect of students' emotional and behavioral disorder and pre-service teachers’ stress on judgments in a simulated class

December, 2021

Source: Teaching and Teacher Education, Volume 108

(Reviewed by the Portal Team)

In this study, the authors used an experimental setting to investigate the potential causal effect of the emotional and behavioral disorder (EBD) label on performance judgments.
They examined whether EBD influences pre-service teachers’ judgments about student performance in terms of estimated percentage of correct student responses and school grades assigned, controlling for actual performance.
Furthermore, they examined a possible indirect effect of EBD on judgments via frequency of calling on each student.
More precisely, they investigated whether EBD leads to a higher/lower call frequency and more/less accurate judgments.
Because students with EBD are a frequent source of teacher stress, they examined stress as a potential moderator of the assumed relationship between EBD and judgments.
They also examined a possible moderating effect of stress on the relationship between EBD and judgment mediated by call frequency (moderated mediation).
The following hypotheses are examined:
1. Students with EBD are given worse performance judgments than students without EBD with the same performance level.
The relationship between EBD and judgments is mediated by the frequency with which a student is called on.
2. The assumed negative effect of EBD on judgments increases under stress.
3. The assumed negative effect of EBD on judgments via call frequency differs depending on stress level.
a. The relationship between EBD and call frequency differs depending on stress level.
b. The relationship between call frequency and judgments differs depending on stress level.


The sample consisted of N = 102 pre-service teachers enrolled in either a bachelor's or master's program.
The experimental group, who received the stress manipulation, consisted of N= 56, and the control group consisted of N =46.
Participants were enrolled in teacher education programs for different school subjects (e.g., science, mathematics, languages) and were in their fifth semester of studies on average at the University of Kiel, Germany. There were no significant differences between the experimental group and control group with regard to age, sex, or semester of studies.
Participants were randomly assigned to the experimental or control group.

Stress induction
In the experimental group, stress was induced using the Trier Social Stress Test (TSST; Kirschbaum et al., 1993).
The TSST is a well evaluated and widely used instrument for the controlled induction of stress (Papen et al., 2017; Smith et al., 2018).
After completing the TSST, participants were taken to the computer lab to fill in a questionnaire and complete the lesson in the simulated classroom.
In the control group, no stress was induced.
The participants were taken to a room where they were offered cookies and magazines and were asked to wait there for the beginning of the experiment.
They were told that the experiment still needed to be prepared and sufficient time had been scheduled.
This should have prevented any inadvertent stress.
As in the experimental group, participants in the control group were taken to the computer lab after 15 min.

Assessment of stress level
In the computer lab, all participants were asked to first fill in a questionnaire concerning their perceived stress level and then complete the lesson in the simulated classroom.
The participants were asked to state how stressed they felt on two scales.

Simulated classroom
The stress level assessment was followed by a lesson in a simulated classroom (see Fiedler et al., 2002).
The simulated classroom is a computer-based instrument that has been previously used to experimentally vary factors that influence judgment accuracy in a classroom setting (e.g., Kaiser et al., 2017; Kramer & Zimmermann, 2020; Südkamp et al., 2008).
Participants complete a lesson with simulated students to gather information about the students' performance.
Different categories of questions can be posed to the class, and the participants can call on specific students to answer each question.
After the lesson, the participants are asked to assess each student's performance.

The study design incorporated a two-stage between-subjects factor and a two-stage within-subjects factor.
The between subjects factor was the moderator stress level, which resulted from the stress manipulation via TSST (Kirschbaum et al., 1993), with participants randomly assigned to the experimental group (stress condition) and control group (no stress condition).
The within-subjects factor stemmed from the independent variable special educational needs within the simulated classroom, which was categorized into EBD vs. no special education support needed.
The second independent variable within the simulated classroom was students’ actual performance in mathematics, which served as a control variable (within-subjects factor).
Both independent variables were systematically varied by programming six classrooms (participants were randomly distributed to one of these), so that students with EBD were represented at all performance levels across all participants.
The dependent variables were the estimated percentage of correct responses and an assigned grade.
The program automatically counted the mediator call frequency.

Results and discussion
The main aim of the present study was to experimentally investigate the effect of students' EBD on pre-service teachers' judgments using the DiaCoM model (Loibl et al., 2020) as theoretical framework.
A further aim was to investigate the mediating role of call frequency and the moderating role of pre-service teachers' stress.
More specifically, the authors investigated whether students labelled as having EBD would be differently assessed compared to students without special educational needs.
They assumed the frequency with which a student was called on as a possible mediator between EBD and judgments.
As students’ EBD is a frequent source of teacher stress, they further investigated whether stress moderates the effect of EBD on judgments via call frequency.
In accordance with their first hypothesis, they found a negative effect of EBD on estimated percentage correct as well as on grades.
Students with EBD were judged to have lower performance than their peers without special educational needs at the same level of actual performance.
This result is comparable to other, predominantly field studies demonstrating that teachers systematically underestimate the performance of students with disruptive or externalizing behavior compared with students without behavioral problems, which resembles EBD in many key features (Bennett et al., 1993; Zimmermann et al., 2013).
According to the continuum model (Fiske & Neuberg, 1990), focusing on the EBD label may lead teachers to engage in more automatic information processing.
Teachers may associate EBD with poor school performance and make judgments based on generalized representations activated by the salient information about the students' special needs.
This could result in lower performance judgments for students with EBD compared to students without EBD.
Indeed, this might even take place unconsciously, as in this study, not only participants' assigned grades, which leave more room for including behavioral judgments, but also direct performance judgments in the form of estimates of a student's percentage of correct responses were influenced by the EBD label.
When interpreting the results, it must be kept in mind that students with EBD were judged to have lower performance than students without EBD, but were actually judged more accurately overall in relation to their actual performance.
This finding suggests that a distinction must be made between biased judgments and judgment accuracy:
The authors showed in their study that judgments for students with EBD were biased relative to those for students without EBD, as the additional information “EBD” was integrated into performance-related judgments of the former group of students.
However, when further examining the level component of judgment accuracy, the performance of students with EBD was judged more accurately: Across all students, the average performance level was 50% correct answers.
Participants estimated an average percentage correct of 51% for students with EBD and 58% for students without EBD.
Thus - as is generally the case - teachers tended to slightly overestimate students' performance in terms of the level component of judgment accuracy (Jenkins & Demaray, 2016; Urhahne & Wijnia, 2020).
This slight overestimation is considered conductive to learning, as performance gains are particularly likely when tasks are slightly above students’ actual performance levels (Hattie, 2009; Urhahne et al., 2011).
Not only did EBD have a direct effect on judgments, the authors also found an indirect effect of EBD on judgments via call frequency, in accordance with Hypothesis 1a.
Students with EBD were called on more often than students without EBD, and the more often a student was called on, the more accurately they were judged, but also the lower the judgments they received compared to students who were called on less frequently.
The relationship between call frequency and judgment accuracy is in line with Karst and Bonefeld (2020), who investigated students with and without immigrant background as student characteristic.
Not only did EBD have an impact on the call frequency, but the call frequency also influenced pre-service teachers' judgments.
The more often a student was called on and thus the more information the teacher had, the more accurately (but lower compared to other students) their performance was judged.
Thus, in accordance with the authors’ assumptions, more information led to more accurate judgments.
Calling on students with EBD more frequently gives teachers the opportunity to differentiate within a class and provide individualized support to students with EBD. However, a negative side effect of being called on more frequently and thus receiving accurate judgment is that students with EBD do not benefit from the same slight overestimation as their classmates.
The second hypothesis could not be confirmed: Students with and without EBD were not evaluated differently by stressed and unstressed participants.
It can be assumed that the stress induced in our study did not limit the participants' cognitive capacity to a sufficient extent to lead to predominantly automatic information processing.
The participants in this study were thus able to assess students’ performance in spite of stress in a routine, professional way.
In accordance with Hypothesis 3, pre-service teachers’ stress level moderated the mediation of the relationship between EBD and judgments by call frequency.
While there was no moderating effect of stress on the relationship between call frequency and judgments (Hypothesis 3b), stress moderated the relationship between EBD and call frequency (Hypothesis 3a): Unstressed participants called on students with EBD more often than students without EBD, while stressed participants called on students with and without EBD equally frequently.

Bennett, R. E., Gottesman, R. L., Rock, D. A., & Cerullo, F. (1993). Influence of behavior perceptions and gender on teachers' judgments of students' academic skill. Journal of Educational Psychology, 85(2), 347-356.
Fiedler, K., Walther, E., Freytag, P., & Plessner, H. (2002). Judgment biases in a simulated classroom. A cognitive-environmental approach. Organizational Behavior and Human Decision Processes, 88(1), 527-561
Fiske, S. T., & Neuberg, S. L. (1990). A continuum of impression formation, from category-based to individuating processes: Influences of information and motivation on attention and interpretation. Advances in Experimental Social Psychology, 23, 1-74.
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to performance. Routledge.
Jenkins, L. N., & Demaray, M. K. (2016). Teachers' judgments of the academic performance of children with and without characteristics of inattention, impulsivity, and hyperactivity. Contemporary School Psychology, 20, 183-191.
Kaiser, J., Südkamp, A., & Moller, J. (2017). The effects of student characteristics on € teachers' judgment accuracy: Disentangling ethnicity, minority status, and performance. Journal of Educational Psychology, 109(6), 871-888
Karst, K., & Bonefeld, M. (2020). Judgment accuracy of pre-service teachers: The influence of attention allocation. Teaching and Teacher Education, 94.
Kirschbaum, C., Pirke, K. M., & Hellhammer, D. H. (1993). The ‘Trier Social Stress Test’ e a tool for investigating psychobiological stress responses in a laboratory setting. Neuropsychobiology, 28(1-2), 76-81.
Kramer, S., & Zimmermann, F. (2020). Zum Ein € fluss von storendem Schü- € lerverhalten im Unterricht auf Leistungsbeurteilungen: Explizite Einsch€ atzungen und experimentelle Befunde [Influence of disturbing classroom behavior on judgments: Explicit estimates and experimental findings]. Zeitschrift für Padagogische Psychologie, 34 € , 99-115.
Loibl, K., Leuders, T., & Dor € fler, T. (2020). A framework for explaining teachers' diagnostic judgements by cognitive modeling (DiacoM). Teaching and Teacher Education, 91.
Papen, M. C., Niemand, T., Siems, F. U., & Kraus, S. (2017). The effect of stress on customer perception of the frontline employee: An experimental study. Review of Managerial Science, 13, 1-23.
Smith, A. M., Dijkstra, K., Gordon, L. T., Romero, L. M., & Thomas, A. K. (2018). An investigation into the impact of acute stress on encoding in older adults. Aging, Neuropsychology, and Cognition, 26(5), 749-766.
Südkamp, A., Moller, J., & Pohlmann, B. (2008). Der Simulierte Klassenraum. Eine € experimentelle Untersuchung zur diagnostischen Kompetenz [The simulated classroom: An experimental study on diagnostic competence]. Zeitschrift für Padagogische Psychologie, 22 € , 261-276.
Urhahne, D., Chao, S. H., Florineth, M. L., Luttenberger, S., & Paechter, M. (2011). Academic self-concept, learning motivation, and test anxiety of the underestimated student. British Journal of Educational Psychology, 81(1), 161-177.
Urhahne, D., & Wijnia, L. (2020). A review on the accuracy of teacher judgments. Educational Research Review, 32.
Zimmermann, F., Schütte, K., Taskinen, P., & Koller, O. (2013). Reciprocal effects € between adolescent externalizing problems and measures of achievement. Journal of Educational Psychology, 105, 747-761. 

Updated: Mar. 10, 2022