Source: Journal of Teacher Education, 63(5), p. 335-355, November/December 2012
(Reviewed by the Portal Team)
In this paper, the authors review the approaches taken in several states that have already estimated the effects of teacher preparation programs (TPP) and analyze the proposals for incorporating students’ test score gains into the evaluations of TPP by states that have received federal Race to the Top funds.
They developed a framework to focus on three types of decisions that are required to implement these new accountability requirements:
(a) selection of teachers, students, subjects, and years of data;
(b) methods for estimating teachers’ effects on student test score gains; and
(c) reporting and interpretation of effects.
This article focuses on the ongoing work to create empirically grounded estimates of TPP effectiveness using student test scores that has been mandated by federal and state policies.
The complexity behind estimating and reporting TPP effects on student test scores goes beyond setting up the sophisticated data systems that link program graduates to practicing teachers to student test scores.
The decisions required to tie TPP graduates to the achievement of their students must be made with an understanding of the consequences of each of the myriad decisions involved.
A series of selection, estimation, and reporting decisions are required, including determining what tests teachers can reasonably be held accountable for, which teachers should be included and which are excluded, whether more transparent and simpler methods should be preferred, and how much information to report.
Understanding the options as well as their strengths and weaknesses along with open dialogue on the insights and consequences of certain decisions will lead to better estimates of TPP effectiveness.
From the analysis of the descriptions of the various state systems in place and under development, the authors can discern a few important conclusions about the current efforts to include measures of TPP effects on student test scores.
The overall goal of most of these efforts to date has been to obtain unbiased estimates of the overall effects of TPP on student achievement.
In place of the strict standard of random assignment, we can establish criteria that guide the selection of better choices.
The authors believe that several criteria should be considered as states move forward with TPP effect estimates:
It will be important to validate the effects on student test scores estimated for TPP evaluations with other, independent measures of high quality instruction by TPP graduates, such as direct observation of teachers using observation instruments and student surveys that have been shown to measure high quality instruction.
One way of viewing fairness is that the TPP are neither advantaged nor penalized by decisions of their graduates who are beyond their control.
Preparing teachers who choose to teach in challenging schools or challenging students should not be a factor that affects the TPP effect estimates.
All value-added model (VAMs) that are being used for the evaluation of TPPs attempt to isolate the effects of teachers and remove the influence on student test scores that are beyond the control of TPP.
Transparency is an obvious, but difficult, criterion to satisfy because the most transparent approaches may fail the tests of accuracy and fairness.
Clear and understandable explanations will need to be offered when complex and sophisticated estimation approaches are used.
The inclusiveness criterion suggests that the current testing regime that has been put into place largely to meet NCLB requirements may fall short in the longer term and omit significant subprograms of larger traditional TPP, such as early education programs or special education.
In addition, incorporating student test scores into the evaluation of TPP is only one of many measures that could be used to measure the effects of the graduates of these programs on student outcomes.
Student engagement, graduation rates, and direct measures of high quality instruction are but a few of the potential measures that could be added to TPP evaluation for a more inclusive and comprehensive perspective on the program’s effects on students.
It is also important to note that the current efforts to incorporate student test scores into TPP evaluation partially fulfill one of the four purposes for evaluation, accountability/oversight, and do not directly or necessarily address the other three purposes for which evaluations are conducted: program improvement, assessment of merit and worth, or knowledge development.
Finally, the empirical work to date has established “proof of concept” for incorporating student test scores into the evaluation of TPP.
It shows that a sufficient signal can be found in student test scores to reliably attribute effects to individual TPP even with the “noise” of confounding variables and nonrandom sorting of teachers and students into schools and classrooms.