The Next Generation of Teacher Evaluation: Measuring Teachers

Teacher evaluation systems are not perfect. Here are some ways to make them better.

In this series of posts, we’ll be looking at what Michael Toth and Dr. Robert Marzano’s new book, Teacher Evaluation that Makes a Difference, from ASCD, has to say about the future of teacher evaluation and how it can be a tool for teacher growth and development.

In Texas, fewer than 3 percent of teachers received evaluation ratings below the “proficient” level, leading to the obvious question: What’s the point? That’s what Stacey Hodge, a middle school teacher in Dallas, wonders. She received “exceeding expectations” ratings in four of the state’s criteria and “proficient” in the rest, with no feedback as to areas in which she could improve.

“I can’t go any higher than exceeds expectations,” she said. “I think I do a really good job, but am I where I need to be?”

It should come as no surprise that, given the choice, most educators would choose a teacher evaluation system that focused on both development and measurement — with development the primary purpose. An informal survey of 3,000+ educators showed that 76 percent of respondents want teacher evaluation to support teachers rather than simply quantify their performance.

In Chapter 3 of Michael Toth and Dr. Robert Marzano’s new book,Teacher Evaluation that Makes a Difference, the authors take a close look at the characteristics of a teacher evaluation system that privileges development over measurement.

There are three primary characteristics of such a system:
1.  It’s comprehensive and specific, meaning it includes a wide variety of instructional strategies associated with student achievement, and it also identifies classroom behaviors at a very granular level, allowing for a high degree of focus when developing skills.
2.  It employs a developmental scale or rubricthat articulates stages of skill development
3.  It acknowledges and rewards teacher growth, scoring teachers not just on their current level of proficiency but also on the extent to which they reach their growth targets across a school year.

So how do we know we’re doing it right?
Obviously, determining the success of teacher evaluation systems is a challenge. Marzano and Toth address the two primary reasons why relying on classroom observations can be problematic: Sampling error and measurement error.

Sampling error: Sometimes, the lesson being observed does not adequately represent a teacher’s typical behavior. Perhaps the teacher’s typical level of use of a specific strategy is not exhibited during that particular observation. Or maybe a particular strategy is not easily observed during a single class period. Also, different types of lessons require different strategies. And sometimes the full use of a strategy isn’t evident until the end of the lesson, so if the entire lesson isn’t observed, crucial data is missed.

Measurement error: When observers inaccurately identify the type of strategy a teacher is using, or when they misidentify the level at which a teacher is using a particular strategy, a teacher’s performance evaluation suffers. Recommendations for decreasing measurement error include using multiple raters to score the same video recordings of teachers, and using very specific cut-points in the scale used for observations.

Toth and Marzano go into greater depth and detail on how to reduce sampling and measurement error, which we will look at in Part II of this post. We’ll also consider what other sources of data can be factored into the profile of teacher competence in the classroom: Teacher tests and student surveys.

Coming soon: Measuring Teachers’ Classroom Skills, Part II (Chapter 3 of Teacher Evaluation that Makes a Difference).

Leave a Reply