Zócalo Public Square is a magazine of ideas from Arizona State University Knowledge Enterprise.
Imagine one morning, coffee in hand, you head to the website of your local newspaper, type in your name, and up pops how you rank in relation to your colleagues at work. The ranking is based on some mysterious statistical model but the message is clear—you don’t measure up. Now imagine the sting of public humiliation when you run into your neighbors, colleagues, and family later that day.
This was the reality for Los Angeles Unified School District (LAUSD) elementary school teachers in August 2010, when the Los Angeles Times published a database with thousands of teachers’ names alongside a measure of how much “value” each had added to their students’ standardized test scores.
The fallout from the Times database, which relied solely on students’ math and English scores on the California Standards Tests, has been national in scope. The National Education Policy Center re-analyzed the data, using a different statistical model, and found that 54 percent of teachers in the Times’ database fell into a different effectiveness category. Calling the Times’ release “reckless,” the Center’s analysis joined a maelstrom of critiques and legal battles that continue today.
A notable calm in this storm is the near universal acceptance that teaching quality should be evaluated using multiple measures. These include observing teaching practice over time, asking students to report on the quality of their experience, and analyzing the rigor of assignments. The work of teaching is complex, as is the resulting arc of student learning. In order to capture what’s really happening in classrooms, we need a variety of tools to mitigate the error associated with any one measure. But, as schools and districts are discovering, the devil is in the details. Creating a system for collecting, analyzing, and using multiple data points to promote teacher learning and growth requires infrastructure, reliable measures, hours of administrative and teacher time, technical expertise, and, above all, faith and trust in the process.
Ensuring that faith and trust is no easy feat. Many teachers in Los Angeles are wary of the centralized systems under construction to monitor their performance and enhance their growth. The Times’ misstep was followed by massive LAUSD layoffs due to budget cuts, and, in 2012, by the acrimonious Vergara v. California lawsuit, which continues to focus public attention on how to fire bad teachers. In this context, it takes strength and courage to open up your classroom door and invite others in to evaluate the quality of your practice.
That’s just what a group of teachers at one Los Angeles public school is doing. For the past five years, teachers at the UCLA Community School, in the central city neighborhood of Koreatown, have been mapping out their own process of evaluation based on multiple measures—and building both a new system and their faith in it.
As one of Los Angeles’ 50 “pilot schools”—district schools with charter-like autonomy to innovate—this school is the only one trying to create its own teacher evaluation infrastructure, building on the district’s groundwork. As the school’s research director, I’ve helped support data collection and analysis, but the evaluation process is owned by the teachers themselves.
Indeed, these teachers embrace their individual and collective responsibility to advance exemplary teaching practices and believe that collecting and using multiple measures of teaching practice will increase their professional knowledge and growth. They are tough critics of the measures under development, with a focus on making sure the measures help make teachers better at their craft.
When it came to student surveys, for instance, teachers added questions that were open-ended, pressing students to explain how the teachers could improve. Students made a variety of helpful suggestions, such as asking for more explanation of math strategies. Teachers also received scores in areas such as academic challenge and classroom engagement, which were further broken down by student groups. For example, a simple bar graph allowed teachers to see whether struggling students felt as supported or challenged as their high-achieving peers. I met with a few teachers and was impressed to hear them reflect on how they could better reach failing students. One teacher was moved to tears looking at her scores, remarking, “These are my students talking to me.” Throughout this feedback process, I was struck by how much teachers appreciated external, trustworthy data on their daily practice.
In addition to student surveys, the school’s principal and assistant principal spent hours observing the teachers’ classrooms, documenting their instructional moves and practices and later debriefing what went well and what could be improved. Teachers also assembled a portfolio containing an assignment they gave students, how they taught this assignment, and samples of the student work produced. This portfolio was scored by educators trained at UCLA to assess teaching quality on several dimensions, including academic rigor and relevance. Teachers then completed a reflection on the scores they received, what they learned from the data, and how they planned to improve their practice.
After receiving these three different kinds of data—student surveys, observations, and portfolio assessments—almost all teachers reported in a survey that they appreciated receiving multiple measures of their practice. Most teachers reported that the measures were a fair assessment of the quality of their teaching, and that the evaluation process helped them grow as educators. But there was also consensus that more information was needed to help them improve their scores. For example, some teachers wanted to know how to make assignments more relevant to students’ lives; others asked for more support reflecting on their observation transcripts.
Perhaps the most important accomplishment of this new system was that it restored teachers’ trust in the process of evaluation. Very few teachers trust that value-added measures—which are based on tests that are far removed from their daily work—can inform their improvement. This is an issue explored by researchers who are probing the unintended consequences of teacher accountability systems tied to value-added measures (such as the formula used by the L.A. Times). For example, Harvard researcher Susan Moore Johnson cautions that value-added evaluation methods may reduce trust and undermine collaboration, affirming schools as egg-crate organizations where teachers work in isolation. We know that schools flourish when the adults inside are working together, not apart. Long-term research on school reform affirms the central role that relational trust and respect play in improving schools.
The L.A. Times database and other rankings miss the most important qualities of great teachers. They open their classroom doors and make their practice public. And they trust their colleagues and others to tell them when they are calling on some students over others, to point out when their lesson doesn’t challenge all students, or to suggest ways to enliven classroom discussions. Embracing and acting upon this sort of feedback takes courage and isn’t easy, especially in today’s education climate. But focusing public attention on teacher learning and betterment is the best route to restoring trust in teacher evaluation. That’s a story worth sharing with your neighbors.
Karen Hunter Quartz is research director at the UCLA Community School and a faculty member in the UCLA Graduate School of Education and Information Studies. She wrote this for Thinking L.A., a partnership of UCLA and Zocalo Public Square.