Can we grade the teachers?

The Los Angeles Times has announced it will publish a data base to measure the effectiveness of each individual teacher in the Los Angeles Unified School District, the second-largest in the nation.

The newspaper obtained seven years’ of math and English test scores, and applied a statistical method called value-added analysis.  This measured the improvement in students’ performance in each school year from the previous year.  As the LA Times said, this controls for poverty, race, English proficiency and other factors blamed for poor school performance.

This is what the reporters concluded:

  • Highly effective teachers routinely propel students from below grade level to advanced in a single year.  There is a substantial gap at year’s end between students whose teachers were in the top 10% in effectiveness and the bottom 10%.  The fortunate students ranked 17 percentile points higher in English and 25 points higher in math.
  • Some students landed in the classrooms of the poorest-performing instructors year after year — a potentially devastating setback that the district could have avoided.  Over the period analyzed, more than 8,000 students got such a math or English teacher at least twice in a row.
  • Contrary to popular belief, the best teachers were not concentrated in schools in the most affluent neighborhoods, nor were the weakest instructors bunched in poor areas.  Rather, these teachers were scattered throughout the district.  The quality of instruction typically varied far more within a school than between schools.
  • Although many parents fixate on picking the right school for their child, it matters far more which teacher the child gets.  Teachers had three times as much influence on students’ academic development as the school they attend.  Yet parents have no access to objective information about individual instructors, and they often have little say in which teacher their child gets.
  • Many of the factors commonly assumed to be important to teachers’ effectiveness were not. Although teachers are paid more for experience, education and training, none of this had much bearing on whether they improved their students’ performance.

It would be a mistake to use these tests as a basis to reward and punish.  Instead, every elementary teacher in the Los Angeles area should be given a day off during the school year to observe a high-performance teacher.  Then at the end of the year, teachers should go on a retreat to talk about what they’ve learned.

The way you improve in industry is to learn the best practices, adopt them and figure out how to improve upon them.  But in the public school system, there is no systematic way that you learn the best practices.  Each teacher operates in isolation.

Michael O’Hare, professor of public policy at the University of California at Berkeley, put the case this way:

What the LA performance data does is highlight a batch of teachers at the top of the data whose classrooms need to be visited by their peers, perhaps by videotape, and discussed. The point is not that everyone should be completely focused on increasing these test scores, but that a successful record at that measurable result is a good (not perfect) indicator of teaching practices that, if observed and discussed, will lead to better outcomes for students on a variety of dimensions. …

Teachers never see each other work, almost never get to talk to each other about individual students, and have practically no opportunity for the core practice of quality assurance, which is observing and then discussing a particular practice, comparing alternatives, in a group of peers. …

via The Reality-Based Community.

A.J. Duffy, president of United Teachers Los Angeles, objected to the survey.  I think he is wrong, but I agree there is a danger the numbers be misused. The more test scores are used as a basis to reward and punish, the greater the temptation to manipulate the results.  I’m reminded of one of the favorite sayings of W. Edwards Deming, the father of Total Quality Management – that if you give a manager a numerical target, he’ll make it, even if he has to destroy the company in the process.

There also is a danger of taking one measurement and assuming it is the only thing that matters.  School performance is affected by many things, including the backgrounds of the students, by how many days a year the schools are open, by the textbooks and facilities and by whether having a diploma will in reality make a difference in getting a job.

Deming was a statistician, and he was aware of how statistics can be abused.  One is judging people by how they rank (1st, 2nd, 3rd … last) rather than by their distance from the goal or from the average.  The performance of most people in most things is very similar, Deming said.  When someone’s performance is so outstanding that they are off the charts, they should be put to work instructing others.

My mother was a public school teacher in the primary grades for more than 40 years.  The affection and respect of her former students show she was an excellent teacher.  She developed her own teaching methods, which were private to her. When a supervisor came to observe her work, she would switch to whatever instructional method was in favor that year.  What a legacy she could have left if there had been a means by which she could have shared her experience and knowledge with others!

Click on Who’s teaching L.A.’s kids? for the full Los Angeles Times article.

Click on Teacher performance data and its discontents for Michael O’Hare’s full comment.

Click on Dukenfield’s Law of Incentive Management for the pitfalls of linking financial incentives to test results.

[P.S. 12/31/10]  Michael O’Hare in a later comment specifically related teacher evaluation to the Deming philosophy.

Deming – brilliant, tough-minded, and humane –  demonstrated that if you reward individual workers for performance, you are going to be rewarding random variation a lot of the time, with poisonous effects.  Right away, when the top salesman among twenty gets a trip to Hawaii with his wife, the response of the other nineteen is not to emulate him (and how could they, if they don’t see what he does, which is the case for teachers in spades), but to be pissed off and jealous, which is, like, really great for collaborative enterprise.  Next year, regression toward the mean sets in and he is only number five, or ten, so he looks like a slacker, coasting on his laurels. Even his wife starts giving him the fish eye; don’t be surprised if his lunch martini count starts to go up.

It is a universal, desperate, desire of lazy or badly trained managers to find a mechanistic device you can wind up like a clockwork, loose upon the organization, and go play golf.  Like testing and firing to get people to do good work.  Please, Lord, show me the way to manage without any actual heavy lifting!  But many desires, no matter how desperately we cleave to them, are not fated to be fulfilled, and this is one.  Teaching, like any complex production process, will get better when teachers watch each other work and talk about what they are doing, why, and how it works; what to watch is usefully indicated by statistical QA methods.  Period.

via The Reality-Based Community.

[P.S. 3/10/11]  However, I do agree with this observation by Conor Friedersdorf on the American Scene web log, and I think W. Edwards Deming would, too.

Is it difficult to develop a precise metric for ranking every teacher in a school from highest performing to lowest performing in order to divide up compensation by merit? Yes, very tough indeed. In extreme circumstances, however, it is very easy to evaluate teacher performance. Say that there’s a student at your school who attempts suicide, and on his first day back, one of his teachers tells him, “Carve deeper next time – you can’t even kill yourself.” Or imagine another teacher who is caught keeping a stash of marijuana, pornography, and vials with cocaine residue on school grounds. Ponder a case where a male middle school teacher is observed lying on top of a female student in shop class. Or a special education teacher who fails to report child abuse, yells insults at children, and inadequately supervises her class. These aren’t hyperbolic examples crafted to make a theoretical point that has little bearing on the real world. These are actual examples of misbehavior by Los Angeles Unified School District teachers who weren’t fired!

via The American Scene.

In a large group, there may be a couple of outstanding performers who are in a different category from all the rest, and there may be a couple who are grossly incompetent or worse.  But it is always obvious who they are.  You don’t need a complicated evaluation process to identify them.

[P.S. 5/7/11]  Click on The Testing Machine for an article by Barbara Renaud Gonzales in The Texas Observer about how high-stakes testing works at one Texas middle school.  (Hat tip to Steve B sending me the article).

The school is testing for the Texas Assessment of Knowledge and Skills (TAKS) benchmarks before the real TAKS test, which determines which students progress to the next grade.  The tests, administered by the Texas Education Agency, also determine how the school is rated academically. Benchmark testing is supposed to help schools project how students will perform on the actual TAKS. If too many students fail in the spring, the principal’s job, along with everyone else’s in the administration, is at stake. The school I’m visiting is considered at-risk for being labeled “low performing.”

The school district, out of desperation, has contracted with a prestigious university, my employer, to help the teachers in math, reading and science.  I’m here to gather data about attendance, behavior and grades—key to researching how to reduce dropouts.

At my university, researchers have spent almost 15 years examining the complexities of student success in at-risk schools.  We have found that standardized tests like the TAKS are not predictors for high school graduation. Students flunk the TAKS for reasons other than academic skills.  Some have oh-my-God! panic attacks.  Some, like the dyslexic Albert Einstein, can’t perform well on tests.  Many progressive educators believe that standardized tests should enhance the curriculum, not punish students by failing them. … …

Because of my job, I get to observe the different seventh-grade classes.  There are more than 30 students in most of the math and science classes, and the teachers try hard to ignore whispering, jostling and paper-shuffling.  One-third of the class seems to be at risk of failing because of emotional and academic problems.  Some are special education students who have been mainstreamed.  Some are wannabe gang members.  Some are just bored.  The teachers must get through their lessons in 45 minutes and don’t seem to breathe the whole time.   They are absorbed in their LCD boards, their colorful markers, swooping through the fractions and formulas once and again.  They give tips and shortcuts for solving the math problems likely to come up on the TAKS.  Pay attention!  The front of the class is quiet, but the back third is buzzing at the end of the day.  My university’s master teachers are helping teachers keep students engaged with the coursework.  Play games, they tell the teachers.  Give real-life problems.  But the TAKS seems to be the dark cloud in their classroom.

The math teachers call the last period of the day “the class from hell.”

At the end of the third six-week period, in early January, I tell my university that it doesn’t seem right to me that the grade reports show only six seventh-graders out of 350—the target group we’re following—are failing math.  I’ve been in those classrooms, observed how one-third aren’t paying attention.  How can this be?  I’m a product of working-class public schools and know how easy it is to fall behind in math. … …

At the end of the school year, seven of 357 seventh-graders have failed math, according to the official roster.  Six have failed English and reading.  The school meets the TAKS standards and receives a “recognized” rating.  With the help of master teachers, after-school and Saturday morning tutorials, 70 percent of the students have met the standard in math.  But do they know how to solve a problem that’s not on the TAKS?

Tags: , , , ,

84 Responses to “Can we grade the teachers?”

  1. Tomcat in the red room. Says:

    The problem with this is that, if one pupil of a so-called brillitan teacher fails, then that pupil’s parents will start questioning the results, demanding the teacher is re-assessed. less than perfect teachers will feel threatened, parents won’t want anybody but the so-called best teaching their kids. etc etc.


    • badkidsgoodgrammar Says:

      It seems to happen in private schools. I know of two teachers that were reprimanded (one eventually quit) because they, coming from the public system, expected students to live up to the expectations that were set. Apparently the parents and the administration expected them to be graded according to the amount of money that was spent on their education.


  2. bilal Says:

    i love this


  3. philebersole Says:

    I think that test results compiled by the Los Angeles Times, which measure average improvement in reading and math tests over the school year, are a good rough measure of teachers’ effectiveness looking backward, but not necessarily looking forward.

    When the tests were given, the results were what they were. From now on, when the tests are given, the teachers will know they are going to be judged on the basis of the results, which means they have an incentive to do things – not necessarily cheat – that will game the results.

    It is the same with student evaluations of teachers. It is useful information, provided they are not used as a basis to reward and punish.

    One of W. Edwards Deming’s 14 points is *8. DRIVE OUT FEAR*. If people are afraid that information relating to their performance will be taken down and used against them, they won’t gather that knowledge, and they’ll resist those who do.


  4. philebersole Says:

    Let me share observations about journalism, based on 40 years working on newspapers, which may be relevant to public school teaching.

    Most people who go into journalism do it for love of the work, at least at first. I never knew anybody who went into journalism because they hoped to get rich. But I knew a large number who left the field because they couldn’t earn enough money to support their families.

    The very best reporters have talent and dedication that few people could match. But anybody of reasonable intelligence and a reasonable work ethic can learn to be an adequate reporter and writer.

    Most of us in the newsroom would have agreed on who the best reporters were. But I don’t know of any metric – number of column-inches of published copy, number of stories on Page One, results of readership surveys, number of journalism awards – that would have captured this.

    I worked for many years under a merit pay system. I can honestly say that I never gave a thought, in any decision concerning my work, as to how it would affect my next pay raise. I think this was pretty much true of everybody in the newsroom.

    My greatest motivators were personal satisfaction, the praise and comments of readers and the respect of my peers. But then, unlike reporters today, I didn’t work on the condition of fear of whether I would keep my job.

    Some (not all) of the best reporting was done against the indifference and sometimes the resistance of newspaper management.

    Journalism-basing and public hostility did not motivate us to improve. Instead it made us reporters defensive and resistant to change.


  5. familynurturingtree Says:

    My thoughts lean toward appreciative coaching or appreciative inquiry for a teacher and the school to explore those best practices, celebrate what’s working now, and instead of trouble shooting or punishing, inspire the teachers to want they want to be which will carry over to inspiring their students and create a whole new learning environment.

    For more information on appreciative coaching visit: the “services” page at

    All Blessings,


  6. homo symbolicus Says:

    Well, I am a teacher too and I am full of passion about it. I actually see as my #1 social justice issue and by the way, sorry for my previous post full of errors and typos. I quickly posted my comment without rereading it while I was doing three other things 😉
    > Cameron Evans: You’re central argument is can we grade the teachers? The real question is do we need to study the performance of teachers post-mortem?
    > mandymcadoo: The results also probably show which teachers spend the most time teaching test-taking methods and strategies.
    true, I have never been a good test taker myself, but hey! what would the other option be? What would be wrong with this? I never learned to be a good test taker, even though I have always academically excelled
    > cheneetot08: I remember way back when I was still schooling before classes ends, students are given the opportunity to rate their teachers and give comments on how to improve their teaching skills.
    Very helpful too
    > ladyjustine: There is still no evidence that ‘league tables’ in England have caused any difference to standards of teaching. Schools’ results have increased, but it has led to a narrowing of the curriculum, as only the things that can be measured are taught, and many other things are sacrificed.
    this can be obviously changed in an easy way and it is not a problem with “league tables” per se, but how the idea was implemented. Similar things happen in the states were kids are drilled with the things that matter for the schools to get a better evaluation. To me this is not exactly “cheating” or let’s call it “white cheating”
    > ladyjustine: … As always, some minorities under-perform
    Exactly and this is why I did propose such a system. The extra moneys you will get not based on the students scores, but based on “their difference that matters”, namely the difference from their previous scores
    > ladyjustine: It’s flawed, ridiculous and divisive.
    Not if implemented right. Let me tell you about an example of a similar system (from an educational point of view) from one of my best known hells. I grew up in a very poor country (Cuba) and there they used to have a solid basic education (people have told me that is no more) and a scoring schedule that accompany them all the way to their college application. The educational system (specially the tests ad scoring were strictly standardized) You very well knew you were competing based -SOLELY- on your scores and how many opening there were for the three professions you could apply for. It was also a totally open and nation-wide process and the results were posted on huge billboards that look like this: profession + a running number for the profession/entry score (based on how many openings they had for that profession) + Your name + your scores + where you lived/school from which you applied (for identification purposes)
    There was -almost- no cheating in that system. The only cheating that happened was just that a few of the children of the nomenclatura would get in (in blah, blah kinds of professions such as “Journalism”, “Philology” and “Law”; not medical school, Physics or technical subjects) and everybody in the schools knew who they were because they very well knew that their names were not listed and they very much looked it 😉 You would not hear the end of it about how unfair they found those other ones had made it in though their preferential venues …
    > ladyjustine: … only those who have ‘the passion’ should be allowed in to the profession
    I don’t believe in those “new man” kinds of philosophies. Things can be improved if passion, greed, the regular you and me, … are given a fair chance
    > Tomcat in the red room: The problem with this is that, if one pupil of a so-called brillitan teacher fails …
    Obviously, just one or a few children would not make a difference in the system I propose
    > badkidsgoodgrammar: Apparently the parents and the administration expected them to be graded according to the amount of money that was spent on their education.
    > philebersole: … I worked for many years under a merit pay system
    Come on philebersole! A merit pay system for journalists?!?! Give me a break!!! What a joke specially for Journalism/media in the states!!! how on earth could that be!!! As people tell me comparing newspapers in the states to the government owned media in Cuba … OK that was an open, day light lie, but is this the truth?!?!?


  7. homo symbolicus Says:

    special talents + “cheating”, being good at test taking, true thing …
    I do believe Math/Sciences + L1s are important, but people have their own talents they want to professionally develop …
    Once you got your ticket to the University/trade school you applied for, you would have to sit extra exams and a special talents exam. Say if you were a flute player you would have to come with your flute and play in front of three students of the school and a teacher and if you were applying to Literaturwissenschaft they wanted to here about something you had written yourself and other related and unrelated etceteras … 😉
    I know of one case of someone trying to get her read end into performance arts at the Superior Institute of Art and she was denied three times even though she had passed all other exams (regular exams you didn’t have to repeat and you could reapply next year only for the special talents one) I also know of cases of girls from remote mountainous places where they don’t even have electricity, who, out of plain hard work/studies and enthusiasm, got to Med School. My mother is an exceptionally good teacher so I got naturally educated, … call me whatever you want, but I really, really, really loved to see that these other kids who were not that lucky to have a teacher as your mother, still got a fair chance


  8. J Says:

    I think that being able to grade teachers is somethingthat should be implemented all over. In every job youhave you have some type of evaluation to rate your performance. Teachers you be tested bi annually for the effectivness.


    • Greg Camp Says:

      But what will be the standard of evaluation? Is there one way to evaluate teachers across different communities, different subjects, and different groups of students? Standardization worries me because it will mean the kind of one-size-fits-all, mechanical teaching that No Child Left Behind has been promoting.


      • bryanreece Says:

        I am certain we could develop rigorous standards and processes for evaluating teaching. We have been able to do this with regard to our research. Submitting a paper for publication to peer reviewed journals is a very rigorous process. Colleagues are very honest and professional with regard to their comments and we have agreed in higher education that the standards are legitimate. Research is no more personal or idiosyncratic than teaching. If we can develop and sustain ongoing evaluation of research then we should be able to develop and sustain ongoing evaluation of all academic practices (e.g., teaching).


  9. bryanreece Says:

    Great post and very interesting string of comments. I like the idea of grading or measuring teacher performance; however, I also agree with one of the previous posts, suggesting we need to measure all the other factors that impact student performance. How do we measure student engagement? Parental (significant other) support? Academic infrastructure? Learning support services? If we could combine these measures into a composite score, I think we could develop better strategies and policies for improving student performance. Does anyone know of a model like this? Is anyone conducting an analysis like this?


  10. Measuring Pedagogy | On Student Success Says:

    […] 2010 by bryanreece Just read an interesting post at Phil Ebersole’s Blog titled “Can we grade the teachers?.” Ebersole talks about the recent LA Times coverage of LAUSD teachers and the evaluation of […]


  11. LD Says:

    Bry, yes, it’s called the Midas touch, and that level of accountability makes the system cold and unhuman. Learner centered assessmant, curriculum assessment, formative and summative assessments, quantitative analysis, dynamic quantitative evaluation and comparative analysis are all used in some way in education. But what really works well is using good teachers instead of the hacks HR offices across the state have been employing to fulfill staffing requirements. They are easy to spot in an interview. They smile a lot to the point of getting a headache. They only speak the language of their training taught rhetoric. And they agree with everything they told about their new job. Assessment is not the problem.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: