Posts Tagged ‘United Teachers Los Angeles’

Can we grade the teachers?

August 24, 2010

The Los Angeles Times has announced it will publish a data base to measure the effectiveness of each individual teacher in the Los Angeles Unified School District, the second-largest in the nation.

The newspaper obtained seven years’ of math and English test scores, and applied a statistical method called value-added analysis.  This measured the improvement in students’ performance in each school year from the previous year.  As the LA Times said, this controls for poverty, race, English proficiency and other factors blamed for poor school performance.

This is what the reporters concluded:

  • Highly effective teachers routinely propel students from below grade level to advanced in a single year.  There is a substantial gap at year’s end between students whose teachers were in the top 10% in effectiveness and the bottom 10%.  The fortunate students ranked 17 percentile points higher in English and 25 points higher in math.
  • Some students landed in the classrooms of the poorest-performing instructors year after year — a potentially devastating setback that the district could have avoided.  Over the period analyzed, more than 8,000 students got such a math or English teacher at least twice in a row.
  • Contrary to popular belief, the best teachers were not concentrated in schools in the most affluent neighborhoods, nor were the weakest instructors bunched in poor areas.  Rather, these teachers were scattered throughout the district.  The quality of instruction typically varied far more within a school than between schools.
  • Although many parents fixate on picking the right school for their child, it matters far more which teacher the child gets.  Teachers had three times as much influence on students’ academic development as the school they attend.  Yet parents have no access to objective information about individual instructors, and they often have little say in which teacher their child gets.
  • Many of the factors commonly assumed to be important to teachers’ effectiveness were not. Although teachers are paid more for experience, education and training, none of this had much bearing on whether they improved their students’ performance.

It would be a mistake to use these tests as a basis to reward and punish.  Instead, every elementary teacher in the Los Angeles area should be given a day off during the school year to observe a high-performance teacher.  Then at the end of the year, teachers should go on a retreat to talk about what they’ve learned.

The way you improve in industry is to learn the best practices, adopt them and figure out how to improve upon them.  But in the public school system, there is no systematic way that you learn the best practices.  Each teacher operates in isolation.

Michael O’Hare, professor of public policy at the University of California at Berkeley, put the case this way:

What the LA performance data does is highlight a batch of teachers at the top of the data whose classrooms need to be visited by their peers, perhaps by videotape, and discussed. The point is not that everyone should be completely focused on increasing these test scores, but that a successful record at that measurable result is a good (not perfect) indicator of teaching practices that, if observed and discussed, will lead to better outcomes for students on a variety of dimensions. …

Teachers never see each other work, almost never get to talk to each other about individual students, and have practically no opportunity for the core practice of quality assurance, which is observing and then discussing a particular practice, comparing alternatives, in a group of peers. …

via The Reality-Based Community.

A.J. Duffy, president of United Teachers Los Angeles, objected to the survey.  I think he is wrong, but I agree there is a danger the numbers be misused. The more test scores are used as a basis to reward and punish, the greater the temptation to manipulate the results.  I’m reminded of one of the favorite sayings of W. Edwards Deming, the father of Total Quality Management – that if you give a manager a numerical target, he’ll make it, even if he has to destroy the company in the process.

There also is a danger of taking one measurement and assuming it is the only thing that matters.  School performance is affected by many things, including the backgrounds of the students, by how many days a year the schools are open, by the textbooks and facilities and by whether having a diploma will in reality make a difference in getting a job.

Deming was a statistician, and he was aware of how statistics can be abused.  One is judging people by how they rank (1st, 2nd, 3rd … last) rather than by their distance from the goal or from the average.  The performance of most people in most things is very similar, Deming said.  When someone’s performance is so outstanding that they are off the charts, they should be put to work instructing others.

My mother was a public school teacher in the primary grades for more than 40 years.  The affection and respect of her former students show she was an excellent teacher.  She developed her own teaching methods, which were private to her. When a supervisor came to observe her work, she would switch to whatever instructional method was in favor that year.  What a legacy she could have left if there had been a means by which she could have shared her experience and knowledge with others!

Click on Who’s teaching L.A.’s kids? for the full Los Angeles Times article.

Click on Teacher performance data and its discontents for Michael O’Hare’s full comment.

Click on Dukenfield’s Law of Incentive Management for the pitfalls of linking financial incentives to test results.

[P.S. 12/31/10]  Michael O’Hare in a later comment specifically related teacher evaluation to the Deming philosophy.

Deming – brilliant, tough-minded, and humane –  demonstrated that if you reward individual workers for performance, you are going to be rewarding random variation a lot of the time, with poisonous effects.  Right away, when the top salesman among twenty gets a trip to Hawaii with his wife, the response of the other nineteen is not to emulate him (and how could they, if they don’t see what he does, which is the case for teachers in spades), but to be pissed off and jealous, which is, like, really great for collaborative enterprise.  Next year, regression toward the mean sets in and he is only number five, or ten, so he looks like a slacker, coasting on his laurels. Even his wife starts giving him the fish eye; don’t be surprised if his lunch martini count starts to go up.

It is a universal, desperate, desire of lazy or badly trained managers to find a mechanistic device you can wind up like a clockwork, loose upon the organization, and go play golf.  Like testing and firing to get people to do good work.  Please, Lord, show me the way to manage without any actual heavy lifting!  But many desires, no matter how desperately we cleave to them, are not fated to be fulfilled, and this is one.  Teaching, like any complex production process, will get better when teachers watch each other work and talk about what they are doing, why, and how it works; what to watch is usefully indicated by statistical QA methods.  Period.

via The Reality-Based Community.

[P.S. 3/10/11]  However, I do agree with this observation by Conor Friedersdorf on the American Scene web log, and I think W. Edwards Deming would, too.

Is it difficult to develop a precise metric for ranking every teacher in a school from highest performing to lowest performing in order to divide up compensation by merit? Yes, very tough indeed. In extreme circumstances, however, it is very easy to evaluate teacher performance. Say that there’s a student at your school who attempts suicide, and on his first day back, one of his teachers tells him, “Carve deeper next time – you can’t even kill yourself.” Or imagine another teacher who is caught keeping a stash of marijuana, pornography, and vials with cocaine residue on school grounds. Ponder a case where a male middle school teacher is observed lying on top of a female student in shop class. Or a special education teacher who fails to report child abuse, yells insults at children, and inadequately supervises her class. These aren’t hyperbolic examples crafted to make a theoretical point that has little bearing on the real world. These are actual examples of misbehavior by Los Angeles Unified School District teachers who weren’t fired!

via The American Scene.

In a large group, there may be a couple of outstanding performers who are in a different category from all the rest, and there may be a couple who are grossly incompetent or worse.  But it is always obvious who they are.  You don’t need a complicated evaluation process to identify them.

[P.S. 5/7/11]  Click on The Testing Machine for an article by Barbara Renaud Gonzales in The Texas Observer about how high-stakes testing works at one Texas middle school.  (Hat tip to Steve B sending me the article).

The school is testing for the Texas Assessment of Knowledge and Skills (TAKS) benchmarks before the real TAKS test, which determines which students progress to the next grade.  The tests, administered by the Texas Education Agency, also determine how the school is rated academically. Benchmark testing is supposed to help schools project how students will perform on the actual TAKS. If too many students fail in the spring, the principal’s job, along with everyone else’s in the administration, is at stake. The school I’m visiting is considered at-risk for being labeled “low performing.”

The school district, out of desperation, has contracted with a prestigious university, my employer, to help the teachers in math, reading and science.  I’m here to gather data about attendance, behavior and grades—key to researching how to reduce dropouts.

At my university, researchers have spent almost 15 years examining the complexities of student success in at-risk schools.  We have found that standardized tests like the TAKS are not predictors for high school graduation. Students flunk the TAKS for reasons other than academic skills.  Some have oh-my-God! panic attacks.  Some, like the dyslexic Albert Einstein, can’t perform well on tests.  Many progressive educators believe that standardized tests should enhance the curriculum, not punish students by failing them. … …

Because of my job, I get to observe the different seventh-grade classes.  There are more than 30 students in most of the math and science classes, and the teachers try hard to ignore whispering, jostling and paper-shuffling.  One-third of the class seems to be at risk of failing because of emotional and academic problems.  Some are special education students who have been mainstreamed.  Some are wannabe gang members.  Some are just bored.  The teachers must get through their lessons in 45 minutes and don’t seem to breathe the whole time.   They are absorbed in their LCD boards, their colorful markers, swooping through the fractions and formulas once and again.  They give tips and shortcuts for solving the math problems likely to come up on the TAKS.  Pay attention!  The front of the class is quiet, but the back third is buzzing at the end of the day.  My university’s master teachers are helping teachers keep students engaged with the coursework.  Play games, they tell the teachers.  Give real-life problems.  But the TAKS seems to be the dark cloud in their classroom.

The math teachers call the last period of the day “the class from hell.”

At the end of the third six-week period, in early January, I tell my university that it doesn’t seem right to me that the grade reports show only six seventh-graders out of 350—the target group we’re following—are failing math.  I’ve been in those classrooms, observed how one-third aren’t paying attention.  How can this be?  I’m a product of working-class public schools and know how easy it is to fall behind in math. … …

At the end of the school year, seven of 357 seventh-graders have failed math, according to the official roster.  Six have failed English and reading.  The school meets the TAKS standards and receives a “recognized” rating.  With the help of master teachers, after-school and Saturday morning tutorials, 70 percent of the students have met the standard in math.  But do they know how to solve a problem that’s not on the TAKS?