Friday, February 11, 2011

Adventures in Peer Review

A few weeks ago I was contacted by a large consulting firm who asked if I would be willing to sit on a peer-review panel for a large government agency program to fund graduate school fellowships in a field generally familiar to me.  (I'm going to speak in a lot of generalizations here, because much of the details are intended to remain anonymous).  As this is part of my professional obligation, and because I would receive a small amount of money which I can add to my research  funds to spend, I accepted.  Note here that the governmental agency contracts with a consulting firm to carry out the work for the peer review process. They requested, and I sent a copy of my curriculum vitae to the firm so that they could best judge which proposals to have me review.

A couple of weeks ago, I received a zip file with about 50 proposals, all 35-50 pages long, and our instructions.  Each panel member was given a short list of about 18 proposals that we were to evaluate and grade on a 5 point scale (call them hot $#!*, lukewarm $#!*, cold $#!*, bad $#!* and deep $#!*) on three areas, and for a shorter list of them, about five, to prepare to present the reviews to the entire panel.

The three areas of review were 1) scientific value, 2) An evaluation of the students prospects, and 3) "broader societal impacts" (largely how this helps "diversity", broadly defined).  They were to be given equal weight.

The reviews were due 2 days before the panel meeting.  It took me about 3 days to wade through the 18 reviews, as it took me at least an hour a piece to read the proposal, consider the various criteria, and write and justify a grade for the three criteria for each proposal.  Honestly I feel bad it was that short, but there's only so much you can do...  My scores ranged from three Hot $#!*s to two bad and one cold $#!*, a fair distribution, mostly centered around lukewarm $#!*.

The consulting firm received the reviews from all of us, I suppose, scored them together, and threw out any that were below a certain combined score (just about lukewarm $#!*).

This morning I drove 75 miles to the consulting firms offices for the panel meeting at 8 AM.  We were given a set of instructions (including not to talk about the details of proposals outside the room).  We were told that our job was to evaluate the proposals and give our best judgement, and not to consider what the agency might want to fund.  We were also told the agency would consider our views, but in the end, do whatever they felt best. We were pointedly not told how many proposals were likely to be funded. (That get's into a little game theory)

We were each given a list of proposals to present as first, second or third.  I was first on the second proposal, and 2nd or 3d on five more.  We were given 10 minutes (approximately) to present out reviews, allow anyone else to question us, and then the three presenters (and any one else who wanted to) gave a new combined score, resulting in 3 or more new scores which were averaged into a single $#!* score.  The first presenter was asked to write a single combined review which reflected the views of the voters.  After that was done, and all your reviews you voted on were signed, you were done.  The reviewing part was actually finished by 3:30, but the paperwork took until 5 PM.

One thing that many of us noted was that there seemed to have been little or no effort to match the expertise of the reviewers to the subject of the proposal. There were several proposals there for which I would have been more qualified than the reviewer chosen; not because of intelligence, but because of specialities. We had engineers grading biology and biologists grading engineering, for example.

A pair of related incidents I would like to relate.  One reviewer noted that a student identified his family with a user group, and suggested that he/she wanted to get into the science to help assure that the regulatory burden on this group was warranted.  The reviewer objected to that, and said his "bias" should stop at the door, and down graded the proposal on that basis.  In another instance, a student was lauded and given high mark for dropping out of school to work for activist environmental organizations, and then going back to school to get a degree to further help those activist organizations.

Sadly, I confess I did not speak up and point out this inconsistency.

After all mishmashing of scores, I think about half the proposals were judged Hot $#!*, and half to be Lukewarm $#!* (remember, lower scores had been thrown out earlier).


I'm reasonably confident that the majority of the students in either candidates were worthy of support  (frankly, there were some tremendously good proposals there).  I wish I knew how many will ultimately receive support.

No comments:

Post a Comment