Thursday, June 4, 2009

Thoughts on gender differences in math

Reminded by this link:
http://blogs.discovermagazine.com/80beats/2009/06/02/more-evidence-that-girls-kick-ass-at-math-just-like-boys/

To copy the following older post off the lab blog (from 10/4/2007):

I vented about this at lab meeting the other day. I now think it's time to actually organize some information on it because the idea seems more pernicious than I initially realized.

I think of this as the "Larry Summers" hypothesis, although this is actually a bit inaccurate (although people would probably recognize it by that name). The core idea is that a reason women are underrepresented at the highest levels of success in math and science (e.g., faculty positions at top universities) is that the distributions of inherent ability betwen men and women are different. The mean inherent ability may be identical, but greater variance in the male distribution puts more men in the extremes of high and low ability. Thus at the highest levels of success, you would expect to find less women because there are fewer women at the very upper end of the distribution.

This seems like a question of science and statistics, but there's a significant danger here. If the core hypothesis is believed, it argues against gender-based affirmative action at top universities. If the existing difference in representation of men and women in top universities is based on a genetic difference, increasing representation women will actually make those departments stupider on average.

The alternate hypothesis is that the existing differential representation is due to cultural (social, environmental) factors that can be ameliorated by affirmitive action policies aimed at overcoming a historical cultural bias against women in these fields.

So we have a particularly difficult situation: a scientific question that is very hard to assess that has a direct and immediate policy impact. My personal opinion is that in these cases, it is important to weigh the costs of error. If we have to make a binary policy decision (affirmative action, yes or no) that will be based on an evaluation of the balance of unclear evidence, which error is more costly? Is more damage done by in appropriately implementing or eliminating affirmative action? I will note that this kind of consideration is broadly unpopular with scientists who see themselves as pure seekers of truth. But I'm not going to argue that philosophical point here since this particular theory can just be evaluated on a balance of evidence basis and there doesn't seem to be much to it.

Some background
The original Summers situation is described evenhandedly on Wikipedia.

An excerpt:
Controversy

Another study performed by the American Psychological Association in response to the book The Bell Curve, which investigated the difference in intelligence between different social classes (strongly correlated with race in the U.S.), determined (as did the authors of the book) that the studies available in 1995 showed no major difference between males and females in regard to IQ scores.[24]

In January 2005, Lawrence Summers, president of Harvard University, unintentionally provoked a public controversy when MIT biologist Nancy Hopkins leaked comments he made at a closed economics conference at the National Bureau of Economic Research.[25] [26] [27] In analyzing the disproportionate numbers of men over women in high-end science and engineering jobs, he suggested that, after the conflict between employers' demands for high time commitments and women's disproportionate role in the raising of children, the next most important factor might be the above-mentioned greater variance in intelligence among men than women, and that this difference in variance might be intrinsic,[28], adding that he "would like nothing better than to be proved wrong". The controversy generated a great deal of media attention, forced Summers to make a number of apologies, and led Harvard to commit $50 million to the recruitment and hiring of women faculty.[29]

In May 2005, Harvard University psychology professors Steven Pinker and Elizabeth Spelke debated "The Science of Gender and Science".[30]

In July 2006, Stanford University neurobiologist Ben Barres, a transsexual man, wrote a provocative piece in Nature on his own experiences as both a male and female scientist.[31] Barres argued that prior to transition, he had succeeded as a female despite pervasive sexism. Barres wrote that numerous studies show female scientists are consistently rated lower than their male counterparts with the same levels of productivity and credentials.

In 2006, Danish psychologist Helmuth Nyborg was asked to vacate his position at Aarhus University after publishing a paper in Personality and Individual Differences that showed an 8 point IQ difference in favour of men.Nyborg, Helmuth (2005). "Sex-related differences in general intelligence g, brain size, and social status". Personality and Individual Differences 39: 497-509.


Summers in an economist. Where did he get this idea about variability? I'm not sure, but apparently this idea has some "mainstream" support in Psychology.

Baumeister's 2007 APA address
Stephen Pinker taking this position in a debate with Liz Spelke

There's a lot of interesting stuff here. But I was really captured by the similarities of the Pinker/Baumeister argument because they have the same flaws that seemed pretty obvious to me. I actually do hesitate when it seems like a smart person is arguing something transparently stupid. A plausible alternate hypothesis is that I'm wrong or misunderstanding something important. So pointing out the core problems here may help us evaluate which hypothesis is more plausible (they're being stupid or I am).

The variance hypothesis
I included the links so that you could check my account of the hypothesis, but it's not really complicated. There are plenty of data that say men are overrpresentated at both tails of the distribution. You could probably argue about the data, or tackle the question of what is being distributed, IQ? intelligence? success? ability? Success is probably the best description (although all those constructs co-load) and I'm happy to stipulate the data are what they are for the purpose of evaluating the rest of the argument.

The problem is the inference that the differential variability is inherent, i.e., it's based on a significant genetic contribution. The obvious alternate hypothesis is that the differential representation in the tails is predominantly cultural or societal based on differential treatment of men/women (boys/girls).

In case you're wondering, a simple way you get differential representation in the tails is via feedback loops. Let's say individuals vary in ability and some small percentage have the ability to become geniuses if provided with effective teaching/instruction/training (note this is a fairly nativist argument itself and not the only one). E.g., exhibiting ability -> attention & more teaching -> greater ability -> more specialized teaching... etc. until you push somebody out into the very upper tail of ability/success. If this trait was equally distributed across genders, but there was even a slightly lower chance that the feedback loop gets started for women then men, you'll end up with women underrepresented in the tail due to cultural differences.

Is that plausible? Maybe. There is an abundant evidence that men and woman (boys and girls) are treated differently at least. What's the evidence that the overrepresentation in the tails is genetically based (inherent)?

There isn't any actually. Baumeister references greater variability in height among men, but not only can I not find any source for that (I looked up average height charts and the distributions look roughly identical) but you might even expect SD to go up with mean (men are taller).

Very weirdly, both Baumeister and Pinker argue that men have been selected for greater risk taking historically as if that was related. Baumeister also argues that men are under greater selection pressure to pass their genes along (fewer men have passed their genes through time than women apparently). But neither argument is remotely relevant to producing greater variability in success. In fact, greater selection should produce less variability (as any remotely clueful evolutionary psychology should know) not more. You could use either of these to argue why you think men should be smarter on average (they aren't) but neither is related to increased variability.

Pinker gives us one pseudo-shred of data: "And biologists since Darwin have noted that for many traits and many species, males are the more variable gender."

So there's your alternative hypotheses to consider: men are the more variable gender or social/cultural affects influence the degree to which women acheive upper-tail success.

Pinker's talk is the far less egregious of the two, but the absence of consideration of social or cultural effects is still staggering. He spends a considerable amount of time documenting existing differences between genders to prove that there are some. I don't believe there are reasonable people who disagree with that, but it still doesn't mean success/ability is genetically determined. It doesn't even mean all those differences are genetic/inherent.

There are a whole lot of examples of this in his talk, but here's my current favorite as a parent of a 13yo girl who is in an accelerated math track in high school:


Fifth, mathematical reasoning. Girls and women get better school grades in mathematics and pretty much everything else these days. And women are better at mathematical calculation. But consistently, men score better on mathematical word problems and on tests of mathematical reasoning, at least statistically. Again, here is a meta analysis, with 254 data sets and 3 million subjects. It shows no significant difference in childhood; this is a difference that emerges around puberty, like many secondary sexual characteristics. But there are sizable differences in adolescence and adulthood, especially in high-end samples. Here is an example of the average SAT mathematical scores, showing a 40-point difference in favor of men that's pretty much consistent from 1972 to 1997. In the Study of Mathematically Precocious Youth (in which 7th graders were given the SAT, which of course ordinarily is administered only to older, college-bound kids), the ratio of those scoring over 700 is 2.8 to 1 male to female. (Admittedly, and interestingly, that's down from 25 years ago, when the ratio was 13-to1, and perhaps we can discuss some of the reasons.) At the 760 cutoff, the ratio nowadays is 7 males to 1 female.


"...this is a difference that emerges around puberty, like many secondary sexual characteristics." Can Dr. Pinker seriously not consider the possibility that the development of secondary sexual characteristics would increase the differential social/cultural treatment of young girls? Does he think the only thing that changes for girls at puberty is the concentration of hormones in the body? wtf?

And in his own data here, he admits the number of girls in the upper tail has changed dramatically over the recent period of greater concern about gender issues and more gender-based affirmative action.

So re-consider the binary policy question of "gender-based affirmitive action, yes or no?" as it is impacted by this very difficult scientific question, "is the observed increased variability in success/ability of men relative to women based on inherent genetic factors or environmental/cultural factors?" And you'll see why I'm so thorougly annoyed about this.

Larry Summers was President of Harvard at the time he suggested this was a major cause of differential representation of men and women on the faculty. Is it still safe to ignore the policy implications of weak science?

P.S. There is one claim in the midst of Pinker's laundry list aimed at proving men and women are inherently different that I should probably go find better data on.

Seventh, a lack of differential treatment by parents and teachers. These conclusions come as a shock to many people. One comes from Lytton and Romney's meta-analysis of sex-specific socialization involving 172 studies and 28,000 children, in which they looked both at parents' reports and at direct observations of how parents treat their sons and daughters — and found few or no differences among contemporary Americans. In particular, there was no difference in the categories "Encouraging Achievement" and "Encouraging Achievement in Mathematics."

There is a widespread myth that teachers (who of course are disproportionately female) are dupes who perpetuate gender inequities by failing to call on girls in class, and who otherwise having low expectations of girls' performance. In fact Jussim and Eccles, in a study of 100 teachers and 1,800 students, concluded that teachers seemed to be basing their perceptions of students on those students' actual performances and motivation.


A few basic points of reality: First, math and science teachers at the middle and high school level aren't predominantly female (elementary school teachers are and they aren't dupes, but they are influenced by cultural expectations). Second, he's apparently never heard of the Implicit Attitudes Test which shows marked discrepancies between intentions and actual bias. People regularly report doing their best to be non-biased, but are not able to consistently hold to it. The fact that parents report they mean to be as encouragng to girls in school does not guarantee that they are. And the fact that teachers believe they are responding to their perceptions of student interest does not mean their perceptions aren't influenced by a bias to see women/girls as less interested in math/science material (which, incidentally, is a feedback loop that would push women away from the tails and back towards the mean of the distribution).

BTW, the best way to see the patently obvious difference in treatment of girls/boys in school is to look at how they treat each other. You get a better picture of the cultural bias bleeding through because they don't have the frontal lobes to effectively inhibit the implicit attitudes they pick up from the world. But I should get some actual data on this.

Wednesday, April 1, 2009

Research Methods

I teach Research Methods in Psychology. The class is structured to be a lot of work for the undergraduates at NU. It is often viewed with dread by the students. There's a lot of writing and the material seems like it might be pretty dry.

So I was jealous when I see Brad DeLong and Paul Krugman occasionally post their class syllabi online to their blogs. Their classes look interesting, but I wouldn't post a Research Methods syllabus online. Who'd want to read that on purpose?

Reflecting on this, I think maybe I'm doing it wrong. People actually like science. There are science columns in all sorts of weeklies, in newspapers, on the "most emailed" list at Yahoo or other news aggregators. And a lot of that science is not so good. People should be better at distinguishing the good stuff from the weak stuff and in theory, Research Methods is the class where they would learn to do this.

I haven't figured out how to do it right yet, though. I'm not posting the syllabus. Mythbusters on the Discovery Channel seems to manage to teach some science and get people to watch on purpose. I can't think of a way to blow stuff up in Research Methods, though.

Newspapers and colleges

Title copied from DeLong:
http://delong.typepad.com/sdj/2009/03/newspapers-and-colleges.html

Who linked the key idea from Kevin Carey:
http://chronicle.com/weekly/v55/i30/30a02101.htm

Question: Are colleges in trouble the way newspapers are in trouble? Will digital access to content wreck their business model?

My answer: No.

First, if you are teaching such that the YouTube or iTunes videos of your lectures are equivalent to taking the class, something seems to be going wrong. This is especially true if you have a reasonably small class size. Students know how to read. If the written material is sufficiently unclear that you need to then come to class and explain it, maybe you need a better text. The lecturer should be adding value over a good text one way or another.

In big classes where interaction is limited, the lecturer can add value by shaping the information from the text (emphasis/paraphrasing). Because of writing/printing lags, we can assume the text isn't necessarily up to date in every way. Maybe places where the lecturer challenges a claim of the text signals areas of interesting debate. But what about fields where the basic information doesn't change much year to year (e.g. calculus, but not psychology)? And what happens when students get their textbooks from a Kindle that can update more rapidly?

There's a learning and memory question in there: is it better to read, hear or read+hear for long-term memory of the content. The last of these implies multiple repetitions which is better. But is podcast+text < lecture+text? If not, maybe intro classes do end up changing a lot with technology.

Second, there are definitely classes and probably also extracurricular activities that depend entirely on discussion and face-to-face interactions. These become more available and potentially more important as the big intro classes become easier. One good example from the Psych department at NU is that a lot of our Psych majors get to participate directly in psychological research via an independent study. Neither Psychology nor scientific methods are as generally well understood as they could be and hands-on work in a lab is not only fun, but very valuable in both of those domains.

Third, there is a lot more to being at a college than just sitting in lectures. I'm not entirely convinced that the social networking being done outside the classroom isn't occasionally as valuable (or more) than the facts being absorbed from the courses taken. I think some professors/lectures forget this on occasion. We're a small piece of undergraduate education even beyond the fact that each of our classes is just one in the midst of many that students take.

Teaching and technology

Comments on other people's blogs feel like they disappear too quickly. I can keep better track of my thoughts and maybe even share them here. FWIW, the name comes from something else altogether but I like the phrase.

The motivation to set this up is to describe some ideas I've recently had about teaching. The idea and new things I'm trying in teaching were really all inspired by Brad DeLong's blog. Although our fields are different (economics for him, cognitive neuroscience and psychology for me), we share a certain technophilia and interest in how technological change influences information transmission and the consequences of those changes.

The current teaching experiment is using podcasts as part of my teaching of Research Methods in Psychology. When I first mentioned my idea to the academic technology staff here at Northwestern University, they assumed I wanted to record my lectures. No, that's not the idea. My idea is that as video is easy and cheap, why are we lecturing to students? In my own teaching, I spend part of the time talking at the students (lecturing) and part in more interactive discussions and exercises. The latter part of that seems to be where face-to-face interaction is actually useful.

So the idea is to pre-record a podcast of the 10-15m I used to spend at the beginning of the class lecturing based on the text. The lecture was to emphasize parts, de-emphasize other parts, paraphrase and review content they were supposed to have already read. This was followed by analysis and discussion of good and bad research examples. If they pre-watch the lecture part, I can spend the whole class on the interactive parts. In theory, this may enhance education, but in practice - ?

The seed for the core idea comes from DeLong's economic history note on Universities:
http://delong.typepad.com/sdj/2008/08/why-are-we-here.html
Excerpted:

The Pre-Gutenberg University:

  • Universities have their origins in the medieval need of the powerful to train theologians (for the church) and to train judges (for the emperor and the kings of France, England, Castile, and other kingdoms.
  • A manuscript hand-copied book back in 1000 cost roughly the same share of average annual income as $50,000 is today.
  • Hence if you have a "normal" college--eight semesters, four courses a semester--and demand that people buy and read one book a course, you are talking the equivalent of $1.6M in book outlay. Can't be done.
  • Hence you assemble the hundred or so people who want to read Boethius's The Consolation of Philosophy in a room, and have the professor read to them--hence lecture, lecturer, from the Latin lector, reader--while they frantically take notes because they are likely to never see a copy of that book again once they are out in the world administering justice in Wuerzburg or wherever...

Once the printing press arrives, book prices drop and yet, universities and lecturing persists.

In theory, classroom lecturing persists because the professor is adding something not immediately available just from the written text. Some of that comes from the ability for students and professors to interact in the classroom. Some of it may come from the immediacy of having somebody standing in front of you telling you the information -- maybe that's better for learning and memory than reading in some cases? Maybe the lecturer can re-phrase, re-act to the audience and provide better emphasis?

The podcast+more interaction theory is based on the idea that hearing it is a good supplement to reading it and interaction is useful and worth spending as much time as possible on.