This is the first in a series of blog posts by me and James Heathers on the research of Dr. Nicolas Guéguen, of the Université Bretagne-Sud in France. We will be examining one of Dr. Guéguen's studies in each post. Cathleen O'Grady at Ars Technica has written an excellent overview of the situation.
We start with this article:
Guéguen, N. (2012). Color and women hitchhikers’ attractiveness: Gentlemen drivers prefer red. Color Research & Application, 37, 76–78. http://dx.doi.org/10.1002/col.20651
Participants were drivers on a road in Brittany (western France). A female confederate wearing one of six colours of t-shirt (black, white, red, blue, green, or yellow) stood by the side of the road posing as a hitchhiker (with a male confederate stationed out of sight nearby for security purposes). She noted whether each driver who stopped to offer her a lift was male or female. In order to establish the number of drivers of each sex who drove along the road (whether or not they stopped), two other confederates were stationed 500 metres away in a car by the side of the road, facing the approaching traffic. As each car passed, they noted whether the driver was a man or a woman. Using the count of male vs female drivers who stopped, and the count of male vs female drivers who passed, it was found that male drivers were considerably more likely to stop when the hitchhiker was wearing a red t-shirt compared to any other colour.
There are some puzzling aspects to this article.
We start with this article:
Guéguen, N. (2012). Color and women hitchhikers’ attractiveness: Gentlemen drivers prefer red. Color Research & Application, 37, 76–78. http://dx.doi.org/10.1002/col.20651
Brief summary of the article
The author's hypothesis was that male (but not female) drivers would be were more likely to offer a lift to a female hitchhiker if she was wearing a red (versus any other colour) t-shirt. The independent variables were the colour of the woman's t-shirt and the sex of the driver, and the dependent variable was the driver's decision to stop or not.Participants were drivers on a road in Brittany (western France). A female confederate wearing one of six colours of t-shirt (black, white, red, blue, green, or yellow) stood by the side of the road posing as a hitchhiker (with a male confederate stationed out of sight nearby for security purposes). She noted whether each driver who stopped to offer her a lift was male or female. In order to establish the number of drivers of each sex who drove along the road (whether or not they stopped), two other confederates were stationed 500 metres away in a car by the side of the road, facing the approaching traffic. As each car passed, they noted whether the driver was a man or a woman. Using the count of male vs female drivers who stopped, and the count of male vs female drivers who passed, it was found that male drivers were considerably more likely to stop when the hitchhiker was wearing a red t-shirt compared to any other colour.
There are some puzzling aspects to this article.
1. Number of volunteers
The article states (p. 77) that the five female confederates were chosen by a group of men who
rated the facial attractiveness of “18 female volunteers with the same height, with the same breast size (95 cm of bust measurement and bra with a ‘B’ size cup), and same hair color”. It is interesting to think about how many there women must be in the volunteer pool at the Université Bretagne-Sud in order for 18 with the same height, bra size, and hair colour to put themselves forward to stand for hours on end (see point 4, below) to stop passing drivers. Once the attractiveness of the participants had been established, the five who were rated closest to the middle of the scale were chosen, and "precautions were taken to verify that the rates of attractiveness were not statistically different between the confederates", whatever that means. Oh, and "All of the women stated that they were heterosexuals"—presumably to ensure that they gave off the right vibes through the windscreens of approaching cars.
The article states (p. 77) that the five female confederates were chosen by a group of men who
rated the facial attractiveness of “18 female volunteers with the same height, with the same breast size (95 cm of bust measurement and bra with a ‘B’ size cup), and same hair color”. It is interesting to think about how many there women must be in the volunteer pool at the Université Bretagne-Sud in order for 18 with the same height, bra size, and hair colour to put themselves forward to stand for hours on end (see point 4, below) to stop passing drivers. Once the attractiveness of the participants had been established, the five who were rated closest to the middle of the scale were chosen, and "precautions were taken to verify that the rates of attractiveness were not statistically different between the confederates", whatever that means. Oh, and "All of the women stated that they were heterosexuals"—presumably to ensure that they gave off the right vibes through the windscreens of approaching cars.
2. Two different sample sizes
There is a curious inconsistency between Table 1 and the main text. In Table 1, the numbers of male and female drivers are listed as 3,474 and 1,776, respectively. However, these two numbers sum to 5,250, rather than 4,800 (which was the sample size reported elsewhere in the article, with 3,024 male and 1,776 female drivers). It is not clear how such an error might creep in by accident, since it requires two very different digits (4 instead of 0 and 7 instead of 2) to be mistyped.
3. The colours of the t-shirts
The article reports the colours of the T-shirts worn by the hitchhikers in very precise terms, even going so far as to give their HSL (
Saturation
|
Luminance
|
Stated colour
|
Hex colour
|
Result
|
|
0
|
0
|
0
|
Black
|
#000000
|
|
0
|
0
|
100
|
White
|
#ffffff
|
|
16
|
92
|
Red
|
#f88a62
|
||
19
|
100
|
94
|
Yellow
|
#ffeae0
|
|
210
|
100
|
100
|
Blue
|
#ffffff
|
|
99
|
66
|
87
|
Green
|
#d7f4c8
|
It would be interesting to learn how these HSL numbers were obtained, since several of them are so badly wrong. Indeed, it is not clear why it was considered necessary to report the colours with such precision; it would surely have been enough to state that bright, unambiguous examples of each color had been selected. For that matter, given how long it must have taken to test so many drivers (see the next point) and that the author had a clear hypothesis about the effects of the colour red, it is not clear why so many different colours of t shirt were tested.
4. How long did all this take?
The article states (p. 77) that “Each hitchhiker was instructed to test 960 drivers. After the passage of 240 drivers, the confederate stopped and was replaced by another confederate.” No indication is given of how long it took for 240 drivers to pass. However, the article also tells us (p. 77) that the research was conducted "at the entry of a famous peninsula of Brittany in France". So perhaps we can get a clue from another Guéguen article:
Guéguen, N. (2007b). Bust size and hitchhiking: A field study. Perceptual and Motor Skills, 105, 1294–1298. http://dx.doi.org/10.2466/pms.105.4.1294-1298
Yes, you read that right. Dr. Guéguen did indeed conduct a study to see whether women with larger breasts get more offers of lifts from men. (I bet you can't guess what the result was.) Anyway, that study—which had a very similar procedure to the one we're discussing here, except that the, er, manipulated (!) independent variable was the apparent size of the female hitchhiker's breasts, rather than the colour of her t-shirt—was conducted "at the entry of a famous peninsula ('Presqu'Île de Rhuys') of Brittany in France". Assuming that the “famous peninsula” mentioned in both studies is the same place (which would make sense, if that location is indeed particularly propitious for lone female hitchhikers), and assuming similar traffic flows to those reported in the "bust size" study, in which the passage of 100 cars took “about 40 to 50 minutes” (p. 1296), we assume that it took between 1.5 and 2 hours for 240 cars to pass. In order to test 4,800 drivers, then, a total of 30 to 40 hours of testing would be required. The "t-shirt colour" article also states (p. 77) that the experiment “took place during summer weekends on clear sunny afternoons between 2 and 5 PM.” With three hours being available on Saturday and three more on Sunday, the experiment would thus have taken between five and seven complete weekends, assuming that every hour of testing time was sunny (a contingency that is far from guaranteed in Brittany). Yet none of the confederates who gave up multiple weekends to accomplish this Herculean task on behalf of psychological science are listed as co-authors, or even acknowledged in any way, in the resulting article.
Additionally, the design of the experiment appears to require that only drivers who were alone in their cars should be counted, since the purported effect of the red t-shirt was to increase the sexual attractiveness of the wearer. You might expect that if a male driver's wife is in the car, it could affect any sexually-motivated enthusiasm he might have for offering a lift to the hitchhiker, whatever the colour of her t-shirt; alternatively, a female driver might have been willing to help the hitchhiker had all of the seats in her car not been full of children. We have seen a statement by Dr. Guéguen in which he confirmed that "L’expérience n’incluait effectivement que des personnes seules. Les automobiles avec plusieurs personnes ne sont pas prises en compte dans l’étude" ("The experiment only included people [driving] on their own. Cars containing multiple people were not counted in the study"). So the figure calculated above for the number of hours and days taken to test the required number of drivers needs to be multiplied by some factor to take into account the percentage of cars with multiple occupants. Given that the study was carried out on sunny weekend afternoons in summer in an area with a substantial number of tourists, it seems reasonable that perhaps half of the cars driving along the road on a summer afternoon might have had more than one occupant, which would either double the number of weekends required for collection of the data to between 10 and 14, or in any case more than compensate in our calculations for any growth in local traffic since the "variable bust size" study was conducted.
Another way to think about the time involved is to consider the interactions of the hitchhiker with the drivers who stopped. Even if it took an average of only two minutes to catch up to the car where it stopped (probably some distance along the road from her), introduce herself, explain that there was an experiment taking place, "warmly thank" the driver, and return to her starting point, that would require nearly 10 hours (i.e., four afternoons) just for the 579 drivers (450 male, 129 female) who were reported as having stopped, even assuming that in every case a new driver then stopped immediately afterwards. If drivers were only stopping at the rate of one every five minutes overall (12 per hour), it would take 48 solid hours to test 579 drivers.
5. Problems with recording the sex of the drivers
As mentioned in the introduction, there were two observers whose job was to observe every passing car and record its driver's sex. (Per the previous point, it is worth thinking about the challenge of determining whether or not a driver is on their own in the vehicle, which requires, for example, determining whether a car driving past at around 20 metres per second does or does not have a small child in the back seat.) The article states (p. 77) that “[t]he convergence between the two observers’ evaluation was high (r = 0.97)”.
There is a major problem here. In order for a correlation coefficient to be calculated, we need more information than the simple total numbers of male and female drivers. Specifically, the two observers would need to independently record both the sex of each driver and the sequence in which those drivers were observed; for example, with ten drivers and disagreement about the sex of the third, the correlation between MFFFMMMFMM and MFMFMMMFMM would be .80. However, the article reports (p. 77) that each of the observers “used two hand-held counters, one to count the female motorists and the other to count the male motorists”. The term “hand-held counters” suggests simple mechanical devices, such as those used to count attendees at sporting events (such as this). But without synchronized timestamps across all four of these counters, or some other form of sequential tracking, it is not possible to establish the order in which the drivers passed each of the observers. More sophisticated methods of collecting and correlating these data can be imagined (for example, using laptop computers), but of course both observers had their hands full with the counters. With just a count of male and female drivers from each observer, stating a correlation coefficient makes no sense. It is therefore entirely unclear how the author could have established the correlation coefficient that he reported.
Conclusion
In view of the above points, it is not clear that how the study can actually have taken place as described in the article. As noted above, we have seen a statement by Dr. Guéguen (with whom we have been indirect contact for almost two years now, via the good offices of the French Psychological Society, about a number problems in several of his published articles; more on this to come in a subsequent post) concerning the question of whether only drivers who were on their own were tested. That statement did not, however, provide any specific or relevant answers to any of the other issues about this article that I have discussed here.
[[Update 2017-11-29 22:08 UTC: Added link to Cathleen O'Grady's article. ]]