I loved this book, and I want to say at the outset, that if you need “fodder” for starting conversations at social gatherings, this book is replete with it!
The author, Christian Rudder, is President and Co-founder of the dating site, OKCupid. He began collating and charting data from his own site, and then expanded his database by looking at comparative data from “rival” dating sites, Facebook, twitter and other social media sites. What he has found out is just amazing. The underlying theme is that, no matter what people claim about themselves and how they express themselves in surveys, what they actually believe and do is often quite different, but can be discovered by their online interactions. Even in the privacy of one’s own home, what one searches for in Google is revelatory.
I think my favorite aspect of this book is how it demonstrates the remarkable revolution in sociological research. It makes me gnash my teeth in regret that I did my own studies before this sort of data had become available – what fun it would have been! (Not that I didn’t have fun, in a bizarre, pedantic sort of way, but just saying….) For example, you can analyze tweets to see which people celebrate certain traditions, and how closely these mirror political borders. Using the program DOLLY (Digital Online Life and You) – to cite just one example, researchers found that the Dutch holiday of Sint Maarten is not only celebrated in the northern Netherlands, but also in Western Belgium: “the tweets reconnect old Holland to Flanders, its cultural cousin.” As the author observes: “Thus we watch an animated visualization of GPS-enabled data points, and see shadows of the Hapburgs.” Just imagine, he says, if we could have tracked the tweets in Alsace-Lorraine over the years as it changed hands from German to French to German to French, with each government trying to impose its culture and language on the people. [When we traveled to that area, it was clear the mix was still trying to sort itself out!]
Other entertaining discoveries: research on Facebook has now verified that most of us are in fact connected by six degrees of separation; the majority of searches for “missed connections” are from sightings at Walmart (and most of those are in the South); when white men write essays about themselves for dating sites, the most commonly used word after “the” is “pizza”; the most antithetically used words (words used most used by everyone else but least used by specific groups) for Asian men include “layed back” [spelled wrong] and ”6’4” (oddly, the second most typical phrase for Asian women is “tall for an asian”); and that the Center for Disease Control coordinates with Google to track epidemics because when people are getting sick, they search for symptoms and remedies.
Far and away the most revelatory data have to do with race and gender preference. The author explains, for instance, that a variety of indications (searches, friend connections, etc.) suggest the figure of 5% of the population being gay is pretty accurate and holds true across the states. But the number of self-reporting gays varies by the level of acceptance by states. So for example, if you see a state in which only 1.5% of respondents self-report as gay, you can probably pretty safely assume that 3.5% are in the closet. (He provides a lot of documentation to substantiate this.)
The details on race are the saddest, and show the extent to which race still is in fact a problem in the U.S. (in case you could possibly doubt it). Rudder reports data (not only from OKCupid but also DateHookup and match.com – a total of around 20 million Americans) on ratings of each group (white, black, Asian, and Hispanic) for each sex by each group, ranking the attractiveness of the other sex by race alone. Every single category and sex rates black women the lowest. Claiming to be part white elevates one’s rating substantially. Perhaps most significantly, data outside the U.S. reveal no such bias! He also talks about spikes in Google for searches like jokes about [the “n” word] that correspond precisely to peaks in Obama’s presidential campaign cycle.
In addition, as the author explains, you can find out a lot about peoples’ prejudices by watching the operation of Google’s sentence completion function. Google will fill in the most popular responses as you begin questions like, “Why do all blacks….” “Why do all gays….” “Why do husbands….,” etc.
Finally, the author includes a very thorough discussion about privacy, even bringing the Edward Snowden revelations to bear.
Discussion: I could have a couple of small quibbles. As one illustration, the author made a chart correlating the age at which a woman looks “most attractive” to a man by the age of the man. As the age of the man increases, the age of the woman by and large does not. But does that mean men find aging unattractive, or could there be a conscious or subconscious consideration that older women either might already have children (a.k.a. “baggage”) or conversely, might not be able to have children, which the man might want? [Or am I just trying to come up with reasons why aging women aren’t really seen as less attractive?]
And speaking of the constraints of the data, the way questions are formulated doesn’t necessarily allow for all possible variables that might come into play. [Example from a recent Facebook “test” I took: “Do you prefer acid rock, pop, or rap? Those were all the choices; no “other”; no “none of the above”. I was forced to make a choice and provide an answer that wasn’t at all accurate.]
In other words, I sometimes want more “data” to understand the data. (Rudder says in the Afterword that he deliberately omitted statistical details to make the book more readable, because “mathematical wonkiness” wasn’t what he was trying to get across.) He does add references in the back whenever possible for further study.
Another small criticism I have is that the author presents so many arresting data findings that he sometimes goes from one to the other without full elucidation.
On the other hand, I am confident the author is aware of all of this. He acknowledges an intellectual debt to Edward Tufte, who is a (perhaps “the”) leading authority on the uses, misuses, ambiguities, and deceptiveness of data, and Rudder acknowledges:
”…behind every number there’s a person making decisions: what to analyze, what to exclude, what frame to set around whatever pictures the numbers paint.”
As he concludes, this science of data analysis is just in its beginning stages; he is trying to give us a taste of what is already out there, and what is to come. It is aggregate data, he cautions; we still have to account for individual differences and quirks. But it sure is fascinating to find out what the numbers show about broad trends.
Evaluation: This book is full of stunning and provocative information about who we are, as well as who we want to be (but aren’t, at least not yet). Learning “sociology” has never been this fun!
A Few Notes on the Audio Production:
The narrator, Kaleo Griffith, and those in charge of the production did an outstanding job. This is a book which, in hard copy, is full of maps and charts and graphs and data lists, and you would think it couldn’t possibly be turned into an audiobook. But someone decided how to render the data intelligible to the ear, and how much to read and how much to suggest by “and so on.” Griffith reads in an engaging way, and is so enthusiastic it’s hard to believe it’s not his own book!
Published unabridged on 6 CDs (7 1/2 listening hours) by Random House Audio, an imprint of the Penguin Random House Audio Publishing Group, 2014