CAM, The Cultural Anthropology Methods Journal, Vol. 8 no. 1, 1996 |
H.
Russell Bernard
University of Florida(1)
Every cultural anthropologist ought to be
interested in finding, or creating, and analyzing texts. By
"finding" texts I mean things like diaries, property
transactions, food recipes, personal correspondence, and so on.
By "creating" texts I mean recording what people say
during interviews.
But by creating text, I also mean doing what
Franz Boas did with George Hunt, and what Paul Radin did with Sam
Blowsnake. In 1893, Boas taught Hunt to write Kwakiutl, Hunt's
native language. By the time Hunt died in 1933, he had produced
5,650 pages of text -- a corpus from which Boas produced most of
his reports about Kwakiutl life (Rohner 1966).
Sam Blowsnake was a Winnebago who wrote the
original manuscript (in Winnebago) that became, in translation, Crashing
Thunder: An Autobiography of a Winnebago Indian (Radin
1926). More recently, Fadwa El Guindi (1986), James Sexton
(1981), and I (Bernard and Salinas 1989), among others, have
helped indigenous people create narratives in their first
languages.
Original texts provide us with rich data --
data that can be turned to again and again through the years as
new insights and new methods of analysis become available. Robert
Lowie's Crow texts and Margaret Mead's hours and hours of cinema
verité about Bali dance are clear examples of the value of
original text. Theories come and go but, like the Pentateuch, the
Christian Gospels, the Q'uran and other holy writ, original texts
remain for continued analysis and exegesis.
If we include all the still and moving images
created in the natural course of events (all the television
sitcoms, for example), and all the sound recordings (all the jazz
and rock and country songs, for example), as well as all the
books and magazines and newspapers, then most of the recoverable
information about human thought and human behavior is
naturally-occurring text. In fact, only the tiniest fraction of
the data on human thought and behavior was ever collected for the
purpose of studying those phenomena. I suppose that if we piled
up all the ethnographies and questionnaires in the world we'd
have a pretty big hill of data. But it would be dwarfed by the
mountain of naturally-occurring texts that are available right
now, many of them in machine-readable form.(2)
One of the things I like best about texts is
that they are as valuable to positivists as they are to
interpretivists. Positivists can tag text and can study
regularities across the tags. This is pretty much what content
analysis (including cross-cultural hypothesis testing) is about.
Interpretivists can study meaning and (among other things) look
for the narrative flourishes that authors use in the (sometimes
successful, sometimes unsuccessful) attempt to make texts
convincing.
Scholars of social change have lots of
longitudinal quantitative data available (the Gallup poll for the
last 50 years, the Bureau of Labor Statistics surveys for the
last couple of decades, baseball statistics for over a hundred
years, to name a few well-studied data sets), but longitudinal
text data are produced naturally all the time. For a window on
American popular culture, take a look at the themes dealt with in
country music and in Superman comics over the years.
Or look at sitcoms and product ads from the
1950s and from the 1990s. Notice the differences in, say, the way
women are portrayed or in the things people think are funny in
different eras. In the 1950s, Lucille Ball created a furor when
she became pregnant and dared to continue making episodes of the I
Love Lucy show. Now think about almost any episode of Seinfeld.
Or scan some of the recent episodes of popular soap operas and
compare them to episodes from 30 years ago. Today's sitcoms and
soaps contain much more sexual innuendo.
How much more? If you were interested in
measuring that, you could code a representative sample of
exemplars (sitcoms, soaps) from the 1950s and another
representative sample from the 1990s, and compare the codes
(content analysis again). Interpretivists, on the other hand,
might be more interested in understanding the meaning across time
of concepts like "flirtation," "deceit,"
"betrayal," "sensuality," and
"love," or the narrative mechanisms by which any of
these concepts is displayed or responded to by various
characters.
Suppose you ask a hundred women to describe
their last pregnancy and birth, or a hundred labor migrants to
describe their last (or most dangerous, or most memorable)
illegal crossing of the border, or a hundred hunters (in New
Jersey or in the Brazilian Amazon region) to describe their last
(or greatest, or most difficult, or most thrilling) kill. In the
same way that a hundred episodes of soap operas will contain
patterns about culture that are of interest, so will a hundred
texts about pregnancies and hunts and border crossings.
The difficulty, of course, is in the coding of
texts and in finding the patterns. Coding turns qualitative data
(texts) into quantitative data (codes), and those codes can be
just as arbitrary as the codes we make up in the construction of
questionnaires.
When I was in high school, a physics teacher
put a bottle of Coca-Cola on his desk and challenged our class to
come up with interesting was to describe that bottle. Each day
for weeks that bottle sat on his desk as new physics lessons were
reeled off, and each day new suggestions for describing that
bottle were dropped on the desk on the way out of class.
I don't remember how many descriptors we came
up with, but there were dozens. Some were pretty lame (pour the
contents into a beaker and see if the boiling point was higher or
lower than that of sea water) and some were pretty imaginative
(let's just say that they involved anatomically painful
maneuvers), but the point was to show us that there was no end to
the number of things we could measure (describe) about that Coke
bottle, and the point sunk in. I remember it every time I try to
code a text.
Coding is one of the steps in what is often
called "qualitative data analysis," or QDA. Deciding on
themes or codes is an unmitigated, qualitative act of analysis in
the conduct of a particular study, guided by intuition and
experience about what is important and what is unimportant. Once
data are coded, statistical treatment is a matter of data
processing, followed by further acts of data analysis.
When it comes right down to it, qualitative
data (text) and quantitative data (numbers) can be analyzed by
quantitative and qualitative methods. In fact, in the phrases
"qualitative data analysis" and "quantitative data
analysis," it is impossible to tell if the adjectives
"qualitative" and "quantitative" modify the
simple noun "data" or the compound noun "data
analysis." It turns out, of course, that both QDA phrases
get used in both ways. Consider the following table:
Analysis | Data | ||
Qualitative | Quantitative | ||
Qualitative | a | b | |
Quantitative | c | d |
Cell a is the qualitative analysis of
qualitative data. Interpretive studies of texts are of this kind.
At the other extreme, studies of the cell d variety
involve, for example, the statistical analysis of questionnaire
data, as well as more mathematical kinds of analysis.
Cell b is the qualitative analysis of
quantitative data. It's the search for, and the presentation of,
meaning in the results of quantitative data processing. It's what
quantitative analysts do after they get through doing the work in
cell d. Without the work in cell b, cell d
studies are puerile.
Which leaves cell c, the quantitative
analysis of qualitative data. This involves turning the data from
words or images into numbers. Scholars in communications, for
example, might tag a set of television ads from Mexico and the
U.S. in order test whether consumers are portrayed as older in
one country than in the other. Political scientists might code
the rhetoric of a presidential debate to look for patterns and
predictors. Archeologists might code a set of artifacts to
produce emergent categories or styles, or to test whether some
intrusive artifacts can be traced to a source. Cultural
anthropologists might test hypotheses across cultures by coding
data from the million-pages of ethnography in the Human Relations
Area Files and then doing a statistical analysis on the set of
codes.
Strictly speaking, then, there is no such thing
as a quantitative analysis of qualitative data. The qualitative
data (artifacts, speeches, ethnographies, TV ads) have to be
turned first into a matrix, where the rows are units of analysis
(artifacts, speeches, cultures, TV ads), the columns are
variables, and the cells are values for each unit of analysis on
each variable.
On the other hand, the idea of a qualitative
analysis of qualitative data is not so clear-cut, either. It's
tempting to think that qualitative analysis of text (analysis of
text without any recourse to coding and counting) keeps you
somehow "close to the data." I've heard a lot of this
kind of talk, especially on e-mail lists about working with
qualitative data.
Now, when you do a qualitative analysis of a
text, you interpret it. You focus on and name themes and tell the
story, as you see it, of how the themes got into the text in the
first place (perhaps by telling your audience something about the
speaker whose text you're analyzing). You talk about how the
themes are related to one another. You may deconstruct the text,
look for hidden subtexts, and in general try to let your audience
know the deeper meaning or the multiple meanings of the text.
In any event, you have to talk about
the text and this means you have to produce labels for themes and
labels for articulations between themes. All this gets you away
from the text, just as surely as numerical coding does.
Quantitative analysis involves reducing people (as observed
directly or through their texts) to numbers, while qualitative
analysis involves reducing people to words -- and your words, at
that.
I don't want to belabor this, and I certainly
don't want to judge whether one reduction is better or worse than
the other. It seems to me that scholars today have at their
disposal a tremendous set of tools for collecting, parsing,
deconstructing, analyzing, and understanding the meaning of data
about human thought and human behavior. Different methods for
doing these things leads us to different answers, insights,
conclusions and, in the case of policy issues, actions. Those
actions have consequences, irrespective of whether our input
comes from the analysis of numbers or of words.
1. 1. This was written while I was at the University of Cologne (July 1994-July 1995). I thank the Alexander von Humboldt Foundation, the Institut für Völkerkunde at the University of Cologne, and the College of Arts and Sciences, University of Florida for support during this time.
2. 2. The Human Relations Area Files (HRAF) consists of about one million pages of text on about 550 societies around the world. All the data on a 60-culture sample from that database are now available on CD-ROM. HRAF plans to convert the entire million-page corpus of text to machine-readable form over the next few years. The Center for Electronic Texts in the Humanities at Rutgers University is bringing together hundreds of machine-readable corpera (the Bible, all of Shakespeare's work, all the ancient Greek and Latin plays and epics). Lexis has placed the entire corpus of Supreme Court opinions on line. The list goes on and on. Conversions of text corpera to on-line databases proceeds at a breathtaking pace.
[geneva97/eop.htm]