Analyzing words in brief descriptions:
Fathers and mothers describe their children

 

Gery Ryan and Thomas Weisner
Dept. of Psychiatry and Biobehavioral Sciences
UCLA, Los Angeles, CA 90095-1759
gryan@ucla.edu, tweisner@ucla.edu



Introduction

How much can we learn from a simple word analysis of qualitative data? Judging from the literature on content analysis (Krippendorff 1980, Weber 1990) and recent articles in CAM by Jehn & Doucet (1996) and Schnegg & Bernard (1996), the answer is "a lot." Here we extend what can be done with words by examining parents' descriptions of their adolescents. We ask two questions. First, what do the words parents use in their descriptions tell us about the goals they have for their children? Second, what do differences and similarities in word use tell us about the differences and similarities in informants' perceptions of their children?

We rely on standard word processing programs and other readily available software. No special formatting or coding of the data is required. The methods we describe are useful for discovering patterns in any body of text, whether fieldnotes or responses to open-ended questions, and are particularly helpful when used along with ethnographic data and with other sources of information. Word analysis can tell us about salience, patterning, and context of words, and the relationships between words, but word analysis cannot produce a holistic interpretation of cultural data.

Systematic Descriptions of Children

We know that parental perceptions of adolescents and children vary across cultures. When Super and Harkness (1986) asked Kipsigi parents in western Kenya to describe boys, the descriptions included the terms "warrior" and "fierce." When Raghavan (1993) asked South Asian parents living in the United States about their daughters, the descriptions included "hospitable" and "responsible." Such phrases or words would strike most American parents as unusual or odd. Instead, American parents are more likely to use such terms as "athletic," "independent," "argumentative," and "well-rounded" -- terms which would seem odd to most Kipsigis or South Asians.

Seeking to understand parental perceptions and attitudes toward their adolescent children, we asked parents of adolescents in the United States to write a short description of their sons or daughters. This was a relatively easy, comfortable task for most of the people whom we interviewed. Our informants are all participants in the Family Lifestyles Project (FLS) -- a 20-year longitudinal study of nonconventional and countercultural families and their children. [See Eiduson & Weisner (1978), Weisner (1986), Weisner & Garnier (1992), and Weisner et al. (1994) for reviews and key findings from the project].

In 1974 and 1975, investigators contacted 200 mothers during their third trimester of pregnancy. Mothers were involved in conventional and nonconventional living arrangements. Nonconventional arrangements included single mothers, social contract couples (not legally married), and mothers in communes or group living situations. Members of the research team have followed the mothers, their mates, and their child ever since.

Attrition has been remarkably low. In 1992-1994, the FLS researchers conducted a follow-up study of the adolescent children and reached 100% of the mothers, 98% of the teenagers, and 48% of the fathers or other mates. The central question of the adolescent follow-up was: How did these "children of the children of the 60's" turn out? Investigators asked parents about their child's performance in school, personal relationships, political attitudes, gender identity, drug use, and other characteristics.

As part of a larger questionnaire, parents were asked, "What is your teenager like now? Does she or he have any special qualities or abilities?" Parents wrote their answers in short phrases. It is on these data that we focus now. How did parents describe their children? Did mothers and fathers describe their children differently?

In 82 of the 200 families interviewed, both a male and female parent independently described their child. Nearly all the descriptions came from the biological parents. In three cases, the male parent was a step-father, and in one case the female parent was a step-mother. Since we are interested in how parents raise their children, we treat biological and step-parents as equal in our analysis. We thus have two descriptions for each of 82 children, one from the mother and the other from the father, for a total of 164 descriptions.

Each of the 82 children are different (some are more artistic, social, academic, or temperamental than others) but we can make comparisons across children because: a) we were systematic in how we asked parents to describe their experiences (we always asked the exact same question each time); b) each pair of parents described the same child; and c) we have the same number of descriptions (82) in each file. Of course, the child is not "the same" to each parent. Children respond and identify differently to each parent and parents to each child. So when parents are asked to describe their child, they are not reacting to exactly the same stimuli, but rather to a comparable family situation that has different meanings to each family member.

Handling the data

We transcribed the parents' verbatim answers into a word processor (in this case, WordPerfect 6.0). For each answer, we typed in the family identification number, the type of family, the sex of the child being described, the sex of the parent who gave the description, and the complete description. Each description was followed by a single hard return. Figure 1 shows the first three descriptions in our master file (MASTER.WP).

To facilitate analysis, we separated each unique phrase/descriptor by a period and a space. The period/space combination has two advantages. First, a period indicates the end of a sentence, and we can then use the word processor or style checker to count the number of sentences in a document (Harris 1996). Second, we can use the period as a delimiter for importing the text data into a spreadsheet or a database (like Excel or Quattro Pro).





ID009. F1030. Boy. Fthr. Loving. Obedient. Maintains own identity. Likes being home. Independent. Anxious to go to California to school.
 
  ID016. F1130. Boy. Fthr. Smart. Energetic. Arrogant. Dependent. Slick. Passive. Lack of imagination. Attraction to inner-city lifestyle.  
  ID124. F1130. Girl. Mthr. Great kid. Willing to communicate with parents. Listens. Motivated in school. Helpful around the house. Healthy. Active. Lots of friends. She tends to play it safe.  

Figure 1. Examples of master file of parents' descriptions of their children

Once we had our master file of descriptions, we sorted the descriptions by parent's sex. Since we consistently made parent's sex the fourth word of the paragraph, we can do this with our word processor. Select all your text, and tell the word processor to use the fourth word to sort the highlighted paragraphs.(1) (Before sorting, backup your file.)

We then copied mothers' and fathers' responses to separate files (MOTHER.WP & FATHER.WP). At this point we were only interested in the descriptors, so we stripped out the extraneous information in each file. This is easily semi-automated with a macro that goes to the beginning of each paragraph and deletes the first four words (ID, family type, child's and parent's sex,). Our two stripped files contained only the verbatim descriptions provided by mothers and fathers.

Simple tricks you can do with a word processor

We used WordPerfect's document information function to calculate some general statistics.(2) Document information is located under File on the top menu. Among other things, it calculates the number of characters, words and sentences, plus the average word length, the average number of words per sentence, and the maximum words per sentence. Table 1 compares these statistics for mothers' and fathers' responses.

    Mothers   Fathers   Total
Characters   9748   7625   17373
Word Count   1692   1346   3038
Sentence Count   528   411   939
Average Word Length   5.76   5.66   5.72
Average Words per Sentence   3.20   3.27   3.24
Maximum Words per Sentence   14   17   17

Table 1. Text statistics generated from WordPerfect 6.0



These simple statistics tell us that:

1) Mothers use more words to describe their children than do fathers. Of all the words used to describe the 82 children, 56% come from mothers and 44% come from fathers.

2) On average, mothers used 28% more sentences than did men. [Mothers used 528/82=6.4 phrases to describe their children, while men used 411/82=5.0 phrases. Mothers and fathers use the same number of words per phrases, but mothers said more things about their children.]

3) Mothers and fathers use roughly the same size words, about 5.7 characters each.

Fathers and mothers are more similar in this sample than they are different. Mothers use more words, but not very much more, and on other measures, fathers and mothers are about equal. Clearly, parents used the same "standard social science questionnaire schema" to answer our questions -- writing a series of terse phrases and words for a minute or so.

Learning from unique word lists

We next examine whether mothers and fathers use different words to describe their children. WORDS 2.0 (Johnson 1995) is a useful program that counts the number of running words in a text, identifies the number of unique words forms, and lists the number of occurrences of each unique form.(3) (See Bernard 1995 for a review of WORDS 2.0.) Other programs, such as CATPAC, also count the frequency of unique words. (See Doerfel and Barnett 1996 for a review of CATPAC).

To get the files ready for WORDS 2.0, we first saved our WordPerfect files (MOTHER.WP and FATHER.WP) in ASCII format (calling them MOTHER.ASC and FATHER.ASC so as not to overwrite the original files). When we analyzed each file, we used WORDS 2.0's "common word list" to exclude 125 of the most-used English terms. Figure 2 shows a portion of the two outputs. Each output tells us how many words each file contained originally,(4) how many unique words were found (including unique common words), and how many words were removed when we eliminated the common ones. WORDS 2.0 outputs the list of unique words with their respective frequency of occurrence. We indicate the rank order of each word under the # sign. (You can do this in your word processor by turning on the line numbering option.)(5)

Figure 2 shows that the MOTHER file contained a total of 1,721 words in 734 unique word forms. It contained 542 instances of the 125 common words that were eliminated from further consideration. In the end, there were 666 unique words in the file and mothers mentioned the words good, friends, loving, out, and people at least 11 times. The last word on the mothers' list, zest, was mentioned only once.

We can think of unique word lists as concentrated data or, as Tesch (1990:138-139) called them, distillations. We can produce different measures of concentration and we can compare those measures across the MOTHER and FATHER files. With 734 unique words in a corpus of 1721 words, mothers have a type-token rate, or concentration rate, of 57% (1-734/1721). Fathers have a concentration rate of 55% (1-607/1355). If we use only the 666 unique substantive words (eliminating all occurrences of words in the common-word file), then the concentration rate for mothers is 1-666/1721=61% and for fathers 1-548/1355=60%. Just 207 of the 666 substantive words occur more than once in the MOTHER file. This produces a concentration rate of 1-207/1721=88% identical to the rate (1-159/1355=88%) for fathers.

We lose a lot of information when we examine unique words. We do not know the context in which the words occurred, nor whether informants used words negatively or positively. Nor do we know how the words related to each other. But distillations like these introduce very little investigator bias (we do have to choose what words to leave out of the analysis), and they can help us identify constructs used by parents to describe their children.

The word lists suggest things about parents' values and goals for their children and the lists can be compared across fathers and mothers. For example, from Table 1 we do not know if fathers have less to say about their children or they just have less to say about all topics. From Figure 2, however, we see that men's vocabulary for describing children is as rich as women's vocabulary. (The ratio of unique words to total words is roughly equivalent for men (607/1355=.45) and for women (734/1721=.43)).

Figure 2 allows us to make crude comparisons between men's and women's use of different words. (The measures are crude because they represent rank order data and do not take into consideration the total number of words used by each group.) Both mothers and fathers use the word good a lot more than any other word. Women, for example use good almost twice as much as friends, their second most popular word. Antonyms of good are not prevalent among the word list, indicating that people might have a tendency to be optimistic in describing their children, have a response bias on questionnaires to use positive words, and are accessing a cultural model for describing one's child that emphasizes positive, growing cultural careers.

Figure 2 also suggests that men and women focus on different characteristics of their children. A comparison of the most-frequently-used words shows that friends, loving, people, and responsible are ranked higher for women than they are for men. In contrast school, hard, intelligent, bright, and independent are ranked higher for men than for women. This suggests that mothers, on first mention, express concern over interpersonal issues, while men appear to give priority to achievement-oriented and individualistic issues.

The rank ordering of word frequencies, however, are somewhat deceptive since they do not take into consideration the total number of words mentioned by men and women. We can, however, standardize the word frequencies according to what we expect to find if men and women used the same number of words. Table 2 shows the results of such a process.

Table 2. Word frequencies sorted by standardized frequency difference in gender

TERM   Both Mother Father Expected

Father

Standardized difference
               
school   26 10 16 20.32 -10.32  
good   45 22 23 29.21 -7.21  
lack   9 2 7 8.89 -6.89  
student   9 2 7 8.89 -6.89  
enjoys   6 1 5 6.35 -5.35  
independent   13 5 8 10.16 -5.16  
extremely   4 0 4 5.08 -5.08  
like   4 0 4 5.08 -5.08  
ability   7 2 5 6.35 -4.35  
own   7 2 5 6.35 -4.35  
wants   7 2 5 6.35 -4.35  
high   5 1 4 5.08 -4.08  
interested   5 1 4 5.08 -4.08  
               
great   11 6 5 6.35 -0.35  
mature   11 6 5 6.35 -0.35  
humor   9 5 4 5.08 -0.08  
times   9 5 4 5.08 -0.08  
attitude   7 4 3 3.81 0.19  
caring   14 8 6 7.62 0.38  
               
adult   4 4 0 0 4  
average   4 4 0 0 4  
difficulty   4 4 0 0 4  
goes   4 4 0 0 4  
kid   4 4 0 0 4  
lots   4 4 0 0 4  
respect   4 4 0 0 4  
talented   4 4 0 0 4  
uses   4 4 0 0 4  
honest   9 7 2 2.54 4.46  
time   9 7 2 2.54 4.46  
creative   6 6 0 0 6  
friends   16 12 4 5.08 6.92  
             



Figure 2. Counts of words used more than 5 times by mothers and fathers

  Mothers' Descriptions   Fathers' Descriptions  
  File MOTHER.ASC:

Total number of running words in file: 1,721

Number of unique word forms in file: 734

The following counts exclude 542 occurrences of 125 common word forms.

  File FATHER.ASC:

Total number of running words in file: 1,355

Number of unique word forms in file: 607

The following counts exclude 419 occurrences of 125 common word forms.

 
  #

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

...

666

Count/Word

22 good

12 friends

11 loving

11 out

11 people

10 doesn't

10 hard

10 school

9 responsible

9 sense

8 caring

8 intelligent

8 lacks

8 sensitive

7 bright

7 honest

7 others

7 self

7 time

7 well

7 work

6 creative

6 does

6 great

6 mature

6 sports

5 academically

5 artistic

5 cares

5 concerned

5 goals

5 going

5 humor

5 independent

5 other

5 social

5 times

...

1 zest

  #

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

...

548

Count/Word

23 good

16 school

11 hard

9 intelligent

8 bright

8 independent

8 out

8 well

7 doesn't

7 lack

7 loving

7 people

7 sensitive

7 sports

7 student

6 caring

6 does

6 life

6 others

6 work

5 ability

5 enjoys

5 great

5 lacks

5 likes

5 mature

5 own

5 sense

5 social

5 wants

...

1 zero

 

 

To create this table, we put mothers' and fathers' responses in a single ASCII file and counted the words again. We then selected the 131 words that informants mentioned at least four times. We put these words in the first column of a spreadsheet and put their frequency counts in the second column. In the third and fourth columns, we put the number of times each word was mentioned by women and by men respectively. Next we calculated the expected word frequencies for men if men used the same number of words as women. Since women on average used 1.27 (1721/1355) times more words as did men, we multiply the observed mens' frequency by 1.27. We put the result in the fifth column.

To compare mothers' and fathers' word use, we subtracted the expected mens' frequencies in column five from the observed women's frequencies in column three. We put the results in column six. This gave us a more accurate difference in word use between the files. Negative numbers in column six mean that the word was more likely to be used by fathers than used by mothers. Positive numbers mean that the word was more likely to be used by mothers. Numbers close to zero mean that there wasn't that much difference between the men's and women's descriptions.

Finally, we sorted the rows in the spreadsheet by the values in column six. In Table 2, we show some selected results of this comparison: words whose frequencies varied a lot between men and women as well as some examples of words whose frequencies varied little.

When we standardize for the number of words used, we find that fathers use the words school, good, lack, student, enjoys, independent, extremely, like, ability, own, wants, high, and interested more than do mothers. On the other hand, mothers use the words friends, creative, time, honest, uses, talented, respect, lots, kid, goes, difficulty, average, and adult more than do fathers. Men and women, however, are equally likely to use the words great, mature, humor, times, attitude, and caring.

Notice the differences between the standardized measures and the rank orders shown in Figure 2. The rank-order data tell us about the relative priority of words within each gender while the standardized data allow us to compare use rates across genders. For instance, the word good was the most-used word for both women and men. When we compare across genders, we find that men tend to use the word more often. In contrast, men and women are equally likely to use the word caring in their descriptions but the simple rank-order for the word is higher for women than it is for men.

Our findings are similar to other research on gender differences. On many measures, men and women, boys and girls show substantial overlap in behavioral tendencies. Although mean or modal differences often are relatively small, specific measures (in our case, emphasis on different concerns in describing teens) are quite constant and are found cross culturally (Best et al. 1994).

The word counting techniques described here do not require complex and expensive text analysis programs. These simple methods help researchers concentrate often confusing data into a more manageable form, and are relatively bias free. The techniques can be used for exploring central themes and for systematically comparing within and across groups.

Of course, these are just the first univariate, exploratory steps in a more detailed qualitative analysis. We still want to examine the context in which these words occur and how key words are related to each other. For example, how does the sex of the teen as well as the parent influence word use? We also want to explore some of the hypotheses that we have formed in this simple first step. Treating words as units of analysis offers researchers a simple way of exploring text and confirming hypothesis.

References

Endnotes

1. To do this in WordPerfect for Windows 6.1: Select Tools/Sort from the menu. When the menu for sorting appears, tell WordPerfect to sort by paragraph. (Note: WordPerfect assumes that paragraphs are separated by two hard returns). Make sure that the appropriate settings are marked as follows: Type = Alpha, Sort Order = Ascending, Line = 1, Field = 1, and Word = 4. After making the changes, select OK. WordPerfect will put all the fathers' responses on top of the file and all the mothers' responses on the bottom. To do this in Word 6.0: Select Table/Sort Text. Select Options. In the "Separates fields at:" dialogue box, select "Other" and fill in the box with a single period. Select OK. In the "Sort by" dialogue box, select "Field 4." Select OK.

2. Similar statistics can be obtained in Microsoft's Word. Word, however, does not automatically count the number of sentences in a document. To do so, you need to build a macro, as follows:

Sub MAIN
StartOfDocument
Count = 0
While SentRight(1, 1) <> 0
If Right$(Selection$(), 1) <>Chr$(13) Then count = count +1
Wend
MsgBox "Number of sentences in document:" + Str$(count)
End Sub

3. WORDS 2.0 was created by Eric Johnson and is distributed by TEXT Technology, 114 Beadle hall, Dakota State University, Madison, SD, 57042-1799. Email: langners@columbia.dsu.edu. For information on other programs that Johnson has created, check out the website http://www.dsu.edu/~johnsone/ericpgms.html.

4. 4. The total number of words identified by WordPerfect 6.0 for the MOTHERS.WP file (1,692) differs from the total number of words identified by WORDS 2.0 (1,721). For the same file, Word 6.0 counts 1,731 words. Discrepancies occur because each program has a slightly different definition of what counts as a word. In our case, single hyphens (-) are the leading culprits. WordPerfect 6.0 counts the hyphen as a word while WORDS 2.0 and Word 6.0 do not. Since most of our calculations use the total number of words as a fixed denominator and this denominator tends to be quite large, slight increases or decreases have little effect on the overall analysis. Be aware, however, that these differences do exist -- and are rarely documented.

5. 5. In WordPerfect 6.0 this is found under Format/Line. In Word 6.0 it is located under File/Page Setup/Layout.

 

[geneva97/eop.htm]