Gender differences in language use

Newmann et alii (2009)

http://emanjavacas.github.io/slides/

1 General research question

-> To what extent do women and men differ in their language use

-> Try to analyze gender differences across multiple linguistic categories and multiple contexts

1.1 Spot the difference!

female_raw.jpg

male_raw.jpg

1.2 What we now know: previous findings (since the 80s)

1.2.1 Turn-taking conversation

(Mulac, Bradac, Gibbons 1988)

  • Women used more questions (“Does anyone want to get some food?”)
  • Men used more directives (“Let’s go get some food”)
  • longer sentences among women
  • more and longer turns taken among men

1.2.2 Specific phrase level considerations of female language

(Robin Lakoff 1975)

  • hedges and tag questions (“it seems like”, “…, aren’t you?”)
  • Uncertainity verb phrases (“I wonder if…”)

1.2.3 Corpus-based research

(Biber, Conrad, Reppen 1998)

  • Females

– more intensive adverbs

– more conjunctions (“but”)

– more modal and auxiliary verbs (“could”)

– more first person plural

  • Males

– longer words

– more articles

– more references to locations

1.3 Shortcomings

1.3.1 Non replicability, contradictory reports

  • Thomson and Murachver’s (2001),

– e-mail communication found that men and women were equally likely to ask questions

  • (Mulac, Seibold, & Farris, 2000), role play,

men used more negations and asked more questions – women posed more directives

  • Explanations:

– studies do not take into account context

  • coding schemes are not always consistent across studies

“One researcher’s uncertainty verb phrase is another’s hedge”

1.3.2 Non significant results (due to the small effect sizes)

Analysis of the correlation between gender and sentence length:

-> significant (p < .05) effect size (d) of .12

-> sample of over 2,000 people

-> mean words-per-sentence was 23.4 for men and 19.1 for women

-> with a standard deviation of 35.1

Pennebaker and Stone (2003)

sentence_length_200.jpg

2 Methodology

2.1 Corpus

2.1.1 Description

-> 11,609 participants, -> approximately 45,700,000 words -> 14,324 final text files, with 5,971 written by men and 8,353 written by women.

2.1.2 Taking context into account

registers.jpg

2.2 LIWC (Linguistic Inquiry and Word Count)

-> Used as text preprocessing to filter out relevant categories (or features) which might be relevant for gender differentiation

–> LIWC defines a total of 74 categories

–> LIWC takes into account both function words and content words

–> but also more abstract entities like POS.

–> 54 language dimensions.

2.2.1 Snapshot of the official documentation

Click hier –> Link to the categories

Link to the LIWC website

2.2.2 Example output of LIWC

female_raw_coloured.png male_raw_coloured.png

  Women Men Example
Linguistic Processes      
Total Function Words 37 26  
Adverbs 5 3 “very”, “really”
Article 2 2 “a”, the”
Prepositions 3 10 “to”, “with”
Personal Pronoun 11 7 “us”, “them”
First Pers. Pronoun 8 7 “I”
Total Verbs 10 3  
Present tense verbs 5 3 “is”, “does”
Auxiliary Verbs 7 3 “am”, “will”
Psychological Processes      
Affective processes 3 1 “cried”, “abandon”
Positive emotions 2 1 “love”, “nice”
Social processes 8 1 “mate”, “talk”
Cognitive processes 9 10 “cause”, “know”
Relativity 4 3 “exit”, “area”
Total 134 96  

3 Results

3.1 Multivariate analysis

-> MANOVA (Multivariate Analysis Of VAriance)

–> F(53; 14,270) = 30.66 –> p < 0.05

Friendly explanation of MANOVA

3.2 Summary of interesting results

Females Males Null effect Effect size (d)
pronouns     .36
present-tense verbs     .18
  word-legth   .24
  numbers   .15
  articles   .24
  prepositions   .17
social words     .21
positive feelings     .15
anxiety     .16
  swear words   .22
  occupation   .12
    word count  
    question marks  
    sex  

3.3 Function words are better discriminants

-> comparation of the average effect size for function words categories with the average effect size for the content words categories:

– Content words: verbs (feeling, hearing, insight…) => d = .10

– Content words: nouns (friends, family, occupation, money, metaphysical…) => d = .11

– Function words (articles, prepositions, pronouns…) => d = .20

3.4 Taking context into account (Crossover)

3.4.1 Crossover effects

-> When considering single registers effect sizes can change their sign

    Emotion Time M. SoC Fiction TAT Exams Conversation
pronouns .36 .25 .20 .33 .79 .30 .46 .03
present verbs .18 .14 .04 .17 .28 .09 .41 -.21
word-legth -.24 -.16 -.04 -.09 -.32 -.17 -.45 -.44
numbers -.15 -.09 -.15 -.13 -.37 -.10 -.09 14
articles -.24 -.21 -.07 -.33 -.70 -.22 -.05 -.77
prepositions -.17 -.11 -.09 -.12 -.26 -.09 -.11 -.74
anxiety .16 .13 .05 -.09 .04 .12 -.30 .16
swear words -.22 -.14 -.10 -.24 -.28 -.12 -.06 -.43

3.4.2 Discriminant contexts/registers

Register Effect size (d)
Fiction .31
Conversation .26
Exams .22
SoC .11
TAT .09
Emotion .08
Time Management .08

4 Conclusions

-> Small and subtle differences but consistent and systematic

-> Function words are found to be the best discriminators

-> Mainly, the following variables: word length, articles, swear words, social words and pronouns

-> Further work is needed (the article was published 2009!) in reavealing the “why”

4.1 Discussion question

The phenomenon analized consists on a entire range of variables that:

-> act together, holistically (as opposed to the markers analyzed before)

-> are rather abstract and non-salient (frequencies of function words, etc…)

-> to what extent can this be considered to be relevant to active construction of identity?

-> could we explain this as a result of a unconsciously inherited language style?

5 Bibliography

Check the paper!

“Gender differences in language use” Newmann et alii (2009)