Introduction to Crosstabs


You are investigating whether women in a given company are being discriminated against. You look at the salaries of women and find this:

Company 1

Female

Under 50K

400

Over 50K

200

600

At  another company, you  find this:

Company 2

Female

Under 50K

300

Over 50K

300

600

What would you conclude about the first company? It looks like the first company discriminates while the second doesn't, right? 

Let's look at it from another point of view. Let's just consider all the people who are making good money, starting with the first company:

Company 1

Male

Female

Over 50K

100

200

300

Notice that twice as many women as men are making good money. If we look at the second company, we see more or less the same story:

Company 2

Male

Female

high pay

200

300

500

So, in both companies, it looks like more women than men are making good money. But from the first two tables, it appears that in Company 1 most women are making low pay, while in Company 2, its about half and half.

Now let's look at the men in both companies:

Company 1

Male

Female

Under 50K

200

400

600

Over 50K

100

200

300

300

600

900

Is there discrimination at Company 1? The biggest group of people is females who make less than 50K!

Company 2

Male

Female

low pay

100

300

400

high pay

200

300

500

300

600

900

How about at Company 2?

If we convert to column percentages we get:

Company 1

Male

Female

Under 50K

66.67%

66.67%

Over 50K

33.33%

33.33%

100.00%

100.00%

Company 2

Male

Female

low pay

33.33%

50.00%

 

high pay

66.67%

50.00%

100.00%

100.00%

Obviously, it is Company 2 that has lower salaries for women. So it seems like we should always look at the percentages, right?

Have a closer look at the raw frequencies for company 2. What if we had sampled three times as many low paying people as high paying people? Then the results would have looked this:

Company 2

Male

Female

low pay

300

900

1200

high pay

200

150

350

500

1050

1550

And the percentages are these:

Company 2

Male

Female

low pay

60.00%

85.71%

high pay

40.00%

14.29%

100.00%

100.00%

These percentages are different! Somehow, we need to get a handle on the fact that there are different numbers of males and females, AND different numbers of high and low paying jobs. We can't see the pattern in the data because of these different sizes of the groups. A way to deal with this is given in the next handout.