Using Indicator Variables

Consider the square footage at first:

– β2 is the value of an additional square foot of living area and β1 is the value of the land alone

How do we account for location, which is a qualitative variable?

– Indicator variables are used to account for qualitative factors in econometric models

– They are often called dummy, binary or dichotomous variables, because they take just two values, usually one or zero, to indicate the presence or absence of a characteristic or to indicate whether a condition is true or false

– They are also called dummy variables, to indicate that we are creating a numeric variable for a qualitative, non-numeric characteristic

– We use the terms indicator variable and dummy variable interchangeably

Generally, we define an indicator variable D as:

– So, to account for location, a qualitative variable, we would have:

Adding our indicator variable to our model:

If our model is correctly specified, then:

Adding an indicator variable causes a parallel shift in the relationship by the amount δ

An indicator variable like D that is incorporated into a regression model to capture a shift in the intercept as the result of some qualitative factor is called an intercept indicator variable, or an intercept dummy variable

The least squares estimator’s properties are not affected by the fact that one of the explanatory variables consists only of zeros and ones

– D is treated as any other explanatory variable.

– We can construct an interval estimate for D, or we can test the significance of its least squares estimate

FIGURE 7.1 An intercept indicator variable

The value D = 0 defines the reference group, or base group

We could pick any base

For example:

Then our model would be:

Suppose we included both D and LD:

– The variables D and LD are such that D + LD = 1

– Since the intercept variable x1 = 1, we have created a model with exact collinearity

– We have fallen into the dummy variable trap.

– By including only one of the indicator variables the omitted variable defines the reference group and we avoid the problem

Suppose we specify our model as:

– The new variable (SQFT x D) is the product of house size and the indicator variable

– It is called an interaction variable, as it captures the interaction effect of location and size on house price

– Alternatively, it is called a slope-indicator variable or a slope dummy variable, because it allows for a change in the slope of the relationship

Now we can write:

FIGURE 7.2 (a) A slope-indicator variable (b) Slope- and intercept-indicator variables

The slope can be expressed as:

Assume that house location affects both the intercept and the slope, then both effects can be incorporated into a single model:

– The variable (SQFTD) is the product of house size and the indicator variable, and is called an interaction variable

– Alternatively, it is called a slope-indicator variable or a slope dummy variable

We can see that:

Consider the wage equation:

– The expected value is:

Applying Indicator Variables

Table 7.3 Wage Equation with Race and Gender

Recall that the test statistic for a joint hypothesis is:

To test the J = 3 joint null hypotheses H0: δ1 = 0, δ2 = 0, γ = 0, we use SSEU = 130194.7 from Table 7.3

– The SSER comes from fitting the model:

for which SSER = 135771.1

Therefore:

– The 1% critical value (i.e., the 99th percentile value) is F(0.99,3,995) = 3.80.

– Thus, we conclude that race and/or gender affect the wage equation.

Now consider our wage equation:

“Are there differences between the wage regressions for the south and for the rest of the country?’’

– If there are no differences, then the data from the south and other regions can be pooled into one sample, with no allowance made for differing slope or intercept

– To test this, we specify:

(7.10)

Now examine this version of Eq. 7.10:

Table 7.5 Comparison of Fully Interacted to Separate Models

From the table, we note that:

We can test for a southern regional difference.

We estimate Eq. 7.10 and test the joint null hypothesis

Against the alternative that at least one θi ≠ 0

This is the Chow test

The F-statistic is:

– The 10% critical value is Fc = 1.85, and thus we fail to reject the hypothesis that the wage equation is the same in the southern region and the remainder of the country at the 10% level of significance

– The p-value of this test is p = 0.9009

Remark:

– The usual F-test of a joint hypothesis relies on the assumptions MR1–MR6 of the linear regression model

– Of particular relevance for testing the equivalence of two regressions is assumption MR3, that the variance of the error term, var(ei ) = σ2, is the same for all observations

– If we are considering possibly different slopes and intercepts for parts of the data, it might also be true that the error variances are different in the two parts of the data

– In such a case, the usual F-test is not valid.

Consider the wage equation in log-linear form:

– What is the interpretation of δ?

Expanding our model, we have:

Log-linear Models

Let’s first write the difference between females and males:

– This is approximately the percentage difference

The estimated model is:

– We estimate that there is a 24.32% differential between male and female wages

For a better calculation, the wage difference is:

– But, by the property of logs:

Subtracting 1 from both sides:

– The percentage difference between wages of females and males is 100(eδ – 1)%

– We estimate the wage differential between males and females to be:

100(eδ – 1)% = 100(e-0.2432 – 1)% = -21.59%

12

ââ

PRICESQFTe

=++

1 if characteristic is present

0 if characteristic is not present

D

ì

=

í

î

1 if property is in the desirable neig

hborhood

0 if property is not in the desirable

neighborhood

D

ì

=

í

î

12

ââ

PRICEDSQFTe

d

=+++

(

)

(

)

12

12

ââ when 1

ââ when 0

SQFTD

EPRICE

SQFTD

d

ì

++=

ï

=

í

+=

ï

î

1 if property is not in the desirable

neighborhood

0 if property is in the desirable neig

hborhood

LD

ì

=

í

î

12

ââ

PRICELDSQFTe

l

=+++

12

ââ

PRICEDLDSQFTe

dl

=++++

(

)

12

ââ

PRICESQFTSQFTDe

g

=++´+

(

)

(

)

(

)

12

12

12

ââ

ââ when 1

ââ when 0

EPRICESQFTSQFTD

SQFTD

SQFTD

g

g

=++´

ì

++=

ï

=

í

+=

ï

î

(

)

2

2

âã when 1

â when 0

D

EPRICE

D

SQFT

+=

ì

=

í

=

î

(

)

12

âäâã

PRICEDSQFTSQFTDe

=+++´+

(

)

(

)

(

)

12

12

âäâã when 1

ââ when 0

SQFTD

EPRICE

SQFTD

ì

+++=

ï

=

í

+=

ï

î

(

)

1212

ââää

ã

WAGEEDUCBLACKFEMALE

BLACKFEMALEe

=+++

+´+

(

)

(

)

(

)

(

)

12

112

122

1122

ââ –

âäâ –

âäâ –

âääãâ –

EDUCWHITEMALE

EDUCBLACKMALE

EWAGE

EDUCWHITEFEMALE

EDUCBLACKFEMALE

+

ì

ï

++

ï

=

í

++

ï

ï

++++

î

(

)

(

)

RU

U

SSESSEJ

F

SSENK

=

·

(

)

(

)

(

)

6.71031.9803

1.9142 0.1361

WAGEEDUC

se

=-+

(

)

(

)

(

)

135771.1130194.73

14.21

130194.7995

RU

U

SSESSEJ

F

SSENK

===

(

)

(

)

(

)

(

)

(

)

1212

1

23

4

5

ââää

ãè

èè

è

è

WAGEEDUCBLACKFEMALE

BLACKFEMALESOUTH

EDUCSOUTHBLACKSOUTH

FEMALESOUTH

BLACKFEMALESOUTHe

=+++

+´+

+´+´

+´´+

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

1212

112213

245

ââää

ã

0

âèâèäè

äèãè 1

EDUCBLACKFEMALE

BLACKFEMALESOUTH

EWAGE

EDUCBLACK

FEMALEBLACKFEMALESOUTH

+++

ì

ï

+´=

ï

=

í

+++++

ï

ï

++++´=

î

89088.540895.9

129984.4

fullnonsouthsouth

SSESSESSE

=+

=+

=

012345

:

èèèèè0

H

=====

(

)

(

)

(

)

130194.7129984.45

129984.4990

0.3203

RU

U

SSESSEJ

F

SSENK

=

=

=

(

)

12

ln

ââä

WAGEEDUCFEMALE

=++

(

)

(

)

12

12

ââ ( 0)

ln

âäâ ( 1)

EDUCMALESFEMALES

WAGE

EDUCFEMALESMALES

+=

ì

ï

=

í

++=

ï

î

(

)

(

)

lnln

ä

FEMALESMALES

WAGEWAGE

-=

(

)

·

(

)

(

)

(

)

(

)

ln1.65390.09620.2432

0.0844 0.0060 0.03

27

WAGEEDUCFEMALE

se

=+-

(

)

(

)

lnlnln

FEMALES

FEMALESMALES

MALES

WAGE

WAGEWAGE

WAGE

d

æö

-==

ç÷

èø

FEMALES

MALES

WAGE

e

WAGE

d

=

1

FEMALESMALESFEMALESMALES

MALESMALESMALES

WAGEWAGEWAGEWAGE

e

WAGEWAGEWAGE

d

-==-

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!