![]() |
Published by the Department of
Criminal Justice, California State University, San Bernardino © 2004, The Western Criminology Review. ISSN 1096-4886 All Rights Reserved. |
Estimation Issues Associated with Time-Series--Cross-Section Analysis in Criminology*
John L. Worrall
California State University, San Bernardino
Travis C. Pratt
Washington State University
Online Citation: Worrall, J., and Pratt,
T. 2004."Estimation Issues Associated with Time-Series--Cross-Section Analysis
in Criminology." Western Criminology Review 5 (1) http://wcr.sonoma.edu/v5n1/worrall.html
[Printed version available in PDF format with
page numbers.]
| ABSTRACT In this paper we offer a relatively comprehensive introduction to estimation issues associated with time-series-cross-section analysis in criminology. We divide the estimation issues into two categories: (1) those that have received a fair amount of attention in the literature and (2) those that have not. Issues that have received attention are heterogeneity, autocorrelation, panel heteroskedasticity, nonstationarity, and unit-specific trends. Issues that have not received much attention are spatial autocorrelation and contemporaneous correlation. Using county-level data from the state of California, focusing in particular on the crimes of assault, robbery, and burglary, we control for the first set of estimation problems, then we explore the effects of the latter set. We conclude that contemporaneous correlation deserves more attention than spatial autocorrelation. Also, we found that assault is more sensitive to the estimation issues raised than either robbery or burglary. KEYWORDS: time-series--cross-section data; heterogeneity; autocorrelation; panel heteroskedasticity; nonstationarity. |
|
Several criminologists have moved beyond the limits of the cross-sectional research design. Specifically, the time dimension has been incorporated into many cross-sectional research designs, resulting in what is frequently termed "time-series-cross-section" (TSCS) analysis. Quantitative research that tacks on time to the traditional cross-sectional research design is besieged by problems of estimation. Most introductory econometrics texts cover these problems, and the techniques for rectifying them are both accessible and available. Unfortunately, it appears to us that the criminological literature lacks an introduction to the many problems inherent in time-series-cross-section analysis. More importantly, many criminologists have still ignored some of these estimation issues. We intend to provide a relatively comprehensive introduction to the various estimation problems associated with time-series-cross-section analysis. We will review many of the issues already familiar in criminological research; however, we will also introduce two additional estimation issues that have received little attention in the criminological literature. In addition, we will provide an overview of various methods for dealing with these problems. Finally, we will estimate a series of "generic" models of crime, focusing on the consequences of ignoring the estimation problems introduced throughout the article. Many of the issues discussed in this paper apply only to macro-level criminological research. Indeed, the models we estimate at the end of the paper are macro-level in nature. Nevertheless, some of the problems discussed here are relevant to all TSCS designs, and should therefore be of interest to criminologists studying micro-level dynamics. To avoid becoming too "methodological," we will use criminological examples throughout this paper in order to give it a "real world" focus. We will explain the topics in conceptual terms and conclude by discussing the importance of them for macro-level criminological research that consists of repeated observations on the same units of analysis. TIME-SERIES-CROSS-SECTION MODELS TSCS data combine observations both cross-sectionally and over time-an approach that increases sample size over either cross-section or time-series data. Many of the same statistical techniques are used to model both TSCS and panel data. As such, the techniques we explore in this article should be of interest to all researchers whose statistical models contain observations on the same units over time. Advantages of TSCS Data Panel data and TSCS data share many of the same advantages-only the statistical techniques differ. Panel and TSCS data models have long been considered one of the best designs for the study of causation next to a purely random experiment. Campbell and Stanley (1967), for example, refer to panel/TSCS models as "excellent quasi-experimental design[s], perhaps the best of the more feasible designs." Lempert (1966) stated that panel/TSCS designs are research designs "par excellence." Still other researchers have argued that panel/TSCS techniques are well-suited to causal analysis (e.g., Stimson). In addition to their potential for detecting causal relationships, panel/TSCS techniques offer a number of distinct advantages over cross-sectional estimators. As Hsiao (1986) points out, "panel data provides major benefits for...estimation in at least three areas: (1) identification of...models and discriminating between competing...hypotheses, (2) eliminating or reducing estimation bias, and (3) reducing problems of data multicollinearity." Panel/TSCS data also gives the researcher a large number of observations (one unit can actually have several observations), thereby increasing degrees of freedom. A final advantage of panel/TSCS techniques, more relevant in the present context, is: "By utilizing information on both the intertemporal dynamics and the individuality of the entities being investigated, one is better able to control in a more natural way for the effects of missing or unobserved variables" (Hsiao, 1986). Accordingly, panel/TSCS data are well-suited to the detection of population heterogeneity (time-stable characteristics of the units of analysis, such as conservatism in Orange County, California) because such variables-to the extent they exist-are typically unmeasurable and unobservable. The Foundation of TSCS Estimation The TSCS models we focus on are based on the following generic form: Equation (1) perfectly resembles a typical OLS regression model, except the subscripts for unit and time are incorporated into the model. This is an important point because, subscripts aside, equation (1) implies that time is irrelevant. The reason a time variable (such as a lagged dependent variable) is not included in equation (1) is because time may have no bearing on the dependent variable. Researchers must first test whether it is an important factor that needs to be controlled. We now turn our attention to some of the estimation issues posed by equation (1). ESTIMATION ISSUES A pooled analysis of the data based on Equation (1) would be seriously flawed, in part because such analysis assumes that repeated observations on each unit are independent. Thus, several estimation issues need to be considered in the TSCS context. Some of these are familiar to macro-level criminologists, but others have been effectively ignored. We begin by focusing on familiar estimation issues, including techniques for dealing with them. We then move into less familiar territory and consider additional TSCS estimation issues that have received little attention in the criminological literature. Familiar Issues in Macro-Level Criminological Research Heterogeneity. Heterogeneity refers to unobserved variables that remain constant over time.2 It is not always the case that heterogeneity needs to be modeled, but when it is present, specific techniques need to be followed in order to control for it. Heterogeneity is typically detected by comparing the F statistics resulting from equation (1) to the results from an equation of the following more specific form: Click here to view Formula 2.Equation (2) differs from equation (1) by the addition of vectors of dummy variables, vi and ?t, marking unit and time, respectively. Equation (2) can be described as a two-way fixed-effects model, which controls for unmeasured time-invariant differences between units and unit-invariant differences between time periods. The addition of unit-specific dummy variables acknowledges that there may be inherent features of individual units (e.g., counties) that affect the outcome of interest that are not adequately captured by any of the regressors included in the model (i.e., heterogeneity). For example, Cherry (1999; see also Cornwell and Trumbull 1994) has pointed out that cross-jurisdictional variations introduce "noise" into macro-level crime analysis which may bias results by noting that:
Another example of heterogeneity at the macro-level could be the political orientation of a specific city or county. It is often the case that such characteristics as "conservatism" or "liberalism" remain relatively constant over time. Likewise, such characteristics as "rural" or "urban" remain effectively constant for long periods of time. This "heterogeneity" needs to be modeled in time-series-cross-section analysis. The addition of time-specific dummy variables acknowledges that all the units in the model could be subject to common events in any given year. This time effect is frequently taken into account through a series of dummy variables for year, otherwise random year-to-year variations could contaminate the X-Y relationship specified in the model. A criminological example of this "time effect" could be a downturn in the economy in a single state. A significant downturn would likely affect the whole state at the same time. The result could be increased unemployment and a concomitant increase in crime. In formal terms, the two-way fixed effects model in equation (2) assumes that the slope estimates for the variables in the model remain constant across the unit and time dimensions but intercepts vary by both unit and time. So, instead of a single constant term, we have a series of dummy variables in the equation (either omitting one dummy that is captured by the constant or omitting the constant term). Admittedly, the fixed effects approach to modeling heterogeneity is somewhat crude; it suggests that important differences between counties and over time can be captured by simply including dummy variables in the specification. Since we do not know the extent of macro-level heterogeneity, however, and since there is no established theoretical basis suggesting it needs attention, we are simply starting "at the beginning" by exploring the influence on crime of macro-level heterogeneity.3 Temporal Autocorrelation. It is rarely the case that observations within TSCS data are independent along the time dimension. We expect serial dependence (often referred to as serial autocorrelation and/or temporal autocorrelation). It is not uncommon for the values of a particular unit from one time period to be associated with values for the same unit from another period (Hanushek and Jackson 1977; Maddala 1992). For example, the budgets of public agencies, such as police departments, are closely tied to one another over time. Likewise, prison populations, while fluctuating greatly over the long term, tend to be closely related from one year to the next. There are several ways to test for temporal (serial) autocorrelation, including the calculation of the Durbin-Watson d statistic from the residuals generated by OLS regression models. Importantly, this test for serial autocorrelation is for strictly exogenous regressors only; additional tests are required if the model contains a lagged dependent variable. Another easy method for detecting serially correlated is the TSCS analog of the standard Lagrange multiplier test. This is accomplished by estimating an OLS regression equation and then regressing the residuals on all of the independent variables and the lagged residual. If the coefficient on the lagged residual is significant, then the null hypothesis of independent errors can be rejected. These steps are taken to detect a first-order autoregressive process, yet the test can be refined to detect higher-order serial autocorrelation by the addition of multiple lags for the captured OLS residuals (see Beck and Katz 1996). Numerous methods are available for dealing with serial autocorrelation. One well-known approach to dealing with serial autocorrelation with traditional time-series that follows one unit over time is feasible generalized least squares (FGLS). This involves running the OLS regression of Y on all X variables and obtaining the OLS residuals. Then run the regression of the OLS residuals on the lagged dependent variable (for the AR[1] model) for all t=2,….,n and obtain the coefficient on the lagged dependent variable, rho. Finally, subtract rho from 1 for all X and Y variables and conclude by applying OLS to that equation. There are several names for FGLS estimation of the AR(1) model. These come from the different methods of estimating rho. Cochrane-Orcutt (CO) estimation omits the first observation and uses estimated rho from the regression of the OLS residuals on the lagged dependent variable. In contrast, Prais-Winsten (PW) estimation uses the first observation instead of dropping it. Another method to correct for serial autocorrelation, this one specifically geared toward dealing with multiple units over time, is a simple extension of the "basic" FGLS approach. The so-called "Park's method" (Beck and Katz 1995) estimates a specific autocorrelation coefficient for each unit of analysis. In other words, this method assumes that autocorrelation differs by unit whereas the basic FGLS approach assumes that a single autocorrelation coefficient applies to all units. This latter FGLS approach has been criticized by Beck and Katz (1995, 1996) because it assumes that the errors for all units follow a unit-specific autoregressive process. They argue, instead, that it is better to assume a common autoregressive process (Beck and Katz 1996). In their words, "TSCS analysts start with the assumption that the parameters of interest, ß, do not vary by unit; this 'pooling' is at the heart of TSCS analysis. Why then should the 'nuisance' serial correlation parameters vary by unit?" They showed through a series of Monte Carlo experiments that the assumption of a common serial correlation process leads to superior estimates of ß. Beck (forthcoming) actually criticizes both methods of FGLS because they "treat the interesting properties of TSCS data [e.g., serial autocorrelation] as nuisances which cause estimation difficulties. In place of the FGLS approach they propose simply including one one-period lagged dependent variable on the right-hand side of the equation for two reasons. First, a lagged dependent variable allows researchers to consider issues of unit root TSCS data. TSCS models have a unit root when the estimated value of the coefficient on the lagged dependent variable is one. We discuss this problem further below. Second, lagged dependent variables can serve as proxies for other variables not included in the model (see Wooldridge 2000). We adopt the lagged dependent variable approach below. Panel Heteroskedasticity. A unique form a heteroskedasticity frequently presents itself in the analysis of TSCS data. Panel heteroskedasticity4 can affect whole units at a time since error variances for a given unit may display time dependence. Non-constant variance is a likely violation of Gauss-Markov assumptions in TSCS data. To use a criminological example, variance estimates for rural county crime rates are likely to differ significantly from those of urban counties, which, in turn, are likely to contribute to nonconstant error variances. Similarly, if police departments are the units of analysis, then having large and small agencies in the same sample could be problematic. There are several acceptable methods available to detect heteroskedasticity, including the use of auxiliary regressions (Franzese 2002) and the Breusch-Pagan test. Methods for dealing with panel heteroskedasticity, however, are less clear with TSCS data (see Beck and Katz 1995, 1996 for a review; also see Wooldridge 2000). The most popular (and most easily computed) method is to weight the data by the square root of the variable thought to be responsible for heteroskedasticity. In many macro-level criminological models, this variable is usually a measure of population size (see, e.g., Chamlin and Cochran 1997; Marvell and Moody 2001; Pratt and Godsey, forthcoming; Sampson and Groves 1989; Shepherd, forthcoming). Another method of dealing with panel heteroskedasticity is to adjust the standard errors as opposed to weighting the data. Beck and Katz (1995, 1996) call this approach regression with "panel corrected standard errors" (PCSEs). PCSEs inflate the standard errors in light of the panel structure of the data. The PCSE approach leaves the data in their original form and so is desirable for those who do not wish to engage in empirical weighting of the data. Some regression routines in population statistics packages (e.g., STATA) allow researchers to weight the data by the square root of a specified variable as well as opt for the PCSE approach. This means that and heteroskedasticity remaining after weighting can be "controlled" for with panel corrected standard errors. Unit-Specific Trends. Assuming that all the units in a TSCS model follow the same pattern can be problematic. Indeed, certain units may depart from the norm, requiring the inclusion of unit-specific trend variables to control for fluctuations in a unit (e.g., a county) that depart from the trends captured by year dummies as in equation (2). Unit-specific trend variables are therefore proxies for factors that make crime rates (or other dependent variables) vary more or less than the overall trend. There is no easy way (that we know of) to determine whether unit-specific trends are necessary.5 Such variables are nevertheless worth considering because of the possibility of units that do not "behave" like the rest. Usually the unit-specific time trend is assumed to be linear and is coded from, say, 1 to t (Quadratic and other trends can be included as well). The need for modeling unit specific trends can be made clearer with a hypothetical example. Assume, for example, that a single county elects a democratic sheriff. The sheriff advocates reduced enforcement of drug laws and an increase in attention to treatment for drug users. Assume further that the voters approve of this strategy (however unlikely) and the result is a steady decrease in property crime during the period the sheriff remains in office. This effect would probably not carry over to other counties in the same state, which suggests that it needs to be modeled. Unit specific trends accomplish this. Nonstationarity. A fundamental assumption underlying the analysis of TSCS data is that they are stationary. In formal terms, data are stationary if their means, variances, and autocovariances (at various lags) remain across all time points. Stationarity is often detected through a Dickey-Fuller unit root test. If a time series has a unit root, it is deemed nonstationary. One way to perform this test is to regress the dependent variable in an equation on the one-period lagged dependent variable. If the coefficient on the lagged dependent variable, ?, equals one, then we face what is known as a unit root problem (i.e., a nonstationarity situation). Another method for detecting nonstationarity examines whether the coefficient on the lagged dependent variable, d, in an equation with a first-differenced dependent variable, equals zero. Unit root tests can (and usually should) also be used in models with more complicated dynamics. For example, an augmented Dickey-Fuller test can be performed by regressing the first-differenced dependent variable on the one-period lag of the dependent variable and on one or more lagged first-differenced dependent variables (i.e., lagged changes). The inclusion of the lagged changes is intended to clean up any serial correlation in the first-differenced dependent variable. Stated in more concrete terms, for this Dickey-Fuller test statistic to be valid, dynamics must be completely modeled. One way to determine the proper lag length is to start with several, then drop lags without significant coefficients, using standard t-tests (see, e.g., Enders 1995). One of the more common methods for dealing with nonstationarity is, as with autocorrelation, to first-difference the data. Since many criminological times series appear to be stationary based on unit-root tests, however, this transformation is rarely undertaken (see, e.g., Marvell and Moody 1996, 1995, 2001). Unit root tests such as those like the augmented Dickey-Fuller test have low power (Enders 1995). Since the null hypothesis in a unit root test is that of a unit root, this means that Dickey-Fuller statistics cause researchers to conclude, more often than they should, that data contain a unit root. Stationarity is not easily understood with an example. The reason for this is that it is a "statistical" factor that needs to be controlled for, not a conceptual problem. Nevertheless, a hypothetical scenario may help explain this phenomenon. It is well-known, for instance, that crime rates fluctuate over time. That is, they trend upward and downward over the long haul. Time series analysis assumes that this is not the case, which means that the data need to be forcibly made stationary. Issues Frequently Ignored in Macro-Level Criminological Research6 Spatial Autocorrelation. Because TSCS data contain observations on several cross-sections, or units, spatial autocorrelation is frequently a problem-especially when the units are contiguous-(Mencken and Barnett 1999). When the units of analysis are geographic aggregates the potential for error correlation between units increases. For example, when two counties border one another, both may face many of the same problems. Thus, the second requirement before establishing an adequate model is to account for spatial autocorrelation. This is a critical issue in our analysis because we focus on all counties in a particular state, each of which is contiguous to several others. A criminological example of spatial autocorrelation can be easily conceived of. If one city experiences a sudden and dramatic increase in homicide, it is not unrealistic to assume that the problem could be pushed into surrounding cities. Alternative, regional factors can lead to spatial autocorrelation. For example, cities in Southern California are markedly different in size, weather, and other factors than cities in Central and Northern California. It is likely that there is a degree of correlation between Southern California cities that would not carry over to cities to the North. Spatial autocorrelation is the result of the geographic clustering of values across observations-more than would be expected in a random distribution of values across geographical units (e.g., Anselin 1998). Spatial autocorrelation exists when the value for one variable X at location j is dependent (or associated) with the value of variable X at location i. Spatial autocorrelation is generally not a problem in macro-level studies of crime when units, such as cities or counties, are randomly sampled. Since many researchers analyze contiguous units, however, the (potential) problem of spatial autocorrelation must often be addressed. When spatial autocorrelation is not addressed, it can create deflated standard error terms (i.e., exaggerated efficiency) and therefore artificially increases the chances of finding statistically significant relationships. Several techniques are available for the detection of spatial autocorrelation (Mencken and Barnett 1999). Moran's I, perhaps the best-known technique, is interpreted like a correlation coefficient; the greater the I, the greater the autocorrelation. The I statistic is calculated as follows: 7 Getis and Ord (1992) point out several limitations associated with Moran's I. First, Moran's I is a global spatial autocorrelation statistic which means it may be unable to detect localized pockets of spatial autocorrelation. In addition, Moran's I is calculated based on the covariation between unit values on a specific variable. This means that a positive and significant Moran's I can be explained by a clustering of higher values in surrounding units, lower values in surrounding units, or both high and low clustering within a specified distance. An alternative to Moran's I proposed by Getis and Ord (1992) is known as the G statistic. It is calculated as follows: The G statistic differs from Moran's I by the inclusion of a distance band, d, an area of geographic interest in which spatial autocorrelation may be problematic. The spatial weights matrix, wij, is a binary matrix of ones and zeros, where ones indicate units within the distance band, d. Even Getis and Ord (1992) concede that the G statistic has its faults in that it may fail to identify localized clusters of positive and negative spatial autocorrelation. They therefore propose the Gi statistic: Two relatively simple methods exist for dealing with spatial autocorrelation, the latter of which is adopted in this paper. The first is known as a "spatial disturbances model" (Doreian 1980a, 1980b). In this model the effects of spatial autocorrelation among disturbances are incorporated into the model in much the same way that a first-order autoregressive model is estimated. In place of the one-period lagged disturbances, however, one multiplies a binary N x N spatial weights matrix (where the weights equal one if area j shares a common boundary with area I, zero otherwise) by the disturbances. The weight matrix is intended to represent the pattern of interaction between the disturbances at location i and j. This method is computationally labor intensive in the TSCS context because of the element of time. For example, 40 units observed at 10 time periods requires a spatial weights matrix of size 400 x 400. Since the researcher must construct the spatial weights matrix, and because we are not aware of statistical packages that estimate spatial disturbance models for TSCS data, we select the following approach. A "spatial effects model" is where the effects of autocorrelation within the dependent variable are incorporated. Instead of multiplying the disturbances by a spatial weights matrix, values of the dependent variable on all contiguous units are averaged and entered into the model as another variable. Researchers can substitute such a variable with lagged mean values to explore the possibility of delayed spatial autocorrelation. Lagged variables used in this fashion result in what is known as a "spatial lag model." The choice between both methods is a substantive rather than empirical one. The researcher must decide whether the dependent variable is spatially autocorrelated or whether the errors are. Accepting the view that the dependent variable is spatially autocorrelated is akin to suggesting that there is no spatial relationship between the independent variables. By contrast, spatial autocorrelation in the errors suggests that much more than the dependent variable is correlated across spatial units. Contemporaneous Correlation. Time-series-cross-section data are often plagued by the problem of contemporaneous correlation. That is, the observations from certain units may be correlated with the observations from other units during the same time period. As indicated in the heterogeneity section above, time-specific dummy variables are often incorporated into TSCS models in order to control for events that affect all units of analysis in a given year. Contemporaneous correlation is a markedly different problem. Contemporaneous correlation refers the error correlation between two or more units. In other words, contemporaneously correlated errors exist if there are unobserved features of some units that are related to the unobserved features of other units. Or, as Beck and Katz (1995) observe, "we might expect TSCS errors to be contemporaneously correlated in that large errors for unit i at time t will often be associated with large errors for unit j at time t." Furthermore, "these contemporaneous correlations may differ by unit" (p. 636). For example, the errors in two units may be linked together but remain independent of errors in the remaining units. Contemporaneous correlation can be understood by referring back to the concept of time-specific heterogeneity. As we already saw, it is possible that all counties in a given state can be affected by the same event at the same time. Contemporaneous correlation is basically the same thing, but with the possibility that less than all counties are affected. If, for example, there was a sudden and unexpected cold snap, then agricultural counties could see destruction of crops, increased unemployment, and crime, but this effect would probably not manifest itself in larger, urban counties. Alternatively, contemporaneous correlation can refer to differing levels of correlation between all units of analysis during the same time period (as opposed to the same level of correlation that dummy variables for time assume). Breusch and Pagan (1980) proposed a test for detecting contemporaneous correlation in regression residuals-a Lagrange multiplier (LM) of the following form: It is a test of the null hypothesis that the off-diagonal elements of the relevant correlation matrix are zero. One approach to dealing with contemporaneous correlation is to treat it as a nuisance and correct for it using Feasible Generalized Least Squares (FGLS). Beck and Katz (1996) call this approach "old fashioned" because instead of regarding contemporaneous correlation as a substantively important characteristic to be modeled, it is more or less ignored. In short, Beck and Katz (1995, pp. 644-45) have pointed out that "the downward bias in standard errors makes the [FGLS] technique unusable unless there are substantially more time points (T) than there are cross-sectional units (N)." The FGLS approach, then, would be inappropriate to control for contemporaneous correlation. In place of FGLS Beck and Katz (1995) propose that analysts deal with the complicated TSCS error process by using OLS regression with panel corrected standard errors (PCSE). Their Monte Carlo simulations showed that PCSEs are accurate in the presence of contemporaneous correlation (as well as panel heteroskedasticity). Beck and Katz (1995, 1996) go on to note that PCSEs can only be estimated after first modeling dynamic processes. Thus, they propose that the following model first be estimated: PCSEs are calculated using the OLS residuals from equation (3).8 The Monte Carlo simulations reported by Beck and Katz (1995, 1996) showed that panel-corrected standard errors are accurate in the presence of contemporaneously correlated errors and/or panel heteroskedastic errors. In other words, if the errors are serially independent, PCSEs provide good estimates of equation (3). A key component of equation (3) is the lagged dependent variable. Beck and Katz (1995, 1996) argue that a "modern" approach is to model the dynamics as part of the specification, and the simplest way to do this is to include a lagged dependent variable in the specification.9 DATA, COVARIATES, AND MODELS We now call attention to the importance of the foregoing estimation issues by estimating a series of "generic" macro-level models of crime with TSCS data, using aggravated assault, robbery, and burglary rates as our dependent variables. The models are "generic" in the sense that we are not testing any particular macro-level theory of crime, but rather we are assessing the potential impacts of these estimation issues on a variety of empirical relationships that are commonly tested by criminologists. The Data We used county-level data supplied by the California State Attorney General's Office from 1989 to 2000 (58 counties yielding 696 usable observations). Although our results cannot be expected to generalize to other states, the main advantage of focusing on a particular state is that we were able to obtain data in yearly increments. Thus, we are simply extending what is already a long criminological history of county-level studies of crime (see, e.g., Baller et al. 2001; Gillis 1996; Guthrie 1995; Hannon and DeFronzo 1998; Kowalski and Duffield 1990; Kposowa and Breault 1993; Kposowa et al. 1995; Lee 1996; Petee and Kowalski 1993; Phillips and Votey 1975). Dependent Variables The dependent variables used in our analysis are: (1) the aggravated assault rate; (2) the robbery rate; and, (3) the burglary rate. The rates were calculated as the number of each offense reported to the police divided by population. The dependent variables were quite highly skewed; accordingly, we used the natural logarithm of each variable in the equations. Some researchers argue that crime rates are discrete events and, as such, necessitate models that take into account the discrete data-generating process. As Osgood (2000) and Osgood and Chambers (2000) note, however, the natural logarithm transformation is appropriate provided that the counties analyzed are not all characterized by low offense rates relative to population size. That is to say, when population size grows smaller, the crime rate becomes less precise as well as skewed. Our models include several highly populous counties with substantially "meaningful" crime rates. As such, we believe the transformed dependent variable approach is perfectly acceptable over the models-for-discrete-outcomes approach advocated by some analysts (e.g., see the discussions by Brame et al., 1999; Osgood, 2000). Independent Variables We use both social-structural and criminal justice system-related covariates in the analyses that follow. The social-structural covariates we included were: (1) the high school dropout rate; (2) the welfare rate; (3) the unemployment rate; (4) per capita income; (5) the percentage of black residents; (6) the percentage of Hispanic residents; (7) the percentage of males between 13 and 17; (8) the percentage of males between the ages of 18 and 25; and (9) the percentage of families claiming the homeowners exemption on their state tax returns, a proxy for population mobility.10 Each of these variables have been employed as structural covariates in past macro-level criminological research (see, e.g., Allan and Steffensmeier 1989; Bailey 1984; Cantor and Land 1985; Chamlin 1989; Decker and Kohfeld 1984; Fowles and Merva 1996; Kapuscinski et al. 1998; Kovandzic et al. 1998; Land et al. 1995; Osgood 2000; Smith and Parker 1980; Warner and Roundtree 1997; Williams and Flewelling 1988). The criminal justice system-related covariates included in our models were: (1) the probability of arrest, depending on which specific crime is modeled; (2) per capita people held in custody; and (3) per capita law enforcement expenditures for the entire county. While other criminal justice system-related covariates exist, each of these have been used by criminologists in previous research (see, e.g., Lynch et al., 1994; Mathur, 1978; Pogue, 1975; Swimmer, 1974; see also the similar measures employed by Greenberg and Kessler, 1982; Harer and Steffensmeier, 1992; Langworthy, 1989; Marvell and Moody, 1996; Sampson and Cohen, 1988; Stack, 1984; Yu and Liska, 1993). Several of the independent variables also displayed positive skew; as such, we took the natural log of each variable so our equations are to be interpreted as elasticities. Problems in the Data We determined that our data require that heterogeneity be modeled.11 Also, using the various tests discussed to this point, we determined that the data were characterized by panel heteroskedasticity, serial autocorrelation, spatial autocorrelation, and contemporaneous correlation. The data were, however, stationary according to the augmented Dickey-Fuller test. Models We begin by estimating models with heterogeneity (controlled for with fixed effects for unit and time), unit specific trends (controlled for with unit-specific trend variables coded from 1 to t for each I), panel heteroskedasticity (weighted by square root of population), and serial autocorrelation (controlled for with a lagged dependent variable). These models will hereafter be referred to as the "baseline models." The baseline model is as follows: Equation (4) is identical to equation (2) with the exceptions of dyi,t-1 and vt. The former denotes the lagged dependent variable, and the latter denotes unit-specific trends. Panel heteroskedasticity is modeled by multiplying the independent and dependent variables by the square root of population. We then turn our attention to the less familiar estimation issues introduced above, namely spatial autocorrelation and contemporaneous correlation, estimating the following alternative models: (1) baseline models with a spatial autocorrelation term (coded as the mean value of the dependent variable for contiguous units in the same year) and (2) baseline models with panel corrected standard errors (for contemporaneous correlation). In all we estimate twelve models, three each (assault, robbery, and burglary) for the baseline and each of the aforementioned extensions of the baseline models. Given these various models we then compare the patterns of statistical significance/non-significance, as well as the magnitudes of the relationships across successive model specifications. In short, we are exploring the degree to which the substantive interpretation of certain empirical relationships may change as the estimation issues discussed previously in this article are taken into account.12 RESULTS In Table 1 we compare the coefficients between the baseline models and the spatial autocorrelation models. The first two columns are for the assault rate, the second two columns are for robbery rate, and the third two columns are for the burglary rate. Turning attention to the assault rate columns, the first presents the results from the baseline model and the second presents the results from the spatial autocorrelation model. Columns three through four (robbery) and five through six (burglary) are to be interpreted similarly. Table 2 is identical to Table 1 except that it presents the results of the baseline models and the panel corrected standard error models (for contemporaneous correlation), without the spatial autocorrelation terms. As can be seen in Table 1, controlling for a spatial autocorrelation term does not substantially influence any of the empirical relationships assessed here. Across all three dependent variables, the slope estimates and standard errors for each of the independent variables are virtually identical between the baseline models and the models that include the spatial autocorrelation term.13 On the other hand, Table 2, which displays our comparison of the baseline models to those with panel corrected standard errors (PCSEs), indicates that adjusting for contemporaneous correlation in this context is quite important. Indeed, the effect of contemporaneous correlation is rather complex across the assault, robbery, and burglary models. In particular, in comparing the baseline models to the PCSE models, certain relationships are mediated (the lagged dependent variable and the percent male aged 13 to 17 in the assault models; percent home exemption in the robbery models; and the lagged dependent variable and the percent home exemption in the burglary models). Other relationships, however, demonstrate a "suppression" effect (Sharpe and Roberts 1997), where the relationships actually got stronger as a result of introducing the PCSEs (percent Hispanic in the assault models; unemployment,14 both age distribution variables, and per capita law enforcement expenditures in the robbery models; and unemployment, per capita held, and per capita law enforcement expenditures in the burglary models). CONCLUSION In this article we have raised a number of issues relevant to the estimation of unbiased parameter in TSCS research designs. While the issues of spatial autocorrelation, contemporaneous correlation, and time trends have been discussed elsewhere, there has yet to be any single work that brings these discussions together and assesses their relevance for criminological research. Accordingly, the major purpose of this article was to review these estimation issues and to then empirically examine the degree to which they may impact the strength and significance of a number of relationships commonly studied by criminologists. Given the analyses presented here, three major conclusions can be reached. First, an absence of controls for spatial autocorrelation did not substantially affect any of the relationships examined in this study. This does not mean that spatial autocorrelation has no relevance in TSCS designs; rather, in the present context (a county-level analysis in California) models with and without a spatial autocorrelation term were virtually identical. Second, of the three dependent variables included here, assaults appear to be the least sensitive to the estimation issues discussed above. This may indicate that "instrumental" offenses (burglary and robbery) may be more susceptible to the biases associated with contemporaneous correlation than assaults. Finally, contemporaneous correlation "matters" in the present context. Although the pattern of impact was complex (both mediating and suppression effects were revealed), the broader implication is that contemporaneous correlation issues significantly influenced multiple empirical relationships routinely assessed by criminologists. Why does this matter? The standard approach of only using dummy variables for each time period in time-series-cross-section analysis assumes that the correlation between each unit of observation is the same across all units. For example, dummy variables for each time period force a downturn in the economy to have the same effect on all counties at the same time. The panel corrected standard errors approach we used here (that controls for contemporaneous correlation) allows this correlation to vary, which permits more accurate coefficient estimates when all is said and done. Again, while we are at a loss to explain the changes in the coefficients (because of inadequate theory development in this area), contemporaneous correlation appears to be worthy of consideration for criminologists. In all, the work presented here highlights the importance of testing for whether the estimation issues/problems discussed throughout the paper are present. If so, correcting them to avoid errors in estimation is necessary to avoid errors in interpretation as to which factors are, or are not, related to crime. 1 There is a wide array of techniques for the analysis of panel and TSCS data (e.g., Baltagi, 1995; Greene, 1993; Hsiao, 1986; Beck and Katz, 1995). There are special techniques for qualitative as opposed to quantitative dependent variables. Also, there are dynamic panel models for lagged dependent variables. Furthermore, depending on the expected "error structure," one choose from a large list of techniques designed to detect error correlations between and within individual units of observation.back 2 An alternative definition of heterogeneity is diversity within a context, in this case each county. However, this diversity is expected to remain constant over time periods.back 3 It is important to point out that the coefficients for the dummy variables for unit and time are practically uninterpretable. A significant coefficient on a single unit-specific dummy variable provides little information other than an indication that some unobserved (perhaps unknowable) time-stable feature of the unit exists. As such, the coefficients on unit-specific and time-specific dummy variables are nearly always suppressed. We follow this approach in the analysis section of this paper.back 4 Panel heteroskedasticity, compared to ordinary heteroskedasticity, allows the error variances to vary from unit to unit while requiring that they be constant within each unit.back 5 One possible technique would be to conduct a likelihood ratio test comparing the restricted and non-restricted models.back 6 One issue that we do not devote attention to in the present context is the assumption of constant coefficients across time periods. Stated simply, TSCS models require constant coefficients across time and space. Thorough discussions concerning violations of this assumption as well as methods for dealing with it can be found in Pesaran et al. (1999), Pesaran and Smith (1995), and Robertson and Symons (1992).back 7 A similar statistic is Geary's C. Moran's I is based on cross products to measure value association. Geary's C employs squared difference. We opt for Moran's I merely because it is the more popular of the two.back 8 The actual derivation of PCSEs is rather complicated. Readers are advised to consult the appendix in Beck and Katz (1996) for a complete explanation.back 9 Although this approach has been labeled by Beck and Katz (1996) as being "modern," the same point was made two decades ago by Kessler and Greenberg (1981) in their discussion of panel analysis.back 10 Three variables were fairly collinear: (1) the welfare rate; (2) the unemployment rate; and (3) per capita income. Each was correlated with the other at approximately .60; however, tolerance estimates consistently fell above .30, which led us to conclude that multicollinearity was not a problem.back 11 The test used for determining whether heterogeneity is present is with an F-test. The null for this test is that the coefficients for all the unit-specific dummies equals zero. The null was rejected in all of our models.back 12 Although a number of empirical tests for the equality of coefficients across regression models exist (see, e.g., Brame et al., 1998; Clogg et al., 1995), given the large sample in the present case the pooled standard errors for each of the coefficient comparisons rarely exceeded zero (most often beyond five decimal places), making even small differences between coefficients statistically significant (yet perhaps substantively unimportant). Thus, we opt for visual inspections of the changes in coefficients and, since our standard errors are included in the tables, the reader should be able to calculate equality of coefficients estimates if necessary.back 13 A possible explanation for this finding is that spatial autocorrelation is only detectable with lower-level units of analysis (e.g., Census blocks). We thank a reviewer for pointing this out.back 14 It appears as though the unemployment rate does a "bounce" across the robbery and burglary models. This may often be treated as an indicator of multicollinearity (Hanushek and Jackson, 1977). Diagnostic procedures (tolerance levels and condition indexes), however, did not reveal the presence of multicollinearity in these models or in those presented in either Table 1 or Table 2.back REFERENCES Allan, E.A. and Steffensmeier, D.J. (1989). "Youth, Underemployment, and Property Crime: Differential Effects of Job Availability and Job Quality on Juvenile and Young Adult Arrests." American Journal of Sociology :54:107-123. Anselin, L. (1998). "Exploratory Spatial Data Analysis in a Geocomputational Environment. Interactive Techniques and Exploratory Spatial Data Analysis." Working Paper 9801, Regional Research Institute, West Virginia University, Morgantown. Bailey, W.C. (1984). "Poverty, Inequality, and City Homicide Rates." Criminology 22:531-550. Baller, R.D., Anselin, L., Messner, S.F., Deane, G., and Hawkins, D.F. (2001). "Structural Covariates of U.S. County Homicide Rates: Incorporating Spatial Effects." Criminology 39:561-90. Baltagi, B. (1995). Econometric Analysis of Panel Data. New York: John Wiley and Sons. Beck, N. (forthcoming). "Time-Series-Cross-Section Data: What Have We Learned in the Last Few Years." Statistica Neerlandica. Beck, N. and Katz, J.N. (1995). "What to Do (and Not to Do) With Time-Series Cross-Section Data." American Political Science Review 89:634-647. Beck, N. and Katz, J.N. (1996). "Nuisance vs. Substance: Specifying and Estimating Time- Series-Cross-Section Models." Political Anlaysis 6:1-34. Brame, R., Paternoster, R., Mazerolle, P, and Piquero, A. (1998). "Testing for the Equality of Maximum-Likelihood Regression Coefficients Between Two Independent Equations." Journal of Quantitative Criminology 14:245-261. Brame, R., Bushway, S., and Paternoster, R. (1999). "On the Use of Panel Research Designs and Random Effects Models to Investigate Static and Dynamic Theories of Criminal Offending." Criminology 37:599-640. Breusch, T. and Pagan, A. (1980). "The LM Test and Its Applications to Model Specification in Econometrics." Review of Economic Studies 47:239-254. Campbell, D.T. and Stanley, J.C. (1967). Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally. Cantor, D. and Land, K.C. (1985). "Unemployment and Crime Rates in the Post-World War II United States: A Theoretical and Empirical Analysis." American Sociological Review 50:317-32. Chamlin, M.B. (1989). "Conflict Theory and Police Killings." Deviant Behavior 10:353-68. Chamlin, M.B. and Cochran, J.K. (1997). "Social Altruism and Crime." Criminology 35:203-28. Cherry, T.L. (1999). "Unobserved Heterogeneity Bias When Estimating the Economic Model of Crime." Applied Economic Letters 6:753-757. Clogg, C.C., Petkova, E., and Haritou, A. (1995). "Statistical Models for Comparing Regression Coefficients Between Models." American Journal of Sociology 100:1261-1293. Cornwell, C. and Trumbull, W.N. (1994). "Estimating the Economic Model of Crime With Panel Data." The Review of Economics and Statistics 76:360-366. Decker, S.H. and Kohfeld, C.W. (1984). "A Deterrence Study of the Death Penalty in Illinois, 1933-1980." Journal of Criminal Justice 12:367-77. Doreian, P. (1980a). "Linear Models With Spatially Distributed Data: Spatial Disturbances or Spatial Effects?" Sociological Methods and Research 9:29-60. Doreian, P. (1980b). "Estimating Linear Models With Spatially Distributed Data." Pp. 359-385 in Samuel Leinhardt, (ed.), Sociological Methodology. San Francisco: Jossey-Bass. Enders, W. (1995). Applied Econometric Time Series. New York: Wiley. Fowles, R. and Merva, M. (1996). "Wage Inequality and Criminal Activity: An Extreme Bounds Analysis for the United States, 1975-1990." Criminology 34:163-82. Franzese, R.J. (2002). Macroeconomic Policies of Developed Democracies. Cambridge University Press. Getis, A. and Ord, J.K. (1992). "The Analysis of Spatial Association by use of Distance Statistics." Geographical Analysis 24:189-206. Gillis, A. R. (1996). "So Long as They Both Shall Live: Marital Dissolution and the Decline of Domestic Homicide in France, 1852-1909." American Journal of Sociology 101:1273-1305. Greenberg, D.F. and Kessler, R.C. (1982). "The Effect of Arrests on Crime: A Multivariate Panel Analysis." Social Forces 60:771-790. Greene, W.H. (1993). Econometric Analysis (3rd edition). Upper Saddle River, NJ: Prentice Hall. Guthrie, D.J. (1995). "From Cultures of Violence to Social Control: An Analysis of Violent Crime in US Counties With Implications for Social Policy." Berkeley Journal of Sociology 39:67-99. Hannon, L. and Defronzo, J. (1998). "The Truly Disadvantaged, Public Assistance, and Crime." Social Problems 45:383-392. Hanushek, E.A., and Jackson, J.E. (1977). Statistical Methods for Social Scientists. San Diego, CA: Academic Press. Harer, M.D. and Steffensmeier, D. (1992). "The Differing Effects of Economic Inequality of Black and White Rates of Violence." Social Forces 70:1035-54. Hsiao, C. (1986). Analysis of Panel Data. New York: Cambridge University Press. Kapuskinski, C.A., Braithwaite, J., and Chapman, B. (1998). "Unemployment and Crime: Toward Resolving the Paradox." Journal of Quantitative Criminology 14:215-44. Kessler, R.C. and Greenberg, D.F. (1981). Linear Panel Analysis: Models of Quantitative Change. New York: Academic Press. Kovandzic, T., Vieraitis, L.M., and Yeisley, M.R. (1998). "The Structural Covariats of Urban Homicide: Reassessing the Impact of Income Inequality and Poverty in the Post-Reagan Era." Criminology 36:569-599. Kowalski, G.S. and Duffield, D. (1990). "The Impact of the Rural Population Component on Homicide Rates in the United States: A County-Level Analysis." Rural Sociology 55:76-90. Kposowa, A.J. and Breault, K.D. (1993). "Reassessing the Structural Covariates of U.S. Homicide Rates: A County Level Study." Sociological Focus 26:27-46. Kposowa, A.J., Breault, K.D., and Hamilton, B. (1995). "Reassessing the Structural Covariates of Violent and Property Crimes in the USA: A County Level Analysis." British Journal of Sociology 46:79-105. Land, K.C., Cantor, D., and Russell, S.T. (1995). "Unemployment and Crime Rate Fluctuations in the Post-World War II United States." In Hagan, J., and Peterson, R.D. (Eds.), Crime and Inequality. Stanford, CA: Stanford University Press. Langworthy, R.H. (1989). "Do Stings Control Crime? An Evaluation of a Police Fencing Operation." Justice Quarterly 6:27-45. Lee, R.S. (1996). "The Ecology of Violence in the United States." International Journal of Group Tensions 26:3-20. Lempert, R. (1966). "Strategies of Research Design in the Legal Impact Study: The Control of Plausible Rival Hypotheses." Law and Society Review 1:111-132. Lynch, M.J., Groves, W.B., and Lizotte, A. (1994). "The Rate of Surplus Value and Crime: A Theoretical and Empirical Examination of Marxian Economic Theory and Criminology." Crime, Law and Social Change 21:15-48. Maddala, G.S. (1992). Introduction to Econometrics. New York: MacMillan. Marvell, T.B. and Moody, C.E. (1995). "The Impact of Enhanced Prison Terms for Felonies Committed with Guns." Criminology 33:247-278. Marvell, T.B. and Moody, C.E. (1996). "Specification Problems, Police Levels, and Crime Rates." Criminology 34:609-46. Marvell, T.B. and Moody, C.E. (2001). "The Lethal Effects of Three-Strikes Laws." Journal of Legal Studies 30:89-106. Mathur, V.K. (1978). "Economics of Crime: An Investigation of the Deterrent Hypothesis for Urban Areas." Review of Economics and Statistics 60:459-466. Mencken, F.C. and Barnett, C. (1999). "Murder, Nonnegligent Manslaughter, and Spatial Autocorrelation in Mid-South Counties." Journal of Quantitative Criminology 15:407- 422. Osgood, D.W. (2000). "Poisson-Based Regression Analysis of Aggregate Crime Rates." Journal of Quantitative Criminology 16:21-43. Osgood, D.W. and Chambers, J.M. (2000). "Social Disorganization Outside the Metropolis: An Analysis of Rural Youth Violence." Criminology 38:81-115. Pesaran, M., Yongcheol, S., and R. Smith. (1999). "Pooled Mean Group Estimation of Dynamic Heterogeneous Panels." Journal of the American Statistical Association 94:621-34. Pesaran, M. and R. Smith. (1995). "Estimating Long-Run Relationships From Dynamic Heterogeneous Panels." Journal of Econometrics 68:79-113. Petee, T.A. and Kowalski, G.S. (1993). "Modeling Rural Violent Crime Rates: A Test of Social Disorganization Theory." Sociological Focus 26:87-89. Phillips, L. and Votey, H.L. (1975). "Crime Control in California." Journal of Legal Studies 4:327-350. Pogue, T.F. (1975). "Effect of Police Expenditures on Crime Rates: Some Evidence." Public Finance Quarterly 3:14-44. Pratt, T.C. and Godsey, T.W. (forthcoming). "Social Support and Homicide: A Cross-National Test of an Emerging Criminological Theory." Journal of Criminal Justice. Robertson, D. and J. Symons. (1992). "Some Strange Properties of Panel Data Estimators." Journal of Applied Econometrics 7:175-189. Sampson, R.J. and Cohen, J. (1988). "Deterrent Effects of the Police on Crime: A Replication and Theoretical Extension." Law and Society Review 22:163-89. Sampson, R.J. and Groves, W.B. (1989). "Community Structure and Crime: Testing Social-Disorganization Theory." American Journal of Sociology 94:774-802. Sharpe, N.R. and Roberts, R.A. (1997). "The Relationship Among Sums of Squares, Correlation Coefficients, and Suppression." American Statistician 51:64-66. Shepherd, J.M. (forthcoming). "Fear of the First Strike: The Full Deterrent Effect of California's Two- and Three-Strikes Legislation." Journal of Legal Studies. Smith, D.A. and Parker, R.N. (1980). "Type of Homicide and Variation in Regional Rates." Social Forces 59:136-47. Stack, S. (1984). "Income Inequality and Property Crime: A Cross-National Analysis of Relative-Deprivation Theory." Criminology 22:229-57. Stimson, J. (1985). "Regression in Space and Time: A Statistical Essay." American Journal of Political Science 29:914-947. Swimmer, E. (1974). "The Relationship of Police and Crime: Some Methodological and Empirical Results." Criminology 12:293-314. Warner, D. and Wilcox Roundtree, P. (1997). "Local Ties in a Community and Crime Model: Questioning the Systemic Nature of Informal Social Control." Social Problems 44:520-36. Williams, K.R. and Flewelling, R.L. (1988). "The Social Production of Criminal Homicide." American Sociological Review 53:421-31. Wooldridge, J.M. (2000). Introductory Econometrics: A Modern Approach. New York: South-Western College Publishing. Yu, J. and Liska, A.E. (1993). "The Certainty of Punishment: A Reference Group Effect and It's Functional Form." Criminology 31:447-64. |
|
*The authors would like to thank David Greenberg for his helpful comments on a previous version of this manuscript. ABOUT THE AUTHOR Travis C. Pratt is an Assistant Professor in the Department of Political Science/Criminal Justice at Washington State University. His research interests focus on correctional policy and criminological theory, with particular attention to structural theories of crime. His recent work has appeared in Criminology, the Journal of Research in Crime and Delinquency, and Justice Quarterly. backContact Information: Direct Correspondence to John Worrall, Department of Criminal Justice, California State University, 5500 University Parkway, San Bernardino, CA 92407. email: jworrall@csusb.edu |
WCR Home | Review | Submission | Past Issues | Help |
Search | Registration | Staff | Copyright | Useful Links |-----------------------------------------------------------------
© 2004, The Western Criminology Review. All Rights Reserved. ISSN 1096-4886
Last modified January 2004
| Western Society of Criminology |