# Explaining regional unemployment differences in Germany: a spatial panel data analysis

←

→

**Page content transcription**

If your browser does not render page correctly, please read the page content below

SFB 649 Discussion Paper 2012-026 BERLIN Explaining regional unemployment differences in Germany: a spatial panel data analysis ECONOMIC RISK Franziska Lottmann* 649 * Humboldt-Universität zu Berlin, Germany SFB This research was supported by the Deutsche Forschungsgemeinschaft through the SFB 649 "Economic Risk". http://sfb649.wiwi.hu-berlin.de ISSN 1860-5664 SFB 649, Humboldt-Universität zu Berlin Spandauer Straße 1, D-10178 Berlin

Explaining regional unemployment differences in Germany: a spatial panel data analysis Franziska Lottmann∗† Abstract This paper analyzes determinants for regional differences in German un- employment rates. We specify a spatial panel model to avoid biased and inefficient estimates due to spatial dependence. Additionally, we control for temporal dynamics in the data. Our study covers the whole of Germany as well as East and West Germany separately. We exploit district-level data on 24 possible explanatory variables for the period from 1999 until 2007. Our results suggest that the spatial dynamic panel model is the best model for this analysis. Furthermore, we find that German regional unemployment is of disequilibrium nature, which justifies political interventions. Keywords: regional unemployment, spatial dependence, spatial panel models, Ger- many JEL classification: C23, R12, R23 ∗ Corresponding author: Franziska Lottmann, Humboldt-Universität zu Berlin, School of Economics and Business, Institute of Statistics and Econometrics, Chair of Econometrics. Spandauer Str. 1, 10178 Berlin, Germany. Tel: +49-30-2093 5705, Fax: +49-30-2093 5712, Email: franziska.lottmann@wiwi.hu- berlin.de. † I would like to thank Nikolaus Hautsch and Bernd Droge as well as the participants of the 5th World Conference of the Spatial Economic Association (SEA) for helpful comments on this project. Financial support of the Deutsche Forschungsgemeinschaft via SFB 649 "Economic Risk" for the provision of data is gratefully acknowledged. 1

1 Introduction The unemployment rate is a widely used and often discussed indicator for the eco- nomic well-being of a country. However, the discussion is mostly concentrated on na- tional unemployment rates which give no information about the regional structure of unemployment. Though, data on regional unemployment rates show substantial dif- ferences between regions. According to Taylor and Bradley (1997), regional differences within a country are stronger than differences between countries. Regional differences are of particular interest in Germany due to the specific history of the country. Until 1990, Germany was divided into two separate countries with different economic sys- tems. The division of Germany caused structural differences resulting in adjustment processes which have not been fully completed until today. This paper analyzes determinants for regional differences in German unemployment rates using spatial econometric methods. We identify the driving factors in the whole of Germany as well as in East and West Germany separately. Twenty years after German reunification, this study is, to our best knowledge, the first contribution investigating regional unemployment in Germany. A specific feature of regional labor markets is their correlatedness over space. The presence of spatial (auto-)correlation implies that the level of regional unemployment in one particular region is correlated with that of neighboring regions. On the one hand, firms do not restrict their recruiting activities to their resident location and, on the other hand, job searchers might accept a job in a different area. The spatial econometric literature shows that ignoring spatial effects yields biased and inefficient estimates (see Anselin and Bera (1998) among others). Therefore, we apply a spatial econometric model to avoid these shortcomings. To model regional unemployment, we take into account 24 possible explanatory vari- ables containing equilibrium and disequilibrium and derive our set of regressors by 2

model selection. We have panel data on 412 German districts cooresponding to German NUTS1 III regions for the period from 1999 until 2007. As labor market data exhibit not only spatial but also temporal dynamics, we utilize both a static and a dynamic mod- eling approach while most contributions in the literature consider only static model specifications. Both applying a spatial panel model and a dynamic modeling to the context of regional unemployment are novel to the literature. Regional unemployment differentials have been subject of intensive research in the literature. From a methodological point of view, the empirical literature can be divided into two strands of literature. On the one hand, models for regional unemployment are estimated using (non-spatial) panel data techniques. Examples are Partridge and Rickman (1997) who use data on state unemployment for the United States, and Tay- lor and Bradley (1997) who provide a comparative study for regional unemployment disparities in Germany, Italy and the United Kingdom. Their data for Germany covers only the Western part for the period from 1984 until 1994. They use data on the level of German Länder which correspond to the NUTS I level. On the other hand, contributions apply spatial econometric models in a cross-sectional setting. The first contribution in this direction is by Molho (1995) in which he provides evidence for the presence of significant spillovers in the adjustment to local shocks using data on 280 Local Labor Market Areas in Great Britain. Further examples for this strand of literature are Aragon et al. (2003) who analyze district-level data for the Midi-Pyrénées region of France and Cracolici et al. (2007) who explore the geographical distribution of unemployment in Italy. Finally, Elhorst (2003) provides a survey on theoretical models and explanatory variables for regional unemployment differences. We contribute to the existing literature by the following two aspects: Firstly, we ap- ply both a static and a dynamic spatial panel model. Furthermore, we exploit the panel 1 NUTS (French abreviation) stands for "Nomenclature of Territorial Units for Statistics", and it is a hier- archical classification of regional units for statistical purposes. 3

dimension of the data and, in addition to that, we account for both spatial and tem- poral dependence in the data. Our results show that the spatial dynamic panel model fits our data in the best way. Secondly, we provide evidence that regional unemploy- ment in Germany is of disequilibrium nature which provides a justification for political interventions on regional labor markets. The structure of this paper is as follows: The second section briefly reviews theoret- ical explanations for regional unemployment differentials while the third presents the data set and explains how the spatial weights matrix is defined. The econometric model is introduced in the fourth section which covers model selection, specification testing and spatial econometric modeling. The fifth section is dedicated to the estimation re- sults for the whole of Germany as well as for East and West Germany individually. Finally, the last section concludes. 2 Theoretical explanations for regional unemployment differentials Classical economic theory suggests that differences in regional unemployment should not occur because unemployed living in a region with high unemployment are ex- pected to move to an area with lower unemployment. A similar reasoning holds for firms which are assumed to move from low-unemployment to high-unemployment regions because they can benefit from a larger pool of workers. However, regional un- employment data shows substantial differences. 2.1 Why do regional unemployment rates differ? The literature provides different explanations for the existence of regional unemploy- ment differentials which can be summarized into two different views. The equilib- rium view assumes the existence of a stable equilibrium in which regions have differ- 4

ent unemployment rates. According to Molho (1995), this equilibrium is characterized by “uniform utility across areas for homogeneous labor group” (p. 642). In this set- ting, there is no incentive for further migration. Hence, households (and firms) need to be compensated for high (low) unemployment by other positive factors, so-called amenities. Such amenities are, for example, better climate, reasonable housing prices or higher quality of life. Hence, the equilibrium rate of unemployment in region i is a function of the amenity endowment in this region (Marston (1985)). The equilib- rium view has received theoretical and empirical support from (among others) Marston (1985) drawing on ideas from Hall (1970). Contrary to the equilibrium view, the disequilibrium view assumes that regional un- employment will equalize in the long run. However, the adjustment process might be slow. The speed of adjustment depends on different factors that are connected to both labor supply and labor demand. Such factors are, for example, the age structure and the educational attainment of the population. Young people are more likely to migrate as they have lower opportunity costs and are less risk averse (Aragon et al. (2003)). People holding a degree of higher education are also more likely to move because the labor market for high-skilled workers is larger and these persons are expected to be bet- ter informed (Aragon et al. (2003)). The structure of the labor force also influences the relocation behavior of firms. Moreover, population density also affects the adjustment process to the long-run equilibrium. Unemployment is expected to be lower in urban areas because the matching process between unemployed and vacant jobs is more effi- cient. Furthermore, the migration behavior of people is clearly influenced by migration costs. For example, housing prices and the structure of the housing market influence how easy it is for a household to change its location. These explanations for regional unemployment differences give rise to different con- clusions for policy makers. According to Marston (1985), government efforts to reduce regional unemployment differentials are “useless” (p. 58) since they cannot reduce un- 5

employment anywhere for long when the level of regional unemployment can be con- sidered as equilibrium state. By contrast, the disequilibrium view delivers an “implicit justification for programs that target government funds to depressed areas” (Marston (1985), p. 58). In light of these different consequences for policy, it is important to as- sess whether regional unemployment can be considered as equilibrium phenomenon or not. However, both explanatory approaches for regional unemployment are not necessar- ily mutually exclusive. Marston (1985) states that “it may be that an equilibrium rela- tionship exists, but that equilibrating forces are so weak that individual areas spend a long period of time away from their equilibrium” (p. 59). For the German case, there are arguments for both theoretical approaches to explain the regional labor market sit- uation. On the one hand, about twenty years after German reunification, the economic catching-up process of East Germany is not yet complete. But, on the other hand, re- gional unemployment rates are not expected to equalize in the long run because of structural differences between regions. There exist structural differences not only be- tween East and West Germany but also within East and West Germany and other areas. Partridge and Rickman (1997) combine both approaches and extend the set of factors that might influence regional unemployment. In contrast to the equilibrium view, they do not assume that household utility in terms of income and amenities will equalize across areas in equilibrium. They add monetary and psychological costs of household relocation to the household utility function. These costs can be sufficiently high such that moving of households is limited. As regional unemployment in Germany has both equilibrium and disequilibrium aspects, we base our empirical analysis on Partridge and Rickman (1997). 6

2.2 Set of possible determining factors Following Partridge and Rickman (1997), we assume that unemployment in region i in year t depends on disequilibrium variables and an equilibrium component which is a function of market equilibrium effects, demographic characteristics as well as producer and consumer amenities. For the choice of the actual variables in these categories we take into account the empirical regional unemployment literature. However, the set of our variables is limited by data availability. Disequilibrium effects We use the employment growth rate which, according to the literature, has turned out to be an important determinant for regional unemployment. This is not surprising because the change in employment directly affects unemployment.2 Another variable capturing disequilibrium effects are wages or unit labor costs. Unfortunately, this data is not available on the desired regional level for our analysis. Market equilibrium effects To account for the sectoral structure of regions, we use employment shares of differ- ent sectors. According to Martin (1997), industrial composition effects are a “primary reason” (p. 244) for labor demand and regional unemployment to differ across regions. Demographic characteristics Demographic characteristics influence both labor demand and labor supply by affect- ing the number of new hires, quits, and workers leaving the labor force (Partridge and Rickman (1997)). We use the share of young and older persons to account for the age structure of the labor force. In contrast to studies on other countries, as for example the United States, German labor market data does not contain any information on eth- nicity in general. However, we have data on the share of foreigners in the labor force. Another important demographic variable is labor force participation, especially female 2 Itwould be interesting to analyze the impact of (temporally) lagged values of employment growth on regional unemployment. However, to our best knowledge, employment data on periods prior to 1999 is not available on the level of districts. 7

labor force participation. Due to different social roles of women in both German coun- tries before 1990, labor force participation of women differs substantially between East and West Germany. Unfortunately, the data on female labor force participation is only available on the level of Regierungsbezirke which partly correspond to the NUTS II re- gions of Germany, i.e. this variable exhibits less regional variation than the others. To include information on human capital, we utilize data providing three levels of edu- cational attainment which are a university degree, a vocational qualification and no professional qualification at all. Furthermore, we use the balance of incoming and out- going commuters of district i to control for a region’s linkages with other regions. A positive commuting balance in region i indicates that labor supply in region i increases by incoming commuters. Moreover, a positive commuting balance gives an indication for positive demand for labor in region i. Amenities On the one hand, the impact of amenities is captured by population density. It is a proxy for consumer and producer amenities because urban areas provide more ameni- ties than rural areas. Unemployed persons have more employment opportunities and the matching process is expected to be more efficient in urban areas. However, urban areas are also associated with pollution and congestion. On the other hand, we con- sider three amenity variables which, to our best knowledge, have not been considered in the regional unemployment literature so far. First, we use the public debt ratio of a district because high public debts in relation to gross domestic product (GDP) are an indication for a deficient ability of a region to finance public goods and subsidies. Additionally, strongly indebted communities are not attractive for firms to create new businesses. Second, we utilize data on the number of business registrations. This vari- able is a proxy for producer amenities. A higher number of new businesses will result in a higher demand for labor. Third, we use the number of overnight stays to capture a region’s attractiveness to tourists. Additionally, a high number of overnight stays may 8

be related to high business activities.3 3 Data and spatial weights matrix 3.1 Regional unemployment and its determining factors The data on regional unemployment rates used in this analysis are provided online by the Federal Employment Office (Bundesagentur für Arbeit). As it is official data, the underlying definition of unemployment corresponds to regulations in German Social Security Code (Sozialgesetzbuch). Moreover, we utilize a huge regional data set of pos- sible explanatory variables. All these variables are taken from the regional database of the Federal Statistical Office of Germany (Statistisches Bundesamt). Since there were some values missing in this database, we requested them directly from the correspond- ing regional statistical institutions. A detailed description of the data and sources can be found in Table 12 in the appendix. Our data set covers the period from 1999 until 2007.4 The end of our sample period is determined by a change in the sectoral classifi- cation in 2008, i.e. data on employment in different industries is not comparable before and after this change of classification. The data is available for all 412 German districts (Landkreise and kreisfreie Städte) which correspond to German NUTS III regions.5 During our sample period, there are two reforms of district allocation. We allocate the data for the whole period in such a way that it corresponds to the situation after these reforms. Details on the district reforms can be found in the appendix. To visualize regional differences in unemployment rates of German districts, Fig- 3 In contrast to other studies (as Cracolici et al. (2007) or Molho (1995)), we do not consider housing prices in our analysis because the majority of Germans lives in rented apartments. In 2006, 58% of the German population lived in rented appartments (see Timm (2008)). Until now, there exists no comprehensive data base for rental prices in German districts. 4 In 2005, a labor market reform ("Hartz reform") became effective which changed the definition of unem- ployment. Therefore, the number of unemployed increased by definition in this year. 5 Baddeley et al. (1998) state that NUTS III regions "most closely approximate meaningful labor markets" (p. 204). However, Eckey et al. (2007b) explain that travel-to-work areas are the relevant regional level for analyses of regional production and unemployment. 9

ure 1 presents a map of Germany which is colored according to the extent of regional unemployment in 2009.6 Additionally, Table 3 shows summary statistics of regional unemployment rates over time. Based on these exploratory tools, we can summarize the following major facts. First, there is substantial variation in regional unemployment rates in Germany. In 2004, the district with lowest unemployment exhibited a rate of 4.4 % (Eichstätt district) while the highest regional unemployment rate amounted to 31.4 % (Uecker-Randow district). Second, the German labor market is characterized by strong differences between East and West Germany which still can be considered as consequences of German division. Regional unemployment rates are higher in East Germany. However, in a ranking of German districts with respect to unemployment, there are some East German districts that are placed ahead of West German districts. Third, besides the East-West differences, there is a slight North-South divide. To test for stationarity of the data, we apply panel unit root tests. The results of the Im et al. (2003) (IPS) test and the Fisher-type (ADF) test which was proposed in Maddala and Wu (1999) and in Choi (2001) clearly reject the hypothesis of a unit root in regional unemployment rates at all reasonable significance levels. In addition to that, we apply the IPS test and the Fisher-type (ADF) test to our set of explanatory variables and find that all explanatory variables are stationary as well. However, Baltagi et al. (2007a) show that there can be considerable size distortions in panel unit root tests when the true model exhibits spatial error correlation. Hence, these test results can only serve as a slight indication regarding stationarity of the data. 3.2 Spatial autocorrelation on German labor markets An important component of spatial econometric models is the spatial weights matrix. It is a nonstochastic matrix that specifies exogenously the spatial relations between ob- 6 Themap of Germany shows that some of the NUTS III regions lie within others, i.e. these districts have only one physical neighbor. 10

Table 1: Summary statistics of geographic distances (in kilometers) between centroids of German districts Min 1st Qu. Median Mean 3rd Qu. Max Std. dev. 1.18 191.7 298 310.6 417.1 845.6 155.52 servations. Hence, the spatial weights matrix determines the neighborhood of district i. Accordingly, the term ‘neighboring’ always refers to the neighborhood set defined by the corresponding spatial weights matrix. On the one hand, we use a binary spatial weights matrix with entries zero and one and, on the other hand, matrices with general weights. The simplest version of a spatial weights matrix is the binary contiguity matrix. When two districts share a common border, the corresponding entry in the spatial weights matrix is one and zero otherwise. The elements on the main diagonal are zero by definition. This matrix induces a simple spatial structure which might not reflect actual spatial linkages in an appropriate way. Therefore, we construct spatial weights matrices with general weights. On the one hand, we utilize data on geographic dis- tances between districts and, on the other hand, we use a combination of geographic distance and size, as proposed in Molho (1995), to define spatial weights. Geographic distance has frictional effects on labor market activity. Workers prefer to find a job in their closer environment because commuting and moving entail monetary and psychological costs. Therefore, we use great circle distances between centroids of districts to define the entries of the spatial weights matrix. Summary statistics of the geographic distances are provided in Table 1. The weights of the distance-based matrix are defined by exp(−τdij ) f or i 6= j wij = (1) 0 f or i = j, 11

where τ is a distance decay parameter and dij is the geographic distance between dis- tricts i and j. The resulting spatial weights matrix crucially depends on the choice of τ. To determine the distance decay parameter, we use a grid search with different values for τ and decide according to the Bayesian and Akaike’s information criterion which parameter value is most suitable for our data. Niebuhr (2003) also uses this distance decay function to define the weights for her analysis of regional unemployment in Eu- rope. However, the distance decay function neglects the labor market size of districts. Spa- tial dependence differs when the extent of employment opportunities differs although distances between districts are the same. We expect that the spatial impact of a dis- trict with high employment on a low-employment district is stronger than vice versa. Therefore, we utilize the weighting scheme proposed by Molho (1995) which combines size with the distance decay effect. According to Molho (1995), the spatial weights are defined by Ej exp(−ηdij ) ∑k6=i Ek exp(−ηdik ) f or i 6= j wij = (2) 0 f or i = j, where E denotes the employment level and η is the distance decay parameter. As Molho (1995) points out, this weighting scheme implies that the spillover effect of the labor market situation in region j on the setting in region i increases with size of region j (measured in terms of employment) and decreases with the distance between both dis- tricts. Again, the impact of distance on the strength of the spatial relation crucially depends on the distance decay parameter η. We perform a grid search for η and decide on the appropriate value for our model according to information criteria. Labor market activity and hence labor market data is expected to be correlated over space. To justify this aspect, we perform the Moran I test for spatial autocorrelation using regional unemployment rates. As this test is not specified for a particular spatial 12

process, we can apply it directly to our data. The null hypothesis of this test is the absence of spatial autocorrelation while the alternative is not exactly specified. The test statistic can be expressed by (Moran (1950)) ∑in ∑nj wij (ui − ū)(u j − ū) I= (3) ∑in=1 (ui − ū)2 where ui and u j are the regional levels of unemployment in district i and j. ū is defined by ū = 1 n ∑in=1 ui and wij is the element of the spatial weights matrix indicating the spatial impact of region j on region i. For the computation of the Moran I statistic we use the binary contiguity matrix.7 As the Moran I statistic is designed to detect spatial autocorrelation in cross-sectional data, we compute it for every year of our sample separately. The results of the Moran I test are presented in Table 2. They show that regional unemployment rates are pos- itively spatially autocorrelated during the period from 1999 until 2007. Furthermore, they show a decreasing trend in the values of the Moran I statistic, i.e. the extent of spatial autocorrelation in regional unemployment rates decreases during 1999 and 2007. 4 Econometric Model In order to control for spatial autocorrelation in the data, we specify a spatial econo- metric model for our analysis of regional unemployment. We apply a panel data model which allows to account for unobserved individual heterogeneity in the data. We ob- tain our model in two steps: Firstly, we use a model selection procedure to decide which variables from our set of possible explanatory variables actually have a significant im- pact on regional unemployment. Secondly, we use the specification test by Debarsy 7 Wealso tried the other spatial weights matrix to compute the Moran I statistic and got qualitatively the same results. 13

Table 2: Results of the Moran I test for spatial autocorrelation (1999-2007) Moran I Z p-value 1999 0.874 26.48 0 2000 0.875 29.02 0 2001 0.890 29.51 0 2002 0.882 29.25 0 2003 0.863 28.61 0 2004 0.846 28.05 0 2005 0.799 26.5 0 2006 0.810 26.86 0 2007 0.793 26.29 0 I − E[ I ] Notes: Z denotes the standard deviate of the Moran I statistic, i.e. Z = sd( I ) . The null hypothesis is the absence of spatial autocorrelation whereas the alternative is positive spatial autocorrelation. The Moran I values are computed assuming normality. Table 3: Summary statistics of regional unemployment rates (1999-2009) Min 1st Qu. Median Mean 3rd Qu. Max Std.dev. national 1999 4 7.8 10 11.41 14.3 24.8 4.815 11.7 2000 3 6.7 8.8 10.46 13.3 25.6 5.158 10.7 2001 3 6.3 8.4 10.19 12.7 26.7 5.356 10.3 2002 3.9 6.9 9 10.69 12.9 27.6 5.279 10.8 2003 4.6 7.7 9.8 11.57 13.9 29.7 5.424 11.6 2004 4.4 7.7 9.8 11.66 14 31.4 5.467 11.7 2005 4.7 8.7 11.4 12.84 16.1 29.7 5.323 13 2006 3.7 7.7 10.5 11.81 15 27.6 5.084 12 2007 2.4 6.1 8.5 9.868 12.6 24.2 4.733 10.1 2008 1.9 4.8 7.2 8.435 11 21.5 4.306 8.7 2009 2.5 5.7 7.9 8.843 11.4 20.1 3.908 9.1 14

20 15 10 5 Figure 1: Regional unemployment in Germany in 2009 15

and Ertur (2010) to assess which spatial process captures the spatial dynamics in our data in the best way. 4.1 Model selection Our model selection procedure is based on the standard two-way fixed effects panel model (Baltagi (2008)), i.e. K uit = ∑ βk xkit + µi + αt + eit ; i = 1, . . . , N; t = 1, . . . , T, (4) k =1 where uit is the regional unemployment rate, β k are unknown parameters and xkit are the values of K explanatory variables. µi denotes district-specific effects and αt rep- resent time effects. We assume the district-specific effects to be fixed as our data set contains information on all German districts. The time effects capture national factors as, for example, business cycle effects that affect all regions in the same way. eit are the disturbances for which it is assumed that eit ∼ (0, σe2 ). The indices of the variables denote district i and year t. Model (4) controls neither for spatial autocorrelation nor for temporal dynamics in the data. Therefore, we refer to this model as basic model. If spatial dependence in the data is ignored, standard OLS regression will provide biased parameter estimates in case of spatial lag dependence and in case of spatially lagged exogenous variables. However, OLS estimation produces unbiased and inefficient estimates for the spatial er- ror model. Neglecting a spatial lag term is similar to an omitted variable bias (Franzese and Hays (2007)). As the spatial lag term is correlated with the error term, OLS estima- tion of the associated coefficient will be inconsistent (Franzese and Hays (2007), Anselin and Bera (1998)). In order to choose the relavant variables, we divide our set of explanatory variables into three groups according to theoretical importance. Then, we regress regional un- 16

employment rates on different combinations of variables where the variables with the strongest theoretical support are always contained. To keep compuational effort man- ageable, we base these regressions on the basic model (equation (4)), although OLS estimation produces biased and/or inefficient results for spatially autocorrelated data. Finally, we compute Akaike’s (AIC) and the Bayesian information criterion (BIC) to assess the goodness-of-fit of the regressions. Table 4 provides an overview of the division of explanatory variables into these groups. The first group of variables contains variables which are essential for our model. We include in this group the employment share in manufacturing and in the construction industry (%I ND and %CON), the age-related demographic variables (YOUNG and OLD) as well as one of the human capital variables (H0). Additionally, we include employment growth (EG) in this group to account for disequilibrium effects.8 The sec- ond group contains variables that are expected to be important for the explanation of regional unemployment rates. We assign to this group our amenity variables (DENS, DEBTR, STAY and REG). Furthermore, we consider the employment shares of agricul- ture (%AGR), electricity, gas and water supply (%ENERW), financial business (%FI N), transport, storage and communication (%TRANS), real estate (%REAL) and public ad- ministration (%PUB) for this group. Moreover, female labor force participation (FP) as well as the remaining educational variables are part of this group (H1 and H2). The last group consists of variables that are expected to have a weaker influence on regional un- employment. These variables are the share of foreign employed persons (FOREIGN) and the employment shares of mining and quarrying (%MI NE), wholesale and retail trade (%TRADE), hotels and restaurants (%HOT) as well as education, health and so- cial work (%EDUHEALTH). Our model selection procedure selects a model containing thirteen variables. The 8 Note that we have not assigned female labor force participation to this group as its regional variation is small because of limited data availability. 17

summary statistics of these variables are in Table 10 in the appendix. To check for possi- ble multicollinearity in our model, we analyze both the correlation matrix of the regres- sors and variance inflation factors where both give no indication for multicollinearity. Hence, our final best model is uit = β 1 EGit + β 2 %I NDit + β 3 %ENERWit + β 4 %CONit + β 5 %HOTit + β 6 %FI Nit + β 7 %PUBit + β 8 YOUNGit + β 9 OLDit + β 10 H0it + β 10 H1it + β 12 REGit + β 13 DEBTRit + µi + αt + eit ; i = 1, . . . , n; t = 1, . . . , T, (5) where the variables are defined as before. The time effects (αt ) are strongly correlated with the national unemployment rate (correlation: 0.95). Our final model contains all variables of group one. The model selection procedure selects the share of employed persons holding a vocational qualification as additional demographic variable. Hence, we account for two of three educational variables. Only the public debt ratio and the number of business registrations of our amenity variables are contained in our model. Hence, our model selection results reveal a first indication that regional unemployment is a disequilibrium phenomenon. Furthermore, the age- related demographic variables and the educational variables are contained in our final model. Regarding the market equilibrium effects, employment shares in electricity, gas and water supply, hotels and restaurants, financial business and public administration are selected into our model in addition to the sectoral variables of group one. The significance of the employment share in hotels and restaurants can be explained by the fact that a significant part of the work in this industry is done by persons holding no specific training qualification for this field. Hence, it might be easier for unemployed persons to get a job in this field. 18

Table 4: Division of explanatory variables for model selection group 1 group 2 group 3 - employment growth (EG) - female labor force participation (FP) - share of foreign employed persons (FOREIGN) share of persons working share of employed persons share of persons working - in manufacturing (%I ND) - with vocational training (H1) - in mining and quarrying (%MI NE), - and in construction industry (%CON) - and with university degree (H2) - in hotels and restaurants (%HOT), share of - population density (DENS) - in wholesale and retail trade (%TRADE), - young (YOUNG) - public debt ratio (DEBTR) - in education, health and social work (%EDUHEALTH) - and old persons (OLD) - business registrations (REG) 19 - employed persons without - number of overnight stays (STAY) any vocational training (H0) share of persons working - in agriculture, hunting and forestry (%AGR), - in electricity, gas and water supply (%ENERW), - in transport, storage and communication (%TRANS), - in financial business (%FI N), - in real estate, renting and business activities (%REAL), - in public administration and defence; compulsory social security (%PUB)

4.2 Spatial econometric modeling To capture the spatial dependence in the data, we specify a spatial panel model. The spatial econometric literature provides different models for data with spatial autocor- relation: the model with spatially lagged exogenous variables (SLX model), the spatial error model, the spatial lag model and combinations of them. The SLX model is, from a methodological perspective, the simplest model because the additional regressors are exogenous and the error term remains spherical. We estimated this model for our data and found that the coefficients of all spatially lagged regressors are not significant. Fur- thermore, the results are, according to information criteria, slightly worse than those of the basic model.9 4.2.1 Testing for the spatial model specification As the model with spatially lagged exogenous variables is not appropriate for our data, we need to specify one of the other spatial processes. Hence, we perform the specifi- cation test by Debarsy and Ertur (2010) to differentiate between the spatial models. To our best knowledge, the test by Debarsy and Ertur (2010) is the only specification test that allows to discriminate between the spatial lag model, the spatial error model and the model including both a spatial lag and spatially autocorrelated errors. Baltagi et al. (2003) extend the langrange multiplier (LM) test by Breusch and Pagan (1980) to the spatial error component model to test simultaneously for the existence of spatial error correlation as well as for random region effects. Additionally, they derive conditional tests for spatial error correlation and random region effects. Baltagi et al. (2007b) gener- alize the underlying model to a spatial panel model that controls for serial correlation over time for each spatial unit. We use this test to motivate our spatial dynamic model. Finally, Baltagi and Liu (2008) derive a test for autoregressive spatial lag dependence instead of spatial error terms. 9 The results can be obtained from the author upon request. 20

The starting point of the test by Debarsy and Ertur (2010) is the model with both a spatial lag term and spatially autocorrelated errors including fixed effects. It is called spatial autoregressive model with spatially autocorrelated disturbances of order (1, 1) (SARAR (1,1) model) and can be described by Ut = λWUt + Xt β + µ + Vt ; Vt = ρWVt + Ξt ; t = 1, . . . , T, (6) where Ut = (u1,t , u2,t . . . , un,t )0 is a (n × 1) vector containing regional unemployment rates. Xt is the (n × k ) matrix containing all explanatory variables from our selected model (equation (5)), β is the (k × 1) coefficient vector and µ = (µ1 , . . . , µ N )0 . W is the (n × n) spatial weights matrix.10 Ξt = (ξ 1,t , . . . , ξ n,t )0 is the (n × 1) vector of innovations where ξ i,t are i.i.d. across i and t and ξ i,t ∼ (0, σξ2 ). Finally, λ is the spatial autoregressive coefficient and ρ is the spatial autocorrelation coefficient. Debarsy and Ertur (2010) consider five different hypotheses in their paper: • H0a : ρ = λ = 0. This joint hypothesis tests whether there is spatial dependence in the data at all. If it cannot be rejected, there is no need for a spatial econometric model. • H0b : λ = 0. Under the alternative, the specification is the spatial lag model. However, spatial errors may exist. • H0c : ρ = 0. Under the alternative, the model contains spatially autocorrelated errors. However, a spatial lag term may exist. • H0d : ρ = 0, with λ possibly different from 0. Under the alternative, the general specification (equation 6) has to be estimated. 10 Debarsy and Ertur (2010) specify the model in their original contribution using different spatial weights matrices for the spatial lag and spatial error part. But they note that the test also works when these are equal. 21

Table 5: Test results of the specification test by Debarsy and Ertur (2010) using the bi- nary contiguity matrix H0a H0b H0c H0d H0e LM 1353.8 1285.7 967.19 7.86 3771.1 p-value 0 0 0 0.0051 0 • H0e : λ = 0, with ρ possibly different from 0. Under the alternative, the general specification (equation 6) has to be estimated. The test statistics for the hypotheses H0a until H0e are in the appendix. Table 5 shows the results of the Debarsy/Ertur (2010) test using the binary contiguity matrix.11 According to the results, we can reject all five hypotheses even on the 1% significance level. Hence, the SARAR(1,1) model is the most appropriate model for our data. 4.2.2 Static model specification In accordance with the results of the test by Debarsy and Ertur (2010), we include a spatial lag term and spatially autocorrelated errors in our model. Additionally, we incorporate time effects in our static spatial panel model in order to have a two-way specification as in our basic model. The static model specification is Ut = λWUt + β 1 EGt + β 2 %I NDt + β 3 %ENERWt + β 4 %CONt + β 5 %HOTt + β 6 %FI Nt + β 7 %PUBt + β 8 YOUNGt + β 9 OLDt + β 10 H0t + β 11 H1t + β 12 REGt + β 13 DEBTRt + µ + αt 1n + Vt ; Vt = ρWVt + Ξt ; t = 1, . . . , T, (7) where the variables are defined as before. The elements of the (n × 1) disturbance vector Ξt = (ξ 1,t , . . . , ξ n,t )0 are assumed to be i.i.d. across i and t with zero mean and constant variance σξ2 . 1n denotes a (n × 1) vector of ones. 11 We also performed this test using the other spatial weights matrices and obtained qualitatively the same results. 22

Lee and Yu (2010b) show that for the (static) model with fixed individual and time effects the direct quasi-maximum likelihood estimation method yields inconsistent es- timates for the common parameters unless n is large. In addition to that, they show that even in the case when both n and T are large, the distribution of the estimates of common parameters is not properly centered. Moreover, Lee and Yu (2010b) show that the use of the typical within transformation to eliminate fixed effects causes the errors in the within-transformed model to be lin- early dependent. Therefore, they apply an orthogonal transformation to eliminate the individual effects which produces independent error terms. The standard within trans- formation uses the deviation from time mean operator, i.e. JT = IT − T1 1T 10T where IT is the identity matrix of dimension T. Lee and Yu (2010b) define the orthonormal eigenvector matrix of JT , i.e. [ FT,T −1 , √1 1T ]. FT,T −1 is the (T × ( T − 1)) submatrix cor- T responding to the eigenvalues of one. They suggest to transform the original data by FT,T −1 , i.e. ∗ ∗ [Yn1 , . . . , Yn,T −1 ] = [Yn1 , . . . , YnT ] FT,T −1 . (8) Note that the dimension of the transformed model is n( T − 1). To remove the time effects from the model, they propose a similar transformation which is based on the n 1n 1n . orthogonal transformation using Jn = In = 1 0 Correspondingly, the model has dimension (n − 1)( T − 1) after both transformations. Lee and Yu apply this transfor- mation approach in various contributions (Lee and Yu (2010a), Lee and Yu (2010b), Lee and Yu (2010c)). We apply it to both our static and our dynamic model. Finally, the transformed model can be estimated by quasi-maximum likelihood.12 12 For more details on the estimation methodology, see Lee and Yu (2010b). 23

4.2.3 Dynamic model specification Labor market data is not only correlated over space but also over time. To motivate the dynamic approach, we use the test by Baltagi et al. (2007b) because it allows for serial correlation in the error terms (in addition to spatial autocorrelation). Details on hypotheses and test statistics are in the appendix. The test results clearly show the fol- lowing three aspects of our data. Firstly, there is serial dependence in our data. Hence, a dynamic model specification is reasonable in our context. Secondly, the test results give an indication for the presence of spatially autocorrelated errors. This is in line with the results of the Moran I test that also show significant spatial autocorrelation in regional unemployment rates. Thirdly, the test results support our assumption of a fixed effects model because we cannot reject the hypothesis that the standard deviation of the fixed effects is equal to zero. The literature on spatial dynamic panel models provides various model specifica- tions. Elhorst (2012) provides a survey of the literature on specification and estimation of spatial dynamic panel data models. For our analysis of regional unemployment, we include a spatial lag term, a temporally lagged term as well as a combined spatially and temporally lagged term in our dynamic model. The resulting model can be described by Ut = λWUt + γUt−1 + δWUt−1 + β 1 EGt + β 2 %I NDt + β 3 %ENERWt + β 4 %CONt + β 5 %HOTt + β 6 %FI Nt + β 7 %PUBt + β 8 YOUNGt + β 9 OLDt + β 10 H0t + β 11 H1t + β 12 REGt + β 13 DEBTRt + µ + αt 1n + Ξt ; t = 1, . . . , T, (9) where γ captures the pure time-dynamic effects and δ captures the combined spatial- temporal effect. The assumptions about the error term Ξt are as before. Yu et al. (2008) propose a bias corrected quasi-maximum likelihood estimator for the 24

spatial dynamic panel data model including a spatial lag, a temporal lag and a com- bined spatial-temporal term. However, they only allow for individual-specific fixed effects but not for fixed time effects. Lee and Yu (2010a) provide an estimator for the same model but extended to include time period fixed effects. Lee and Yu (2010a) show that direct quasi-maximum likelihood estimation of all parameters in the model with time effects yields an additional bias of order O(n−1 ). They apply their transforma- tion approach and show that it can avoid the additional bias with the same asymptotic efficiency as the direct quasi-maximum likelihood estimates when n is not relatively smaller than T. Furthermore, Lee and Yu (2010a) show that the direct estimates have a degenerate limit distribution while the transformed estimates are properly centered and asymptotically normal. Therefore, we apply the estimation methodology of Lee and Yu (2010a) to our dynamic model. 5 Estimation results Firstly, we estimate the basic model, i.e. the model without any terms controlling for spatial or temporal dependence. The basic model is specified according to a two-way fixed effects panel data model and it is estimated using the standard within-estimator (see Baltagi (2008)). Secondly, we estimate the static spatial panel specification and, thirdly, the spatial dynamic model, both using the binary contiguity matrix, the dis- tance decay matrix as well as the Molho (1995) weights matrix. Hence, we perform seven regressions for the whole of Germany. The regression results for the basic and the static model are in Table 6 and the results for the dynamic model are in Table 7. In addition to that, we perform the same regressions for the Eastern and Western part of Germany individually. Elhorst (2012) discusses stationarity issues and proposes sta- tionarity conditions for spatial dynamic panel data models. These conditions as well as the conditions stated in Lee and Yu (2010c) are satisfied in the regression results for the 25

whole of Germany. However, the regression results for East and West Germany using the distance decay matrix do not meet the stationarity conditions. Therefore, we only present the results using the other spatial weights matrices for the separate analyses. 5.1 Results for the whole of Germany Economic interpretation As expected, regional unemployment rates are influenced negatively by employment growth. Furthermore, the shares of employed persons working in manufacturing and in the construction industry also have a negative impact on regional unemployment. Hence, districts that are specialized in these industries exhibit lower unemployment than districts with a different sectoral structure. Our estimation results reveal no indi- cation for a discrimination of older workers as the associated coefficient is also negative. Though, this coefficient should not be overinterpreted because it can simply be related to effects of demographic change, i.e. an aging labor force. By contrast, the impact of younger employees on regional unemployment is positive. But this does not impliy necessarily youth unemployment because the majority of persons aged 15 until 25 is still in the educational system. The share of employed persons without any profes- sional qualification influences regional unemployment positively which is in line with expectation from theory. Interestingly, this also holds for the share of employed persons with vocational training. Our model contains only a few of the amenity variables. Additionally, the signs of the amenity variables are against expectation from theory. According to the equilibrium view, consumers are expected to stay in regions with high unemployment when this re- gion offers a great extent of amenities. Hence, high unemployment should be related negatively to public debt because heavily indebted districts are not able to finance pub- lic goods to improve life quality. If high public debts result from high investments in the past, consumers expect less expenditures in the future. However, our results show 26

a significant positive coefficient for the public debt ratio. A similar reasoning holds for producer amenities. Firms are expected to move to districts with high unemployment, i.e. the level of producer amenities should be higher when regional unemployment is lower. But the coefficient of business registrations is positive in our empirical results. Even if the public debt ratio is interpreted as a proxy for producer amenities, its co- efficient has not the desired sign. Thus, our results reveal no indication for regional unemployment to be of equilibrium nature in Germany. Nonetheless, some of the mar- ket equilibrium variables, i.e. employment shares, are significant in our model. Spatial econometric interpretation Ignoring spatial dependence in the data, results in biased and inefficient estimates. The estimated coefficients of the basic model are mostly upward-biased in absolute value in comparison with the results of the static model. In an earlier contribution (Lottmann (2012)) we get a similar result for the estimation of matching functions. The existence of this bias is theoretically shown in Franzese and Hays (2007). In addition to that, the in- formation criteria show that the spatial models are more appropriate for our data than the basic one. Hence, a spatial model is needed for the analysis of regional unemploy- ment. The dynamic model fits our data better than the static model according to informa- tion criteria. Thus, in order to model regional unemployment, a dynamic modeling approach needs to be applied. To our best knowledge, most of the contributions to the regional unemployment literature apply only a static model. However, most of the explanatory variables are not significant in the dynamic model. Hence, the temporal lag is able to explain a lot of the variability in regional unemployment rates. Only em- ployment growth, the employment shares of manufacturing, construction industry and electricity, gas and water supply as well as the age-related demographic variables have a significant impact on regional unemployment. Interestingly, the sign of the coefficient 27

for the share of people working in construction industry differs between the static and the dynamic model. The spatial autoregressive (λ) and the spatial autocorrelation coefficient (ρ) measur- ing the spatial influence in our static spatial panel model are both significant while the influence of both coefficients is positive in most cases. Hence, district-level unemploy- ment is influenced positively by unemployment in neighboring districts. The spatial autocorrelation coefficient indicates the impact of regional effects that affect a region consisting of more than one district. Examples in the context of regional unemployment are exogenous shocks as the closure of a production site. The spatial autoregressive co- efficient of the dynamic model is also significant and positive. The same holds for the pure time-dynamic effect. This result underlines the fact that our data exhibit not only spatial but also temporal autocorrelation. Contrary to this, the combined spatial-time effect is negative and significant. Furthermore, the results are fairly sensitive to the choice of the spatial weights matrix. In the spatial econometric literature, Bell and Bockstael (2000) (among others) find that estimation results are more sensitive to the specification of the spatial weights matrix than to the estimation technique. According to information criteria, the binary spatial weights matrix captures the spatial structure of the data in the best way for the static model while the distance decay function is most appropriate in case of the dynamic model. 5.2 Differences between East and West Germany Due to German history, it is worthwhile to analyze the differences between the West- ern and Eastern part of the country. We use a two-regime regression, i.e. we estimate the model for both parts separately. This procedure rests on the assumption that coef- ficients of the explanatory variables differ between East and West Germany. From an economic perspective, we find no reason why a particular coefficient, for example the 28

Table 6: Regression results of regional unemployment model - basic and static model specification for the period from 1999 until 2007 dependent variable: uit basic static binary distance Molho (1995) (τ = 0.02) (η = 0.01) EGit −0.066∗∗∗ −0.033∗∗∗ −0.04∗∗∗ −0.05∗∗∗ (-7.12) (-6.2) (-5.41) (-6.15) %I NDit −0.11∗∗∗ −0.071∗∗∗ −0.08∗∗∗ −0.09∗∗∗ (-7.95) (-7.35) (-7.11) (-7.05) %ENERWit 0.17∗∗∗ 0.098∗∗ 0.08 0.12∗ (2.6) (1.98) (1.47) (1.93) %CONit −0.29∗∗∗ −0.133∗∗∗ −0.12∗∗∗ −0.17∗∗∗ (-11.85) (-10.73) (-5.58) (-7.46) %HOTit 0.16∗∗∗ 0.072∗ -0.01 0.09∗ (2.96) (1.95) (-0.17) (1.96) %FI Nit 0.17∗∗∗ 0.046 0.102∗∗ 0.14∗∗∗ (3.06) (1.13) (2.21) (2.75) %PUBit 0.12∗∗∗ 0.053∗∗∗ 0.056∗∗ 0.073∗∗∗ (4.36) (2.74) (2.49) (3.05) YOUNGit 0.35∗∗∗ 0.021 -0.008 0.057 (9.75) (0.96) (-0.24) (1.64) OLDit −0.16∗∗∗ −0.13∗∗∗ −0.2∗∗∗ −0.22∗∗∗ (-5.86) (-7.28) (-8.03) (-8.73) H0it 0.103∗∗∗ 0.098∗∗∗ 0.088∗∗∗ 0.089∗∗∗ (3.8) (7.78) (4.52) (4.15) H1it 0.081∗∗∗ 0.081∗∗∗ 0.079∗∗∗ 0.084∗∗∗ (4.14) (7.56) (4.72) (4.74) REGit 0.17∗∗∗ 0.08∗∗∗ 0.14∗∗∗ 0.11∗∗∗ (4.44) (3.35) (4.44) (3.96) DEBTRit 0.054∗∗ 0.015 0.02 0.026 (2.2) (0.87) (0.97) (1.16) λ — 0.83∗∗∗ 0.79∗∗∗ 0.78∗∗∗ (71.59) (16.41) (14.56) ρ — −0.46∗∗∗ 0.67∗∗∗ 0.71∗∗∗ (-13.68) (8.77) (9.06) σ2 0.61 0.34 0.44 0.5 log-like -4123.08 -3274.95 -3361.05 -3525.43 AIC 2.23 1.78 1.82 1.82 BIC 2.25 1.80 1.85 1.85 obs. 3708 3708 3708 3708 Notes: t-statistics are in parentheses. t-statistics for the static model are computed according to Anselin (1988). λ is the spatial autoregressive coefficient and ρ is the spatial autocorrelation coefficient. ∗ ∗ ∗, ∗∗ and ∗ indicate coefficients that are significant at 1%, 5% and 10%, respectively. 29

Table 7: Regression results of regional unemployment model - dynamic model specifi- cation for the period from 1999 until 2007 dependent variable: uit binary distance Molho (1995) (τ = 0.02) (η = 0.01) EGit −0.055∗∗∗ −0.063∗∗∗ −0.069∗∗∗ (-7.93) (-8.98) (-9.23) %I NDit −0.021∗∗ -0.011 -0.015 (-1.99) (-1.05) (-1.31) %ENERWit 0.209∗∗∗ 0.23∗∗∗ 0.24∗∗∗ (4.25) (4.63) (4.56) %CONit 0.058∗∗∗ 0.019 0.047∗∗ (2.7) (0.87) (2.03) %HOTit −0.08∗ -0.02 -0.056 (-1.85) (-0.36) (-1.22) %FI Nit 0.055 0.07 0.065 (1.16) (1.39) (1.29) %PUBit 0.0107 0.0095 0.0078 (0.53) (0.47) (0.36) YOUNGit 0.046∗ -0.0001 0.078∗∗∗ (1.74) (-0.0039) (2.7) OLDit −0.08∗∗∗ −0.104∗∗∗ −0.104∗∗∗ (-3.8) (-4.9) (-4.61) H0it 0.006 0.0091 -0.016 (0.28) (0.43) (-0.71) H1it -0.0149 -0.014 −0.029∗ (-0.97) (-0.91) (-1.73) REGit 0.0073 0.013 0.013 (0.28) (0.51) (0.48) DEBTRit 0.0088 0.0043 0.0081 (0.48) (0.23) (0.41) λ 0.5∗∗∗ 0.88∗∗∗ 0.79∗∗∗ (26.55) (42.41) (32.87) γ 0.78∗∗∗ 0.78∗∗∗ 0.8∗∗∗ (49.04) (52.11) (55.29) δ −0.42∗∗∗ −0.68∗∗∗ −0.71∗∗∗ (-15.98) (-17.55) (-14.31) σ2 0.27 0.27 0.31 log-like -2270.7 -1251.6 -1444.4 AIC 1.39 0.77 0.89 BIC 1.42 0.80 0.92 obs. 3296 3296 3296 Notes: t-statistics are in parentheses. t-statistics of the dynamic spatial panel model are computed using the asymptotic distribution derived in Lee and Yu (2010c). λ is the spatial autoregressive coefficient, γ captures the pure time effect and δ captures the combined spatial-time effect. ∗ ∗ ∗, ∗∗ and ∗ indicate coefficients that are significant at 1%, 5% and 10%, respectively. The reduced number of observations results from Lee and Yu’s transformation approach. 30

You can also read