On the Origins of Entrepreneurship: Evidence from Sibling Correlations

Promoting entrepreneurship has become an increasingly important part of the policy agenda in many countries. The success of such policies, however, rests in part on the assumption that entrepreneurship outcomes are not fully determined at a young age by factors that are unrelated to current policy. We test this assumption and assess the importance of family background and neighborhood effects as determinants of entrepreneurship, by estimating sibling correlations in entrepreneurship. We find that between 20 and 50 percent of the variance in different entrepreneurial outcomes is explained by factors that siblings share (i.e., family background and neighborhood effects). The average is 28 percent. Hence, entrepreneurship is far less than fully determined at a young age. Our estimates increase only a little when allowing for differential treatment within families by gender and birth order. We then investigate a comprehensive set of mechanisms that explain sibling similarities. Parental entrepreneurship plays a large role in explaining sibling similarities, as do shared genes. We show that neighborhood effects matter, but are rather small, particularly when compared with the overall importance of family factors. Sibling peer effects, and parental income and education matter even less.


Introduction
Entrepreneurship has been hailed as an avenue for upward social mobility and a driver of innovation, job creation, and growth Acemoglu et al., 2013;Decker et al., 2014); policies aimed at encouraging successful entrepreneurship have therefore been adopted in many countries (Acs et al., 2016). For example, vast amounts of money are spent in the attempt to facilitate access to finance for entrepreneurs with great ideas, but limited personal capital (Lerner, 2009;Lelarge et al., 2010). Entrepreneurial education has also permeated academic and training curricula, from primary school (Huber et al., 2014), through tertiary education (Oosterbeek et al., 2010), and to active labor market programs (Fairlie et al., 2015).
The goals of these programs are usually to increase participants' levels of entrepreneurship relevant skills, such as business planning or creativity, or alternatively, to make them familiar with the possibility of careers in entrepreneurship. The potential success of policy measures and education programs, however, rests in part on the assumption that individuals' entrepreneurship selection and performance are not predetermined. If entrepreneurship is largely determined at a young age by factors outside of an individual's control, such as parental entrepreneurship experience, then adult-stage policies and education efforts may miss their target.
In this paper, we assess the importance of family background and neighborhood effects as determinants of entrepreneurship. Do individuals start life with equal chances of engaging and succeeding in entrepreneurship? If not, then what family-and/or community-wide factors do children and young adults face that may later limit -or promote -their opportunities of becoming successful entrepreneurs?
Previous literature suggests that entrepreneurial preferences may be formed at a young age.
In fact, entrepreneurship education has, so far, only been shown to influence the entrepreneurship relevant skills (Huber et al., 2014) and future entrepreneurial performance (Elert et al., 2015) of relatively young pupils (in their early teens), but not of individuals in tertiary education (Oosterbeek et al., 2010) or adults (Fairlie et al., 2015). Moreover, strong intergenerational associations in entrepreneurship have attracted considerable attention; while part of this relationship has been shown to be genetic (Nicolaou et al., 2008), parental role-modeling appears to be the main driver of the intergenerational association in entrepreneurship (Sørensen, 2007;Lindquist et al., 2015). Additionally, exposure to a dense entrepreneurial environment during formative years also increases the likelihood of entry into entrepreneurship (Guiso et al., 2015).
More generally, family influences are critical in human capital formation and occupational choices (Becker, 1988;Björklund and Jäntti, 2012).
Going beyond the parent-child intergenerational transmission literature, we quantify the importance of family-and community-wide factors (experienced when young) as determinants of entrepreneurship by estimating sibling correlations in various measures of entrepreneurship and entrepreneurial success. Sibling correlations provide us with much broader measures of the importance of family background and neighborhood effects than do intergenerational associations (Solon, 1999;Björklund and Jäntti, 2012).
To compute sibling correlations in entrepreneurship, we use detailed data drawn from Sweden's Multigenerational Register. Our data set includes information on more than 700,000 siblings. We have extensive data on individual and family-wide socio-economic variables, including information on parental education and income, parental entrepreneurship, family structure and parish of residence when individuals were young. As outcomes, we create a wide set of variables at the extensive and intensive margins, which allow us to obtain a complete picture of entrepreneurial outcomes: we recognize that self-employment and incorporation may capture different aspects of entrepreneurial engagement. Moreover, we analyze not only to what extent youth environment affects the decision to become an entrepreneur, but also to what extent it affects the ability to survive and thrive as an entrepreneur.
Our main results show that close to 25 percent of the variance in individuals' decisions to become self-employed is explained by family background and community influences; for incorporation this is close to 35 percent. These percentages are slightly higher when we consider above median years of self-employment and incorporation. Brother correlations are always larger than sister correlations, with the largest correlation for males with above median years of incorporation being close to 50 percent. Mixed sibling correlations are consistently smaller than same-sex correlations.
Sibling correlations may be viewed as lower bounds on the importance of family background, since families may also generate differences between siblings if parents treat children differently based on, for example, birth order and/or gender (see also Conley, 2004;Björklund and Jäntti, 2012). 1 As differential treatment by gender or birth order may impact entrepreneurship (Black et al., 2005;Lindquist et al., 2015), and since these family-generated differences are not captured by our sibling correlations, we apply the two-step estimation method proposed by Björklund and Jäntti (2012) to assess the potential downward 'bias' of our measures of the importance of family background. On average, our revised estimates increase by 4 percent after controlling for within-family differential treatment.
We then go on to investigate the role played by specific factors in generating sibling similarities in entrepreneurship. We examine the roles played by (i) neighborhood effects, (ii) parental characteristics, (iii) sibling peer effects, and (iv) genes.
To investigate the role of neighborhood factors we estimate neighborhood correlations in entrepreneurship (see, e.g., Solon et al., 2000). Less than 10 percent of sibling correlations or 2.5 percent of total variance can be explained by neighborhood effects. Since we do find some room for neighborhoods to play a role, we do not directly contradict the findings of Giannetti and Simonov (2009) and Guiso et al. (2015), but the scope for such effects in our setting is rather limited.
To assess the role of parental characteristics in generating sibling similarities we run an accounting exercise, where we re-estimate our sibling correlations controlling for observable parental characteristics (Mazumder, 2008;Björklund et al., 2010). Our results show that parental entrepreneurship status is quite important, with parental education and income a distant second; together, these factors account for 20 percent of sibling correlations, and around 5 percent of total variance. Interestingly, parental self-employment is a prime explanatory force in individual self-employment, but not incorporation, and parental incorporation explains best individual incorporation, but not self-employment. This suggests that self-employment and incorporation are different aspects of entrepreneurship (in line with Levine and Rubinstein, 2013;Henrekson and Sanandaji, 2014;van Praag and Raknerud, 2014;Åstebro and Tåg, 2015;Guzman and Stern, 2015), and that there are different transmission mechanisms depending on the type of entrepreneurial engagement of the parents. Parental immigration status, which is typically associated with higher participation in entrepreneurship, does not contribute more than 1.5 percent of sibling correlations, while family structure accounts for less than 1 percent.
We examine how correlations vary with the age difference between siblings to get a first indication of potential peer effects. Unlike the bulk of the sibling correlation literature, where closely spaced siblings tend to have more similar outcomes than widely spaced siblings (see, e.g., Eriksson et al., 2016), we find that sibling correlations are unaffected by birth spacing.
We then compute an upper bound on the potential contribution of sibling peer effects to our sibling correlations by performing a correlated random effects exercise (Altonji et al., 2016).
Our estimated peer effects are generally non-significant and very small in magnitude -less than 10 percent of the sibling correlation could potentially be explained by sibling peer effects.
Lastly, to investigate the role of genes (in general) and shared genes (in particular), we perform the genetic decomposition developed by Nicoletti and Rabe (2013). Our results imply a relatively important role for genes, similar to Nicolaou et al. (2008). The genes that non-twin siblings share may account for a large part of sibling correlation in entrepreneurship; on average, between 56 and 78 percent of the sibling correlations in self-employment may be due to shared genes, and between 38 and 46 percent for incorporation, although not all our estimates of heritability are significant.
We contribute to the literature on the determinants of entrepreneurship in several ways. We apply a set of methods that have not been used to analyze entrepreneurial outcomes before.
Our methods generate a novel measure of the extent to which entrepreneurial entry and success are predetermined by genetic and domestic factors when people are young. Our approach allows us to investigate a quite comprehensive set of potential mechanisms in order to ascertain the channels through which family environment influences children's entrepreneurial choices and success. In doing so, we put many previous results in the literature -regarding the role of neighborhoods, parental income, parental entrepreneurship, and genes -into perspective.
Unlike previous studies, we are able to investigate the relative importance of various mechanisms within a unified empirical framework. We conclude that parental entrepreneurship and genes are the two most important factors generating sibling similarities in entrepreneurship.
The remainder of this paper is structured as follows. Section 2 describes the data and presents descriptive statistics. In Section 3, we lay out our empirical strategy and report baseline sibling correlations. In Section 4, we assess the potential role of differential treatment of siblings within the family and we present the results of this exercise. Section 5 looks at the mechanisms that drive sibling similarities: (i) neighborhood effects, (ii) parental characteristics, (iii) sibling peer effects, and (iv) genes. Section 6 concludes.

Data
To create the dataset used in this paper, we start with a 25 percent random sample from Sweden's Multigenerational Register, which includes all persons born from 1932 onwards who have lived in Sweden at any time since 1961. We then match on all of their brothers and sisters, where siblings are defined as those having the same biological or adoptive mother.
This matching is made possible by the fact that all family ties (biological and adoptive) are recorded in Sweden's Multigenerational Register. We later check that results are similar when defining the family on the basis of the father. Given the years for which entrepreneurship data is available , we restrict our sample to those born between 1960 and 1970: thus, we follow the oldest cohort from age 25 to 52, and the youngest cohort from age 15 to 42. Those who died or left Sweden before 1985 are dropped from the sample. These cohort restrictions imply that siblings are born at most 10 years apart and that some individuals have siblings that are not included in our sample. 2 Consistent with the Swedish tax authorities, we define individuals as self-employed when they derive the majority of their taxable labor income from a business they own in full or in part. For the years 1985 to 2012, we have information on (sole and shared) business ownership for unincorporated business. For the years 1993 to 2012, we also know if a person received the majority of his or her taxable labor income from an incorporated enterprise owned in part or in full by him-or herself (and possibly employing personnel). An incorporated business in our data is a privately owned, non-listed, limited liability stock company. This type of business is better able to capture a type of entrepreneurship more likely to be associated with job creation, innovation and growth than simple self-employment (Levine and Rubinstein, 2013).
We characterize various types of entrepreneurs. 3,4,5 Our first measure of entrepreneurship, Self-employed, is a dichotomous variable equal to one if the individual is ever categorized as self-employed in an unincorporated business that they own in full or in part, and zero otherwise.
Our second measure of entrepreneurship, Incorporated, is equal to one if the individual has ever been incorporated, and zero otherwise. In any given year, no individual is classified as both self-employed and incorporated. 6 We add two stricter definitions of entrepreneurship by using the median of the number of years individuals spend in self-employment (incorporation) as a threshold in labeling an individual as an self-employed (incorporated). In our sample, Selfemployed ≥ 4y takes a value of one for individuals who have been self-employed for at least of 4 years, and Incorporated ≥ 5y equals one for individuals incorporated for a minimum of 5 years. 7 We define a High income self-employed as an individual whose self-employment income (defined below) is above the median.
2 We impose these sample restrictions such that we are able to observe the occupational choices of the individuals in all cohorts and their parents for a number of years. In Appendix Table C.1, column (5), we show that our estimated sibling correlations do not change if we focus only on the smaller sample of complete families.
3 We do not have information before 1993 on those working in their own incorporated enterprise. This implies we are underestimating the true extent of entrepreneurship for the years 1985-1992. For 1993-2001, roughly 2 percent of the sample is in this position. This might be approximately true for 1985-1992, as well. 4 Farmers are included in Statistics Sweden's definition of business owners, since farms are typically run as companies (either incorporated or unincorporated). 16 percent of our sample of self-employed are farmers, while 4 percent of the incorporated are farmers. Sibling correlations are the same for siblings with parents who were farmers and for siblings whose parents were not farmers. 5 In 2004, Statistics Sweden changed their routines for collecting information on business ownership, as well as its definition. Since then, it includes business owners who report zero profits or even losses.
6 Individuals who are Incorporated are not a subsample of the Self-employed. In any given year, these two variables are mutually exclusive. Some individuals have been incorporated, but not self-employed, and vice versa, and other individuals have experienced both types of entrepreneurship at some point in their careers. 7 Note that our measure of duration could, in principle, consist of separate spells.
We also create several continuous measures of entrepreneurship. We start with the number of years individuals have been self-employed or incorporated: Years self-employed and Years incorporated. These variables represent established measures of entrepreneurial duration used in the literature. We also study income from self-employment activities. This variable is created from annual tax register data for the years 1985-2012. For each person in each year, we have a measure of pre-tax total factor income, which includes earnings, taxable benefits (e.g. unemployment insurance, parental insurance, sick pay, etc.) and net capital gains (e.g. dividends, interest received or paid, etc.). In each year a person is self-employed, we label this income as 'self-employment income'. We then take the average across all years of self-employment income and then take the log of this average. This is our measure of Self-employment income.
We have also created a set of family-wide background variables to use in our accounting exercise. We define parental self-employment and incorporation in the same way as we do for their children. We also have information on parental education, immigrant status and income. The latter is defined as the log of the average of a parent's pre-tax total factor income for all available years from 1968 to 2012, calculated separately for mothers and fathers.
Parental education is measured in seven different levels spanning the old minimum, seven-year compulsory level to graduate school. These indicate the highest degree completed in Sweden, and as such, it is missing for older immigrants who have not attended school in Sweden. 8 We include several measures of family type and family structure. We have created a variable for the mother's age at the birth of her first child (including those born before 1960 that do not appear in our sample of siblings). Similarly, we also create correct variables for the number of children in a family and a variable indicating whether a sibling is the first born child or not.
We create a dummy variable if the father is unknown and a variable for family structure at age 15, possibly varying across siblings from the same family. This variable is based on information we have about actual cohabitation; it contains six categories: missing, both parents present, single mother, single father, mother with new husband, father with new wife. We have also constructed two other family structure variables, namely the mother's partner count (i.e. the number of individuals she has conceived children with) and whether the household includes both biological and adoptive children -a motive usually found in the bequests literature. Lastly, we use information on the parishes siblings live in at age 15 to define neighborhoods. 9  Panel C of Table 2 shows the father is missing for 1.6 percent of our sample; 2 percent of individuals are twins and 1.4 percent have been adopted by either mother, father, or both, leading to only 0.6 percent of households that include both biological and adoptive children.

Descriptive Statistics
The average number of children is 2.8 per family, of which we capture in our sample 1.6 children per family on average. Mothers tend to give birth in their mid-20's, and are unlikely to conceive children with more than one man (only around 3 percent do so). Our family structure variable reveals that the lion's share of families consists of intact families -almost 70 percent. Single mothers represent the second most frequent family type (18.54 percent), followed by mothers with a new husband (5 percent), single fathers (3.79 percent), and fathers with a new wife (1.97 percent). This variable is missing for 1.39 percent of our sample.
Panel D shows that our average parish, out of a total of 2,650 parishes, comprises 262 individuals, while the largest includes 5,359 individuals. 10 On average, 17.5 percent of individuals in a parish become self-employed at some point, while 8 percent found an incorporated business.
In Table 3, we examine differences in the observable characteristics of employees, selfemployed and incorporated. Employees are defined as those labor market participants who have never been self-employed or incorporated. The incorporated have (on average) higher incomes and more education than the other two groups, while the self-employed have (on average) lower incomes and less education. The parents of incorporated entrepreneurs also have higher educations and more income (on average) than the parents of the other two groups.
Furthermore, we see that the self-employed are more likely to have parents who were selfemployed (but not incorporated) than those in the other two groups. Similarly, we see that incorporated entrepreneurs are more than three times as likely to have parents who were incorporated compared with the other two categories. Clearly, incorporated and self-employed are different types of entrepreneurs in terms of their observable characteristics and in terms of their family backgrounds, including the type of entrepreneurial experiences they were exposed to as children. These differences are very much in line with those noted by Levine and Rubinstein (2013); also, the differences between employees and the incorporated are larger than those between employees and the self-employed, implying that employees and the self-employed are more substitutable than employees and the incorporated. This is expected to translate into larger sibling correlations in incorporation than in self-employment.

Sibling Correlations
Entrepreneurship, E if , for sibling i from family f can be modeled as: where X if includes individuals' birth year and a gender dummy for individual i from family f .
The residual term, if , is an individual-specific component representing a person's position in the overall distribution of entrepreneurship, whose population variance is given by σ 2 . Following Solon (1999), the individual variance component, if , is assumed to be comprised of two linearly additive and independent variance components: The share of the variance in an individual's long-run probability of being an entrepreneur (or in his or her innate propensity to choose entrepreneurship over wage employment) that can be attributed to family background effects is: This share coincides with the correlation in entrepreneurship of randomly drawn pairs of siblings, which is why ρ is called a sibling correlation.
This sibling correlation can be thought of as an omnibus measure of the importance of family and community effects. It includes family-wide influences that are shared by siblings, such as parental entrepreneurship, parental income, parental aspirations, cultural inheritance, genes, etc. However, it also includes shared influences that are not directly experienced in the home, such as school, church, and neighborhood effects. Genetic traits not shared by siblings, differential treatment of siblings, time-dependent changes in neighborhoods, schools, etc., are captured by the individual component b if . If such non-shared factors are relatively more important than shared factors for determining entrepreneurship, the variance of the family effects will be small relative to the variance of the individual effects and the sibling correlation will be low; in other words, the more important the effects of factors that siblings share are, the larger the sibling correlation will be.
The existence of non-shared family factors, such as differential treatment based on birth order and/or gender, implies that the sibling correlation should be viewed as a lower bound on the importance of family-background and neighborhood effects. 11 We return to this argument when we discuss sibling differences in Section 4.
An estimate of the sibling correlation in long-run entrepreneurship, ρ, can be constructed using estimates of the between-family variation, σ 2 a , and the individual variation, σ 2 b . These can be obtained by estimating the following mixed-effects model: When the outcome variable is continuous (e.g. self-employment income), we estimate this model using Stata's mixed command under the assumption that the two random components are independent realizations from a multivariate normal distribution with mean zero and con- 11 Björklund and Jäntti (2012) discuss this issue and provide a quantitative example for the case of birth order. In particular, they examine the size of the advantage that first born children have over their younger siblings in terms of cognitive and non-cognitive skills, height, schooling, and earnings. They find only minor effects, which we also confirm, see Section 4. stant variance. The variance components are estimated using restricted maximum likelihood.
When the outcome variable is dichotomous (e.g. Self-employed or Incorporated), we reformulate equation (5) as a latent linear response model: where we only observe E if = I(E * if > 0). We estimate equation (6) using Stata's melogit command under the assumption that the random effect a f is a realization from a normal distribution with mean zero and constant variance, while the individual variance component, b if , is drawn from the logistic distribution with mean zero and variance π 2 /3.

Results
Sibling correlations are reported in Table 4. In column (1), we see that family background and community influences account for 23 percent of the likelihood of ever becoming Self-employed and 34 percent of the likelihood of ever becoming Incorporated. Looking at stricter measures of entrepreneurship, we see that 26 percent of the variation in Self-employed ≥ 4y and 42 percent of the variation in Incorporated ≥ 5y can be attributed to family-wide influences that siblings share. Becoming a High income self-employed, however, appears to be less influenced by family background. Columns (2)-(4) also show that shared family background is a more important determinant of entrepreneurship for men than for women and that outcomes for mixed sex siblings are less similar than those of same sex siblings. 12 Overall, these sibling correlations imply that family background is quite important, especially for explaining the likelihood of becoming an entrepreneur according to stricter measures of entrepreneurship, such as owning an incorporated firm for a longer period of time. They are of similar magnitudes as the sibling correlations for earnings and education in Sweden (see, e.g., Björklund and Jäntti, 2012). Yet, one cannot conclude that entrepreneurship is mostly determined when young, since the majority of the variance in our outcome variables can be attributed to factors that siblings do not share. Table C.1 in Appendix C. As a first robustness check, we exclude singletons from our analysis, as they only contribute to the precision of the standard error of our between-family variation estimates. Their exclusion has little impact on the estimated sibling correlations: all the coefficients vary within a small margin and the standard errors are virtually identical with or without singletons. Additionally, 12 The higher sibling correlation we observe for males in Table 4 may be due to heightened interaction between brothers, see Section 5.3.

Sensitivity Analyses Sensitivity analyses are reported in
we define siblings as having the same father: the estimated sibling correlations are very close to the ones for siblings having the same mother. The pattern of higher sibling correlations for males remains across these alternative specifications. Sibling correlations for two-child families, closely spaced siblings, and complete families reveal a consistent picture. A placebo test for 'fake' families yields insignificant 'sibling' correlations. 13

Sibling Differences
Sibling correlations are designed to measure sibling similarities in a given outcome. However, the estimation outlined above can only generate positive (or zero) estimates of sibling similarity, meaning that families can only make siblings alike -consistent with an assumption of parental equal concern for children across birth order and gender. In reality, families may actually act as an important source of inequality between siblings (Conley, 2004); that is, some things that families do -willingly or not -may increase the difference in outcomes between siblings. If this is the case, then the sibling correlation should be viewed as a lower bound on the importance of family background. Behrman and Taubman (1986) show that the model of parental resource allocation to children produces better predictions once the assumption of equal concern is relaxed and birth order effects and parental preferences are explicitly considered. While the relationship of birth order and outcomes is a priori ambiguous, 14 the literature examining the impact of birth order on education and income generally reveals a significant premium associated with first-borns in developed economies (Black et al., 2005;Booth and Kee, 2009;Mechoulan and Wolff, 2015).
Early investments in children are sometimes reinforced by financial transfers during adulthood. While most bequests tend to be relatively equal, inter-vivos transfers display a much larger degree of inequality and may relieve financial constraints for some siblings relative to others (McGarry, 1999;Bernheim and Severinov, 2003;Behrman and Rosenzweig, 2004). These transfers appear to favor children closer to home in terms of caring for parents, who are more likely to be first-borns acting as implicit insurance during parental old age. In addition, parents are more likely to transfer resources to children involved in home production, farm or business work and to biological, rather than adoptive children (Light and McGarry, 2004).
An alternative hypothesis regarding birth order comes from psychology: even though siblings experience shared events, they have divergent interpretations of the same events, given their different ages or circumstances. Moreover, first born children tend to be associated with conformity, since they usually accept the roles that parents envision for them and embrace a more traditional outlook on life. All subsequent siblings then engage in a process of mutual differentiation or 'de-identification', whereby they adopt different roles within the family. Later born children tend to be more creative and less likely to conform to norms, which may make them more 'entrepreneurial' (Sulloway, 1996).
The (perhaps unintended) preferential treatment of first-borns, in parental attention and investments over time, may reduce the constraints these individuals face -either in skill, signaling ability or liquidity -and may facilitate their transition into entrepreneurship. Conversely, laterborns may be intrinsically more entrepreneurial, since they are more likely to be risk-taking and disruptive (Sulloway, 1996). In any case, families create differences between siblings that may have implications for entrepreneurship: we thus expect our measure of family background to increase once birth order effects are accounted for. 15 In terms of gender, same-sex homophily suggests that daughters respond more to mothers' entrepreneurship and that sons respond more to fathers' entrepreneurship (Lindquist et al., 2015;Hoffmann et al., 2015). While this does not necessarily imply overall differential treatment, a pervasive preference for sons may still exist in a business context. Bennedsen et al. (2007) show that if the first born child of a family firm's CEO is male, his replacement is more likely to come from within the family rather than to be external. The interaction of gender, first born and parental entrepreneurship may thus have an impact on the sibling correlations; its magnitude, however, is fairly limited, as Lindquist et al. (2015, footnote 2) show that less than 10 percent of firms in Sweden are inherited.
In order to assess the potential effects of birth order and gender differences, we employ a two-step procedure, following Björklund and Jäntti (2012). 16 We estimate equations (5) and (6), and then predict the individual-level residuals. We subsequently regress these residuals on birth order (a dummy for being first born), gender and their interactions with parental entrepreneurship and incorporation. The R 2 from these regressions reveals how much of the individual-level variance is driven by differential treatment within the family and should thus be counted towards the explanatory power of family background. We compute a 'revised' estimator of the importance of family background,ρ : The differenceρ − ρ reflects the importance of sibling gender and birth order when these characteristics lead to differences in outcomes between siblings. Table 5 suggest the second-stage R 2 is smaller than 0.03, and that less than 2 percent of total variance should be transferred from the individual to the family level.

Results Estimates in
Alternatively put, sibling correlations increase on average by 4 percent once differential treatment by gender and birth order within the family are accounted for. However, the confidence intervals of the original and the revised estimates are largely overlapping, suggesting that the 'bias' created by not controlling for sibling differences is rather small: the sibling correlation is thus a sufficiently tight bound on the importance of family background if we (for the moment) disregard the potential role played by the non-shared genes of siblings (more on this below).
It appears that differential treatment of siblings by parents according to either gender or birth order is small or hardly influential in entrepreneurship.

Accounting for Sibling Similarities
What is it that makes the outcomes of siblings so similar? In this section, we investigate the extent to which our sibling correlation can be accounted for by (i) neighborhoods, (ii) parental characteristics, (iii) sibling peer effects, and (iv) genes. While this accounting exercise does not allow for a causal interpretation of the determinants of entrepreneurship, it provides clues about what is potentially important in explaining sibling similarities in entrepreneurship.

Neighborhoods
In his review of the determinants of entrepreneurship, Parker (2009) comments that "[a]ll major economies exhibit regional differences in rates of entrepreneurship" (p. 147), a note echoed by research into clusters of entrepreneurship (Glaeser et al., 2010). Giannetti and Si-monov (2009) show that in Sweden the between-municipality variance is much larger than within-municipality variance in entrepreneurship and that a standard deviation increase in the proportion of entrepreneurs in the local labor market is associated with about 25 percent more entry into entrepreneurship. Similarly, Guiso et al. (2015) find a positive effect of local firm density (in the individuals' region or province of residence at age 18) on entry into entrepreneurship. In contrast to Giannetti and Simonov (2009), Guiso et al. (2015) also show that a higher firm density leads to higher income in entrepreneurship and the adoption of better management practices, which suggests that exposure to entrepreneurship when young aids learning.
As discussed above, the sibling correlations in Table 4 include these kinds of neighborhood and community effects. What share of the sibling correlation can be accounted for by community influences that are experienced outside of the home, but still shared by siblings? To answer this question, we estimate neighborhood correlations in entrepreneurship, using data on the parish the individual resided in at age 15. We then substitute a neighborhood variance component (c n , where n indexes parishes) for the family variance component and estimate the following two-level mixed effects model: where we include our full set of parental characteristics in X in in order to correct for parental sorting into neighborhoods (Solon et al., 2000). The neighborhood correlation then becomes: Results Panel B in Table 6 shows the results for extensive and intensive outcome margins: the neighborhood correlation ranges between 0.007 and 0.024, meaning that shared neighborhood characteristics can explain at most 2.4 percent of entrepreneurial variance. These neighborhood correlations are small relative to our baseline sibling correlations (reproduced in panel A of Table 6), with an explanatory power of usually less than 5 percent of the sibling correlations in incorporation, but up to 11 percent of the sibling correlations in self-employment. Overall, we provide evidence that the scope for neighborhood effects in entrepreneurship in general, and incorporation in particular, is rather limited, usually less than 11 percent. This result is in line with previous literature using income, education or crime as outcomes: neighborhood effects typically explain less than 10 percent of variance (Solon et al., 2000;Page and Solon, 2003a,b;Lindahl, 2011;Nicoletti and Rabe, 2013;Eriksson et al., 2016). Our results also help put into perspective the previous results of Giannetti and Simonov (2009) and Guiso et al. (2015).
Although we do find some effects of community, most learning and role-modeling appear to take place closer to home (within the family). Thus entrepreneurship policy at the community level may have only a small role in fostering entrepreneurship among youths.

Parental Characteristics
Which parental characteristics are mostly responsible for generating sibling similarities in entrepreneurship? We study this question by including potentially important family-wide variables, either one at a time or simultaneously, in the X if matrix of equation (5)  entrepreneurship in X if . These two additional control variables (fixed effects) should reduce the residual variation in the outcome variable and produce a lower estimate of the betweenfamily variation, σ 2 * a , than the estimate produced without the added controls. Abstracting from measurement error, we can interpret the difference between these two estimates, σ 2 a − σ 2 * a , as an upper bound on the amount of the variance in the family component that can be explained by parental entrepreneurship. It is viewed as an upper bound since it includes other factors affecting children's entrepreneurship that are correlated with parental entrepreneurship (for instance, education, occupation, residence). 17 This exercise also produces a new sibling correlation, ρ * .
From what we know about the relationship between parents' and children's entrepreneurship (Lindquist et al., 2015), we expect this new sibling correlation to be significantly lower.
The degree to which any particular control variable lowers the sibling correlation after being included in the fixed part of the mixed-effects model provides a metric for judging its importance in explaining sibling similarities (Mazumder, 2008;Björklund et al., 2010), but does not allow for a causal interpretation. Specifically, we explore the potential roles played by: (i) parental education and income, (ii) parental self-employment and incorporation, (iii) parents' immigration status, and (iv) family structure.
Previous research has suggested an important role of parental income and education (Lentz and Laband, 1990;Blanchflower and Oswald, 1998;Fairlie and Robb, 2007a); finding a large role for these variables would be consistent with the existence of capital constraints (Holtz-Eakin et al., 1994;Blanchflower and Oswald, 1998). Parental self-employment and incorporation are likely to influence the occupational preferences of individuals, not only through the acquisition of general or specific business human or social capital, but also through role-modeling (Dunn and Holtz-Eakin, 2000;Fairlie and Robb, 2007b;Sørensen, 2007;Colombier and Masclet, 2008;Parker, 2009;Hoffmann et al., 2015;Lindquist et al., 2015).
Ethnicity and parental immigration are also likely to play a role in entrepreneurship decisions -in terms of the location of new immigrants and their subsequent choice of business (Dunn and Holtz-Eakin, 2000;Edin et al., 2003;Andersson and Hammarstedt, 2010;Kerr and Mandorff, 2015). Finally, although family structure is potentially associated with personality developments affecting entrepreneurial decisions, it has been understudied as a determinant of entrepreneurship, mainly given a lack of reliable data. Previous studies find only a limited association of family structure with entrepreneurship (De Wit and Van Winden, 1989;Dunn and Holtz-Eakin, 2000;Hout and Rosen, 2000;Hundley, 2006;Tervo, 2006). Controlling for these observables one by one and then jointly, we can assess both their relative and their total contribution to entrepreneurial variance.

Results
We proceed to address these specific family factors that contribute to sibling similarities by adding a set of covariates to the fixed part of the models, as explained above. These covariates include: mother's and father's education, income, immigration, self-employment and incorporation, family size, the mother's age at first birth and partner count, whether both biological and adoptive children were present in the household, and our dedicated measure of family structure. To simplify the exposition, we have performed a factor analysis, 18 generating six orthogonal factors that load onto (i) parental education and income, (ii) parental immigration, (iii) parental self-employment, (iv) parental incorporation, (v) a composite measure of family structure, based on family size, mother's age at first birth and the mother's partner count, and (vi) the presence of different types of genetic siblings and our objective measure of family structure (these latter two factors explain little of the sibling correlation, and will be added together to assess the importance of family structure). We then add these factors separately and jointly in the fixed part of the model to obtain the new sibling correlations, ρ * .
Since the factor analysis requires individuals to have information on all these variables, our sample size is slightly reduced (for extensive margin outcomes, for instance, it is reduced from 705,626 to 665,665 individuals). Therefore, we re-estimate the sibling correlations for this particular sample and report them in panel A of Table 7. While these sibling correlations are significantly different in a statistical sense from the baseline sibling correlations (given our large sample), they have the same order of magnitude as before. 18 The results of this analysis, i.e. factor loadings, are given in Table C.2 in Appendix C.
Panel B of Table 7 shows that parental education and income explain less than 5 percent of entrepreneurial outcomes, with the exception of Self-employment income, where the corresponding value is 11.38 percent. This means that while there is some degree of association in parent's incomes and the subsequent income an individual makes as an entrepreneur, the explanatory power is very low (the difference between the sibling correlation before and after controlling for education and incomes is 0.024, or 2.4 percentage points). While we do not possess wealth data, our results tentatively imply that capital constraints arguments building on parent's incomes as a determinant of entrepreneurial success lack strong empirical evidence (Holtz-Eakin et al., 1994;Blanchflower and Oswald, 1998;Hurst and Lusardi, 2004). 19 Having non-native parents does not have a large separate impact on entrepreneurship outcomes, usually less than 1 percent, at first glance in contrast to Andersson and Hammarstedt (2010). However, since our factors are orthogonal, we can expect that higher rates of immigrant entrepreneurship to be captured by our entrepreneurship factor. The latter has a higher explanatory power, especially in Self-employed, Self-employed ≥ 4y, and Years self-employed, with the factor explaining as much as 15 percent of sibling similarities. Conversely, parental incorporation has a very small impact on self-employment outcomes, but is a strong predictor of incorporation outcomes, explaining as much as 16.4 percent of variation. 20 The slightly larger effects found for incorporation than for self-employment are consistent with the results in Lindquist et al. (2015). Turning to our (composite and direct) measures of family structure, we find their explanatory power to be extremely limited, up to 1 percent. It does not appear likely, then, that family structure drives the sibling correlations we observe, and that economic, rather than purely sociological family factors are important for entrepreneurship outcomes. In that sense, our results echo those that Björklund et al. (2007) obtain for schooling and earnings.
Table 7, panel C shows the sibling correlations we obtain when we add the six factors pertaining to family characteristics jointly to the fixed part of the model. The explanatory power of family observables ranges between 9.91 percent for years incorporated and 20.52 percent for entrepreneurial income. 21 In terms of total variance of entrepreneurial outcomes, back-of-the-envelope calculations show that only between 2 and 8 percent of variation can be explained by observable family characteristics -a very limited role indeed. 19 To capture non-linearities in the contribution of parental wealth, we also experimented with a dummy for the family being in the top 5 percent of capital incomes (capital income is the difference between earnings and income). This results in a separate factor, which only explains about 2 percent of the sibling correlation in Ever incorporated and High income self-employed, and 4 percent in Self-employment income. The total explanatory power of the 7 factors is, however, unchanged from that reported in Table 7. In addition, using dummies for quintiles of parental socio-economic status (instead of the continuous factor) adds little explanatory power. 20 The same pattern is observed when the sample is split by sibship gender, see Table C.3 in Appendix C. 21 Controlling for all the interactions of the six factors only adds about 1.5 percent more explanatory power.
As a robustness check, we have also estimated the joint contribution of the separate variables to the sibling correlations (instead of the factors obtained through factor analysis), and results are slightly larger, but have the same order of magnitude. In this case, observable family characteristics explain at most 8 percent of the total variation in entrepreneurial outcomes.
To sum up: parental education and income, family structure, and immigrant status account for very little of the sibling correlations in entrepreneurship; parental self-employment, however, explains a large share of the sibling correlations in self-employment (but not incorporation); parental incorporation explains a large share of the sibling correlations in incorporation (but not self-employment).

Sibling Peer Effects
Sibling correlations also capture inter-sibling interactions; while these could be treated as a nuisance in estimating the impact of shared family background, we consider such sibling peer effects to be an integral part of shared environments. In the entrepreneurship literature, peer effects have been convincingly identified within the workplace (Nanda and Sørensen, 2010) and within universities (Lerner and Malmendier, 2013;Kacperczyk, 2013), based on (quasi-) random assignment of employees to workplaces or students to classes. In addition, withinfamily role-modeling has been proposed as a mechanism for intergenerational transmission of entrepreneurship (Lindquist et al., 2015;Hoffmann et al., 2015). Here, we assess the potential role of sibling peer effects in generating sibling correlations.
We first examine sibling correlations at different birth spacings based on month of birth data, from twins (zero spacing), through siblings born at least 12 months apart in rolling intervals of 12 months, and up to sibling spacings of 120 months. 22 There are two competing expectations about the relationship between spacings and sibling correlations (Eriksson et al., 2016). On the one hand, siblings born closer together interact more intensively, which should lead to higher sibling correlations at low birth spacings. Also, closely spaced siblings may share a more similar family environment while growing up. On the other hand, much older siblings can potentially act as stronger role models, and in phenomena like entrepreneurship, it may well be that it takes longer for the older sibling to establish him/herself as an entrepreneur.
Thus, sibling correlations may increase or decrease with sibling spacing. 22 We omit spacings between 1 and 11 months, and larger than 120 months as these are quite rare. Labels in Figure 1 imply 12-month rolling intervals, i.e. the label 12 months covers spacings between 12 and 24 months. In addition, we restrict the non-twins to full siblings in families with two children in our sample. Sibling correlations for this sample are the same as the baseline sibling correlations. Compare column (1) in Table 4 to column (3) in Appendix  Results for entrepreneurship and incorporation in Figure 1 suggest that while twin correlations are higher than non-twin correlations, the latter do not display an evident relationship with birth spacing (this pattern is common across outcomes, see Figure C.1 in Appendix C). This is quite interesting, given that in the bulk of the sibling correlation literature sibling spacing tends to matter quite a lot: the outcomes of closely spaced siblings are typically much more similar than those of widely spaced siblings (see, e.g., Eriksson et al., 2016). However, entrepreneurship may often materialize when individuals have long ago left the household. Many of the outcomes studied earlier, such as education and crime, are realized at an earlier stage.
This may explain the smaller role of sibling spacing in our case.
What we can take away from this exercise is the following: (i) time-varying, family-wide factors do not appear to be important, and (ii) close (day-to-day) interactions between siblings may not be important. These results also suggest that twins are more similar in entrepreneurial outcomes than non-twins, either because of genetic effects, more similar treatment by parents, or stronger inter-sibling interactions. We return more formally to genetic effects in Section 5.4, and turn to a second peer effects exercise next.
While we lack a formal randomization process, by exploiting differences in the timing of entrepreneurial entry for sibling pairs, we may gain information about spillovers from one sibling to the other. A useful method for exploring such peer effects has been proposed by Altonji et al. (2016), who apply it to the study of illegal substance abuse, and subsequently used by Eriksson et al. (2016) to look at criminal activity. The method relies on the relatively strong assumptions that only older siblings can influence younger siblings and that parental influences are not a mediating channel. While their method is intuitively applicable to situations where peer effects are likely to dominate other causes and where individuals are active when young, entrepreneurship represents an occupational choice with long term consequences, and it is not clear that older siblings necessarily engage in entrepreneurship earlier than younger siblings. 23 Since our exercise focuses on explaining the variance of entrepreneurial outcomes due to the influence of sibling peers rather than on identifying causal effects, we take an agnostic approach to applying the Altonji et al. (2016) model. We estimate both the effect of the older sibling on the younger one, and the effect of the younger sibling on the older one, subsequently converting the results into correlations to assess the contribution of peer effects to the sibling correlation (Bonett, 2007). A more detailed description of our empirical model is given in Appendix A.
Results Table 8 summarizes the results of our sibling peer effects exercise on the subsample of sibling pairs, with panel A referring to self-employment and panel B to incorporation. Column (1) shows how much the impact of the older sibling's entrepreneurship status at time t−1 on the younger sibling's entrepreneurship status at time t contributes (at most) to the baseline sibling correlation, and column (2) does so while controlling for a contemporaneous effect. Columns (3) and (4) do the same for the impact of the younger sibling on the older sibling.
The lagged effect of the older sibling's self-employment on the younger sibling represents at most 6.20 to 6.31 percent of the baseline sibling correlation. Conversely, the effect of the younger sibling on the older one in self-employment appears largely negative; this implies that peer effects may actually generate sibling dissimilarity. When we disaggregate by type of sibling pair, we note that the direct peer effects for the subsample are driven by peer influences between brothers, which reflect the same pattern, with very similar magnitudes. With regards to other sibling types and incorporation, most estimated peer effects are not significant. 24 All in all, the timing of entrepreneurial entry and the subsequent peer influence are only significant (at 5 percent) for male pairs, and even then contribute less than 10 percent to our sibling correlations. Thus, peer effects are too small in magnitude to drive the sibling correlation, and at times may even create sibling dissimilarities.

Genes
As noted by Björklund et al. (2005) and Conley and Glauber (2005), sibling correlations also capture shared genetic endowment. Despite disagreement on the distribution of nature and nurture, the literature has established a definite role for genetic endowment in entrepreneurship (Nicolaou et al., 2008;Lindquist et al., 2015). In a study of Swedish adoptees, Lindquist et al. (2015) find that the intergenerational association in entrepreneurship is driven by nurture (roughly two thirds), rather than nature (roughly one third). By contrast, in their Swedish twins study, Zhang et al. (2009) find a strong genetic effect and no effect of shared environment for women, but a large shared environment influence for men, with a zero genetic effect.
One would be tempted to consider that sibling correlations place an upper bound on family influences, and thus implicitly a maximum maximorum upper bound on genetic influences.
As such, this upper bound (25 percent for self-employment) would be lower than previously estimated: for instance, Nicolaou et al. (2008) and Nicolaou and Shane (2010) suggest that around 40 percent of the total entrepreneurial variation (in the UK and the US) is due to genetic influences. However, the results from the twins literature only speak to sharing an entire genome -it is then unclear how to interpret the results at the level of the entire population.
Indeed, for two non-twin full siblings who share (on average) half their genes, it can be that they share all or none of the genes that influence entrepreneurship. It is thus difficult to compute the exact relationship between genetic effects and the sibling correlation itself. We can, however, place an upper bound on the contribution of shared genes to the sibling correlation.
To do this, we first need to calculate sibling correlations for mono-(MZ) and di-zygotic (DZ) twins. We can identify all twins in our data (via month of birth). But, unfortunately, we have no indicator of their zygosity (other than knowing that non same-sex twins are di-zygotic).
In the absence of knowledge about which pairs of twins are mono-or di-zygotic, we impose a series of assumptions in order to identify correlations for these pairs of twins (i.e., we use the approach outlined in Nicoletti and Rabe, 2013). 25 In order to estimate the contribution of genes to sibling correlations, we assume that: (i) gender differences for DZ twins can be approximated by gender differences in closely spaced non-twin sibling pairs, (ii) the variance of the family component for all same sex twins is a weighted average of corresponding variances for MZ and DZ twin pairs, and (iii) boys and girls are conceived with almost equal probabilities.
One important caveat we share with most other twin studies is that while, ideally, this decomposition would only capture genetic influences, in practice it may also partially account for sibling peer effects. If genetically more similar pairs of twins also interact more intensively (or are treated more similarly by parents), then inter-sibling peer effects are also captured by the decomposition. For this reason, Zhang et al. (2009) control for twins' interaction intensity in their estimation of genetic influences (in their data, interactions are 50 percent stronger for identical twins). Unfortunately, we do not have direct information on sibling interaction. However, we did not find any evidence for strong sibling peer effects in the previous section.

Results
The results in Table 9 show that, overall, genes can potentially explain between 26 percent and 63 percent of total variance in entrepreneurship. 26 While we can acknowledge an impact of genes on total entrepreneurial variation that is well in line with those found in the literature, most of our heritability estimates are insignificant (i.e. we cannot actually reject zero genetic effects), especially for stricter definitions of entrepreneurship. Additionally, there is no clear pattern for heritability estimates by gender. 27 For instance, female heritability in being Self-employed is 0.628 (with a p-value of 0.001), but only 0.264 (with a p-value of 0.444) for Self-employed ≥ 4y; while male heritability in being Self-employed ≥ 4y is 0.506 (with a p-value of 0.007), it is only 0.127 (with a p-value of 0.614) for being Incorporated ≥ 5y.
We can use the MZ-DZ correlations reported in Table 9, together with our previous results that both sibling spacing and sibling peer effects were unimportant, to calculate the maximum share of the sibling correlation that could potentially be due to shared genes. We calculate this share as 100 * (ρ M Z − ρ DZ )/ρ DZ and report these percentages in Table 9 for men and women separately. Shared genes contribute at most 40 percent to the sibling correlation in Incorporated, while they can explain up to half of the sibling correlation in Self-employed for males and potentially all of the sibling correlation in Self-employed for females. At most 66 percent of the sibling correlation in Self-employed ≥ 4y for men and 47 percent for women can be explained by shared genes. Lastly, 12 percent and 81 percent of the sibling correlation in Incorporated ≥ 5y can be explained for men and women, respectively.
We conclude, perhaps not unsurprisingly, that shared genes likely play a large role in generating sibling similarities. But, of course, this also implies that non-shared genes and variations in gene patterns also generate sibling differences. Björklund and Jäntti (2012) argue that such genetic differences should be added on top of the sibling correlation when discussing the importance of family background for determining adult outcomes. 26 We use twins and non-twin pairs where the siblings are born 12-24 months apart. Results when using nontwins spaced 12 to 18 months, or 12 to 48 months are relatively similar, albeit on average slightly attenuated. One could consider our results as conservative, and the heritability estimates as upper bounds on the importance of genes, given violations of the equal environment assumption used to justify the equality of the environment components for MZ and DZ twins. 27 The higher heritability of self-employment compared to incorporation could perhaps be driven by an innate 'taste for entrepreneurship'. The higher heritability of females in self-employment can be taken to suggest a stronger reliance on genes in a less favorable environment, in line with Zhang et al. (2009), but not with Zunino (2016).

Conclusion
In this paper, we have quantified the importance of family background and neighborhood effects as determinants of entrepreneurship by estimating sibling correlations in entrepreneurial outcomes. We also explored the extent to which families make siblings different. We then presented a series of exercises designed to help us determine the extent to which these correlations could be explained by (i) neighborhood effects, (ii) parental characteristics, (iii) sibling peer effects, and/or (iv) shared genes. The empirical results are summarized in Table 10.
Sibling correlations tell us that 18 to 26 percent of the variance in self-employment is due to family background and neighborhood effects. These same factors explain 34 to 42 percent of the variance of our variables concerning owner/managers of incorporated businesses. For males this number is almost 50 percent. These sibling correlations are similar to those obtained for incomes and earnings in Sweden Jäntti, 1997, 2012;Björklund et al., 2009Björklund et al., , 2010. They are somewhat smaller (on average) than those found for height, cognitive and non-cognitive ability, grades or years of schooling, but sibling correlations in incorporation do show values in the high range of previous estimates for other outcomes, particularly for men.
While sibling correlations focus attention on sibling similarities, we have also considered that families can act in ways that make siblings different, e.g. through differential treatment of first born children or sons versus daughters. In our sample of siblings born between 1960 and 1970, we do not find any strong evidence of differential treatment (for example, towards the first born sons of entrepreneurs). Adding this kind of differential treatment effect on top of our sibling correlations would only increase our measure of the importance of family background by about 4 percent. The main sibling differences that can be traced back to the family are likely those differences due to the role of non-shared family genes.
Given the large sibling correlations in entrepreneurship we have uncovered, the key question for understanding the origins of entrepreneurship focuses on explaining the determinants of these important sibling similarities. Neighborhood effects can account for (at most) 10 to 12 percent of the sibling correlations. Thus, we do find positive effects of entrepreneurial neighborhoods on entrepreneurship, either in learning or social prestige (Giannetti and Simonov, 2009;Guiso et al., 2015), but their total impact on entrepreneurial entry and attainment is likely to be rather small.
Parental self-employment and incorporation can account for (at most) 15 to 16 percent of the sibling correlations. Interestingly, parental self-employment is a prime explanatory force in individual self-employment, but not incorporation, and parental incorporation explains best in-dividual incorporation, but not self-employment. This suggests that individual self-employment and incorporation are different aspects of entrepreneurship, in line with Levine and Rubinstein (2013), Henrekson and Sanandaji (2014), van Praag and Raknerud (2014), Åstebro and Tåg (2015), and Guzman and Stern (2015), and that there are different transmission mechanisms contingent on the type of parental entrepreneurial engagement. Parental education and incomes explain a further 5 percent of the sibling correlation. Family structure and immigrant status explain almost none of the similarities in sibling outcomes.
We also investigated the possibility that sibling peer effects might be driving the sibling correlation, and found only weak (non-causal) evidence that brothers influence each other's entrepreneurial choices. Sibling peer effects are generally limited in magnitude and rarely statistically significant. This result may explain why we find no distinctive pattern for the relationship between birth spacing and sibling correlations. Ultimately, it appears unlikely that sibling peer-effects drive sibling similarities.
Since siblings share not only the family environment, but also part of their genes, our fourth exercise was designed to measure the potential importance of such genetic effects. We cannot reject the existence of genetic effects; while heritability explains up to 60 percent of total variation for some of the outcomes, it is often not significant. At the extremes, heritability could potentially explain all or none of the sibling correlation. On average, however, it appears that between 56 and 78 percent of the sibling correlations in self employment can (at most) be due to the genes that siblings share, while for incorporated these numbers are 38 to 46 percent. Loosely speaking, summing up the results of these various accounting exercises allows us to explain nearly all of the sibling correlations in entrepreneurship. Perhaps more importantly, we are able to compare the relative importance of different factors in explaining sibling similarities.
We do this within a single unified framework that allows us to put some perspective on the relative importance of different effects reported in the existing literature. We conclude that shared genes are likely the most important factor, followed by parental entrepreneurship, and then neighborhood effects. Parental income, education and immigrant status account for a surprisingly small share of the sibling correlations. The same holds true for family structure.
There may, of course, be factors other than those we address here that contribute to sibling similarities. These may include, but are not limited to, parents' managerial ability (Lucas Jr., 1978), risk and time preferences (Dohmen et al., 2012;Björklund et al., 2010), a wider set of family values (Albanese et al., 2016), and even latent health (Ahlburg, 1998). Capturing such variation would be an interesting avenue for future research, although parts of these effects are arguably captured through the various observable parental characteristics we account for (e.g. parental risk preferences may determine parental entrepreneurship) and may have a genetic component as well. In addition, a future reconciliation of heritability and sibling correlations could shed more light on the importance of genes in generating sibling similarity. To do this, researchers need to identify a set of specific entrepreneurial genes and gene patterns and then study the extent to which these specific genes and gene patterns are actually shared by siblings.
There are, of course, some limitations associated with the exercises presented in this paper. First and foremost, when 'explaining' the determinants of sibling similarities, we cannot claim that we have presented a set of precise causal estimates. Instead, we view our results as part of an exploratory accounting exercise that can point us towards those factors which can potentially explain the largest share of sibling similarities. Second, since we measure the degree to which siblings are similar, we cannot exclude the possibility that single-child families operate in a different manner and that lone children are influenced in different ways by family and community-wide factors. Third, our results pertain to a highly developed economy, with specific cultural and economic traits, and notably egalitarian policies. Our results may likely hold in the other Nordic countries, since we observe similar sibling correlations in other outcomes such as income and education across these countries (Solon, 1999;Björklund and Jäntti, 2011;Black and Devereux, 2011), but they may not apply for Southern Europe, the US or for developing countries. Last but not least, while it would be desirable to base policy on our results (for instance, investments in entrepreneurship for one generation would spill over to other generations, thereby generating a multiplier), it is also likely that the sibling correlations we obtain are themselves (in part) the product of a long history of various policies (tax, education, business, etc.). It would be interesting to track changes both over time (Björklund et al., 2009), as well as across countries (Schnitzlein, 2014), in sibling correlations in entrepreneurship.
Among other things, this would help us to decide whether the sibling correlations that we have documented are 'high' or 'low', and whether they hold across time and space.
We tend to view our findings optimistically. We do not believe that the existence of substantial, pre-determined family-wide factors means that policy is doomed to fail. A large share of the variation in entrepreneurship is, in fact, individual-specific and not solely determined by genes.
Furthermore, children appear to be able to 'learn' about entrepreneurship through their family and community environment, which implies that it may be possible to 'teach' entrepreneurship to young people. Policy may even generate a social multiplier effect if the behavior of a successfully treated person also affects the behavior of her untreated family members. All children of the same mother are defined as belonging to the same family.  *** *** *** Standard errors in parentheses. All differences (with one exception for employees and the selfemployed with regards to maternal income) are significant at less than 1 percent. Please note that some individuals have been both self-employed and incorporated at different points in time.
They are omitted from this analysis, but the results in columns (4) and (5) are very similar if they are counted both as Self-employed and as Incorporated.  Standard errors in parentheses. a The second stage explanatory variables are: a dummy for being first born, the two-way interactions of i) first born and parental entrepreneurship, ii) first born with gender, iii) gender and parental entrepreneurship, and the three-way interaction of first born, gender, and parental entrepreneurship. b The confidence intervals for the baseline and 'revised' sibling correlations are largely overlapping. c We obtain very similar results if we restrict the sample to i) complete families (to ensure the first born does not lie outside our observation window) and ii) complete families with at least two children (to ensure results are not driven by lack of variation in the first born variable).   (6) and (7) and 55,828 individuals in 50,804 families in column (8).  (1) and (2), and the younger sibling, columns (3) and (4), once controls are added and correlated random effects are accounted for. For the full set of results see Table C.4 and appendix Tables C.5-C.8 (the results in this table are based on columns (3), (4), (8) and (9) in those tables.  Sibling peer effects a -7 -7% -3 -3% Shared genes b 56 -78% 38 -46% a Only significant for self-employment and male pairs, see Table 8. b For the significance of heritability estimates, see Table 9. We take the average of the separate values for men and women.
as a contemporaneous effect, but rather as a transitory and common shock to both siblings in the same family. Hence, we do not sum the lagged and contemporaneous sibling effect when analyzing the contribution of peers to the sibling correlation (in contrast to Eriksson et al. (2016), for instance). (3), (4), (8) and (9) of Tables C.4-C.8, while columns (5) and (10) (Table C.4, column (3)); this translates into a sibling correlation ρ = 0.014 as given by the lagged sibling effect, representing 6.20 percent of the baseline sibling correlation.

B Appendix: Genetic Decomposition
In the absence of knowledge about which pairs of twins are mono-or di-zygotic (i.e. MZ and, respectively, DZ), we impose a series of assumptions in order to identify correlations for these pairs of twins, following Nicoletti and Rabe (2013). The most important source of information comes from directly observable mixed sex twins, who must be DZ twins. Thus, we can directly estimate σ 2 DZ,F M , that is, the variance of the family component for DZ twins of mixed sexes (with subscripts F for female and M for male). In order to calculate the corresponding variances for same-sex DZ twins, σ 2 DZ,M M and σ 2 DZ,F F , we make the following assumptions: where NT denotes non-twins. Intuitively, we assume we can approximate gender differences in the variance of the family component for DZ twins reasonably well by gender differences in non-twin sibling pairs, using closely spaced non-twins (born between 12 and 24 months apart). 30 In order to identify the corresponding expressions for MZ twins, we make use of the weak assumption that the variance of the family component for all same sex twins is a weighted aver- 30 We also experiment with different non-twin pair spacings as robustness checks (12 to 18 months, 12 to 48 months) and obtain very similar results. Figure C.1 in Appendix C also shows that only small differences are to be expected. The figures also reveal that twin correlations are not higher just due to different sibling interaction patterns, as these seem to be relatively constant across the spacing distribution. The correlations at different spacings for the intensive margin produce noisier estimates, given their much smaller sample size.
age of corresponding variances for MZ and DZ twin pairs, with weights provided by their incidence in the population of same sex twins. 31 Denoting these proportions with P (and noting that P M Z,M M + P DZ,M M = P M Z,F F + P DZ,F F = 1), we use: Now we need to approximate these proportions in order to solve the MZ-variances from equations In order to measure the relative contribution of genes to the sibling correlation, we compute an indicator of heritability by exploiting differences in shared genetic endowment between MZ and DZ twins (Guo and Wang, 2002;Björklund et al., 2005;Rabe-Hesketh et al., 2008). As sibling correlations represent the total contribution of shared factors to variation in the outcomes, we express them as a linearly additive function of common genes and common environment: a f = a genes + a env . These correspond closely to the A and C factors included in the structural equation model approach used by Nicolaou et al. (2008).
Since MZ twins share 100 percent of their genes, we decompose the sibling correlation into genetic variation -heritability -and environmental variation: h 2 + c 2 T = ρ M Z , where the subscript T indicates twins' common environment. Analogously, for DZ twins who share 50 percent of their genes, we obtain: 0.5h 2 + c 2 T = ρ DZ , with the crucial assumption that the proportion of variance owed to shared environmental influences is the same for MZ and DZ twins. We then back out measures of heritability 31 An implicit assumption is that the means of the family components for the two types of twins are identical. 32 We do not analyze the intensive margin outcomes since sample sizes are extremely small.     Standard errors in parentheses. Column (1) eliminates singletons, column (2) redefines the family on the basis of the father, column (3) restricts the sample to two-child families, and column (4) restricts it to closely spaced siblings (12-24 months) in two-child families. Column (5) only uses families we entirely capture in our sample. Column (6) shows an example of results from an exercise that defines random clusters that match the mother-defined families in number and size; as expected, none of the sibling correlations in this column are significantly different from zero, with the exception of the one for High income entrepreneur, which is still an order of magnitude below the original sibling correlation (and is an artifact of the particular randomization seed used in this example). An additional robustness check where we consider only entrepreneurship events between the ages 25 and 40 also produces results very similar to the baseline sibling correlations: 0.222 (0.005) for Self-employed and 0.367 (0.007) for Incorporated.  (2) Main factor loadings appear in bold. Standard errors in parentheses. Numbers in bold refer to the percentage decrease in the sibling correlation. a The fixed part of these models controls for gender. b The sample is restricted to siblings for whom both parents' entrepreneurial status is available.