Optimal Government Policies in Models with Heterogeneous Agents

In this paper we develop a new methodology for finding optimal government policies in economies with heterogeneous agents. The methodology is solely based on three classes of equilibrium conditions from the government's and individual agent's optimization problems: 1) the first order conditions; 2) the stationarity condition on the distribution function; and, 3) the aggregate market clearing conditions. These conditions form a system of functional equations which we solve numerically. The solution takes into account simultaneously the effect of government policy on individual allocations and (from the government's point of view) optimal distribution of agents in the steady state. This general methodology is applicable to a wide range of optimal government policies in models with heterogeneous agents. We illustrate it on a steady state Ramsey problem with heterogeneous agents, finding the optimal tax schedule.


Introduction
This paper provides a new methodology for computing equilibria which allows the stationary distribution of wealth and income to be a part-perhaps the most importantof a government optimization problem. This general solution method is applicable to a wide range of optimal government policies in models with heterogeneous agents. We formulate the optimal government policy problem as an "operator" problem subject to a system of constraints: 1) the first order conditions from the individual agent's problem; 2) the stationarity condition on the distribution function; and, 3) the aggregate market clearing conditions. The first order conditions of the government operator problem form a system of functional equations in individual agents' and the government's policies and in the distribution function over agents' individual state variables. We solve this system numerically by the projection method.
It should be emphasized that our approach does not use any additional restrictions on or assumptions of the equilibrium allocations but is strictly derived from the first order and envelope conditions and from the stationarity of the endogenous distribution in the steady state. Our main contribution is in the formulation of the functional equations for the government problem and for the stationary distribution over individual state variables. In this way, we are able to solve simultaneously for the optimal government policy, for the optimal individual allocations, and (from the government's point of view) for the optimal distribution of agents in the steady state. To our knowledge, this paper is the first one that provides a solution method for this kind of problem.
We illustrate this general methodology on a steady state Ramsey problem with heterogeneous agents. We recast the original Ramsey (1927) and Lucas (1990) normative question for an economy with heterogeneous agents: What choice of a tax schedule will lead to maximal social welfare in the steady state, consistent with given government consumption and with market determination of quantities and prices? What is the welfare differential with respect to social welfare resulting from the existing progressive tax schedule in the U.S. economy and as well as from the usual flat-tax reform?
The tradeoff between efficiency and income distribution plays a central role in analyzing tax policy. In dynamic, general equilibrium models with household heterogeneity from uninsurable idiosyncratic risk, the optimal tax schedule balances the gains from offering insurance against the efficiency losses from distorting taxes. In our example, we find a welfare maximizing tax schedule on total income from labor and capital which takes into account simultaneously its effects on agents' allocations and on the stationary distribution of agents in a steady state. Previous models analyzing the effects of government policies in this class of models were limited to a sub-optimal policy reform exogenously imposed on the model. Within the context of optimal taxation, several papers have analyzed the steady state implications (and transition paths) resulting from an ad hoc flat-tax reform or from an ad hoc removal of double taxation of capital income.
In this paper, we solve for the optimal steady state tax schedule chosen by a government maximizing the average welfare. In order to evaluate the benefits of the optimal tax schedule, we compare the steady state aggregate levels, welfare, efficiency and distribution of resources associated with the optimal tax schedule to a simulated steady state of the U.S. economy with the existing progressive tax schedule and to a steady state resulting from a standard flat-tax reform.
The optimal tax schedule we find is a function that is neither progressive nor monotone. It is a positive, U-shaped function, taxing the lowest income at 45%, decreasing to a minimum of 19% and rising to 62% at the highest level of total income. It provides incentives for agents to accumulate a high level of individual assets while preserving the equality measures in the economy. Its impact on aggregate levels and welfare is large. Compared to the progressive tax schedule steady state, average welfare increases by 4.4%, capital stock by 49%, output by 15.8%, and consumption by 5.8%. Relative to the flat-tax steady state, welfare goes up by 0.8%, capital stock by 15%, output by 4.5%, and consumption by 1.1%. The marginal tax rate is also a U-shaped function, but almost flat at low incomes, reaching negative levels around the average income, and 4 then rising to positive levels.
The efficiency and distributional effects of the optimal tax are the main mechanisms behind these large changes. Related to the former are the general equilibrium effects: a higher stock of capital increases productivity of labor and, therefore, the income of poor agents. For the latter, the optimal tax schedule concentrates agents around the mean at high levels of wealth, something that a social planner with access to lump-sum transfers would do. The high tax rate at low income levels provides incentives for these agents to save more. The even higher tax rate on high income discourages further savings by the wealthiest agents. In the middle of the total income levels, the tax rate is lower than the one found for the flat-tax reform. In this way, the optimal tax schedule solves the tradeoff between efficiency and equality by altering individual real incomes. For comparison, the flat tax also increases aggregate levels but does not take into account the distribution of agents. On the other hand, the progressive tax schedule provides too much short-run insurance at the cost of long-run average levels.
Finally, in order to evaluate the short run costs of the optimal tax reform, we compute transitions from the progressive and the flat-rate tax schedule steady states to the steady state of the optimal tax schedule. One of our main results is that a majority of the population, 73%, would benefit from a reform that replaces the progressive tax schedule with the optimal tax schedule. On the other hand, only one third of agents would support the reform in the flat rate steady state. These results as well as a detailed efficiency and distributional analysis is described in great detail in the following sections.
We limit our example to the optimal tax schedule on total income from labor and capital that is needed to raise a given fraction of GDP. There are several reasons why we choose this setup. First, the tax on total income enables us to study a distortionary tax system with a non-degenerate distribution of agents in a steady state. If the government had access to lump-sum, first best taxation, the model would collapse to a representative agent one. Second, to a large extent the current U.S. tax code does not distinguish between sources of taxable income. The last reason for a simple tax on the total income 5 is the complexity of the problem we solve.
By focusing on a steady state analysis and by imposing a single tax rate on labor and capital income, we are for now avoiding two important issues related to optimal taxation: the issue of time-consistency and the issue of the optimal capital income tax rate in the steady state. With respect to the former, our government is fully and credibly committed, and the tax schedule is constant over time. 1 The latter issue is to some extent mitigated by the findings of Aiyagari (1995), who showed that for our class of models with incomplete insurance markets and borrowing constraints, the optimal tax rate on capital income is positive even in the long run. 2 Our approach is different from the current contributions to the optimal taxation literature. Based on the original work of Mirrlees (1971) and Mirrlees (1976), Kocherlakota (2003), Golosov, Kocherlakota, and Tsyvinski (2003) and ?) study optimal social planner policies with asymmetric information. In such an environment, positive capital income taxes are optimal despite the associated efficiency loss, and informational frictions are necessary for a characterization of the optimal policies. Compared to these papers, we solve for the optimal tax schedule within the standard neoclassical, general equilibrium, full information and full commitment economy with heterogeneous agents and incomplete markets. Moreover, we characterize the set of admissible tax functions that satisfy the definition of stationary recursive competitive equilibria and are consistent with each agent's maximization problem.
The closest paper to ours is Conesa and Krueger (2004), who compute the optimal progressivity of the income tax code in an overlapping generations economy. They search in a class of monotone tax functions to find a welfare maximizing tax schedule. We show in this paper that, first, limiting the analysis to monotone functions seems rather 1 For the time-consistency problem, see Kydland and Prescott (1977) and a recent contribution by Klein and Rios-Rull (2004).
2 The optimal capital tax in an economy with two types of agents is discussed in Chamley (1986) and Judd (1985).
6 restrictive with respect to welfare maximization. Second, our computational method based on the government optimality conditions improves the welfare criterion by several percentage points compared to the one found by simple search methods.
We are aware of the other issues that are important for the analysis of optimal taxation. We do not consider technology or population growth; we do not model a life-cycle earnings process like Ventura (1999). The tax revenues are not given back to the agents; we abstract from public goods. We study the simplest utility maximization problem on the consumption-investment margin. However, we would like to stress here that our methodology can be applied to any optimal government policy problem, including that with endogenous labor supply or separate taxation of labor and capital incomes.
Finally, in future research we also plan to analyze a much more difficult problem, that of a stationary competitive equilibrium which is the limit of the optimal dynamic tax schedule (again, simple search methods cannot be used for this task).
The paper is organized as follows. The following section describes the economy with heterogeneous agents, defines the stationary recursive competitive equilibrium and the Ramsey problem. Section 3 formulates the equilibrium as a system of functional equations and defines the operator Ramsey problem. Section 4 characterizes the optimal tax schedule by the first order necessary conditions for the operator Ramsey problem.
The computational projection method is described in Section 5. Section 6 presents the results and Section 7 concludes. The Appendix contains the proof of the main proposition.

The Steady State Ramsey Problem
This Section describes the economy, defines the stationary recursive competitive equilibrium and formulates the Steady State Ramsey Problem. The economy is populated by a continuum of infinitely lived agents on a unit interval. Each agent has preferences 7 over consumption given by the utility function where β ∈ (0, 1) and u : R + → R is twice continuously differentiable, strictly increasing and a strictly concave function. We assume that the utility function satisfies the Inada conditions.
In each period, each agent receives an idiosyncratic labor productivity shock which takes on values in a finite set of real numbers, z ∈ Z = {z 1 , z 2 , . . . , z J }. The shock is measured in efficiency units and follows a first-order Markov chain with a transition function Q(z, z ) = Prob(z t+1 = z |z t = z). We assume that Q is monotone, satisfies the Feller property and the mixing condition defined in Stokey, Lucas, and Prescott (1989).
As the labor productivity shock is independent across agents, there is no uncertainty at the aggregate level.
All agents are initially endowed with a nonnegative stock of capital. In each period, each agent supplies his realized labor endowment and accumulated capital stock k to competitive firms operating constant returns to scale production technology. We restrict the accumulated capital to be nonnegative, k ∈ B = [k L , ∞), where k L = 0. The capital stock depreciates at the rate δ ∈ (0, 1).
Finally, there is a government that finances its expenditures by taxing the agents in the economy. We assume that the government expenditures are a fixed percentage of total output, they are not returned to the agents and the government cannot use the first best, lump-sum taxation. We assume that the government has access only to proportional taxation of total income from labor and capital. 3 The ultimate goal is to analyze the economy as a steady-state Ramsey problem with heterogeneous agents. In particular, we seek an optimal, time invariant tax schedule in a stationary recursive competitive equilibrium that will lead to maximal average utility in the steady state, consistent with given government consumption and with market determination of quantities and prices. Throughout the whole paper, the value function, the policy functions, the distribution function, aggregate levels and prices all depend on the tax schedule.
In the competitive equilibrium, given any time-invariant tax schedule τ , agents accumulate aggregate capital stock K used by a representative firm together with inelastically supplied aggregate effective labor L in the production technology F (K, L) = AK α L 1−α , with technology parameters A > 0 and α ∈ (0, 1). Profit maximization implies the following factor prices: In each period, an agent inelastically supplies his labor endowment at wage w, rents capital stock at interest rate r, and maximizes his or her utility by choosing consumption and a level of capital stock for the next period. We preserve heterogeneity in the economy by closing insurance markets so that the agents can only imperfectly insure against idiosyncratic labor productivity shocks by precautionary savings.
Given competitive factor prices (r, w), the government finances its expenditures by a proportional tax on each agent's total income y from labor and capital, y(k, z) = rk + wz.
Then, the tax schedule on total income is the function so that an agent with total income y faces tax rate τ (y) and receives after tax income (1 − τ (y))y.
We will model the economy in a stationary recursive competitive equilibrium. For time-invariant tax schedule τ , such an equilibrium exhibits constant factor prices, constant levels of aggregate variables, and a stationary distribution of agents over their 9 individual states. An agent's individual state is the pair (k, z) ∈ B × Z denoting his or her accumulated stock of capital and the realized labor productivity shock, respectively.
Taking the factor prices (r, w) and the tax schedule τ as given, an agent (k, z) solves the following dynamic programming problem subject to budget constraint and no borrowing constraint where taxable income is defined as The policy function for next-period capital k (k, z) and the Markov process for the productivity shock generate a law of motion: for all (B , z ) ∈ B × Z. According to this law of motion, the fraction of agents that will begin the next period with capital stock in set B and productivity shock z is given by all those agents that transit from their current shock z to shock z and whose optimal decision for capital accumulation belongs to B .
The government budget constraint is, for government expenditures G, expressed as the fixed fraction g of the total output Y , 6. and the allocations are feasible, Note that the aggregate feasibility constraint is implied from the other market clearing conditions by Walras' law. Since labor supply is inelastic, the labor market clears by construction.

The Existence of Stationary Recursive Competitive Equilibrium
Given tax schedule τ and prices, each agent's optimal policy function k (k, z) for all (k, z) ∈ B × Z can be derived from the first order condition where c = (1−τ (y )) y +k −k and y = rk +wz . Because we study a stationary recursive competitive equilibrium, the two period ahead saving function k = k (k (k, z), z ).
Note that the agent also takes into account the effect of his current savings decisions on tomorrow's marginal tax, τ (y ).
Because the tax schedule is an arbitrary function, we must ensure that the first order approach is valid. In order to characterize the admissible tax functions and to prove the Schauder Theorem for economies with distortions, we follow the notation in Stokey, Lucas, and Prescott (1989), Chapter 18. For each agent (k, z) ∈ B × Z with taxable income y(k, z) = rk + wz, denote the after-tax gross income as ψ(k, z) ≡ (1 − τ (y(k, z))) y(k, z) + k.
Using ψ(k, z), rewrite the Euler equation 4 as is the marginal after-tax return on an extra unit of investment. In the following theorem we establish the validity of the first order approach and the existence of the competitive equilibrium.
Theorem 1 If for each (k, z) ∈ B × Z, given a tax schedule τ : R + → R, then the solution to each agent's maximization problem and the stationary recursive competitive equilibrium exist.
The proof of Theorem 1 is in the Appendix.
The following corollary characterizes the set of admissible tax schedules that satisfy the conditions of Theorem 1. For this purpose, define k as the maximal sustainable capital for any agent (for a detailed definition see the proof of Theorem 1). Let w and r denote some exogenously imposed, non-binding lower and upper bounds for the equilibrium wage and interest rate, respectively, and let z and z stand for the lowest and highest productivity shocks. Finally, ε τ y (y) ≡ τ (y) τ (y) y is the elasticity of the tax rate to the taxable income.
Corollary 1 (Admissible Tax Schedule Functions) Let C 2 (R + ) be a set of con- belongs to the set of admissible tax schedules Υ, for all y ∈ [wz, rk + wz], then it satisfies the conditions of Theorem 1.
The above statement follows directly from the fact that ψ 1 (k, z) > 0 and that ψ is quasiconcave. The corollary implies that there exists an upper bound on the marginal tax rate, τ (y) 1 + ε τ y . This upper bound is not likely to bind for a very wide range of tax schedules. 5 We want to stress here that while numerically solving for the optimum tax schedule we do not impose any of these exogenous bounds but we check the admissibility of the optimal tax schedule ex post.
The goal of our paper is to solve for the following Steady State Ramsey Problem.
Definition 2 (Steady State Ramsey Problem) A solution to the Ramsey problem for a stationary economy with heterogeneous agents is an admissible time-invariant tax schedule τ ∈ Υ that maximizes social welfare in the steady state It is easy to show that the above specification of the Steady State Ramsey Problem is equivalent to maximizing the average current period utility, In the following Sections, we will characterize the optimal tax schedule using this latter specification.

The Operator Steady State Ramsey Problem
In this Section we study the Steady State Ramsey Problem of choosing the tax schedule to maximize utilitarian social welfare subject to the constraints that the government's budget is balanced and that the resulting allocation is a stationary competitive equilibrium. Since the problem is to find an optimal, admissible time-invariant function τ : R + → R, we transform the Steady State Ramsey Problem into an operator problem characterizing the stationary competitive equilibrium.
14 First, we specify the operator equation for the Euler equation from the individual household optimization problem (2) where c = (1 − τ (y )) y + k − k and y = rk + wz . Again, the solution of the Euler equation is a time invariant policy function for the next period capital, k (k, z), and In principle, there exist agents who are unconstrained in their savings decision (i.e., k (k, z) > k L and the above Euler equation holds with equality) and agents who are borrowing constrained (i.e., k (k, z) = k L and the Euler equation is satisfied with inequality). An example with only two shock levels, Z = {z, z}, is depicted in Figure   1. In this figure, agents with the low shock and accumulated assets k ∈ [k L , k(z)] are borrowing constrained.
In general, for all z ∈ Z, there exists a current minimal accumulated asset level k(z) above which agents are not borrowing constrained, If for a given z ∈ Z the minimal asset level k(z) is greater than k L , then agents with accumulated assets in k ∈ [k L , k(z)] are borrowing constrained. If on the other hand k(z) = k L , all agents for that shock level are unconstrained in their saving decision.
In order to express the Euler equation in the operator form, we introduce an operator F defined on two functions: the tax schedule τ : R + → R and the next period The operator F is a mapping from a twodimensional space of continuous functions into a one-dimensional space of continuous functions. This operator Euler equation must be equal to zero in expectation for each First, for the agents with unconstrained savings decisions, i.e., those with (k, z, z ) ∈ [k(z), ∞) × Z × Z, the operator is given by Second, for the agents who are borrowing constrained, i.e., for those with (k, z, z ) ∈ [k L , k(z)] × Z × Z and the next period savings k equal to k L , the operator F is equal to k − k L , which is zero.
Next we turn to the operator equation for the stationary distribution, L, defined on three functions: the tax schedule τ : R + → R, the next period capital policy function Because the stationary distribution is derived from agents' savings decisions, we have to again distinguish between the constrained and unconstrained agents. First, for the agents with unconstrained savings decisions the operator L for equation (5) is given by . From the proof of Theorem 1, the savings function is a monotone function in k over the whole interval [k(z), ∞] and thus for any z, there exists an inverse function k −1 assigning the current value of capital k to the value of the next period capital x according to k = (k −1 )(x, z).
Second, for the agents who are borrowing constrained with next period capital equal to k L , the operator L has to be defined as where λ(k L , z ) is the mass of agents with next period capital k L .

Definition 3 (Operator Stationary Recursive Competitive Equilibrium)
Given admissible time-invariant tax schedule τ ∈ Υ, an operator stationary recursive competitive equilibrium is prices (r, w); policy function k : subject to a system of operator equations (10)-(11), consistent with equilibrium prices (1) and the market clearing conditions (7)- (8) In a similar way, we define so-called 'sensitivity' functions Dk and Dλ, which capture the effect of marginal changes in the tax function on the policy function and on the distribution function, respectively.
consistent with equilibrium prices (1), the balanced budget, and the market clearing conditions (7)-(8) in Definition 1. The above first order necessary conditions (12)- (14), together with two additional operator equations, form a system of five operator equations in the unknown five time-invariant functions k , λ, Dk , Dλ, and τ .
The proof of Proposition 1 is in the Appendix.
Intuitively, the first order condition for the Operator Steady State Ramsey Problem in equation (12) resembles a total derivative of W with respect to τ equal to zero, i.e., a first order condition for an unconstrained optimization problem. This comes from the fact, formally stated in the proof of Proposition 1 and discussed already in Theorem 1, that the stationary recursive competitive equilibrium is properly defined for any admissible tax function from the set Υ. 7 Therefore, the remaining two first order conditions (13)-(14) are, as we proved in the previous section, necessary as well as sufficient conditions. Finally, the equations (15)-(16) are not first order conditions but serve rather as additional conditions on the unknown sensitivity functions.
The exact formulas for terms ∂X /∂ϕ, for X ∈ {W, F, L} and ϕ ∈ {τ, k , λ} are derived in Lemma 1 in the Appendix. Here we will discuss the results using simplified notation only.

Effects of τ on Social Welfare
Equation (12) describes the effect of the income tax schedule on social welfare. There is a direct effect of the income tax schedule given by the first term and two indirect effects via the next period capital decision and the stationary distribution.
Using (25) and (32) in Lemma 1 in the Appendix, the direct effect of tax on social welfare can be rewritten as The direct cumulative effect (integrated over the distribution of capital and shocks) reflects the change in current consumption weighted by the marginal utility. The direct change in current consumption can be decomposed into two parts: the first is the change in pre-tax income (1 − τ ) ∂y ∂τ (mainly through returns to capital and labor-see equation (34) in the Appendix); the second captures the change in disposable income proportional to the pre-tax income y.
Interestingly, there is also an additional feedback effect that must be taken into account when we consider a nonlinear income tax schedule contingent on the current pre-tax income: the tax change which influences income y also influences the tax rate via the change in income, thus the effect is ∂τ ∂y y ∂y ∂τ . The unknown function τ is determined from conditions (12)-(16) The negative sign in the formula follows from ∂c ∂k = −1. The effect of changes in the tax schedule on the savings decision is captured by the unknown sensitivity function Dk , characterized by the implicit functional equation (15).
The second indirect effect of the income tax schedule on social welfare is via the distribution of capital. Using (24), it is simply equal to The unknown function Dλ contains the effects of the tax schedule on the distribution of capital, characterized by the implicit functional equation (14).

Effects of τ on the Euler Equation
The total effect of the tax schedule on an individual agent's Euler equation (15) can be decomposed into three effects: a direct effect, an indirect effect via the next period capital, and an indirect effect via the two-period ahead capital.
By Lemma 1, the direct effect can be expressed as The first line of this formula measures the direct effect of a tax change in the current period due to the change in consumption weighted by the marginal change of the marginal utility u (c). The second line captures the same effect in the next period.
The third line is the direct effect on the return to capital next period, weighted by the marginal utility of the next period consumption u (c ). To obtain the total effect, these partial effects are summed over all possible future values of shocks z .
We can see a similar pattern from the direct effect of a change in the tax schedule on current consumption c, next period consumption c , and after tax return to capital (1 − τ )r. There is a decline in disposable income proportional to pre-tax income y, or in return to capital r. Similarly, there is the effect of a tax change via pre-tax income (1 − τ ) ∂y ∂τ and pre-tax return to capital (1 − τ ) ∂r ∂τ . And finally, there is feedback effect ∂τ ∂y ∂y ∂τ y and ∂τ ∂y ∂y ∂τ r. An additional part of the effect on the next-period after-tax return to capital is the term − ∂τ ∂y y . It captures the fact that a positive slope of a tax schedule creates an additional incentive to save less in order to be taxed at a lower rate. This means that the two additional terms, the first order and the second-order feedback effects, −2 ∂τ ∂y ∂y ∂τ r − ∂ 2 τ ∂y 2 ∂y ∂τ y r, represent a disincentive to earn too high income. The first term says that the disincentive depends on how the next period income changes with the tax schedule. The second term takes into account how the slope of the tax schedule changes with the change of the tax schedule.
Again, the unknown function τ can be determined from conditions (12)-(16) of Proposition 1. Once we know τ , we also know ∂τ ∂y and ∂ 2 τ ∂y 2 ; τ , ∂τ ∂y , and ∂ 2 τ ∂y 2 are just functions τ , ∂τ ∂y , and ∂ 2 τ ∂y 2 applied to the next period; i.e., τ = τ (y ), ∂τ ∂y = ∂τ (y ) ∂y , and ∂ 2 τ ∂y 2 = ∂ 2 τ (y ) ∂y 2 . According to Lemma 1, ∂y ∂τ ≡ ∂r ∂τ k + ∂w ∂τ z . Now we turn to the indirect effects: The second term of equation (15) is the indirect effect via next period capital on the individual Euler equation. We rewrite it in simplified notation as The function τ affects the Euler equation directly via k (the first term) and indirectly via k = k (k ) (the second term). The expressions ∂F ∂k , ∂F ∂k , and ∂k ∂k are described below. The term Dk , how the change in the tax function changes the savings decision, is the unknown sensitivity function.
The first indirect effect in equation (17) is equal to It can be again decomposed into three parts: the effect through the current consumption −u (c), the next period consumption, and the feedback effect on the tax schedule 2 ∂τ ∂y + ∂ 2 τ ∂y 2 y r 2 since ∂y ∂k = r. In the second indirect effect in equation (17), is the effect of the two period ahead capital decision function k = k (k ), using the fact that ∂c /∂k = −1. Finally, the second term in equation (17) also contains the effect of the policy function k on k . In full notation, Exploiting the time invariant structure of the model, the knowledge of the decision function k allows us to construct its derivative with respect to k, ∂k ∂k , and thus ∂k ∂k .
The last term in the individual Euler equation (15), the indirect effect via two period ahead capital, can be expressed as ∂F ∂k ∂k ∂τ .
It describes the effect of k on the Euler equation and the effect of the tax schedule on k . The first part ∂F ∂k was already discussed above. The direct effect of the tax schedule on the two period ahead capital decision function can be understood in its full notation: Again, knowledge of Dk is sufficient for determining ∂k ∂τ .

Effects of τ on Stationary Distribution
The total effect of the tax schedule on the stationary distribution of capital in equation (14), which must be, by definition, equal to zero, can be decomposed into three effects: a direct effect, an indirect effect via next period capital, and an indirect effect via the stationary distribution.
When the lower bound on savings is not binding, there is no direct effect of taxes on the stationarity condition and ∂L ∂τ = 0. For the other case, see Lemma 1 in the Appendix. The indirect effect via the next period capital from the second term of equation (16) is, according to Lemma 1, equal to For an individual savings function associated with a shock z today, it is simply composed of the effect of current capital on the distribution, ∂λ ∂k , and of the effect of the tax schedule on the current level of capital. In the Appendix we show that the effect of the tax on current period capital ∂(k ) −1 (k ,z) ∂τ , given a pair (k , z), can be expressed as

The Least Squares Projection Method
The solution to the Operator Steady State Ramsey Problem from Proposition 1 can be found numerically by using the least squares projection method. In this Section we outline its application to our problem and the approximation of the optimal tax schedule. 8 The solution to the Operator Steady State Ramsey Problem are the zeros of the given operator equations. First, we approximate the unknown functions by combinations of polynomials from a polynomial base. Therefore, approximated solutions are specified by unknown parameters transforming the original infinitely dimensional problem into a finite dimensional one. After substituting the approximated functions into the original operator equations, we construct the residual equations. Ideally, the residual functions should be uniformly equal to zero. In practical situations, however, this is not achievable and we limit the problem to a finite number of conditions, the so-called projections, whose satisfaction guarantees a reasonably good approximation. There are many possibilities how to define the projections. 9 We have chosen the least squares projection method for its good convergence properties and advantage in solving systems 8 For a detailed explanation of projection methods to stationary equilibria in economies with a continuum of heterogenous agents, see Bohacek and Kejak (2002). 9 For an excellent survey and description of these methods, see Chapter 11 in Judd (1998).
of nonlinear operator equations. We search for parameters approximating the functional equations that minimize the squared residual functions.
In the system of operator equations given by (12)- (14) and (15) , for each shock value z 1 , z 2 , . . . , z J . We interpret the policy function k i as the next-period capital function of an agent who was hit by shock level z i . Analogously, the distribution function λ i is the distribution of agents with shock z i , etc. Similarly, we assign the Euler and distribution function operators to every shock level, F i and L i , respectively.
We approximate all unknown functions by the orthogonal Chebyshev polynomial base 10 Details on Chebyshev polynomials can be found in Judd (1992), Judd (1998) where the vector of parameters a ≡ (a 1 , a 2 , . . . , a J , a J+1 , . . . , a J+2 ) is of a size S = J × (M + N + O + P ) + Q, k ≡ ( k 1 , k 2 , . . . , k J ), Λ ≡ ( λ 1 , λ 2 , . . . , λ J ), and . . , k J ( k i )) for any i = 1, . . . , J is obtained from equation (10) where we substitute k i (k; a) for k (k, z) and k j ( k i (k; a); a) for k (k (k, z), z ) as well as z i for z and z j for z . We can similarly get formulas for L by using equation (11) respectively. 11 To obtain equation (15)  The least squares projection method looks for a vector of parameters a that minimizes the sum of weighted residuals, with the weighting function given by After approximating the integrals by the Gauss-Chebyshev quadrature, we obtain a minimization problem with k's being the zeros of the polynomial φ of a degree greater than the biggest degree of approximation, max{M, N, O, P, Q}.
Since the least squares projection method sets up an optimization problem, we can use standard methods of numerical optimization, e.g., the Gauss-Newton or the Levenberg-Marquardt methods. Again, the discussion of these methods is not the aim of our paper. However, we found that these traditional methods did not work in our high-dimensional problem mainly due to possible multiple local solutions. We tried several other methods (simulated annealing or genetic algorithm with quantization, for example) and finally succeeded with a genetic algorithm with multiple populations and local search.

Results
In this Section we solve for the optimal tax schedule and compare the associated steady state allocations to those resulting from the existing progressive tax schedule in the U.S. economy and from the usual flat-tax reform. In order to evaluate tax reforms, we conduct the usual transition analysis.

Parameterization
Given the complexity of our Steady State Ramsey Problem, for now we do not model the earnings process so well as Ventura (1999). Each agent supplies labor inelastically and the uninsurable idiosyncratic shock to labor productivity follows a two-state, first order Markov chain. We use the results of Heaton and Lucas (1996) who, using the PSID labor market data, estimate the household annual labor income process between 1969 and 1984 by a first-order autoregression of the form with ∼ N (0, σ 2 ). They find that ρ = 0.53 and σ 2 = 0.063. Tauchen and Hussey (1991) approximation procedure for a two-state Markov chain implies z L = 0.665, z H = 1.335 and Q(z L , z L ) = Q(z H , z H ) = 0.74. These values imply an aggregate effective labor supply equal to one with agents evenly split over the two shocks. 12 We set the discount factor at β = 0.95. The rest of the parameters are taken from Prescott (1986), in particular α = 0.36, δ = 0.1, and the preference parameters σ = 1.
Finally, for all steady states we consider a Ramsey problem in which the government is required to raise a predetermined amount of tax revenues equal to 20% of the total output, i.e., g = 0.2.

The U.S. Progressive Tax Schedule
We model the progressive tax schedule as Ventura (1999), the closest model analyzing a flat-tax reform in an economy with heterogeneous agents. 13 An agent's budget constraint 12 Similar parameterization is used by Storesletten, Telmer, and Yaron (1999) Ventura (1999). 13 Compared to his model, our agents are infinitely lived, so we omit the life-cycle variables, accidental bequests, government transfers, and social security tax and benefits. Except for capital depreciation, we do not consider tax deductions.

29
can be written as where For income I ∈ (I m−1 − I m ], the total tax is then The government budget constraint is cleared by finding an equilibrium value of the tax exemption level I * . Aggregate statistics of the steady state are shown in the left column of Table 1.

A Flat-Tax Reform
The flat-tax reform consists of replacing the progressive tax schedule with a single flat tax τ on the total income from labor and capital. The budget constraint of each agent becomes Note that the flat tax reform, like in Ventura (1999), does not eliminate taxation of capital income. We find that the equilibrium flat tax rate is τ = 0.254.
The middle column in Table 1 describes the steady state results. Relative to the progressive tax schedule steady state, the flat-tax reform increases the steady state levels by magnitudes found in the literature: capital stock increases by 30%, output by 10.8%, consumption by 4.6%, and welfare by 3.9%. As in Ventura (1999), the flat-tax reform increases inequality: Gini income coefficients rise from 0.22 to 0.31 before tax and from 0.21 to 0.32 after tax. 14

The Optimal Tax Schedule
Finally, we use our methodology described in the previous Sections to solve for the optimal tax schedule that maximizes average steady state welfare.
The right column in Table 1 summarizes the optimal tax schedule steady state. The impact of the optimal tax schedule is very large. Steady state average welfare increases by 4.4%. Aggregate capital stock rises by 49%, output by 15.8%, and consumption by 5.8%. Inequality increases too but not as much as in the flat-tax reform: Gini income coefficients are 0.28 before and 0.27 after tax, respectively. General equilibrium effects cause the interest rate to drop by almost one half and the wage to increase due to a higher productivity of labor used in production with such a high capital stock. Compared to the flat-tax steady state, capital stock increases by 15%, output by 4.5%, consumption by 1.1%, and welfare by 0.8%. Figure 2 shows the optimal tax schedule and the marginal tax rate function. The average tax rate is a U-shaped function taxing the lowest total income at 45%, decreasing to a minimum of 19% and rising to 62% at the highest level of total income. Although the whole shape of the tax function is important for the resulting allocations, the majority of agents face the decreasing or the flat part of the tax schedule. The marginal tax rate is also U-shaped, almost flat and close to zero for low incomes, falling to negative levels around the average total income and then rising at high income levels. Note that the maximal marginal rate is 2.5 and that the optimal tax schedule easily satisfies the 14 Elimination of capital income tax in Lucas (1990) increases capital stock by 30-34% and consumption by 6.7%. A flat-rate reform with heterogeneous agents in Ventura (1999) increases the total capital stock by one third, output by 15%. Without a well calibrated life-cycle earnings process, we are not able to match well the inequality coefficients, especially that of wealth.
admissibility condition from Corollary 1 (the interest rate implies an upper bound equal to 19.7).
The optimal tax function τ is strictly positive and very nonlinear. Both results are different from Mirrlees (1971) static model with a fixed distribution of skills, where the welfare maximizing tax schedule is close to a linear, non-decreasing function. In his model, the marginal tax rate is between zero and one, and zero at both ends of the distribution. Compared to our model with insurance and savings incentives, Mirrlees (1971) results follow from labor incentives related to the distribution of skills and consumption-leisure preferences.
We want to emphasize that our stationary distribution is endogenous and there are no restrictions on the optimal tax schedule to be positive or to be of any particular shape. Conesa and Krueger (2004), also in a general equilibrium framework but with added life cycle features, studied the optimal progressivity of a tax schedule, limiting their class of tax schedules to monotone functions as in Gouveia and Strauss (1994).
In this class of functions, the optimal tax schedule is basically a flat tax with a fixed deduction, delivering a welfare gain of 1.7% compared to the existing progressive tax in the United States.
The class of monotone functions seems rather restrictive for the optimal tax schedule. Our class of admissible functions includes all progressive tax schedules but these were found significantly inferior with respect to the welfare criterion. Also, a simple search method can be greatly improved by using the optimality conditions developed in this paper. In a simple experiment, we applied our solution method to continue from parameters obtained from a similar simple search procedure: the social welfare criterion improved by 4.6%. Finally, the simple search method cannot be used for computation of a stationary competitive equilibrium, which is the limit of the optimal dynamic tax schedule 15 15 We show in Bohacek and Kejak (2004) that under some parameterization the first order conditions of the general dynamic Ramsey problem can simplify to the first order conditions of the Steady State 32

The Tradeoff Between Efficiency and Distribution
Apart from the general equilibrium effects, the huge impact of the optimal tax schedule arises from the distributional effects. The stationary distributions of capital in the three steady states are shown in Figure 3. Although both the flat and the optimal tax schedules increase the aggregate levels, the difference between them is that the flat tax schedule does not take into account the distribution of agents. The flat tax reform helps more the agents with high incomes: the mean wealth increases much more than the median so that the median/mean ratio falls to 0.77. In the flat-tax steady state, the aggregate levels increase but from "the optimal distribution" point of view the mass of agents moves too much to the left while wealthy agents emerge at the right tail of the distribution. The progressive tax schedule has the lowest inequality measures because the high taxes on rich agents narrows the distribution towards the mean. However, the low tax rates on low incomes do not provide incentives for the poor households to save and move to higher income levels. In other words, it provides too much short-run insurance at the cost of the long-run average levels.
This is exactly what the optimal income tax schedule improves. The main mechanism behind the large growth in the aggregate levels is the incentive effect of the optimal tax schedule. The U-shaped function in the top panel of Figure 2 effectively concentrates the agents around the mean, something that a social planner with access to lump-sum transfers would do. 16 The high tax rate at low income levels provides incentives for these agents to save more and move to higher income levels. On the other hand, the even higher tax rate on high income discourages further savings by the wealthiest agents.
Ramsey Problem analyzed in this paper. 16 In many countries, marginal taxes are favorable to middle income groups. In practice, high rates on the rich can break large fortunes while on the poor they provide a floor for poverty. The result is a more equal distribution. Saez (2002) studies the optimal progressivity of capital income tax in a partial equilibrium model with exogenous labor and distribution. He finds that a progressive tax is a powerful tool to redistribute accumulated wealth.
In the middle of the total income levels, the tax rate is lower than that found for the flat-tax reform. The optimal tax schedule preserves the median/mean wealth ratio of the progressive tax schedule by increasing the median by 47% and the mean by 49%.
The support of the invariant distribution becomes wider but inequality measures do not increase as much as in the flat-tax reform. 17 To further analyze the tradeoff between efficiency and distribution, we adopt the approach in Domeij and Heathcote (2004) to distinguish the efficiency gain from distributional gains. The efficiency gain for an individual agent is the percentage of the original consumption that would allow the agent to consume the same fraction of the aggregate consumption after the reform as he or she was consuming in the original steady state. In the case of logarithmic utility, the gain is the same for all agents (see Domeij and Heathcote (2004) for a simple proof and other details). The distributional gain is the difference between the individual welfare gain and the efficiency gain. 18 Table 1 displays the average efficiency and distributional gains of the optimal steady state relative to the other two steady states. It is apparent that the steady state associated with the optimal, U-shaped tax function is welfare and efficiency superior to the other two steady states: both average welfare and efficiency measures are positive, and naturally greater for the comparison with the progressive steady state. As it was noted before, the optimal tax schedule obtains an average distributional loss relative to the progressive tax (-0.57%) but a gain relative to the flat tax steady state (2.86%).
The individual gains, for agents with high and low labor productivity shocks, are shown in Figure 4. They are monotonically decreasing functions for all agents at all asset and labor income levels (with some exceptions). Most of the asset-poor agents 17 Table 1 also shows the fraction of agents constrained in their borrowing: only 1.16% of agents are constrained in the progressive tax schedule steady state. The flat tax schedule increases this number to 1.88%, while the optimal tax steady state it is 1.42 (Domeij and Heathcote (2004) obtained similar results). 18 The individual welfare gain is the percentage of the original consumption level that would make an agent as well off as in the optimal tax steady state.
have both welfare and distributional gains while the rich have losses relative to the flat tax schedule steady state. There are two forces present: first is the tax rate (especially for the rich agents in the flat tax steady state) and general equilibrium effects. The huge welfare gains (5-20%) for poor agents are mostly due to the higher wage in the optimal steady state. Note that the big efficiency gain from the optimal tax schedule is not sufficient to compensate all agents for the more unequal distribution (compared to the progressive tax steady state, an agent with a low productivity shock has always a distributional loss in Figure 4, top panel). Table 2 shows the distribution of resources for quintiles of the wealth distribution.
Because of the high tax rate on incomes in the bottom quintile, agents in the optimal tax schedule steady state consume 6.5% less than those of the progressive tax schedule.
From the second quintile of the optimal tax schedule steady state, agents consume on average more than in the other two steady states. Dividing these levels by the average consumption in each steady state, we can calculate average quintile consumption relative to the steady state average. In the optimal tax schedule, the bottom quintile consumes 73% of the average consumption, in the flat tax it is 77%, in the progressive it is 82%.
This shows that the savings incentives of the optimal tax schedule outweigh the insurance aspects (i.e., redistribution) in both the progressive and the flat tax schedules.
The distribution of capital reveals that the incentives contained in the optimal-tax schedule move the distribution to higher capital levels. The poorest quintile owns on average 17% more assets than in the progressive steady state. This increase is even larger for the other quintiles (40% on the top). Again, the flat-rate steady state leads to a lower level of savings by the bottom two quintiles. These levels are reflected in the shares of the total capital stock. For all steady states the bottom quintile owns only around 5% of the total stock while the top quintile around one third (43% in the flat-tax steady state).
The investment-to-income ratios reveal the agents in the bottom quintile of the optimal schedule invest much more than similar agents in the other two steady states.
Agents in the optimal tax schedule steady state invest 30% of their income, more than those in the flat-tax (27%) and progressive (22%) steady states. The investment is also more evenly distributed over the quintiles. Note also that the flat-rate tax schedule favors capital accumulation by the top quintile.
The income and after-tax income distribution show the differences between the three tax schedules. The progressive tax helps the bottom quintile while the flat tax helps the top quintile. The U-shape of the optimal tax provides the right incentives at the cost of the lowest after-tax income for the poor agents. Finally, the optimal tax actually equalizes the tax contribution share of total tax revenues across the quintiles. Both the flat-tax and progressive-tax steady states put more relative burden on the higher income quintiles.
Finally, Figure 5 shows the sensitivity functions Dk and Dλ. The top panel shows the effect of a change in the optimal tax schedule on the savings decision of agents. For the low shock it is close to zero, for the high shock it is negative and monotonically decreasing. The bottom panel displays the same effects on the probability density function of the stationary distribution λ, again for each shock. We know from the stationarity condition of the distribution that the integral of these functions must be zero. 19

Transition to the Optimal Tax Schedule Steady State
Pure welfare steady-state comparisons could be misleading because tax changes imply substantial redistribution in the short run. Changes in the mix between capital and labor income taxes redistribute the tax burden across households. In Domeij and Heathcote (2004) model of capital tax cuts, the expected discounted present value of welfare losses during transition are so large that they overturn the steady state welfare improvement.
The short-run cost in the form of higher labor taxes is too heavy a price to pay for all 19 Our numerical solution is only very close to zero due to approximation errors and the complexity of the problem we face.
except for the wealth-richest households. 20 Table 3 shows the results for our tax reform experiment. It compares the expected present discounted value from an unanticipated optimal tax reform of the progressive and flat-tax steady state. In each case the optimal tax schedule is imposed on the stationary distribution of the initial steady state (it is not the optimal transition to the optimal tax schedule steady state). We guess a sufficiently large number of convergence periods and iterate on paths of equilibrium interest rates and wages to clear markets in each period of the transition, returning possible excess tax revenues to all agents in each period. The convergence is relatively fast lasting around thirty periods.
Contrary to Domeij and Heathcote (2004), we find that the reform makes both the mean and the median agents in the progressive tax schedule economy better off. Their welfare gains are positive but smaller than in the pure steady-state comparison (3.44% and 3.86%, respectively, measured as per period consumption transfers as a percentage of the initial steady state average consumption). The top panel in Figure 6 shows the expected present discounted values in the progressive-rate steady state and at the moment of the unanticipated reform to the optimal tax schedule. While 73% of the population is better off due to the reform, it is not Pareto improving as the poorest 27% of all households are worse off (they are hit by the high tax rates the optimal schedule imposes on low income levels).
On the other hand, a transition from the flat-tax steady state would not be supported by the mean nor by the median agent (they lose 1.81% and 1.97%, respectively). The poor and the wealthy, for whom the tax increases dramatically, are worse off during the transition. The bottom panel in Figure 6 shows the expected present discounted values 20 This is similar to Garcia-Mila, Marcet, and Ventura (1995) and Auerbach and Kotlikoff (1987) who find that reducing capital income taxation shifts the tax burden away from households who receive a large fraction of their income from capital and towards those who receive a disproportionate fraction from labor. Transition costs in Lucas (1990) reduce the welfare gains from zero capital tax reform to 0.75-1.25 percent of average consumption in the initial steady state.
of the flat-rate steady state and of the transition to the optimal tax schedule. Political support is not sufficient, equal only to 33% of the population. We do not know whether an optimal transition would be welfare improving from this steady state.
As usual, this transition exercise shows that a tax reform is not Pareto improving for all agents. However, the gains from the optimal tax reform of the existing progressive tax schedule are so large that they are supported by the majority of agents despite their transitional costs. Conesa and Krueger (2004) also find that the majority of the population would benefit from their optimal tax reform. However, in their case the poor and rich benefit, while it is the middle class (38%) who would be against the reform.
Finally, Figure 7 shows the efficiency and distributional individual gains from transition. 21 Relative to the steady state analysis, the averages for the progressive steady state reform decline: while the average welfare and efficiency gains remain still positive the distributional loss reaches negative 7%. A reform from the flat rate steady state delivers average welfare and efficiency losses but improves the distribution. Note that due to sizeable general equilibrium effects, the functions are still positive for poor agents and monotonically declining.

Conclusions
Quah (2003) shows that average levels are of the first order importance for economic growth and welfare, much more important than inequality. Government policies focusing on aggregate levels, including obviously optimal fiscal policy and taxation, are essential.
However, it is the distribution of agents that delivers these aggregate levels. This paper clearly shows that it is crucial to think of policies that target the distribution of agents.
Only in this way the high aggregate levels and welfare improvements can be achieved.
To our knowledge, this paper is the first one that provides a solution method for such optimal government policies in heterogeneous agent economies. We think of these policies as optimal because they take into account their effects on the distribution of agents. As an example, we find the optimal tax schedule for a Steady State Ramsey Problem in an economy with heterogeneous agents. The optimal tax schedule is Ushaped, it increases all aggregate levels by providing the right incentives for the agents to accumulate high aggregate levels but not at the cost of increased inequality. Welfare gain in the steady state is large, and it is positive for both mean and median agents in a transition following an unanticipated optimal tax reform of the progressive tax schedule steady state.
The methodology developed in this paper can be applied to any optimal government policy. Within the field of optimal taxation, in our future research we plan to study the optimal tax schedule with elastic labor supply and realistic life-cycle income profiles. An endogenous labor-leisure decision might significantly affect the shape of the optimal tax schedule, the aggregate labor supply and the distribution of labor hours. We would also like to explore different (Rawlsian) welfare functions. Another topic that has received a lot of attention is optimal capital taxation in models with heterogeneous agents (see Aiyagari (1995) for the initial contribution). Finally, we plan to use this methodology to analyze optimal dynamic taxation. Constrained agents is a fraction of agents whose wealth equals the exogenous lower bound on capital. Welfare is measured as the consumption level corresponding to the average utility. Average Welfare Gain is measured as a percentage of the average consumption each agent would have to receive in the progressive and the flat-rate tax steady state so that the average welfare equals that in the optimal tax steady state. Average Efficiency and Distributional Gains are defined in the text.  Notes: Welfare in the steady state is defined as in the note to Table 1. Welfare in transition is measured as the consumption level corresponding to the average expected present discounted value from transition. Average Welfare Gain from transition is a percentage of the average consumption each agent would have to receive in each period of transition so that the average welfare from transition equals in expected present discounted value that of the initial steady state. Average Efficiency and Distributional Gains from transition are defined in the text.