Approximaths

  • Padé approximants: Convergence I

    For row sequences on the Padé table, Montessus’s theorem (1902) proves convergence for functions meromorphic on a disk. Before giving the statement of the theorem, we would like to remind the reader of a few definitions:

    Holomorphic function: A holomorphic function is a complex-valued function of one or more complex variables that is complex differentiable in a neighborhood of each point in a given domain.

    Analytic function: An analytic function is a function that is locally given by a convergent power series.

    Meromorphic function: A meromorphic function on an open subset D of the complex \mathbb{C}-plane is a function that is holomorphic on all of D except for a set of isolated points. These points are called the ‘poles’ of the function.

    Here is the Montessus’s theorem as stated by E. B. Saff in 1972:

    Let f(z) be analytic at z = 0 and meromorphic with precisely \nu poles (multiplicity counted) in the disk |z| < \tau. Let D be the domain obtained from |z| < \tau by deleting the \nu poles of f(z). Then, for all m sufficiently large, there exists a unique rational function R_{m, \nu} of type (m, \nu), which interpolates f(z) in the point z = 0 considered of multiplicity m+\nu+1. Each R_{m, \nu} has precisely \nu finite poles and, as m \to \infty, these poles approach the \nu poles of f(z) in |z| < \tau. The sequence R_{m, \nu} converges in D to f(z), uniformly on any compact subset of D.

    Alternative formulation:

    Let f(z) be meromorphic in |z| < \tau, analytic at z = 0, and with a total of \nu poles \zeta_1, \zeta_2, \dots, \zeta_{\nu} (with multiplicity included) in |z| < \tau. Then, as m \to \infty, the Padé approximants P(m,\nu) of f converge on:

    \displaystyle S_f := \{ z \in \mathbb{C} \mid |z| < \tau \} \setminus \{ \zeta_1, \zeta_2, \dots, \zeta_\nu \}

    to f, uniformly on every compact subset K of S_f. In particular:

    \displaystyle P(m,\nu)(z) \to f(z)

    The Padé table is represented as follows:

    n=0 n=1 n=2 \dots n=\nu \dots
    m=0 P(0,0) P(0,1) P(0,2) \dots \boxed{P(0,\nu)} \dots
    m=1 P(1,0) P(1,1) P(1,2) \dots \boxed{P(1,\nu)} \dots
    m=2 P(2,0) P(2,1) P(2,2) \dots \boxed{P(2,\nu)} \dots
    \dots \dots \dots \dots \dots
    m \to \infty \dots \dots \dots \boxed{P(m,\nu)} \dots

    The Montessus’s theorem is crucial in approximation theory as it ensures the uniform convergence of Padé approximants for meromorphic functions, enhancing the accuracy of rational approximations.

  • Two properties of Padé Approximants

    Property 1

    Let g(x) = \frac{1}{f(x)}, with f(0) \neq 0, and assume f is at least C^{m+n} at x = 0. If P(m,n)_f = \frac{P(x)}{Q(x)}, then the [n/m] Padé approximant of g(x):

    \displaystyle P(n,m)_g = \frac{Q(x)}{P(x)},

    provided P(0) \neq 0.

    Proof: Given P(m,n)_f = \frac{P(x)}{Q(x)}, we have:

    \displaystyle f(x) Q(x) - P(x) = \epsilon(x) x^{m+n+1}.

    Since g(x) = \frac{1}{f(x)}, consider:

    \displaystyle g(x) P(x) - Q(x) = \frac{1}{f(x)} P(x) - Q(x) = \frac{P(x) - f(x) Q(x)}{f(x)} = -\frac{\epsilon(x) x^{m+n+1}}{f(x)}.

    Since f(0) \neq 0, \frac{1}{f(x)} is bounded near x = 0, and:

    \displaystyle g(x) P(x) - Q(x) = O(x^{m+n+1}),

    indicating that \frac{Q(x)}{P(x)} is the [n/m] Padé approximant of g(x), as it matches the Taylor series of g(x) up to x^{m+n}. For example, for f(x) = \sqrt{1+x}, the [2/2] Padé approximant can be computed, and g(x) = \frac{1}{\sqrt{1+x}} yields a consistent [2/2] approximant by taking the reciprocal.

    Property 2

    If f is even (f(x) = f(-x)) and at least C^{m+n}, and the [m/n] Padé approximant exists and is unique (guaranteed if the Hankel determinant is non-zero), then P(m,n)_f = \frac{P(x)}{Q(x)} is even, i.e., P(x) = P(-x) and Q(x) = Q(-x).

    Proof: Since f(x) = f(-x), the Taylor series of f contains only even powers. For P(m,n)_f = \frac{P(x)}{Q(x)}, we have:

    \displaystyle f(x) Q(x) - P(x) = O(x^{m+n+1}).

    Evaluate at -x:

    \displaystyle f(-x) Q(-x) - P(-x) = f(x) Q(-x) - P(-x) = O(x^{m+n+1}),

    since f(-x) = f(x), and the error term remains of order x^{m+n+1}. Thus, \frac{P(-x)}{Q(-x)} satisfies the same Padé condition as \frac{P(x)}{Q(x)}. By uniqueness of the [m/n] approximant (assuming non-zero Hankel determinant), we conclude:

    \displaystyle P(x) = P(-x), \quad Q(x) = Q(-x).
  • Padé approximant of 1/(1-x)

    In this post we will have a look at the Padé approximants of \frac{1}{1-x}. The Maclaurin series of \frac{1}{1-x} is:

    \displaystyle 1 + x + x^2 + x^3 + x^4 + x^5 + \cdots

    which converges for x \in ]-1,1[. We can calculate the corresponding P(1,1)

    \displaystyle \frac{A_0 + A_1 x}{1 + B_1 x} = 1 + x + x^2 + \cdots
    \displaystyle = (1 + B_1 x)(1 + x + x^2 + \cdots)
    \displaystyle = 1 + x + x^2 + \cdots + B_1 x + B_1 x^2 + B_1 x^3 + \cdots
    \displaystyle = 1 + (1 + B_1) x + (1 + B_1) x^2 + (1 + B_1) x^3 + \cdots

    Keeping only degrees up to 2:

    \displaystyle = 1 + (1 + B_1) x + (1 + B_1) x^2

    This implies:

    \displaystyle A_0 = 1
    \displaystyle A_1 = 1 + B_1
    \displaystyle 1 + B_1 = 0

    Therefore, A_0 = 1, A_1 = 0 and B_1 = -1 and the P(1,1) is:

    \displaystyle \boxed{P(1,1)(x) = \frac{1}{1-x}}

    This is an exceptional result since the Padé approximant P(1,1) is equal to the function it is supposed to approximate. This result is very attractive since it suggests a way to solve very hard problems using series up to some terms. Then using Padé approximation we may hope to recover the exact solution (or at least a sufficient approximation for a specific application).

    Let’s imagine that we would like to solve the following differential equation:

    \displaystyle y' = y^2 \text{ with initial condition } y(0) = 1

    This differential equation can be solved exactly since it is separable. The solution is y(x) = \frac{1}{1-x}. Let’s pretend for a moment that solving this equation is a very difficult problem because we don’t know how to solve separable differential equations.

    A possible approach is to consider a solution of the form:

    \displaystyle y(x) = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots

    Then we have:

    \displaystyle y(x)' = a_1 + 2 a_2 x + 3 a_3 x^2 + \cdots
    \displaystyle y(x)^2 = (a_0 + a_1 x + a_2 x^2 + \cdots)^2
    \displaystyle = a_0^2 + a_0 a_1 x + a_0 a_2 x^2 + \cdots
    \displaystyle + a_0 a_1 x + a_1^2 x^2 + \cdots
    \displaystyle + a_2 a_0 x^2 + \cdots
    \displaystyle = a_0^2 + 2 a_0 a_1 x + (a_1^2 + 2 a_0 a_2) x^2 + \cdots

    We can write the differential equation keeping only terms up to two:

    \displaystyle y(x)' = y(x)^2
    \displaystyle a_1 + 2 a_2 x + 3 a_3 x^2 = a_0^2 + 2 a_0 a_1 x + (a_1^2 + 2 a_0 a_2) x^2

    we obtain:

    \displaystyle a_1 = a_0^2
    \displaystyle 2 a_2 = 2 a_0 a_1
    \displaystyle 3 a_3 = a_1^2 + 2 a_0 a_2

    Setting a_0 = 1 (since y(0) = 1) implies a_1 = a_2 = a_3 = 1. An approximation of the solution to the differential equation is therefore:

    \displaystyle 1 + x + x^2 + x^3

    As presented above, the corresponding P(1,1) = \frac{1}{1-x} which is the exact solution to the differential equation. In fact, the diagonal sequence of Padé approximants (P(1,1), P(2,2), … P(n,n)) recovers \frac{1}{1-x}. This is, of course, a special case.

    So, for this differential equation, we have the following pattern:

  • Padé approximants of sqrt(1+x) and 1/sqrt(1+x)

    In this post we will have a look at the Padé approximants of \sqrt{1+x} and \frac{1}{\sqrt{1+x}}. The Maclaurin series of \sqrt{1+x} is:

    \displaystyle 1+ \frac{1}{2}x - \frac{1}{8} x^2 + \frac{1}{16}x^3 - \frac{5}{128} x^4 + \frac{7}{256}x^5 + \dots

    The MacLaurin series of \frac{1}{\sqrt{1+x}} is:

    \displaystyle 1 - \frac{1}{2}x + \frac{3}{8}x^2 - \frac{5}{16}x^3 + \frac{35}{128}x^4 + \dots

    According to the section concerning the calculation of Padé approximants using a matrix notation (see post Computing Padé approximants, we have to solve two linear systems sequentially. For \sqrt{1+x} we therefore have to first solve:

    \displaystyle \begin{pmatrix} C_1 & C_2 \\ C_2 & C_3 \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = -\begin{pmatrix} C_3 \\ C_4 \end{pmatrix}
    \displaystyle \begin{pmatrix} \frac{1}{2} & -\frac{1}{8} \\ -\frac{1}{8} & \frac{1}{16} \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} -\frac{1}{16} \\ \frac{5}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} \frac{1}{2} & -\frac{1}{8} \\ -\frac{1}{8} & \frac{1}{16} \end{pmatrix}^{-1} \begin{pmatrix} -\frac{1}{16} \\ \frac{5}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} 4 & 8 \\ 8 & 32 \end{pmatrix} \begin{pmatrix} -\frac{1}{16} \\ \frac{5}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} \frac{1}{16} \\ \frac{3}{4} \end{pmatrix}

    Injecting the B_n coefficients calculated above in the second linear system we have:

    \displaystyle \begin{pmatrix} C_0 & 0 & 0 \\ C_1 & C_0 & 0 \\ C_2 & C_1 & C_0 \end{pmatrix} \begin{pmatrix} B_0 \\ B_1 \\ B_2 \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}
    \displaystyle \begin{pmatrix} 1 & 0 & 0 \\ \frac{1}{2} & 1 & 0 \\ -\frac{1}{8} & \frac{1}{2} & 1 \end{pmatrix} \begin{pmatrix} 1 \\ \frac{3}{4} \\ \frac{1}{16} \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}

    From this system we obtain A_0 = 1, A_1 = \frac{5}{4}, A_2 = \frac{5}{16}. Therefore:

    \displaystyle P(2,2) = \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x + B_2 x^2}
    \displaystyle = \frac{1 +\frac{5}{4} x + \frac{5}{16} x^2}{1 + \frac{3}{4} x + \frac{1}{16} x^2}

    For \frac{1}{\sqrt{1+x}} we have to first solve:

    \displaystyle \begin{pmatrix} C_1 & C_2 \\ C_2 & C_3 \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = -\begin{pmatrix} C_3 \\ C_4 \end{pmatrix}
    \displaystyle \begin{pmatrix} -\frac{1}{2} & \frac{3}{8} \\ \frac{3}{8} & -\frac{5}{16} \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} \frac{5}{16} \\ -\frac{35}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} -\frac{1}{2} & \frac{3}{8} \\ \frac{3}{8} & -\frac{5}{16} \end{pmatrix}^{-1} \begin{pmatrix} \frac{5}{16} \\ -\frac{35}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} -20 & -24 \\ -24 & -32 \end{pmatrix} \begin{pmatrix} \frac{5}{16} \\ -\frac{35}{128} \end{pmatrix}
    \displaystyle \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} \frac{5}{16} \\ \frac{5}{4} \end{pmatrix}

    Injecting the B_n coefficients calculated above in the second linear system we have:

    \displaystyle \begin{pmatrix} C_0 & 0 & 0 \\ C_1 & C_0 & 0 \\ C_2 & C_1 & C_0 \end{pmatrix} \begin{pmatrix} B_0 \\ B_1 \\ B_2 \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}
    \displaystyle \begin{pmatrix} 1 & 0 & 0 \\ -\frac{1}{2} & 1 & 0 \\ \frac{3}{8} & -\frac{1}{2} & 1 \end{pmatrix} \begin{pmatrix} 1 \\ \frac{5}{4} \\ \frac{5}{16} \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}

    From this system we obtain A_0 = 1, A_1 = \frac{3}{4}, A_2 = \frac{1}{16}. Therefore:

    \displaystyle P(2,2) = \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x + B_2 x^2}
    \displaystyle = \frac{1 + \frac{3}{4} x + \frac{1}{16} x^2}{1 + \frac{5}{4} x + \frac{5}{16} x^2}

    We observe that:

    \displaystyle P(2,2)_{\frac{1}{\sqrt{1+x}}} = \frac{1}{P(2,2)_{\sqrt{1+x}}}

    These calculations suggest that, if g = \frac{1}{f} then:

    P(n,m)_{g} = \frac{1}{P(n,m)_{f}}

    Where P(n,m)_{g} and P(n,m)_{f} are the Padé approximants of g and f respectively. This proposition is actually true and can be proved formally.

  • Padé approximants of sec(x)

    In the previous post we have computed the Padé approximants for the exp(x) function. The approximation was not very impressive compared to the Maclaurin series of exp() since the latter converges for all x. In this post we will have a look at the Padé approximants and Maclaurin series of sec(x).

    First let’s recall the graph of sec(x) (see figure 1). sec(x) is a function with vertical asymptotes that cannot be approximated globally by a Taylor series. The Maclaurin series of sec(x) is:

    \displaystyle sec(x) = 1 + \frac{1}{2!}x^2 + \frac{5}{4!} x^4 + \frac{61}{6!} x^6 + \frac{1385}{8!} x^8 + \cdots
    \displaystyle = \sum_{n=0}^{\infty} \frac{E_n}{(2n)!} x^{2n}

    Where E_n are so-called Euler numbers:

    \displaystyle E_1 = 1, \quad E_2 = 5, \quad E_3 = 61, \quad E_4 = 1385, \quad E_5 = 50521, \quad E_6 = 2702765

    We would like to compute the P(4,4) approximant of sec(x). The first step is to have a look at the corresponding Hankel determinant (which is the determinant of the Hankel matrix):

    \displaystyle H_{n,m}(f) = \det \left( \begin{array}{cccc} C_{m-n+1} & C_{m-n+2} & \cdots & C_m \\ C_{m-n+2} & C_{m-n+3} & \cdots & C_{m+1} \\ \vdots & \vdots & \ddots & \vdots \\ C_m & C_{m+1} & \cdots & C_{m+n-1} \end{array} \right)

    For m = 4 and n = 4 we have:

    \displaystyle H_{4,4}(f) = \det \left( \begin{array}{cccc} C_1 & C_2 & C_3 & C_4 \\ C_2 & C_3 & C_4 & C_5 \\ C_3 & C_4 & C_5 & C_6 \\ C_4 & C_5 & C_6 & C_7 \end{array} \right)

    and the corresponding Hankel determinant for the P(4,4) of sec(x) is:

    \displaystyle H_{4,4}(sec(x)) = \det \left( \begin{array}{cccc} 0 & \frac{1}{2!} & 0 & \frac{5}{4!} \\ \frac{1}{2!} & 0 & \frac{5}{4!} & 0 \\ 0 & \frac{5}{4!} & 0 & \frac{61}{6!} \\ \frac{5}{4!} & 0 & \frac{61}{6!} & 0 \end{array} \right) = 1.08507 \times 10^{-6}

    the determinant being not equal to zero implies that we can inverse the Hankel matrix and solve the systems to compute the Padé coefficients of P(4,4) for sec(x) (see post Computing Padé approximants). The inverse of the Hankel matrix is given by:

    \displaystyle \left( \begin{array}{cccc} 0 & -81.3333 & 0 & 200 \\ -81.3333 & 0 & 200 & 0 \\ 0 & 200 & 0 & -480 \\ 200 & 0 & -480 & 0 \end{array} \right)

    We have to solve this first linear system:

    \displaystyle \left( \begin{array}{cccc} 0 & \frac{1}{2!} & 0 & \frac{5}{4!} \\ \frac{1}{2!} & 0 & \frac{5}{4!} & 0 \\ 0 & \frac{5}{4!} & 0 & \frac{61}{6!} \\ \frac{5}{4!} & 0 & \frac{61}{6!} & 0 \end{array} \right) \left( \begin{array}{c} B_4 \\ B_3 \\ B_2 \\ B_1 \end{array} \right) = - \left( \begin{array}{c} C_5 \\ C_6 \\ C_7 \\ C_8 \end{array} \right) = \left( \begin{array}{c} 0 \\ -\frac{61}{6!} \\ 0 \\ -\frac{1385}{8!} \end{array} \right)
    \displaystyle \left( \begin{array}{cccc} 0 & \frac{1}{2!} & 0 & \frac{5}{4!} \\ \frac{1}{2!} & 0 & \frac{5}{4!} & 0 \\ 0 & \frac{5}{4!} & 0 & \frac{61}{6!} \\ \frac{5}{4!} & 0 & \frac{61}{6!} & 0 \end{array} \right)^{-1} \left( \begin{array}{c} 0 \\ -\frac{61}{6!} \\ 0 \\ -\frac{1385}{8!} \end{array} \right) = \left( \begin{array}{c} B_4 \\ B_3 \\ B_2 \\ B_1 \end{array} \right)
    \displaystyle \left( \begin{array}{cccc} 0 & -81.3333 & 0 & 200 \\ -81.3333 & 0 & 200 & 0 \\ 0 & 200 & 0 & -480 \\ 200 & 0 & -480 & 0 \end{array} \right) \left( \begin{array}{c} 0 \\ -\frac{61}{6!} \\ 0 \\ -\frac{1385}{8!} \end{array} \right) = \left( \begin{array}{c} \frac{104.3191}{5040} \\ 0 \\ -\frac{115}{252} \\ 0 \end{array} \right)

    Solving the system above allows us to compute the B_n coefficients (see results below). Now, according to the calculations presented in the post (see Computing Padé approximants), we have to solve the second linear system to compute the A_m coefficients:

    \displaystyle \left( \begin{array}{ccccc} C_0 & 0 & 0 & 0 & 0 \\ C_1 & C_0 & 0 & 0 & 0 \\ C_2 & C_1 & C_0 & 0 & 0 \\ C_3 & C_2 & C_1 & C_0 & 0 \\ C_4 & C_3 & C_2 & C_1 & C_0 \end{array} \right) \left( \begin{array}{c} 1 \\ B_1 \\ B_2 \\ B_3 \\ B_4 \end{array} \right) = \left( \begin{array}{c} A_0 \\ A_1 \\ A_2 \\ A_3 \\ A_4 \end{array} \right)
    \displaystyle \left( \begin{array}{ccccc} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ \frac{1}{2!} & 0 & 1 & 0 & 0 \\ 0 & \frac{1}{2!} & 0 & 1 & 0 \\ \frac{5}{4!} & 0 & \frac{1}{2!} & 0 & 1 \end{array} \right) \left( \begin{array}{c} 1 \\ 0 \\ -\frac{115}{252} \\ 0 \\ \frac{104.3191}{5040} \end{array} \right) = \left( \begin{array}{c} A_0 \\ A_1 \\ A_2 \\ A_3 \\ A_4 \end{array} \right)
    \displaystyle A_0 = 1, \quad A_1 = 0, \quad A_2 = \frac{1}{2!} - \frac{115}{252} = \frac{11}{252}, \quad A_3 = 0, \quad A_4 = \frac{5}{24} - \frac{115}{504} + \frac{104.3191}{5040} = \frac{4.3191}{5040}
    \displaystyle B_0 = 1, \quad B_1 = 0, \quad B_2 = -\frac{115}{252}, \quad B_3 = 0, \quad B_4 = \frac{104.3191}{5040}

    This implies that for sec(x):

    \displaystyle P(4,4) = \frac{1 + A_2 x^2 + A_4 x^4}{1 + B_2 x^2 + B_4 x^4}
    \displaystyle = \frac{1 + \frac{11}{252} x^2 + \frac{4.3191}{5040} x^4}{1 - \frac{115}{252} x^2 + \frac{104.3191}{5040} x^4}

    The graph of P(4,4) for sec(x) is presented in figure 2. The graph of sec(x) and its corresponding P(4,4) is presented in figure 3. We can see that the P(4,4) approximant (in green) is approximating the function sec(x) beyond its vertical asymptotes (pink vertical lines) in contrast to the corresponding Taylor series presented in figure 1. Moreover the convergence of P(4,4) is better than the one of the Taylor series.

    It is also interesting to note that the ‘information’ needed to construct the Padé approximants of the function sec(x) has been extracted from the truncated Taylor series. Despite this fact, the Padé approximant provides a better approximation of the sec(x) function than the series from which it is derived.

  • Padé approximants of exp(x)

    We can organize and present the Padé  P(m,n)  approximants in a table like this:

     P(m,n)   0   1   2   3   ... 
     0   P(0,0)   P(1,0)   P(2,0)   P(3,0)   ... 
     1   P(0,1)   P(1,1)   P(2,1)   P(3,1)   ... 
     2   P(0,2)   P(1,2)   P(2,2)   P(3,2)   ... 
     3   P(0,3)   P(1,3)   P(2,3)   P(3,3)   ... 
     ...   ...   ...   ...   ...   ... 

    The table above shows, in order, Padé’s first approximants. This is a way to present and organize the Padé approximants. We can use the procedure presented in the previous posts to compute the Padé approximants for the exponential function  \exp(x) . Solving systems presented in the previous posts, leads to the following table:

     P(m,n)   0   1   2   3   ... 
     0   1   1 + x   1 + x + \frac{x^2}{2}   1 + x + \frac{x^2}{2} + \frac{x^3}{6}   ... 
     1   \frac{1}{1 - x}   \frac{1 + \frac{1}{2}x}{1 - \frac{1}{2}x}   \frac{1 + \frac{2}{3}x + \frac{1}{6}x^2}{1 - \frac{1}{3}x}   \frac{1 + \frac{3}{4}x + \frac{1}{4}x^2 + \frac{1}{24}x^3}{1 - \frac{1}{4}x}   ... 
     2   \frac{1}{1 - x + \frac{1}{2}x^2}   \frac{1 + \frac{1}{3}x}{1 - \frac{2}{3}x + \frac{1}{6}x^2}   \frac{1 + \frac{1}{2}x + \frac{1}{12}x^2}{1 - \frac{1}{2}x + \frac{1}{12}x^2}   \frac{1 + \frac{3}{5}x + \frac{3}{20}x^2 + \frac{1}{60}x^3}{1 - \frac{2}{5}x + \frac{1}{20}x^2}   ... 
     3   \frac{1}{1 - x + \frac{1}{2}x^2 - \frac{1}{6}x^3}   \frac{1 + \frac{1}{4}x}{1 - \frac{3}{4}x + \frac{1}{4}x^2 - \frac{1}{24}x^3}   \frac{1 + \frac{2}{5}x + \frac{1}{20}x^2}{1 - \frac{3}{5}x + \frac{3}{20}x^2 - \frac{1}{60}x^3}   \frac{1 + \frac{1}{2}x + \frac{1}{10}x^2 + \frac{1}{120}x^3}{1 - \frac{1}{2}x + \frac{1}{10}x^2 - \frac{1}{120}x^3}   ... 
     ...   ...   ...   ...   ...   ... 

    Setting x = 1, we obtain the following values:

     P(m,n)   0   1   2   3 
     0   1.000000   2.000000   2.500000   2.666667 
     1   3.000000   2.750000   2.722222 
     2   2.000000   2.666667   2.714286   2.717949 
     3   3.000000   2.727273   2.718750   2.718310 

    The ‘relative error’ is defined as:

    \displaystyle \text{Relative error} := \frac{\text{Approximation} - \text{Exact value}}{\text{Exact value}}

    Relative errors of Padé approximants  P(m,n)  of  e^x  evaluated at x = 1, using the exact value  e \approx 2.718281  are shown in the table below:

     P(m,n)   0   1   2   3 
     0   -0.632121   -0.264241   -0.080301   -0.018988 
     1   0.103638   0.011662   0.001449 
     2   -0.264241   -0.018988   -0.001471   -0.000122 
     3   0.103638   0.003307   0.000172   0.000010 

    The Padé approximants exhibit an alternating sign pattern in their relative errors. This indicates that the Padé approximants oscillate around the true value  e , approaching it from both sides. In contrast, the Taylor approximations converge monotonically from below.

  • Hankel determinant

    In the previous posts, we have seen that in order to compute the Padé coefficients P(m,n) corresponding to a given geometric series we have to be invert the following matrix:

    \displaystyle \begin{pmatrix} C_{m-n+1} & C_{m-n+2} & \ldots & C_{m} \\ C_{m-n+2} & C_{m-n+3} & \ldots & C_{m+1} \\ \vdots & & & \vdots \\ C_{m} & C_{m+1} & \ldots & C_{m+n-1} \end{pmatrix}

    Where C_l are the coefficients of the Taylor series. The matrix above is called a “Hankel matrix”. A ‘Hankel Matrix’ is a symmetric square matrix in which each ascending skew-diagonal from left to right is constant. For Example a Hankel matrix of size 5 can be written like this:

    \displaystyle \begin{pmatrix} a & b & c & d & e \\ b & c & d & e & f \\ c & d & e & f & g \\ d & e & f & g & h \\ e & f & g & h & i \end{pmatrix}

    Let’s make some observations on the Hankel determinant H_{n,m}(f):

    \displaystyle H_{n,m}(f) := \begin{vmatrix} C_{m-n+1} & C_{m-n+2} & \ldots & C_{m} \\ C_{m-n+2} & C_{m-n+3} & \ldots & C_{m+1} \\ \vdots & & & \vdots \\ C_{m} & C_{m+1} & \ldots & C_{m+n-1} \end{vmatrix}

    This determinant has n colons and n rows. We can also numerate the terms of the determinant following the notation:

    \displaystyle H_{n,m}(f) := \begin{vmatrix} d(1,1) & d(1,2) & \ldots & d(1,n) \\ d(2,1) & d(2,2) & \ldots & d(2,n) \\ \vdots & & & \vdots \\ d(n,1) & d(n,2) & \ldots & d(n,n) \end{vmatrix}

    Where d(1,1) := C_{m-n+1} etc. This notation of the terms of the determinants implies that:

    \displaystyle d(i,j) = C_{m-n+i+j-1}

    So that:

    \displaystyle d(1,1) = C_{m-n+1+1-1} = C_{m-n+1}
    \displaystyle d(1,2) = C_{m-n+1+2-1} = C_{m-n+2}
    \displaystyle d(2,1) = C_{m-n+2+1-1} = C_{m-n+2}
    \displaystyle \ldots
    \displaystyle d(n,n) = C_{m-n+n+n-1} = C_{m+n-1}

    If f(x) is a even function of class C_{\infty}, we see that odd coefficients C_{2p+1} = \frac{f^{(2p+1)}(0)}{(2p+1)!} = 0. In this case, every second term of in the Hankel matrix is zero. If f(x) is even, we can establish that if m and n are odd then the Hankel determinant is zero:

    The term with index d(i,j) of the Hankel determinant is C_{m-n+i+j-1}. As stated before, this term is zero for an even function if m-n+i+j-1 is odd. Now, if m and n are odd this means that n-m is even and m-n+i+j-1 is odd when i + j is even. It follows that, in the case of an even function, the Hankel determinant is of the form:

    \displaystyle \begin{vmatrix} 0 & b & 0 & d & \hdots \\ b & 0 & d & 0 & \hdots \\ 0 & d & 0 & f & \hdots \\ \vdots & \vdots & \vdots & & \ddots \end{vmatrix}

    We observe that the odd rows of this determinant are linear combinations of:

    \displaystyle \begin{pmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \\ \vdots \end{pmatrix}, \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ \vdots \end{pmatrix} \text{, etc.}

    This implies that odd-numbered columns are linked and therefore the determinant is zero.

    As example we will derive the P(3,3) of the cosine function. First we have to consider the geometric series of degrees up to 3 + 3 = 6 of cosine:

    \displaystyle \cos x = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \mathcal{O}(x^8)

    For m=3 and n=3 (m and n are both odd) the Hankel determinant is:

    \displaystyle H_{3,3}(f) = \begin{vmatrix} C_{1} & C_{2} & C_{3} \\ C_{2} & C_{3} & C_{4} \\ C_{3} & C_{4} & C_{5} \end{vmatrix}

    The corresponding Hankel determinant for calculating the coefficients of the P(3,3) Padé approximant of the geometric series of cosine is therefore:

    \displaystyle H_{3,3}(cos) = \begin{vmatrix} 0 & -\frac{1}{2} & 0 \\ -\frac{1}{2} & 0 & \frac{1}{4!} \\ 0 & \frac{1}{4!} & 0 \end{vmatrix} = 0

    This implies that we cannot calculate the P(3,3) approximant for the cosine function.

    In conclusion, for an even function like cosine, when m and n are odd, the Hankel determinant Hn,m(f) is zero due to the linear dependence of the columns.

  • Padé approximants of sin(x)

    We are interested in the P(2,2) Padé approximant of the sinus function. We first have to consider de Taylor series of sin(x) up to terms 2+2=4:

    \displaystyle sin(x) = x - \frac{x^3}{3!} + \mathcal{O}(x^5)

    For m=2 and n=2, the Hankel determinant (see previous post) is:

    \displaystyle H_{2,2}(f) = \begin{vmatrix} C_{1} & C_{2} \\ C_{2} & C_{3} \\ \end{vmatrix}

    The Hankel determinant corresponding to the sinus series above is therefore:

    \displaystyle H_{2,2}(sin) = \begin{vmatrix} 1 & 0 \\ 0 & -\frac{1}{6} \\ \end{vmatrix} = -\frac{1}{6}

    According to the previous section we have to first solve:

    \displaystyle \begin{pmatrix} 1 & 0 \\ 0 & -\frac{1}{6} \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = - \begin{pmatrix} C_{3} \\ C_{4} \\ \end{pmatrix}

    Since C_3 = -\frac{1}{6} and C_4 = 0 we have to solve:

    \displaystyle \begin{pmatrix} 1 & 0 \\ 0 & -\frac{1}{6} \end{pmatrix} \begin{pmatrix} B_2 \\ B_1 \end{pmatrix} = \begin{pmatrix} \frac{1}{6} \\ 0 \\ \end{pmatrix}

    So B_2 = \frac{1}{6} and B_1 = 0.

    Finally we have to solve this system (B_0 being set to one):

    \displaystyle \begin{pmatrix} C_0 & 0 & 0 \\ C_1 & C_0 & 0 \\ C_2 & C_1 & C_0 \end{pmatrix} \begin{pmatrix} 1 \\ B_1 \\ B_2 \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}
    \displaystyle \begin{pmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \\ \frac{1}{6} \end{pmatrix} = \begin{pmatrix} A_0 \\ A_1 \\ A_2 \end{pmatrix}

    This implies that A_0 = 0, A_1 = 1 and A_2 = 0. The P(2,2) Padé approximant for sin(x) is therefore:

    \displaystyle P(2,2) = \frac{x}{1 + \frac{1}{6}x^2}

    The graph of sin(x) and its corresponding P(2,2) approximant are shown in figure 1.

  • Computing Padé approximants

    We have seen that Padé approximants offer a more efficient and flexible method for approximating functions than the Taylor expansion. They have been used in many areas of mathematics and physics.

    We would like to find a more convenient way to calculate the Padé coefficients of any Taylor or Maclaurin series.

    In the following post, we admit the existence of Padé approximants P(m,n). It is generally possible to find the coefficients of the Padé approximants except in the case of degeneracy due to particular values of the coefficients of the corresponding series. We will not consider these cases in the rest of our study. Remember that we defined Padé approximants as rational functions:

    \displaystyle P(m,n) := \frac{A_0 + A_1 x + A_2 x^2 + \dots + A_m x^m}{1 + B_1 x + B_2 x^2 + \dots + B_n x^n}

    If we want to solve a very hard or even impossible-to-solve-exactly problem using perturbation theory we will end up with a power series like \sum_{k=0}^{l} C_k x^k. To convert this (potentially not converging) series to its corresponding Padé approximant we write (as described in the previous post):

    \displaystyle \frac{A_0 + A_1 x + \dots + A_m x^m}{1 + B_1 x + \dots + B_n x^n} = C_0 + C_1 x + \dots + C_{n+m}x^{n+m} + \mathcal{O}(x^{m+n+1})

    From a purely algorithmic point of view and in order to calculate the Padé coefficients we drop \mathcal{O}(x^{m+n+1}):

    \displaystyle (C_0 + C_1 x + \dots + C_{n+m}x^{n+m})(1 + B_1 x + \dots + B_n x^n) - (A_0 + A_1 x + \dots + A_m x^m) = 0
    \displaystyle \sum_{i=0}^{n+m}\sum_{j=0}^{n}C_iB_jx^{i+j} - (A_0 + A_1 x + \dots + A_m x^m)= 0

    Comparing each term of the geometric series on the right and the left of this equation (Uniqueness of power series coeffficients) we obtain the following set of equations:

    \displaystyle (C_0 B_0 - A_0) = 0 \quad (i+j = 0)
    \displaystyle (C_1 B_0 + C_0 B_1 - A_1)x = 0 \quad (i+j = 1)
    \displaystyle (C_2 B_0 + C_1 B_1 + C_0 B_2 - A_2)x^2 = 0 \quad (i+j = 2)
    \dots
    \displaystyle (C_m B_0 + C_{m-1} B_1 + \dots + C_{m-n} B_n - A_m)x^m = 0 \quad (i+j = m)
    \displaystyle (C_{m+1} B_0 + C_m B_1 + \dots + C_{m-n+1} B_n)x^{m+1} = 0 \quad (i+j = m+1)
    \dots
    \displaystyle (C_{m+n} B_0 + C_{m+n-1} B_1 + \dots + C_m B_n)x^{m+n} = 0 \quad (i+j = m+n)

    Using A_k = 0 for k > m. We can split this system of equations in two parts and get rid of the x-terms. One part containing the A_i coefficients (up to i+j = m) and the part without A_i coefficients. The first system of equation is:

    \displaystyle C_0 B_0 - A_0 = 0
    \displaystyle C_1 B_0 + C_0 B_1 - A_1 = 0
    \displaystyle C_2 B_0 + C_1 B_1 + C_0 B_2 - A_2 = 0
    \dots
    \displaystyle C_m B_0 + C_{m-1} B_1 + \dots + C_{m-n} B_n - A_m = 0

    The second system of equations:

    \displaystyle C_{m+1} B_0 + C_m B_1 + \dots + C_{m-n+1} B_n = 0
    \dots
    \displaystyle C_{m+n} B_0 + C_{m+n-1} B_1 + \dots + C_m B_n = 0

    Setting B_0 = 1 without loss of generality the second system of equations becomes:

    \displaystyle C_m B_1 + \dots + C_{m-n+1} B_n = -C_{m+1}
    \displaystyle C_{m+1} B_1 + \dots + C_{m-n+2} B_n = -C_{m+2}
    \dots
    \displaystyle C_{m+n-1} B_1 + \dots + C_m B_n = -C_{m+n}

    Changing the order of the coefficients gives:

    \displaystyle C_{m-n+1} B_n + \dots + C_m B_1 = -C_{m+1}
    \displaystyle C_{m-n+2} B_n + \dots + C_{m+1} B_1 = -C_{m+2}
    \dots
    \displaystyle C_m B_n + \dots + C_{m+n-1} B_1 = -C_{m+n}

    We can write both systems using matrix notation:

    \displaystyle \left( \begin{array}{cccccc} C_0 & 0 & \dots & & & 0 \\ C_1 & C_0 & 0 & \dots & & 0 \\ C_2 & C_1 & C_0 & 0 & \dots & 0 \\ \vdots & & & & & \vdots \\ C_m & \dots & & & \dots & C_{m-n} \end{array} \right) \left( \begin{array}{c} B_0 \\ B_1 \\ B_2 \\ \vdots \\ B_n \end{array} \right) = \left( \begin{array}{c} A_0 \\ A_1 \\ A_2 \\ \vdots \\ A_m \end{array} \right)
    \displaystyle \left( \begin{array}{ccccc} C_{m-n+1} & C_{m-n+2} & \dots & & C_m \\ C_{m-n+2} & C_{m-n+3} & \dots & & C_{m+1} \\ \vdots & & & & \vdots \\ C_m & C_{m+1} & & \dots & C_{m+n-1} \end{array} \right) \left( \begin{array}{c} B_n \\ B_{n-1} \\ \vdots \\ B_1 \end{array} \right) = - \left( \begin{array}{c} C_{m+1} \\ C_{m+2} \\ \vdots \\ C_{m+n} \end{array} \right)

    Practically, we start calculations where m-n+k \geq 0. The resolution of the second system gives the values of the B_j coefficients which are injected into the first system to obtain the A_i coefficient (using B_0 = 1). So if the following determinant (called a ‘Hankel determinant’):

    \displaystyle H_{n,m}(f) := \left| \begin{array}{ccccc} C_{m-n+1} & C_{m-n+2} & \dots & & C_m \\ C_{m-n+2} & C_{m-n+3} & \dots & & C_{m+1} \\ \vdots & & & & \vdots \\ C_m & C_{m+1} & & \dots & C_{m+n-1} \end{array} \right|

    Is not null. i.e.:

    \displaystyle H_{n,m}(f) \neq 0

    the Padé approximant will be unique. The calculations for the Padé approximant coefficients yield a unique solution for any Taylor or Maclaurin series, provided the Hankel determinant is not null, excluding cases of degeneracy due to specific coefficient values.

  • Padé approximants

    We would like to represent functions using continued fractions (similarly as we did for numbers). For example, a well-known continued fraction representing tan(x) is:

    \displaystyle tan(x) = \frac{x}{1 - \frac{x^2}{3 - \frac{x^2}{5 - \frac{x^2}{7 - \cdots}}}}

    Continued fractions like this one typically have a bigger definition domain compared to the corresponding Taylor series. This is illustrated graphically in figure 1 and figure 2.

    The continued fraction representation of a function seems a promising approach since the radius of convergence ‘seems’ at a first glance much bigger and the approximation ‘looks’ much better (i.e. converging faster) than for Taylor and Maclaurin series. We will illustrate these statements.

    In order to make some progress in the construction of continued fractions corresponding to functions, we’ll start by defining the following rational functions, which will prove to have interesting properties in their own right:

    \displaystyle P(1,1) := \frac{A_0 + A_1 x}{1 + B_1 x}
    \displaystyle P(2,1) := \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x}
    \displaystyle P(2,2) := \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x + B_2 x^2}
    \displaystyle \cdots
    \displaystyle P(m,n) := \frac{A_0 + A_1 x + A_2 x^2 + \cdots + A_m x^m}{1 + B_1 x + B_2 x^2 + \cdots + B_n x^n}

    Using the definitions above we present some examples using the first terms of the Maclaurin series of tan(x):

    \displaystyle \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x + B_2 x^2} = x + \frac{1}{3} x^3
    \displaystyle A_0 + A_1 x + A_2 x^2 = (x + \frac{1}{3} x^3) (1 + B_1 x + B_2 x^2)
    \displaystyle A_0 + A_1 x + A_2 x^2 = x + B_1 x^2 + B_2 x^3 + \frac{1}{3} x^3 + \frac{1}{3} B_1 x^4 + \frac{1}{3} B_2 x^5
    \displaystyle A_0 + A_1 x + A_2 x^2 = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3 + \frac{1}{3} B_1 x^4 + \frac{1}{3} B_2 x^5

    In the example above we are considering a P(2,2) rational function. So we consider up to 2 + 2 = 4 as the maximum degree we would like to consider for further computations. The equation above becomes:

    \displaystyle A_0 + A_1 x + A_2 x^2 = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3 + \frac{1}{3} B_1 x^4

    Comparing the coefficients of both sides we obtain:

    \displaystyle A_0 = 0 \quad \text{and} \quad B_2 + \frac{1}{3} = 0
    \displaystyle A_1 = 1 \quad \text{and} \quad \frac{1}{3} B_1 = 0
    \displaystyle A_2 = B_1

    Solving the set of equations above gives:

    \displaystyle A_0 = 0, \ A_1 = 1, \ A_2 = 0, \ B_1 = 0, \ B_2 = -\frac{1}{3}
    \displaystyle \implies P(2,2) := \frac{A_0 + A_1 x + A_2 x^2}{1 + B_1 x + B_2 x^2} = \frac{x}{1 - \frac{1}{3} x^2}

    This is clearly the first term of the continued fraction of tan(x) as shown above.

    P(m,n) as defined above are called Padé approximants. A Padé approximant is the “best” approximation of a function near a specific point by a rational function of given order. We also have convergence beyond the radius of convergence of the corresponding series.

    Let’s calculate the P(1,2) corresponding to the first terms of the Maclaurin series of tan(x). 1 + 2 = 3 implies that we have to consider the Taylor series up to degree 3.

    \displaystyle \frac{A_0 + A_1 x}{1 + B_1 x + B_2 x^2} = x + \frac{1}{3} x^3
    \displaystyle A_0 + A_1 x = (x + \frac{1}{3} x^3) (1 + B_1 x + B_2 x^2)
    \displaystyle A_0 + A_1 x = x + B_1 x^2 + B_2 x^3 + \frac{1}{3} x^3 + \frac{1}{3} B_1 x^4 + \frac{1}{3} B_2 x^5
    \displaystyle A_0 + A_1 x = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3 + \frac{1}{3} B_1 x^4 + \frac{1}{3} B_2 x^5

    Keeping only degrees up to 3:

    \displaystyle A_0 + A_1 x = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3

    Comparing the coefficients of both sides we obtain:

    \displaystyle A_0 = 0 \quad \text{and} \quad B_1 = 0
    \displaystyle A_1 = 1 \quad \text{and} \quad B_2 + \frac{1}{3} = 0 \implies B_2 = -\frac{1}{3}
    \displaystyle \implies P(1,2) := \frac{A_0 + A_1 x}{1 + B_1 x + B_2 x^2} = \frac{x}{1 - \frac{1}{3} x^2}

    We observe that (in this case):

    \displaystyle P(1,2) = P(2,2)

    Now we calculate the P(3,3) corresponding to the first terms of the Maclaurin series of tan(x). 3 + 3 = 6 implies that we have to consider the Taylor series up to degree 5:

    \displaystyle \frac{A_0 + A_1 x + A_2 x^2 + A_3 x^3}{1 + B_1 x + B_2 x^2 + B_3 x^3} = x + \frac{1}{3} x^3 + \frac{2}{15} x^5
    \displaystyle A_0 + A_1 x + A_2 x^2 + A_3 x^3 = (x + \frac{1}{3} x^3 + \frac{2}{15} x^5) (1 + B_1 x + B_2 x^2 + B_3 x^3)
    \displaystyle A_0 + A_1 x + A_2 x^2 + A_3 x^3 = x + B_1 x^2 + B_2 x^3 + B_3 x^4 + \frac{1}{3} x^3 + \frac{1}{3} B_1 x^4 + \frac{1}{3} B_2 x^5 + \frac{1}{3} B_3 x^6 + \frac{2}{15} x^5 + \frac{2}{15} B_1 x^6 + \frac{2}{15} B_2 x^7 + \frac{2}{15} B_3 x^8
    \displaystyle A_0 + A_1 x + A_2 x^2 + A_3 x^3 = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3 + \left(B_3 + \frac{1}{3} B_1\right) x^4 + \left(\frac{1}{3} B_2 + \frac{2}{15}\right) x^5 + \left(\frac{1}{3} B_3 + \frac{2}{15} B_1\right) x^6 + \frac{2}{15} B_2 x^7 + \frac{2}{15} B_3 x^8

    Keeping only degrees up to 6:

    \displaystyle A_0 + A_1 x + A_2 x^2 + A_3 x^3 = x + B_1 x^2 + \left(B_2 + \frac{1}{3}\right) x^3 + \left(B_3 + \frac{1}{3} B_1\right) x^4 + \left(\frac{1}{3} B_2 + \frac{2}{15}\right) x^5 + \left(\frac{1}{3} B_3 + \frac{2}{15} B_1\right) x^6

    This implies:

    \displaystyle A_0 = 0 \quad \text{and} \quad B_3 + \frac{1}{3} B_1 = 0
    \displaystyle A_1 = 1 \quad \text{and} \quad \frac{1}{3} B_2 + \frac{2}{15} = 0
    \displaystyle A_2 = B_1 \quad \text{and} \quad \frac{1}{3} B_3 + \frac{2}{15} B_1 = 0
    \displaystyle A_3 = B_2 + \frac{1}{3}

    Solving the set of equations above gives:

    \displaystyle A_0 = 0, \ A_1 = 1, \ A_2 = 0, \ A_3 = -\frac{1}{15}, \ B_1 = 0, \ B_2 = -\frac{2}{5}, \ B_3 = 0
    \displaystyle \implies P(3,3) := \frac{A_0 + A_1 x + A_2 x^2 + A_3 x^3}{1 + B_1 x + B_2 x^2 + B_3 x^3} = \frac{x - \frac{1}{15} x^3}{1 - \frac{6}{15} x^2}

    Now consider the first terms of the continued fraction of tan(x):

    \displaystyle \frac{x}{1 - \frac{x^2}{3 - \frac{x^2}{5}}} = \frac{x}{1 - \frac{x^2}{\frac{15 - x^2}{5}}} = \frac{x}{1 - \frac{5 x^2}{15 - x^2}} = \frac{15 x - x^3}{15 - x^2 - 5 x^2} = \frac{15 x - x^3}{15 - 6 x^2}
    \displaystyle \implies P(3,3) = \frac{x - \frac{1}{15} x^3}{1 - \frac{6}{15} x^2} = \frac{x}{1 - \frac{x^2}{3 - \frac{x^2}{5}}}

    Therefore, the P(1,1), P(2,2) and P(3,3) Padé approximants (see fig. 3, 4 and 5) of tan(x) are equivalent to the first terms of its continued fraction as presented above.

    The results above about tan(x) suggest that a “Padé transformation” (i.e. converting a series into a rational function using the procedure described above) of a divergent Maclaurin and Taylor series is converging beyond the radius of convergence of the Maclaurin series.

    This gives hope for the use of divergent series as often encountered in perturbation theory.

    We would like to calculate the P(m,n) of exp(x). First, we would like to calculate P(1,1). We, therefore, consider the Maclaurin series of exp(x) up to 1 + 1 = 2 degrees:

    \displaystyle \frac{A_0 + A_1 x}{1 + B_1 x} = 1 + x + \frac{1}{2} x^2
    \displaystyle A_0 + A_1 x = (1 + x + \frac{1}{2} x^2)(1 + B_1 x)
    \displaystyle A_0 + A_1 x = 1 + B_1 x + x + B_1 x^2 + \frac{1}{2} x^2 + \frac{1}{2} B_1 x^3
    \displaystyle A_0 + A_1 x = 1 + (B_1 + 1) x + \left(B_1 + \frac{1}{2}\right) x^2 + \frac{1}{2} B_1 x^3

    Keeping only degrees up to 2:

    \displaystyle A_0 + A_1 x = 1 + (B_1 + 1) x + \left(B_1 + \frac{1}{2}\right) x^2

    This implies:

    \displaystyle A_0 = 1 \quad \text{and} \quad B_1 + \frac{1}{2} = 0
    \displaystyle A_1 = B_1 + 1

    Solving the set of equations above gives:

    \displaystyle A_0 = 1, \ A_1 = \frac{1}{2}, \ B_1 = -\frac{1}{2}
    \displaystyle \implies P(1,1) := \frac{A_0 + A_1 x}{1 + B_1 x} = \frac{1 + \frac{1}{2} x}{1 - \frac{1}{2} x}

    We observe that P(1,1) is equivalent to the first term of the continued fraction of exp(x) as presented above:

    \displaystyle P(1,1) = \frac{1 + \frac{1}{2} x}{1 - \frac{1}{2} x} = 1 + \frac{x}{1 - \frac{x}{2}}

    Figure 6 shows 16 Padé Approximants for exp(x). Figure 7 shows P(0,0), P(1,1), P(2,2), P(3,3) and exp(x).