# Sage vs SymPy – integration

During the recent GSoC summit I had the chance to participate in many fascinating discussions. One such occasion was while meeting the Sage representative.

A detail he mentioned, was that during his tests SymPy frequently failed to solve integrals that Sage (using Maxima) was able to solve. An explanation, in which I like to believe, would be that he was testing an old version of SymPy lacking the new integration routines implemented during the last few GSoC projects. Hence my decision the compare the most recent versions of both projects.

The tested versions are SymPy 0.7.2 and Sage 5.3.

Depending on screen size and wordpress theme the table might be badly formatted so here is a link to the wiki html page and a pdf version.

It should be noted that Sage is more rigorous about the assumptions on its symbols and so it fails to integrate something like $x^n$ if $n$ is not explicitly different than -1. I personally think that this is a feature and not a bug. Due to this difference however the script used to test Sage differs from the one used for SymPy.

Another minor methodological difference in the tests is the fact that the timeout pattern that I used failed to work for the Sage interpreter. Hence, SymPy integration timeouts at about 120 seconds while Sage integration is manually interrupted when it takes too much time.

Final methodological difference is that I purge the SymPy cache between each integral as otherwise the RAM usage becomes too great.

The results show that SymPy is slightly better in using special functions to solve integrals, but there are also a few integrals that Sage solves while SymPy fails to do so. On few occasions Sage fails disgracefully, meaning  that it returns an error instead of unevaluated integral. When both packages fail to evaluate the integral SymPy is much slower to say so (timeout for SymPy compared to 1 or 2 seconds for Sage to return an unevaluated integral). Finally, on some occasions the results by Sage seem better simplified.

Integrals solved better by SymPy (if you consider special functions “better”):

• $\frac{1}{a x^n + 1}$ with the use of a special function while Sage returns unevaluated integrals
• $\frac{a x^n}{b x^m + 1}$ with the use of a special function while Sage returns unevaluated integrals
• $\frac{a x^n + 1}{b x^m + 1}$ with the use of a special function while Sage returns unevaluated integrals
• $\frac{a x^5 + x^3 + 1}{b x^5 + x^3 + 1}$ with the use of a special function while Sage returns unevaluated integrals

Integrals solved better by Sage:

• $\frac{a x^2}{b x^2 + 1}$ solved by both but Sage’s result is simpler (it uses arctan instead of log)
• $\frac{1}{\sqrt{x^2 + 1}}$ SymPy fails this simple integral
• $\log\left(\frac{a x^3}{b x^3 + 1}\right)$ solved by both but Sage’s result is much simpler
• $\log\left(\frac{a x^2 + 1}{b x^2 + 1}\right)$ SymPy fails
• $\log\left(\frac{a x^3 + 1}{b x^3 + 1}\right)$ SymPy fails
• $\frac{1}{\sin x + 1}$ SymPy fails
• $\frac{a \sin^2 x + 1}{b \sin^2 x + 1}$ SymPy fails

The code for the SymPy tests:

import signal
from time import clock
from sympy import *
from sympy.core.cache import clear_cache

class TimeoutException(Exception):
pass

def timeout_handler(signum, frame):
raise TimeoutException()

a, b, x = symbols('a b x')
n, m = symbols('n m', integer=True)

integrants = [
x,
a*x**n,
a*x**n + 1,
a*x**b + 1,

1/x,
1/(x + 1),
1/(x**2 + 1),
1/(x**3 + 1),
1/(a*x**n),
1/(a*x**n + 1),
1/(a*x**b + 1),

a*x**2/(b*x**2 + 1),
a*x**3/(b*x**3 + 1),
a*x**n/(b*x**m + 1),
(a*x**2 + 1)/(b*x**2 + 1),
(a*x**3 + 1)/(b*x**3 + 1),
(a*x**n + 1)/(b*x**m + 1),
(a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1),

sqrt(1/x),
sqrt(1/(x + 1)),
sqrt(1/(x**2 + 1)),
sqrt(1/(x**3 + 1)),
sqrt(1/(a*x**n)),
sqrt(1/(a*x**n + 1)),
sqrt(1/(a*x**b + 1)),
sqrt(a*x**2/(b*x**2 + 1)),
sqrt(a*x**3/(b*x**3 + 1)),
sqrt(a*x**n/(b*x**m + 1)),
sqrt((a*x**2 + 1)/(b*x**2 + 1)),
sqrt((a*x**3 + 1)/(b*x**3 + 1)),
sqrt((a*x**n + 1)/(b*x**m + 1)),
sqrt((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

log(x),
log(1/x),
log(1/(x + 1)),
log(1/(x**2 + 1)),
log(1/(x**3 + 1)),
log(1/(a*x**n)),
log(1/(a*x**n + 1)),
log(1/(a*x**b + 1)),
log(a*x**2/(b*x**2 + 1)),
log(a*x**3/(b*x**3 + 1)),
log(a*x**n/(b*x**m + 1)),
log((a*x**2 + 1)/(b*x**2 + 1)),
log((a*x**3 + 1)/(b*x**3 + 1)),
log((a*x**n + 1)/(b*x**m + 1)),
log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

sin(x),
sin(x)**n*cos(x)**m,
sin(a*x)**n*cos(b*x)**m,
1/sin(x),
1/(sin(x) + 1),
1/(sin(x)**2 + 1),
1/(sin(x)**3 + 1),
1/(a*sin(x)**n),
1/(a*sin(x)**n + 1),
1/(a*sin(x)**b + 1),
a*sin(x)**2/(b*sin(x)**2 + 1),
a*sin(x)**3/(b*sin(x)**3 + 1),
a*sin(x)**n/(b*sin(x)**m + 1),
(a*sin(x)**2 + 1)/(b*sin(x)**2 + 1),
(a*sin(x)**3 + 1)/(b*sin(x)**3 + 1),
(a*sin(x)**n + 1)/(b*sin(x)**m + 1),
(a*sin(x)**5 + sin(x)**3 + 1)/(b*sin(x)**5 + sin(x)**3 + 1),
]

integrated = []
durations = []

f_integrants = open('dump_integrants', 'w')
f_integrated = open('dump_integrated', 'w')
f_durations = open('dump_duration', 'w')

for index, integrant in enumerate(integrants):
clear_cache()
print '====================================='
print index, ' of ', len(integrants)
print '###', integrant
start = clock()
try:
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(120)
integrated.append(integrate(integrant, x))
signal.alarm(0)
except TimeoutException:
integrated.append(TimeoutException)
finally:
signal.signal(signal.SIGALRM, old_handler)
durations.append(clock() - start)
print '###', integrated[-1]
print 'in %f seconds'%durations[-1]

f_integrants.write(str(integrant))
f_integrated.write(str(integrated[-1]))
f_durations.write(str(durations[-1]))
f_integrants.write('\n')
f_integrated.write('\n')
f_durations.write('\n')


And for Sage:

import signal
from time import clock
from sage.all import *
from sage.symbolic.integration.integral import indefinite_integral

class TimeoutException(Exception):
pass

def timeout_handler(signum, frame):
raise TimeoutException()

a, b, x = var('a b x')
n, m = var('n m')
assume(n, 'integer')
assume(m, 'integer')

assume(n != 1)
assume(n != -1)
assume(n != 2)
assume(n>0)
assume(b != 1)
assume(b != -1)
assume(b>0)
assume(a>0)

integrants = [
x,
a*x**n,
a*x**n + 1,
a*x**b + 1,

1/x,
1/(x + 1),
1/(x**2 + 1),
1/(x**3 + 1),
1/(a*x**n),
1/(a*x**n + 1),
1/(a*x**b + 1),

a*x**2/(b*x**2 + 1),
a*x**3/(b*x**3 + 1),
a*x**n/(b*x**m + 1),
(a*x**2 + 1)/(b*x**2 + 1),
(a*x**3 + 1)/(b*x**3 + 1),
(a*x**n + 1)/(b*x**m + 1),
(a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1),

sqrt(1/x),
sqrt(1/(x + 1)),
sqrt(1/(x**2 + 1)),
sqrt(1/(x**3 + 1)),
sqrt(1/(a*x**n)),
sqrt(1/(a*x**n + 1)),
sqrt(1/(a*x**b + 1)),
sqrt(a*x**2/(b*x**2 + 1)),
sqrt(a*x**3/(b*x**3 + 1)),
sqrt(a*x**n/(b*x**m + 1)),
sqrt((a*x**2 + 1)/(b*x**2 + 1)),
sqrt((a*x**3 + 1)/(b*x**3 + 1)),
sqrt((a*x**n + 1)/(b*x**m + 1)),
sqrt((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

log(x),
log(1/x),
log(1/(x + 1)),
log(1/(x**2 + 1)),
log(1/(x**3 + 1)),
log(1/(a*x**n)),
log(1/(a*x**n + 1)),
log(1/(a*x**b + 1)),
log(a*x**2/(b*x**2 + 1)),
log(a*x**3/(b*x**3 + 1)),
log(a*x**n/(b*x**m + 1)),
log((a*x**2 + 1)/(b*x**2 + 1)),
log((a*x**3 + 1)/(b*x**3 + 1)),
log((a*x**n + 1)/(b*x**m + 1)),
log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

sin(x),
sin(x)**n*cos(x)**m,
sin(a*x)**n*cos(b*x)**m,
1/sin(x),
1/(sin(x) + 1),
1/(sin(x)**2 + 1),
1/(sin(x)**3 + 1),
1/(a*sin(x)**n),
1/(a*sin(x)**n + 1),
1/(a*sin(x)**b + 1),
a*sin(x)**2/(b*sin(x)**2 + 1),
a*sin(x)**3/(b*sin(x)**3 + 1),
a*sin(x)**n/(b*sin(x)**m + 1),
(a*sin(x)**2 + 1)/(b*sin(x)**2 + 1),
(a*sin(x)**3 + 1)/(b*sin(x)**3 + 1),
(a*sin(x)**n + 1)/(b*sin(x)**m + 1),
(a*sin(x)**5 + sin(x)**3 + 1)/(b*sin(x)**5 + sin(x)**3 + 1),
]

integrated = []
durations = []

f_integrants = open('dump_integrants', 'w')
f_integrated = open('dump_integrated', 'w')
f_durations = open('dump_duration', 'w')

for index, integrant in enumerate(integrants):
print '====================================='
print index, ' of ', len(integrants)
print '###', integrant
start = clock()
try:
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(120)
integrated.append(indefinite_integral(integrant, x))
signal.alarm(0)
except Exception, e:
integrated.append(e)
finally:
signal.signal(signal.SIGALRM, old_handler)
durations.append(clock() - start)
print '###', integrated[-1]
print 'in %f seconds'%durations[-1]

f_integrants.write(str(integrant))
f_integrated.write(str(integrated[-1]))
f_durations.write(str(durations[-1]))
f_integrants.write('\n')
f_integrated.write('\n')
f_durations.write('\n')


Below is the complete table (available as a wiki html page and a pdf).

 Legend: Disgraceful failure Timeout or manual interupt Return an unevalued integral Asking for assumptions Solution with fancy special functions Integrant Sympy integrate Sage indefinite_integral Comments Result cpu time Result cpu time x x**2/2 0 1/2*x^2 0 a*x**n a*x**(n + 1)/(n + 1) 0 a*x^(n + 1)/(n + 1) 0 Sage asks whether the denominator is zero before solving. a*x**n + 1 a*x**(n + 1)/(n + 1) + x 0 a*x^(n + 1)/(n + 1) + x 0 Sage asks whether the denominator is zero before solving. a*x**b + 1 a*x**(b + 1)/(b + 1) + x 0 a*x^(b + 1)/(b + 1) + x 0 Sage asks whether the denominator is zero before solving. 1/x log(x) 0 log(x) 0 1/(x + 1) log(x + 1) 0 log(x + 1) 0 1/(x**2 + 1) atan(x) 0 arctan(x) 0 1/(x**3 + 1) log(x + 1)/3 – log(x**2 – x + 1)/6 + sqrt(3)*atan(2*sqrt(3)*x/3 – sqrt(3)/3)/3 0 1/3*sqrt(3)*arctan(1/3*(2*x – 1)*sqrt(3)) + 1/3*log(x + 1) – 1/6*log(x^2 – x + 1) 0 x**(-n)/a x**(-n + 1)/(a*(-n + 1)) 0 -x^(-n + 1)/((n – 1)*a) 0 Sage asks whether the denominator is zero before solving. 1/(a*x**n + 1) x*gamma(1/n)*lerchphi(a*x**n*exp_polar(I*pi), 1, 1/n)/(n**2*gamma(1 + 1/n)) 2 integrate(1/(x^n*a + 1), x) 1 Sympy solves with special functions an integral that Sage cannot solve. 1/(a*x**b + 1) x*gamma(1/b)*lerchphi(a*x**b*exp_polar(I*pi), 1, 1/b)/(b**2*gamma(1 + 1/b)) 1 integrate(1/(x^b*a + 1), x) 1 Sympy solves with special functions an integral that Sage cannot solve. a*x**2/(b*x**2 + 1) a*(sqrt(-1/b**3)*log(-b*sqrt(-1/b**3) + x)/2 – sqrt(-1/b**3)*log(b*sqrt(-1/b**3) + x)/2 + x/b) 0 (x/b – arctan(sqrt(b)*x)/b^(3/2))*a 0 a*x**3/(b*x**3 + 1) a*(RootSum(_t**3 + 1/(27*b**4), Lambda(_t, _t*log(-3*_t*b + x))) + x/b) 0 -1/6*(2*sqrt(3)*arctan(1/3*(2*b^(2/3)*x – b^(1/3))*sqrt(3)/b^(1/3))/b^(4/3) – 6*x/b – log(b^(2/3)*x^2 – b^(1/3)*x + 1)/b^(4/3) + 2*log((b^(1/3)*x + 1)/b^(1/3))/b^(4/3))*a 0 Interesting examples deserving more study as Sympy uses the sum of the roots of a high order polynomial while Sage uses elementary special functions. a*x**n/(b*x**m + 1) a*(n*x*x**n*gamma(n/m + 1/m)*lerchphi(b*x**m*exp_polar(I*pi), 1, n/m + 1/m)/(m**2*gamma(1 + n/m + 1/m)) + x*x**n*gamma(n/m + 1/m)*lerchphi(b*x**m*exp_polar(I*pi), 1, n/m + 1/m)/(m**2*gamma(1 + n/m + 1/m))) 3 (m*integrate(x^n/((m – n – 1)*b^2*x^(2*m) + 2*(m – n – 1)*x^m*b + m – n – 1), x) – x^(n + 1)/((m – n – 1)*x^m*b + m – n – 1))*a 0 Sympy solves with special functions an integral that Sage cannot solve. (a*x**2 + 1)/(b*x**2 + 1) a*x/b + sqrt((-a**2 + 2*a*b – b**2)/b**3)*log(-b*sqrt((-a**2 + 2*a*b – b**2)/b**3)/(a – b) + x)/2 – sqrt((-a**2 + 2*a*b – b**2)/b**3)*log(b*sqrt((-a**2 + 2*a*b – b**2)/b**3)/(a – b) + x)/2 0 a*x/b – (a – b)*arctan(sqrt(b)*x)/b^(3/2) 0 Sage symplifies better (log-to-trig formulas). (a*x**3 + 1)/(b*x**3 + 1) a*x/b + RootSum(_t**3 + (a**3 – 3*a**2*b + 3*a*b**2 – b**3)/(27*b**4), Lambda(_t, _t*log(-3*_t*b/(a – b) + x))) 0 a*x/b – 1/3*(a*b – b^2)*sqrt(3)*arctan(1/3*(2*b^(2/3)*x – b^(1/3))*sqrt(3)/b^(1/3))/b^(7/3) + 1/6*(a*b^(2/3) – b^(5/3))*log(b^(2/3)*x^2 – b^(1/3)*x + 1)/b^2 – 1/3*(a*b^(2/3) – b^(5/3))*log((b^(1/3)*x + 1)/b^(1/3))/b^2 0 Interesting examples deserving more study as Sympy uses the sum of the roots of a high order polynomial while Sage uses elementary special functions. (a*x**n + 1)/(b*x**m + 1) a*(n*x*x**n*gamma(n/m + 1/m)*lerchphi(b*x**m*exp_polar(I*pi), 1, n/m + 1/m)/(m**2*gamma(1 + n/m + 1/m)) + x*x**n*gamma(n/m + 1/m)*lerchphi(b*x**m*exp_polar(I*pi), 1, n/m + 1/m)/(m**2*gamma(1 + n/m + 1/m))) + x*gamma(1/m)*lerchphi(b*x**m*exp_polar(I*pi), 1, 1/m)/(m**2*gamma(1 + 1/m)) 5 a*m*integrate(x^n/((m – n – 1)*b^2*x^(2*m) + 2*(m – n – 1)*x^m*b + m – n – 1), x) – a*x^(n + 1)/((m – n – 1)*x^m*b + m – n – 1) + integrate(1/(x^m*b + 1), x) 1 Sympy solves with special functions an integral that Sage cannot solve. (a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1) a*x/b + RootSum(_t**5 + _t**3*(500*a**2*b**3 + 27*a**2 – 1000*a*b**4 – 54*a*b + 500*b**5 + 27*b**2)/(3125*b**6 + 108*b**3) + _t**2*(27*a**3 – 81*a**2*b + 81*a*b**2 – 27*b**3)/(3125*b**6 + 108*b**3) + _t*(9*a**4 – 36*a**3*b + 54*a**2*b**2 – 36*a*b**3 + 9*b**4)/(3125*b**6 + 108*b**3) + (a**5 – 5*a**4*b + 10*a**3*b**2 – 10*a**2*b**3 + 5*a*b**4 – b**5)/(3125*b**6 + 108*b**3), Lambda(_t, _t*log(x + (3662109375*_t**4*b**12 + 3986718750*_t**4*b**9 + 242757000*_t**4*b**6 + 3779136*_t**4*b**3 – 1054687500*_t**3*a*b**9 – 72900000*_t**3*a*b**6 – 1259712*_t**3*a*b**3 + 1054687500*_t**3*b**10 + 72900000*_t**3*b**7 + 1259712*_t**3*b**4 + 410156250*_t**2*a**2*b**9 + 655340625*_t**2*a**2*b**6 + 51267654*_t**2*a**2*b**3 + 944784*_t**2*a**2 – 820312500*_t**2*a*b**10 – 1310681250*_t**2*a*b**7 – 102535308*_t**2*a*b**4 – 1889568*_t**2*a*b + 410156250*_t**2*b**11 + 655340625*_t**2*b**8 + 51267654*_t**2*b**5 + 944784*_t**2*b**2 – 48828125*_t*a**3*b**9 – 186046875*_t*a**3*b**6 + 16774290*_t*a**3*b**3 + 629856*_t*a**3 + 146484375*_t*a**2*b**10 + 558140625*_t*a**2*b**7 – 50322870*_t*a**2*b**4 – 1889568*_t*a**2*b – 146484375*_t*a*b**11 – 558140625*_t*a*b**8 + 50322870*_t*a*b**5 + 1889568*_t*a*b**2 + 48828125*_t*b**12 + 186046875*_t*b**9 – 16774290*_t*b**6 – 629856*_t*b**3 – 2812500*a**4*b**6 + 3596400*a**4*b**3 + 104976*a**4 + 11250000*a**3*b**7 – 14385600*a**3*b**4 – 419904*a**3*b – 16875000*a**2*b**8 + 21578400*a**2*b**5 + 629856*a**2*b**2 + 11250000*a*b**9 – 14385600*a*b**6 – 419904*a*b**3 – 2812500*b**10 + 3596400*b**7 + 104976*b**4)/(9765625*a**4*b**8 + 26493750*a**4*b**5 + 746496*a**4*b**2 – 39062500*a**3*b**9 – 105975000*a**3*b**6 – 2985984*a**3*b**3 + 58593750*a**2*b**10 + 158962500*a**2*b**7 + 4478976*a**2*b**4 – 39062500*a*b**11 – 105975000*a*b**8 – 2985984*a*b**5 + 9765625*b**12 + 26493750*b**9 + 746496*b**6)))) 106 -(a – b)*integrate((x^3 + 1)/(b*x^5 + x^3 + 1), x)/b + a*x/b 0 Sympy solves with special functions an integral that Sage cannot solve. sqrt(1/x) 2*x*sqrt(1/x) 0 2*x*sqrt(1/x) 0 sqrt(1/(x + 1)) 2*x*sqrt(1/(x + 1)) + 2*sqrt(1/(x + 1)) 0 2/sqrt(1/(x + 1)) 0 sqrt(1/(x**2 + 1)) Integral(sqrt(1/(x**2 + 1)), x) 0 arcsinh(x) 0 Sympy cannot solve this simple integral while Sage can. sqrt(1/(x**3 + 1)) Integral(sqrt(1/(x**3 + 1)), x) 3 integrate(sqrt(1/(x^3 + 1)), x) 0 sqrt(x**(-n)/a) -2*x*sqrt(1/a)*sqrt(x**(-n))/(n – 2) 0 -2*x*sqrt(x^(-n)/a)/(n-2) 0 Sage asks whether the denominator is zero before solving. sqrt(1/(a*x**n + 1)) Integral(sqrt(1/(a*x**n + 1)), x) 29 integrate(sqrt(1/(x^n*a + 1)), x) 0 When both Sage and Sympy fail, Sage is quicker. sqrt(1/(a*x**b + 1)) Integral(sqrt(1/(a*x**b + 1)), x) 35 integrate(sqrt(1/(x^b*a + 1)), x) 1 When both Sage and Sympy fail, Sage is quicker. sqrt(a*x**2/(b*x**2 + 1)) sqrt(a)*x*sqrt(x**2)*sqrt(1/(b*x**2 + 1)) + sqrt(a)*sqrt(x**2)*sqrt(1/(b*x**2 + 1))/(b*x) 2 (sqrt(a)*b*x^2 + sqrt(a))/(sqrt(b*x^2 + 1)*b) 0 sqrt(a*x**3/(b*x**3 + 1)) Integral(sqrt(a*x**3/(b*x**3 + 1)), x) 7 integrate(sqrt(a*x^3/(b*x^3 + 1)), x) 0 sqrt(a*x**n/(b*x**m + 1)) Timeout 115 integrate(sqrt(x^n*a/(x^m*b + 1)), x) 1 When both Sage and Sympy fail, Sage is quicker. sqrt((a*x**2 + 1)/(b*x**2 + 1)) Timeout 110 integrate(sqrt((a*x^2 + 1)/(b*x^2 + 1)), x) 0 When both Sage and Sympy fail, Sage is quicker. sqrt((a*x**3 + 1)/(b*x**3 + 1)) Timeout 109 integrate(sqrt((a*x^3 + 1)/(b*x^3 + 1)), x) 0 When both Sage and Sympy fail, Sage is quicker. sqrt((a*x**n + 1)/(b*x**m + 1)) Timeout 114 integrate(sqrt((x^n*a + 1)/(x^m*b + 1)), x) 1 When both Sage and Sympy fail, Sage is quicker. sqrt((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)) Timeout 104 integrate(sqrt((a*x^5 + x^3 + 1)/(b*x^5 + x^3 + 1)), x) 0 When both Sage and Sympy fail, Sage is quicker. log(x) x*log(x) – x 0 x*log(x) – x 0 log(1/x) -x*log(x) + x 0 -x*log(x) + x 0 log(1/(x + 1)) -x*log(x + 1) + x – log(x + 1) 0 -(x + 1)*log(x + 1) + x + 1 0 log(1/(x**2 + 1)) -x*log(x**2 + 1) + 2*x – 2*I*log(x + I) + I*log(x**2 + 1) 2 -x*log(x^2 + 1) + 2*x – 2*arctan(x) 0 log(1/(x**3 + 1)) -x*log(x**3 + 1) + 3*x – 3*log(x + 1)/2 + sqrt(3)*I*log(x + 1)/2 + log(x**3 + 1)/2 – sqrt(3)*I*log(x**3 + 1)/2 + sqrt(3)*I*log(x – 1/2 – sqrt(3)*I/2) 6 -x*log(x^3 + 1) – sqrt(3)*arctan(1/3*(2*x – 1)*sqrt(3)) + 3*x – log(x + 1) + 1/2*log(x^2 – x + 1) 0 log(x**(-n)/a) -n*x*log(x) + n*x – x*log(a) 0 n*x + x*log(x^(-n)/a) 0 log(1/(a*x**n + 1)) Integral(log(1/(a*x**n + 1)), x) 68 n*x – n*integrate(1/(a*e^(n*log(x)) + 1), x) – x*log(x^n*a + 1) 1 When both Sage and Sympy fail, Sage is quicker. log(1/(a*x**b + 1)) Timeout 91 b*x – b*integrate(1/(a*e^(b*log(x)) + 1), x) – x*log(a*e^(b*log(x)) + 1) 2 When both Sage and Sympy fail, Sage is quicker. log(a*x**2/(b*x**2 + 1)) x*log(a) + 2*x*log(x) – x*log(b*x**2 + 1) + 2*I*log(x – I*sqrt(1/b))/(b*sqrt(1/b)) – I*log(b*x**2 + 1)/(b*sqrt(1/b)) 10 x*log(a*x^2/(b*x^2 + 1)) – 2*arctan(sqrt(b)*x)/sqrt(b) 0 log(a*x**3/(b*x**3 + 1)) -216*b**4*x**6*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 216*(-1)**(2/3)*b**4*x**6*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 72*(-1)**(1/6)*sqrt(3)*b**4*x**6*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 72*sqrt(3)*I*b**4*x**6*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 588*b**2*x**5*(1/b)**(2/3)*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 1764*b**2*x**5*(1/b)**(2/3)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 588*b**2*x**5*(1/b)**(2/3)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 630*b**2*x**5*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 420*(-1)**(5/6)*sqrt(3)*b**2*x**5*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 210*sqrt(3)*I*b**2*x**5*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 981*b**2*x**3*(1/b)**(4/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 981*(-1)**(2/3)*b**2*x**3*(1/b)**(4/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 327*(-1)**(1/6)*sqrt(3)*b**2*x**3*(1/b)**(4/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 327*sqrt(3)*I*b**2*x**3*(1/b)**(4/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 135*b**2*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 135*(-1)**(2/3)*b**2*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 45*(-1)**(1/6)*sqrt(3)*b**2*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 45*sqrt(3)*I*b**2*(1/b)**(7/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 147*(-1)**(2/3)*b*x**4*log(a)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 49*(-1)**(1/6)*sqrt(3)*b*x**4*log(a)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 49*(-1)**(5/6)*sqrt(3)*b*x**4*log(a)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 147*(-1)**(1/3)*b*x**4*log(a)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 882*(-1)**(2/3)*b*x**4*log(a)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(1/6)*sqrt(3)*b*x**4*log(a)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(5/6)*sqrt(3)*b*x**4*log(a)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(1/3)*b*x**4*log(a)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(1/3)*b*x**4*log(a)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 98*(-1)**(5/6)*sqrt(3)*b*x**4*log(a)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 98*(-1)**(1/6)*sqrt(3)*b*x**4*log(a)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(2/3)*b*x**4*log(a)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 588*(-1)**(2/3)*b*x**4*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 1323*(-1)**(2/3)*b*x**4*log(x)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 441*(-1)**(1/6)*sqrt(3)*b*x**4*log(x)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 441*(-1)**(5/6)*sqrt(3)*b*x**4*log(x)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 1323*(-1)**(1/3)*b*x**4*log(x)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 882*(-1)**(1/3)*b*x**4*log(x)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(5/6)*sqrt(3)*b*x**4*log(x)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(1/6)*sqrt(3)*b*x**4*log(x)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(2/3)*b*x**4*log(x)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 882*(-1)**(1/3)*b*x**4*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(5/6)*sqrt(3)*b*x**4*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(1/6)*sqrt(3)*b*x**4*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(2/3)*b*x**4*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(5/6)*sqrt(3)*b*x**4*log(x – (-1)**(1/3)*(1/b)**(1/3))/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(1/3)*b*x**4*log(x – (-1)**(1/3)*(1/b)**(1/3))/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 147*(-1)**(2/3)*b*x**4*log(b*x**3 + 1)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 49*(-1)**(1/6)*sqrt(3)*b*x**4*log(b*x**3 + 1)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 49*(-1)**(5/6)*sqrt(3)*b*x**4*log(b*x**3 + 1)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 147*(-1)**(1/3)*b*x**4*log(b*x**3 + 1)**2/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 588*(-1)**(2/3)*b*x**4*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(1/6)*sqrt(3)*b*x**4*log(x – (-1)**(1/3)*sqrt(3)*I*(1/b)**(1/3)/2 + (-1)**(1/3)*(1/b)**(1/3)/2)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(2/3)*b*x**4*log(x – (-1)**(1/3)*sqrt(3)*I*(1/b)**(1/3)/2 + (-1)**(1/3)*(1/b)**(1/3)/2)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(2/3)*b*x**4/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 98*(-1)**(1/6)*sqrt(3)*b*x**4/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 98*(-1)**(5/6)*sqrt(3)*b*x**4/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(1/3)*b*x**4/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 588*b*x**2*(1/b)**(2/3)*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 1764*b*x**2*(1/b)**(2/3)*log(x)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 588*b*x**2*(1/b)**(2/3)*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 945*b*x**2*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 630*(-1)**(5/6)*sqrt(3)*b*x**2*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 315*sqrt(3)*I*b*x**2*(1/b)**(2/3)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 98*(-1)**(1/6)*sqrt(3)*x*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 98*(-1)**(5/6)*sqrt(3)*x*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(2/3)*x*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(1/3)*x*log(a)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 294*(-1)**(5/6)*sqrt(3)*x*log(x – (-1)**(1/3)*(1/b)**(1/3))/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(1/3)*x*log(x – (-1)**(1/3)*(1/b)**(1/3))/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(1/3)*x*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(2/3)*x*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 98*(-1)**(5/6)*sqrt(3)*x*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 98*(-1)**(1/6)*sqrt(3)*x*log(b*x**3 + 1)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) – 294*(-1)**(1/6)*sqrt(3)*x*log(x – (-1)**(1/3)*sqrt(3)*I*(1/b)**(1/3)/2 + (-1)**(1/3)*(1/b)**(1/3)/2)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) + 882*(-1)**(2/3)*x*log(x – (-1)**(1/3)*sqrt(3)*I*(1/b)**(1/3)/2 + (-1)**(1/3)*(1/b)**(1/3)/2)/(588*b**2*x**4*(1/b)**(2/3) + 588*b*x*(1/b)**(2/3)) 26 x*log(a*x^3/(b*x^3 + 1)) – 1/2*(2*sqrt(3)*a*arctan(1/3*(2*b^(2/3)*x – b^(1/3))*sqrt(3)/b^(1/3))/b^(1/3) – a*log(b^(2/3)*x^2 – b^(1/3)*x + 1)/b^(1/3) + 2*a*log((b^(1/3)*x + 1)/b^(1/3))/b^(1/3))/a 0 Sage simplifies better. log(a*x**n/(b*x**m + 1)) Timeout 96 (m – n + log(a))*x – m*integrate(1/(b*e^(m*log(x)) + 1), x) – x*log(x^m*b + 1) + x*log(x^n) 2 When both Sage and Sympy fail, Sage is quicker. log((a*x**2 + 1)/(b*x**2 + 1)) Integral(log((a*x**2 + 1)/(b*x**2 + 1)), x) 72 x*log((a*x^2 + 1)/(b*x^2 + 1)) + 2*arctan(sqrt(a)*x)/sqrt(a) – 2*arctan(sqrt(b)*x)/sqrt(b) 0 Sage asks whether a and b are positive and then returns an answer. Sympy fails irrespective of the assumptions. log((a*x**3 + 1)/(b*x**3 + 1)) Timeout 89 x*log((a*x^3 + 1)/(b*x^3 + 1)) + sqrt(3)*arctan(1/3*(2*a^(2/3)*x – a^(1/3))*sqrt(3)/a^(1/3))/a^(1/3) – sqrt(3)*arctan(1/3*(2*b^(2/3)*x – b^(1/3))*sqrt(3)/b^(1/3))/b^(1/3) – 1/2*log(a^(2/3)*x^2 – a^(1/3)*x + 1)/a^(1/3) + log((a^(1/3)*x + 1)/a^(1/3))/a^(1/3) + 1/2*log(b^(2/3)*x^2 – b^(1/3)*x + 1)/b^(1/3) – log((b^(1/3)*x + 1)/b^(1/3))/b^(1/3) 0 Sage asks whether a and b are positive and then returns an answer. Sympy fails irrespective of the assumptions. log((a*x**n + 1)/(b*x**m + 1)) Timeout 89 (m – n)*x – m*integrate(1/(b*e^(m*log(x)) + 1), x) + n*integrate(1/(x^n*a + 1), x) – x*log(x^m*b + 1) + x*log(x^n*a + 1) 4 When both Sage and Sympy fail, Sage is quicker. log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)) Integral(log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)), x) 42 -x*log(b*x^5 + x^3 + 1) + x*log(a*x^5 + x^3 + 1) – integrate((2*x^3 + 5)/(b*x^5 + x^3 + 1), x) + integrate((2*x^3 + 5)/(a*x^5 + x^3 + 1), x) 1 When both Sage and Sympy fail, Sage is quicker. sin(x) -cos(x) 0 -cos(x) 0 sin(x)**n*cos(x)**m Timeout 102 No result 110 Disgraceful failure by Sage. sin(a*x)**n*cos(b*x)**m Timeout 81 No result 112 Disgraceful failure by Sage. 1/sin(x) log(cos(x) – 1)/2 – log(cos(x) + 1)/2 0 1/2*log(cos(x) – 1) – 1/2*log(cos(x) + 1) 0 1/(sin(x) + 1) -2/(tan(x/2) + 1) 1 -2/(sin(x)/(cos(x) + 1) + 1) 0 1/(sin(x)**2 + 1) Timeout 96 1/2*sqrt(2)*arctan(sqrt(2)*tan(x)) 0 Sage simply beats Sympy. 1/(sin(x)**3 + 1) Timeout 87 Maxima: quotient’ by zero’ 78 Disgraceful failure by Sage. sin(x)**(-n)/a Integral(sin(x)**(-n)/a, x) 36 No result 227 Disgraceful failure by Sage. 1/(a*sin(x)**n + 1) Timeout 98 Maxima: expt: undefined: 0 to a negative exponent. 1 Disgraceful failure by Sage. 1/(a*sin(x)**b + 1) Timeout 83 No result 140 Disgraceful failure by Sage. a*sin(x)**2/(b*sin(x)**2 + 1) Timeout 93 (x/b – arctan(sqrt(b + 1)*tan(x))/(sqrt(b + 1)*b))*a 0 Sage simply beats Sympy. a*sin(x)**3/(b*sin(x)**3 + 1) Timeout 82 No result 568 Disgraceful failure by Sage. a*sin(x)**n/(b*sin(x)**m + 1) Integral(a*sin(x)**n/(b*sin(x)**m + 1), x) 24 Manual Interupt 1527 Both Sage and Sympy fail, however Sympy is quicker. (a*sin(x)**2 + 1)/(b*sin(x)**2 + 1) Timeout 98 a*x/b – (a – b)*arctan(sqrt(b + 1)*tan(x))/(sqrt(b + 1)*b) 0 Sage simply beats Sympy. (a*sin(x)**3 + 1)/(b*sin(x)**3 + 1) Timeout 96 Manual Interupt 203 (a*sin(x)**n + 1)/(b*sin(x)**m + 1) Timeout 83 Maxima: expt: undefined: 0 to a negative exponent. 1 Disgraceful failure by Sage. (a*sin(x)**5 + sin(x)**3 + 1)/(b*sin(x)**5 + sin(x)**3 + 1) Timeout 89 Manual Interupt 142

# Graph of the Relations between Objects in the diffgeom Module

This graph, besides showing naively in a rather simplistic way the structure of the theory of differential geometry (and most of what I have implemented in the diffgeom module), brings attention to the one non-trivial part of the module on which I have spent most of my time lately. Namely, implementing covariant derivatives.

All directional derivatives are defined as a limiting procedure on a transport operator. Besides the Lie derivatives which use a certain transport operator that is easy to express in a coordinate free way, all other derivatives, called covariant derivatives have to be expressed using something called Christoffel symbols. And these are the ugly coordinate-dependent sources of pain, as the module structure becomes very cumbersome when such dependence must be accounted for. Thankfully, I think I have found a nice way to implement them in a new CovariantDerivativeOperator class on its own, that will contain all the logic in the same way in which the Base*Field classes do it. This will also require rewrite of the LieDerivative into a LieDerivativeOperator class.

# The diffgeom Module – Status Report

I have written already a few posts about the theory behind the module, the structure of the module, etc. However, besides some rare examples, I have not described in much details how the work progresses. So here is a short summary (check the git log for more details):

• The basics about coordinate systems and fields are already in. There are numerous issues with all the simplify-like algorithms inside SymPy, however they are slowly ironed out.
• Some simplistic methods for work with integral curves are implemented.
• The basics of tensor/wedge products are in. Many simplification routines can be added. Contraction between tensor products and vectors is possible (special case of “lowering of an index”).
• Over-a-map, pushforwards and pullbacks are not implemented yet.
• Instead of them I have focused my work on derivatives and curvature tensors. For the moment work on these can be done in a limited coordinate-dependent way. A longer post explaining the theory is coming and with it an implementation slightly less dependent on coordinates (working with Christoffel symbols is a pain).
• Hodge star operator – still not implemented.

An example that I want to implement is a theorem that in irrotational cosmology isotropy implies homogeneity. Doing this will be the first non-trivial example in this module.

A serendipitous detour from the project was my work on the differential equations solver. Aaron had implemented a very thorough solver for single equations. I had tried to extend it in a few simple ways in order to work with systems of ODEs and initial conditions. However this led me to Jordan forms of matrices, generalized eigenvectors and a bunch of interesting details on which I work in my free time (especially this week).

# Form Fields and Vector Fields do not form a Vector Space

Form fields or vector fields over a manifold (as opposed to forms and vectors) do not form a vector space. They form a module.

The difference is that the scalars of a vector space form an algebraic field while the scalars of a module form a ring. For us humans (as opposed to “those higher beings that I do not understand (a.k.a. mathematicians)”) this means that the scalars in the vector field can divide each other while the scalars in the spaces spanned by fields (i.e. a module) can not.

And just so we all can become even more confused: This has nothing to do with the fact that the “components” of each form field or vector field in certain basis are functions, i.e. themselves elements of a vector space with infinite number of dimensions.

The first way to see this module-not-a-vector-space characteristic is by showing directly that the scalars that form the “coordinate components” of a vector field can not always be divided, even if they are not identically zero. Take, for instance the, manifold $\mathbb{R}^2$ with the polar coordinate system and look at the vector $\begin{bmatrix} r \\ r\cos(\theta) \end{bmatrix}$. The “scalars” are $r$ and $r\cos(\theta)$. Obviously we can not divide the former by the latter because it will be undefined at $\theta=\frac{\pi}{2}+n\pi$.

Another, more amusing way to show that the space spanned by these fields is not a vector space is to explicitly show that a property expected from vector spaces is not fulfilled. Namely, that in $n$ dimensions an $n$-uple of linearly independent elements forms a basis. However, in the case of fields over a manifold we can easily have a number of fields that are linearly independent over the manifold as a whole, and are at the same time linearly dependent (or simply equal to zero) on a subdomain. Hence, we have an $n$-uple of linearly independent fields that can not be linearly combined to represent another arbitrary field.

# Objects Implemented in the diffgeom Module

This post provides a summary of all mathematical types of expression implemented in the diffgeom module. I have chosen not to mention any python classes or other implementation details at all. This table shows how an object expected by the user to be of certain mathematical type operates on another object. If the expectations of a user familiar with differential geometry do not meet the actual implementation, this is a bug in the implementation.

The Argument
point scalar field vector field 1-form field higher form field
The Operator scalar field scalar NA NA NA NA
vector field NA scalar field NA NA NA
1-form field (linear combination of differentials of scalar fields) NA NA scalar field NA NA
higher form field (linear combination of tensor products of lower form fields) NA NA it takes a tuple of vector fields and returns a scalar field NA NA
commutator of vector fields Behaves as a regular vector field.
Lie derivative (the argument is “called” on construction time) NA You specify the object to be derived on creation. The Lie derivative of any object is an object of the same type.

# The Schwarzschild Solution

An “easy” solution to the Einstein equation (in free space) is the spherically symmetric Schwarzschild solution. The pdf bellow shows how one can use the diffgeom module in order to get the equations describing this solution.

One starts with the most general spherically symmetrical metric and by using Einstein equation $R_{\mu \nu}=0$ deduces the equations that must be fulfilled by the components of the metric (in the chosen basis).

schwarzschild.pdf

# Tensor vs Tensor Field, Basis vs Coordinate System

In most of my posts that discuss the SymPy diffgeom module I do not try to make a distinction between a tensor and a tensor field, as it is usually obvious from the context. However, it would be nice to spell it out at least once.

I have two favorite ways to define a tensor/tensor field: either as an object with a representation (in the form of a multidimensional array) that transforms in a precise way when one switches from one basis to another, or instead as (sum of) tensor products of some vectors and 1-forms (i.e. an element of some tensor product of the vector space and its dual).

### In Terms of Transformation Rules

With regard to the first definition, Wikipedia has this to say:

A tensor of type (n, m−n) is an assignment of a multidimensional array $T^{i_1\dots i_n}_{i_{n+1}\dots i_m}[\mathbf{f}]$ to each basis $f = (e_1,...,e_N)$ such that, if we apply the change of basis $\mathbf{f}\mapsto \mathbf{f}\cdot R = \left( R_1^i \mathbf{e}_i, \dots, R_N^i\mathbf{e}_i\right)$ then the multidimensional array obeys the transformation law $T^{i_1\dots i_n}_{i_{n+1}\dots i_m}[\mathbf{f}\cdot R] = (R^{-1})^{i_1}_{j_1}\cdots(R^{-1})^{i_n}_{j_n} R^{j_{n+1}}_{i_{n+1}}\cdots R^{j_{m}}_{i_{m}}T^{j_1,\ldots,j_n}_{j_{n+1},\ldots,j_m}[\mathbf{f}]$.

A tensor field then is a way to map a tensor to each point of a manifold (the tensor is wrt the tangent space at that point).

When we switch from tensors to tensor fields a new object becomes important: the coordinate system. Before proceeding, one must know what a manifold and a tangent space mean. Then we can illuminate the relation between what one calls a “basis” when speaking about tensors and the “coordinate system” in the context of tensor fields. Firstly, a coordinate system gives a way to continuously map a tuple of numbers to a point on the manifold. This continuous map is what physicist love to work with (Cartesian or polar coordinates for instance). The nice thing is that each coordinate system brings with itself a canonical basis for each point of the manifold.

What can be confusing, is that the basis can change from point to point. For example, one can take the $R^2$ manifold that has $R^2$ as its tangent space. Take for instance two points $(x=1, y=0)$ and $(x=0, y=1)$. The basis vectors in the Cartesian coordinate system are the same for both points: $(e_x, e_y)$. However in the polar coordinate system the basis vectors for the first point are $(e_x, e_y)$ and $(e_y, -e_x)$ for the second point.

Anyway, the only thing that changes in the definition, is that the change-of-basis matrix mentioned above now depends on the coordinate systems.

$\hat{T}^{i_1\dots i_n}_{i_{n+1}\dots i_m}(\bar{x}_1,\ldots,\bar{x}_k) = \frac{\partial \bar{x}^{i_1}}{\partial x^{j_1}} \cdots \frac{\partial \bar{x}^{i_n}}{\partial x^{j_n}} \frac{\partial x^{j_{n+1}}}{\partial \bar{x}^{i_{n+1}}} \cdots \frac{\partial x^{j_m}}{\partial \bar{x}^{i_m}} T^{j_1\dots j_n}_{j_{n+1}\dots j_m}(x_1,\ldots,x_k)$

### In Terms of Tensor Products

I prefer this definition, as it relies on the geometrical meaning of vectors and forms. According to Wikipedia, one can express it as:

A type (n, m) tensor T is defined as a map $T: \underbrace{ V^* \times\dots\times V^*}_{n \text{ copies}} \times \underbrace{ V \times\dots\times V}_{m \text{ copies}} \rightarrow \mathbf{R}$, where V is a vector space and V* is the corresponding dual space of covectors, which is linear in each of its arguments.

One can again try to translate this to the case of tensor fields. The straightforward way is just to say that this map is parametrized, thus it depends on which point on the manifold it is evaluated.

However, a more “geometrical” approach would be to keep the part about “a tensor field is the sum of tensor products of vector fields and 1-form fields” but define vector fields and 1-form fields “geometrically”. Vector fields become differential operators over the manifold instead of maps to elements of the tangent space and 1-forms are defined in terms of differentials instead of duals of vectors.

### The Magic

The magic is that this parametrization in terms of tuples of real numbers (a coordinate system) brings automatically a canonical basis and and the transformation matrix for change of basis. Hence defining a coordinate system provides a basis for free. Otherwise the generalization of the first definition would have been clumsier.

# Part 1: What is a Tensor and How is it Implemented in the diffgeom SymPy module?

## The Math

The notion of “a tensor” is commonly defined in two different ways. The first definition goes roughly like this (“roughly” means “do not tell this to your math teacher”):

A tensor is a geometrical object that can be represented in some coordinate system as an n-dimensional array (it is not the array itself). The quantities in that array depend on the coordinate system in which the representation is done, however there is a precise rule on how these quantities change if we switch to another coordinate system. It is this rule that defines what a tensor (and in particular a vector or a 1-form) is.

The other definition, less used by physicists and more used by mathematicians is (again roughly) the following:

A tensor is the sum of tensor products of forms and vectors. Forms and vectors are themselves given nice geometrical definitions.

It is this second definition that is used in the diffgeom SymPy module. Naturally, we need to define “tensor product”, “vector” and “form” in order to use this definition. Indeed, the structure of the module follows closely these mathematical definitions.

Disclaimer: I have used and will use the words tensor and tensor field interchangeably. In order for this post to make any sense to you, ensure that you understand the difference and are able to find out the exact meaning from the context. The same goes also for vector / vector field and form / form field.

## The Implementation

To create the mathematical structure of differential geometry or its implementation in a CAS like SymPy we need to build up the ladder of object definitions. Each new and more interesting notion will depend on the definition of the previous. Hence we start with the boilerplate object “Manifold” and on it we define a “Patch” (the diffgeom module implements all this boilerplate for commonly used manifold, however in order to explain how it works, we will redo everything from scratch):

m = Manifold('my_manifold', 2) # A 2D manifold called 'my_manifold'
p = Patch('my_patch', m) # A patch called 'my_patch'


The first object that does something marginally interesting is the “Coordinate System”. Its role is to permit the parametrization of points on the patch by a tuple of numbers:

cs_r = CoordSystem('R', p) # A coordinate system called 'R' (for rectangular)
point = cs_r.point([1,1]) # A point with coordinates (1, 1)


If you have two or more coordinate systems you can tell the computer how to transform a tuple of numbers from one system to another:

cs_p = CoordSystem('P', p)
cs_r.connect_to(cs_p, [x, y], [sqrt(x**2+y**2), atan2(y,x)])
cs_p.connect_to(cs_r, [r, t], [r*cos(t), r*sin(t)], inverse=False)
# Now the point instances know how to transform their coordinate tuples:
point.coords(cs_p)
# output:
#⎡  ___⎤
#⎢╲╱ 2 ⎥
#⎢     ⎥
#⎢  π  ⎥
#⎢  ─  ⎥
#⎣  4  ⎦


However, differential geometry is not about calculating coordinates of points in different systems. It is about working with fields. Thus, we will focus on a single coordinate system from now on, and to be explicit about its complete independence of whether we want rectangular or other coordinates, we will just call it ‘c’ and leave it unconnected to other coordinate systems.

c = CoordSystem('c', p)


### Scalar Field

We continue to build up our ladder of definitions in order to obtain a more interesting and useful theory/implementation. The next step is the “scalar field”. A scalar field is a mapping from the manifold (the set of points) to the real numbers (yes, just reals (maybe complex), the rest brings unnecessary complexity). Each coordinate system brings with itself the basic scalar fields (i.e. coordinate functions), that correspond to the mappings from a point to an element of its coordinate tuple. These basic scalar fields are implemented internally as BaseScalarField instances (this is however invisible to the user).

 c.coord_functions()
# output: [c₀, c₁]
point = c.point([a, b])
c.coord_function(0)(point)
# output: a


You can build more complicated fields by using the base scalar fields. This does not produce an instance of some new ScalarField class. The BaseScalarField instances just become a part of the expression tree of an ordinary Expr instance (the base for building expressions in SymPy).

c0, c1 = c.coord_functions()
field = f(c0, 3*c1)/sin(c0)*c1**2
f
# output:
#               -1       2
#f(c₀, 3⋅c₁)⋅sin  (c₀)⋅c₁
field(point)
# output:
# 2
#b ⋅f(a, 3⋅b)
#────────────
#   sin(a)


### Vector Field

Then comes the vector field which is defined as an element of the set of differential operators over the scalar fields. All elements of this set can be build up as linear combinations of base vector fields. The base vector fields correspond to the partial derivatives with respect to the base scalar fields. They are implemented in the BaseVectorField class, which also is invisible to the user.

c.base_vectors()
# output: [∂_c_0, ∂_c_1]
e_c0, e_c1 = c.base_vectors()
e_c0(c0)
# output: 1
e_c0(c1)
# output: 0


One can also use more complicated fields (again, no need for new VectorField class, just being part of the expression tree of Expr):

v_field = c1*e_c0 + f(c0)*e_c1
v_field
# output: f(c₀)⋅∂_c_1 + c₁⋅∂_c_0
s_field = g(c0, c1)
v_field(s_field)
# output:
#      ⎛ d            ⎞│           ⎛ d            ⎞│
#f(c₀)⋅⎜───(g(c₀, ξ₂))⎟│      + c₁⋅⎜───(g(ξ₁, c₁))⎟│
#      ⎝dξ₂           ⎠│ξ₂=c₁      ⎝dξ₁           ⎠│ξ₁=c₀
v_field(s_field)(point)
# output:
#  ⎛ d           ⎞│            ⎛ d           ⎞│
#b⋅⎜───(g(ξ₁, b))⎟│     + f(a)⋅⎜───(g(a, ξ₂))⎟│
#  ⎝dξ₁          ⎠│ξ₁=a        ⎝dξ₂          ⎠│ξ₂=b


### 1-Form Field and Differential

The final ingredient needed for the basis of differential geometry is the 1-form field. A 1-form field is a linear mapping from the set of vector fields to the set of reals. The interesting thing is that all 1-forms can be build-up from linear combinations of the differentials of the base scalar fields.

There is the need to define what a differential is. The differential $df$ of the scalar field $f$ is the 1-form field which has the following property: for every vector field $v$ one has $df(v) = v(f)$.

In the diffgeom module the differential is implemented as the Differential class. The differentials of the base scalar fields are accessible with the base_oneforms() method, however one can construct the differential of whatever scalar field they wish. There is, as always, no dedicated OneFormField class. Everything is build up with Expr instances.

c.base_oneforms()
# output: [ⅆ c₀, ⅆ c₁]
dc0, dc1 = c.base_oneforms()
dc0(e_c0), dc1(e_c0)
# output: (1, 0)


And building up more complicated expressions:

f_field = g(c0)*dc1 + 2*dc0
f_field(v_field)
# output: g(c₀)⋅f(c₀) + 2⋅c₁


### The Rest

Now that we have the basic building blocks in order to construct higher order tensors we use tensor and wedge products. More about them in part 2.

## What if I Need to Work with a Number of Different Coordinate Systems

The only difference is that the chain rule of differentiation will be necessary. One can express the same statement in a more implementation independent way as “The rule for transformation of coordinates will need to be applied”. Anyhow, it works:

Examples from the already implemented $R^2$ module.

### Points Defined in one Coordinate System and Evaluated in Another

point_r = R2_r.point([x0, y0])
point_p = R2_p.point([r0, theta0])
R2.x(point_r)
# output: x₀
R2.x(point_p)
# output: r₀⋅cos(θ₀)
trigsimp((R2.x**2 + R2.y**2 + g(R2.theta))(point_p))
# output:
#  2
#r₀  + g(θ₀)


### Vector Fields Operating on Scalar Fields

R2.e_x(R2.theta)
# output:
#            -1
#   ⎛ 2    2⎞
#-y⋅⎝x  + y ⎠
R2.e_theta(R2.y)
# output: cos(θ)⋅r


### 1-Form Fields Operating on Vector Fields

R2.dr(R2.e_x)
# output:
#         -1/2
#⎛ 2    2⎞
#⎝x  + y ⎠    ⋅x


## What if I Need an Unspecified Arbitrary Coordinate System

If it is just one coordinate system, nothing; all the examples in the first part were for a completely arbitrary system. If you want two systems with an arbitrary transformation rules between them, just use an arbitrary function when you connect them:

m = Manifold('my_manifold', 2)
p = Patch('my_patch', m)
cs_a = CoordSystem('a', p)
cs_b = CoordSystem('b', p)
cs_a.connect_to(cs_b, [x, y], [f(x,y), g(x,y)], inverse=False)
cs_a.base_vector(1)(cs_b.coord_function(0))
# output:
#⎛ d            ⎞│
#⎜───(f(a₀, ξ₁))⎟│
#⎝dξ₁           ⎠│ξ₁=a₁


## How Does This Relate to the Scheme Code by Gerald Jay Sussman and Jack Wisdom

The diffgeom module is based in its entirety on the work of Gerald Jay Sussman and Jack Wisdom on “Functional Differential Geometry”. The only substantial difference (in what is already implemented) is how the diffgeom module treats operations on fields. Both the diffgeom module and the original Scheme code behave like this:

scalar_field(point) ---> an expression representing a real number


However, the original scheme implementation and the SymPy module behave differently in these cases:

 vector_field(scalar_field)
form_field(vector_field)


As far as I know, the original Scheme code produces an opaque object. It indeed represents a scalar field, however $\partial_x(x)$ will not produce directly $1$. Instead it produces an object that returns 1 when evaluated at a point. The diffgeom module does this evaluation at the time at which one calls $\partial_x(x)$ without the need to evaluate at a point, thus the result is explicit and not encapsulated in an opaque object.

# Printing in SymPy (for the Differential Geometry Module)

This week I was doing some interesting refactoring, that brings quite a bit of new possibilities, however I will write about this in the coming days. For now… printing. Most importantly, any suggestions for improvements are very welcomed.

Printing in SymPy is done really easily. You just add a _print_Whatever() method to the printer and your new class is printed in whatever manner you wish.

For the moment I am printing scalar fields just as the name of the coordinate in bold non-italic, vector fields as $\partial$ with the name as a subscript and differentials as a fancy $d$ followed by the name.

First of all the unicode printer:

There are some obvious problems, like the fact that in unicode there is no subscript for $\theta$ or $y$.

And then the $\LaTeX$ printer:

Now I must find a nice way to print a Point() instance.

# Integral Curves of Vector Fields in SymPy

A week or two ago I implemented some basic functionality for work with integral curves of vector fields. However, I needed to make additional changes in other parts of SymPy in order for the ODE solver to work with systems of equations and with initial conditions. I also wanted to get my plotting module merged so I can show some visualizations if necessary.

Now that all this is ready (even though not everything is merged in SymPy master) I can show you some of the most basic capabilities implemented in the differential geometry module. First, we start with the boilerplate:

### A Simple Field

from sympy.diffgeom import *
from sympy.diffgeom.Rn import * # This gives me:
#   - R2_p - the polar coord system
#   - R2_r - the rectangular coord system
#   - x,y,r,theta - the base scalar fields
#   - e_x, ... - the base vector fields
# Define some fields to play with
# (these are the same fields, defined in two different ways):
vector_field_circular_p = R2_p.e_theta
vector_field_circular_r = -R2.y*R2.e_x + R2.x*R2.e_y
# Define the same point in two different ways
point_p = R2_p.point([1,pi/2])
point_r = R2_r.point([0,1])


The r index is for rectangular coordinate systems and the p index is for polar.

Now using intcurve_diffequ we can generate the differential equations for the integral curve. This function also generates the required initial conditions:

#        vector field      free parameter for    starting point
#                  |        the curve     |     /          coord system
#                  v                      v    v      v-- for the equation
intcurve_diffequ(vector_field_circular_p, t, point_p, R2_p)
# output:
#   d            d
#([ ──(f₀(t)),   ──(f₁(t)) - 1  ],
#   dt           dt
#
#                        π
# [f₀(0) - 1,    f₁(0) - ─      ])
#                        2

intcurve_diffequ(vector_field_circular_p, t, point_p, R2_r)
# output:
#           d                     d
#([ f₁(t) + ──(f₀(t)),   -f₀(t) + ──(f₁(t))  ],
#           dt                    dt
#
# [ f₀(0),               f₁(0) - 1           ])


Here we have equations for the functions $f_0$ and $f_1$ which are by convention the names that intcurve_diffequ gives for the first and second coordinate.

The cool thing is that we can mix the coordinate systems in any way we wish. The code will automatically make the needed coordinate transformation and return the equations in the demanded coordinate system independently of the coordinate systems in which the input objects were defined (at worst you will need to call some simplification routines):

a = intcurve_diffequ(vector_field_circular_p, t, point_p, R2_r)
a == intcurve_diffequ(vector_field_circular_p, t, point_r, R2_r)
# True
a == intcurve_diffequ(vector_field_circular_r, t, point_r, R2_r)
# True
a == intcurve_diffequ(vector_field_circular_r, t, point_p, R2_r)
# True


Solving the equations actually gives (this solver is not yet in SymPy master as of the time of writing):

equ_r, init_r = intcurve_diffequ(vector_field_circular_r, t, point_r, R2_r)
sol_r = dsolve(equ_r+init_r, [Function('f_0')(t), Function('f_1')(t)])
[simplify(s.rewrite(sin)) for s in sol_r] # some simplification
#[f₀(t) = -sin(t), f₁(t) = cos(t)]            # is necessary because
# dsolve returned complex
# exponentials


Even simpler:

equ_p, init_p = intcurve_diffequ(vector_field_circular_p, t, point_p, R2_p)
dsolve(equ_p+init_p, [Function('f_0')(t), Function('f_1')(t)])
# output:
#[f₀(t) = 1,
#             π
# f₁(t) = t + ─
#             2]


This is obviously just a circle (did I mentioned that the vector field that I defined is circular). There is no need to plot it as it is fairly simple. However a slight change will render the field a bit more interesting:

# A circular field that also pushes in radial direction
v_field = R2.e_theta + (r0 - R2.r)*R2.e_r
# An initial position slightly away from the
# equilibrium one.
start_point = R2_p.point([r0+delta, 0])
equ, init = intcurve_diffequ(v_field, t, start_point)
equ
#                d            d
#[ -r₀ + f₀(t) + ──(f₀(t)),   ──(f₁(t)) - 1 ]
#                dt           dt

init
#[-δ - r₀ + f₀(0), f₁(0)]

dsolve(equ+init, [Function('f_0')(t), Function('f_1')(t)])
#            -t
#[f₀(t) = δ⋅ℯ   + r₀, f₁(t) = t]


This gives a spiral tending towards the equilibrium radius $r_0$. Let us extract the coordinates from these equations and plot the resulting curve:

intcurve_coords = [eq.rhs for eq in dsolve(equ+init, [Function('f_0')(t), Function('f_1')(t)])]
intcurve_coords
#    -t
#[δ⋅ℯ   + r₀, t]

# We need this in Cartesian coordinates for the plot routine.
# We could have solved for Cartesian coordinates since the
# beginning, however our current approach permits us to see
# how to use the CoordSys classes to change coordinate systems:
coords_in_cartesian = R2_p.point(intcurve_coords).coords(R2_r)
coords_in_cartesian
#⎡⎛   -t     ⎞       ⎤
#⎢⎝δ⋅ℯ   + r₀⎠⋅cos(t)⎥
#⎢                   ⎥
#⎢⎛   -t     ⎞       ⎥
#⎣⎝δ⋅ℯ   + r₀⎠⋅sin(t)⎦

# Substitute numerical values for the plots:
x,y = coords_in_cartesian.subs({delta:0.5, r0:1})
plot(x,y, (t,0,4*pi))
#Plot object containing:
#[0]: parametric cartesian line:
#      ((1 + 0.5*exp(-t))*cos(t), (1 + 0.5*exp(-t))*sin(t))
#      for t over (0.0, 12.566370614359172)


This is all great, but what happens if one has to work with more complicated fields. For instance the following simple field will not permit analytical solution:

### No Analytical Solution

v_field = R2.e_theta + r0*sin(1 - R2.r/r0)*R2.e_r
equ, init = intcurve_diffequ(v_field, t, start_point)
equ
#[
#        ⎛    f₀(t)⎞   d
#- r₀⋅sin⎜1 - ─────⎟ + ──(f₀(t))
#        ⎝      r₀ ⎠   dt       ,
#
#d
#──(f₁(t)) - 1
#dt           ]


For cases like this one the user can take advantage of one of the numerical ODE solvers from scipy. Or sticking to symbolic work he can use the intcurve_series function that gives the series expansion for the curve:

intcurve_series(v_field, t, start_point, n=1)
#⎡δ + r₀⎤
#⎢      ⎥
#⎣  0   ⎦

intcurve_series(v_field, t, start_point, n=2)
#⎡            ⎛    δ + r₀⎞     ⎤
#⎢δ + r₀⋅t⋅sin⎜1 - ──────⎟ + r₀⎥
#⎢            ⎝      r₀  ⎠     ⎥
#⎢                             ⎥
#⎣              t              ⎦

intcurve_series(v_field, t, start_point, n=4, coeffs=True)
#[
#⎡δ + r₀⎤
#⎢      ⎥
#⎣  0   ⎦,
#
#⎡        ⎛    δ + r₀⎞⎤
#⎢r₀⋅t⋅sin⎜1 - ──────⎟⎥
#⎢        ⎝      r₀  ⎠⎥
#⎢                    ⎥
#⎣         t          ⎦,
#
#⎡     2    ⎛    δ + r₀⎞    ⎛    δ + r₀⎞⎤
#⎢-r₀⋅t ⋅sin⎜1 - ──────⎟⋅cos⎜1 - ──────⎟⎥
#⎢          ⎝      r₀  ⎠    ⎝      r₀  ⎠⎥
#⎢──────────────────────────────────────⎥
#⎢                  2                   ⎥
#⎢                                      ⎥
#⎣                  0                   ⎦,
#
#⎡      ⎛     2                  2            ⎞                ⎤
#⎢    3 ⎜      ⎛    δ + r₀⎞       ⎛    δ + r₀⎞⎟    ⎛    δ + r₀⎞⎥
#⎢r₀⋅t ⋅⎜- sin ⎜1 - ──────⎟ + cos ⎜1 - ──────⎟⎟⋅sin⎜1 - ──────⎟⎥
#⎢      ⎝      ⎝      r₀  ⎠       ⎝      r₀  ⎠⎠    ⎝      r₀  ⎠⎥
#⎢─────────────────────────────────────────────────────────────⎥
#⎢                              6                              ⎥
#⎢                                                             ⎥
#⎣                              0                              ⎦]


However these series do not always converge to the solution, so care should be taken.

There are other amusing possibilities already implemented, however I will write about them another time.

If you want to suggest more interesting examples, please do so in the comments.

# Consistent output from the SymPy solvers (and some ideas about the ODE solver)

The work on the differential geometry module has not progressed much this week. I have fixed some minor issues, docstrings and naming conventions, however I have not done much with respect to the implementation of form fields as there are still some questions about the design to be ironed out.

Instead I focused on studying two of the features that SymPy presents and that I will use heavily. The first is the simplification routine that I will discuss another time. The second one is the different solvers implement in SymPy.

First of all, I have a very hard time getting used to the various output that the solvers provide. The algebraic equations solver, for example, can return either an empty list or a None instance if there is no solution. If there are solutions it can return a list of tuples of solutions, or a list of dictionaries of solutions, or a list of solutions if there is only one variable to solve for, or the solution itself if the solution is unique… Thankfully, a remedy for this was implemented by Christopher Smith. In pull request 1324 he provided some flags that force the solver to return the solutions in a canonical form. I am very grateful for his work and I hope that in the not too distant future what he has done will become the default behavior. I also hope that the solver will get refactored, because internally it is still a mess of different possible outputs that are canonicalized only at the very end. It is possible that I will work on this later.

Then there is the ODE module. I already need this solver in order to work with the integral curves that my code produces[1]. It is a very advanced solver written a few years ago by Aaron Meurer as part of his GSoC project. However, it still does not support systems of ODEs or solving for initial conditions. With Aaron’s help I have started those. The main difficulty is that I am covering only the simplest cases, however the new API must be futureproof. Moreover, here I again have a problem with the various outputs that can be produced by the ODE solver. Solutions are always returned as Equation instances (which is necessary, as some solutions can be in implicit form), however if there are multiple solutions they are returned in a list, while single solutions are returned themselves (not in a list). Anyway, the structure of the ODE module is straightforward so this should not be too hard to work around.

This week I will probably finish my work with the ODE solver and proceed to the form fields.

[1] The code on the differential geometry side is ready, however before showing it I will first extend the ODE solver in order to have more interesting examples.

# Scalar and Vector Fields in SymPy – First Steps

The Differential Geometry module for SymPy already supports some interesting basic operations. However, it would be appropriate to describe its structure before giving any examples.

First of all, there are the Manifold and Patch classes which are just placeholders. They contain all the coordinate charts that are defined on the patch and do not provide, for instance, any topological information. This leads us to the CoordSystem class which contains all the coordinate transformation logic. For example, if I want to define the $\mathbb{R}^2$ euclidean manifold together with the polar and Cartesian coordinate systems I would do:

R2 = Manifold('R^2', 2)
# Patch and coordinate systems.
R2_origin = Patch('R^2_o', R2)
R2_r = CoordSystem('R^2_r', R2_origin)
R2_p = CoordSystem('R^2_p', R2_origin)

# Connecting the coordinate charts.
x, y, r, theta = [Dummy(s) for s in ['x', 'y', 'r', 'theta']]
R2_r.connect_to(R2_p, [x, y],
[sqrt(x**2 + y**2), atan2(y, x)],
inverse=False, fill_in_gaps=False)
R2_p.connect_to(R2_r, [r, theta],
[r*cos(theta), r*sin(theta)],
inverse=False, fill_in_gaps=False)


All following examples will be about the $\mathbb{R}^2$ manifold which is already implemented in the code for the module. Also, notice the use of the inverse and fill_in_gaps flags. When they are set to True the CoordSystem classes try to automatically deduce the inverse transformations using SymPy’s solve function.

Now that we have a manifold we would like to create some fields on it and define some points that belong to the manifold. The points are implemented in the Point class. You need to specify some coordinates when you define the point, however after that the object is completely coordinate-system-idependent.

# You need to specify coordinates in some coordinate system
p = Point(R2_p, [r0, theta0])


Then one can define fields. ScalarField takes points to real numbers and VectorField is an operator on ScalarField taking a scalar field to another scalar field by applying a directional derivative. For example, here x and y are the scalar fields taking a point and returning it’s coordinate and d_dx and d_dy are the vector fields $\frac{\partial}{\partial x}$ and $\frac{\partial}{\partial y}$. R2_r is the Cartesian coordinate system and R2_p is the polar one.

R2_r.x(p) == r0*cos(theta0)
# R2_r.d_dx(R2_r.x) is a also scalar field
R2_r.d_dx(R2_r.x)(p) == 1


Looking at how can these fields be defined:

# For a ScalarField you provide the transformation in some coordinate system
R2_r.x = ScalarField(R2_r, [x0, y0], x0)
#                     /      |        ^-------- the result
#     the coord system     the coordinates

# For a VectorField you provide the components in some coordinate system
R2_r.d_dx = VectorField(R2_r, [x0, y0], [1, 0])
#                        /      |         ^-------- the components
#         the coord system     the coordinates


Obviously one can define much more interesting fields. For instance the potential due to a point charge at the origin is:

potential = ScalarField(R2_p, [r0, thata0], -1/r0)
# And to reiterate, the definition does not limit you
# to use it only in this coordinate system. For instance:
potential(R2_r.point([x0, y0])) == 1/sqrt(x0**2 + y0**2)


However there is another more intuitive way to do it:

# R2_p.r is the scalar field that takes a point and returns the r coordinate
potential2 = 1/R2_p.r
potential2(R2_r.point([x0, y0])) == 1/sqrt(x0**2 + y0**2))


And this new object potential2 is not an instance of ScalarField. It is actually a normal SymPy expression tree that contains a ScalarField somewhere in its leafs (namely in this case it is Pow(R2_p.r, -1)). However, due to the change to one of the base classes of SymPy that I did in this pull request it is now possible for such tree to be a python callable, by recursively applying the argument to each callable leaf in the tree. This change is still debated and it may be reverted.

Vector fields can also be build in this manner. However, they pose a problem. What happens when you multiply a vector field and a scalar field? This operation should give another vector field. And here is a possible problem with the approach of recursively callable expressions trees:

# Naively this operation will call a scalar field on
# another scalar field which is nonsense:
(R2_r.x * R2_r.d_dx)(R2_r.x) == R2_r.x(R2_r.x) * R2_r.d_dx(R2_r.x)
#                         nonsense----^


The current solution is for scalar_field(not_point) to return the callable itself. Thus we have:

(R2_r.x * R2_r.d_dx)(R2_r.x) == R2_r.x * R2_r.d_dx(R2_r.x)
#\________________/ \______/    \_______________________/
#   vector field        ^---scalar fields---^


This way there is no need for complicated logic in __mul__ nor is there need for addition subclasses of Expr in order to accommodate this behavior.

There is not much more to be said about the structure of the module. There are some other nice things already implemented like integral curves, however I will discuss these in a later post.

Among the things that should be done at some point:

• Should vector fields be callable on points? If yes, what the result should be? An abstract vector, a tuple of coordinates in a certain coordinate system, something else?
• There are many expressions generated by this code that are not simple enough. I should work on the simplification routines and on the differential geometry module itself in order to get more canonical expressions.
• The last point is also valid about the solvers: some coordinate transformations are too complicated for the solvers to find the inverse transformation.
• Manifold and Patch have name attributes. Are these necessary? What is the role of name attributes in SymPy besides printing?
• Start using Lambda where applicable.
• Follow better the class structure of SymPy.

# Differential Geometry in SymPy – my GSoC project

The next few moths will be interesting. I got accepted in the Google Summer of Code program and I am already starting to worry (irrationally) about the project and the schedule. I will be working on a differential geometry module for SymPy (and time permitting, some more advanced tensor algebra).

Basically, I want to create the boilerplate that will permit defining some scalar/vector/form/tensor field in an arbitrary coordinate system, then doing some coordinate-system-independent operations on the field (with hopefully coordinate-system-independent simplifications) and, finally, getting the equations describing the final result in another arbitrary coordinate system.

With this in mind, the details about the project can be seen on the proposal page. Most of it (all except the tensor algebra that I may work on at the end) is based on the work of Gerald Jay Sussman and Jack Wisdom on “Functional Differential Geometry”. I suppose that this project started as a part of their superb book “Structure and Interpretation of Classical Mechanics” (I really have to read this book if I am to call myself a physicist) and the accompanying “Scheme Mechanics” software. By the way, reading the Scheme code is a wonderful experience. This language is beautiful! The authors are also actively updating their code and a newer, more detailed paper on the project can be found here.

Most of my work will be reading the Scheme code and tracing corner cases in SymPy. My workflow will probably consist of implementing some notion from “Functional Differential Geometry” in SymPy and only when I get to semi-working state comparing with the original Scheme code for ideas, then repeating the process on the next part of the system. This way I will be less susceptible to implementing Scheme idioms in Python.

Writing the final version of each function/class of my module will probably take very little time. Most of the time will be dedicated to removing/studying corner cases and assumptions in SymPy’s codebase (more about these later) and experimenting with different approaches for the module structure (and of course reading/deciphering the work of Wisdom and Sussman).

Finally, I will speak a bit about the aforementioned corner cases and assumptions in the SymPy’s codebase. There are the obvious things like having to derive from Expr if you want to be able to have your class as a part of a symbolic expression. Then there is the fact that Basic (and its subclasses like Expr) do some magic with the arguments for the constructor (saved in expr._args) in order to automagically have:

• rebuildable expression with eval(srepr(expr))==expr
• rebuildable expression with type(expr)(*expr._args)
• some magic with the _hashable_content() method in order to (presumably) have efficient cashing

These details make it a bit unclear how to implement things like CoordinateSystem objects which learn during their existence how to transform to other coordinate systems (thus their implementation in code is a mutable object) but at the same time they are the same mathematical object. Anyway, from what I have seen just having a persistent hash and a correct srepr should be enough. I wonder how tabu it is to change your _args after the creation of the class. Why I need to worry about caching (thus the hash) and rebuilding (thus the srepr) is still unclear to me, but I will dedicate whole posts to them later on when I have the explanation. The caching is presumably for performance. It is the need for all that fancy magic that does not permit duck typing in SymPy. If you do not subclass Basic, you can not be part of SymPy, no matter the interfaces that you support.

Then there is the question of using the container subclasses of Expr. Things like Add and Mul, which I would have expected to be just containers. However, they are not. They also do some partial canonicalization, but at the moment their exact role (and more importantly, what they don’t do) is very unclear to me. There was much discussion about AST trees and canonicalization on the mailing list, if you are interested, and how exactly to separate the different duties that Add and Mul have, but as this is enough work for another GSoC I decided to just stop thinking about that and use them in the simples way possible: just as containers.

There is one drawback to this approach. The sum of two vector fields for example is still a vector field and the object that represents the sum should have all the methods of the object representing one of the fields, however Add does not have the same methods as VectorField. The solution that was already used in the matrix module was to create classes like MatrixAdd, and the same was done in the quantum physics module. However, I fear such proliferation of classes for it becomes unsustainable as the number of different modules grows. What happens when I want to combine two objects from the disjoint modules? This is why I simply use Add and Mul and implement helper functions that are not part of the class. These helper functions will ideally be merged in some future canonicalizer that comes about from separating the container and canonicalization parts of Add and Mul.

One last remark is that I will probably have to work on sympify and the sympification of matrices, as I will use coordinate tuples (column vectors) quite often. Then there is the distinction between Application and Function and all the magic with metaclasses that seems very hard to justify. But probably I will write entire posts in which I try to understand why the metaclasses in the core are necessary.

# Gaussian optics in Sympy: studying a lens

In the last post I presented the small module for gaussian optics that I’ve created for sympy. Here I’ll try to show a real problem solved with it and with sympy.

[update]: The api is still changing. Also here I’m using ABCD matrices for something that is more easily done with conjugation relations. It’s that way so I can show how the matrices are used in the module.

The problem is finding how a laser beam is modified by passing trough a lens. What is the new waist? Where is it obtained?

### Making the boilerplate

Let us first introduce the variables for the input beam and the lens. Notice that as input for the creation of the BeamParameter the Rayleigh length $z_r$ is used and not the waist $w_0 = \sqrt{\frac{z_r \lambda}{\pi}}$. That’s the way for now just because it gives simpler expressions:

In [1]: from sympy.physics.gaussopt import *

In [2]:

In [3]: z_r = Symbol("z_r", positive=True)

In [4]: l = Symbol("lambda", positive=True)

In [5]: z = Symbol("z", real=True)

In [6]: f = Symbol("f", real=True)

In [7]: z_r
Out[7]: zᵣ

In [8]: l
Out[8]: λ


Defining the parameter and the lens and then calculating the parameter for the output beam:

In [9]: q_in = BeamParameter(z, z_r, l)

In [10]: tl = ThinLens(f)

In [11]:

In [12]: q_out= tl*q_in


Let’s see what does this give for the waist of the output parameter:

In [13]: w_0_out = q_out.w_0

In [14]: z_out = q_out.z

In [15]: w_0_out
Out[15]:
⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽
⎽⎽⎽   ⎽⎽⎽⎽        ╱         1
╲╱ λ ⋅╲╱ zᵣ ⋅      ╱  ──────────────────
╱              2     2
╱        2⋅z   z    zᵣ
╱     1 - ─── + ── + ───
╱           f     2     2
╲╱                 f     f
─────────────────────────────────────────
⎽⎽⎽
╲╱ π

In [16]: z_out
Out[16]:
2                        2
z                      z                       zᵣ
────────────────── - ────────────────────── - ──────────────────────
2     2     ⎛           2     2⎞     ⎛           2     2⎞
2⋅z   z    zᵣ      ⎜    2⋅z   z    zᵣ ⎟     ⎜    2⋅z   z    zᵣ ⎟
1 - ─── + ── + ───   f⋅⎜1 - ─── + ── + ───⎟   f⋅⎜1 - ─── + ── + ───⎟
f     2     2     ⎜     f     2     2⎟     ⎜     f     2     2⎟
f     f      ⎝          f     f ⎠     ⎝          f     f ⎠

In [17]: together(z_out) # to simplify it a bit
Out[17]:
⎛       2     2⎞
f⋅⎝f⋅z - z  - zᵣ ⎠
─────────────────────
2            2     2
f  - 2⋅f⋅z + z  + zᵣ


Those are still mostly unreadable. To simplify the expressions further I’ll introduce the dimensionless parameters $Z_i = \frac{z}{f}$ and $Z_r = \frac{z_r}{f}$. Also there is the obvious substitution for the input beam waist.

In [19]: Z_in = Symbol("Z_i", real=True)

In [20]: Z_r = Symbol("Z_r", real=True)

In [21]:

In [22]: Z_out = z_out/f

In [23]: Z_out = z_out.subs({z : Z_in*f, z_r : Z_r*f})

In [24]: Z_out = together(Z_out)

In [25]: Z_out
Out[25]:
⎛    2          2⎞
f⋅⎝- Zᵢ  + Zᵢ - Zᵣ ⎠
────────────────────
2            2
Zᵢ  - 2⋅Zᵢ + Zᵣ  + 1


And same for the waist:

In [26]: w_0_in = Symbol("w_0^i", positive=True)

In [27]: w_0_in
Out[27]: wⁱ₀

In [28]: w_0_in_expression = q_in.w_0

In [29]: w_0_in_expression
Out[29]:
⎽⎽⎽   ⎽⎽⎽⎽
╲╱ λ ⋅╲╱ zᵣ
────────────
⎽⎽⎽
╲╱ π

In [30]:

In [31]: w_0_out = w_0_out.subs({w_0_in_expression : w_0_in, z : Z_in*f, z_r : Z_r*f})

In [32]: w_0_out
Out[32]:
wⁱ₀
─────────────────────────
⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽
╱   2            2
╲╱  Zᵢ  - 2⋅Zᵢ + Zᵣ  + 1


Let’s study how the waist changes. First of all, there better be no solutions to the quadratic equations in the denominator. We want it always positive in order to have real waist. Let’s extract the expression.

In [41]: quad_equ = w_0_out.args[1].args[0]

Out[42]:
2            2
Zᵢ  - 2⋅Zᵢ + Zᵣ  + 1


Ok, obviously the discriminant is $-Z_r^2 < 0$ and we have no real solutions. But just as a crosscheck let’s pass it trough sympy’s solve.

In [44]: solve(quad_equ, Z_in)
Out[44]: [-ⅈ⋅│Zᵣ│ + 1, ⅈ⋅│Zᵣ│ + 1]


Great. The waist is real and positive.

### Some useful results

Until now we haven’t done anything really interesting from physical point of view. We have just derived some expressions. Here are some more interesting questions.

#### What is the maximal possible waist obtainable with a single lens?

To find it we minimize the denominator. I lost a few minutes searching for a maximize function in sympy but didn’t find one. But obviously the maximum is at $Z_i = 1$ and so $w_{out}^{max} = \frac{w_{in}}{Z_r}=\frac{f \lambda}{\pi w_{in}}$.

It’s quite interesting that the maximum is for $Z_i = 1$ (i.e. $z_i = f$). Let’s see what does that mean for a convergent lens (it’s mostly the same for a divergent one). The beam is converging to the focal point of the lens so after refraction the beam is parallel. And we know that in order to have parallel beam (i.e. minimize divergence) we must have big waist:

In [39]: q_in.divergence.subs({z_r : waist2rayleigh(w_0_in,l)})
Out[39]:
λ
─────
π⋅wⁱ₀


#### What about the minimal waist?

The predicted waist goes to zero for very big $Z_i$. But the gauss beam is a solution to the paraxial equation which is invalid for great curvatures of the beam front. We have the following limit:

In [41]: q_out.waist_aproximation_limit
Out[41]:
2⋅λ
───
π


It would be interesting to know for what distance is this obtainable. The sad thing is that currently sympy can solve only very explicit equations. To rewrite it I do the following:

In [48]: waist_limit = q_out.waist_aproximation_limit

In [49]: equation = (w_0_out.args[0] / waist_limit)**2 - (1/w_0_out.args[1])**2

In [50]: equation
Out[50]:
2    2
2            2       π ⋅wⁱ₀
- Zᵢ  + 2⋅Zᵢ - Zᵣ  - 1 + ───────
2
4⋅λ

In [51]:

In [52]: solve(equation, Z_in)
Out[52]:
⎡       ⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽         ⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎽⎤
⎢      ╱       2  2    2    2         ╱       2  2    2    2 ⎥
⎢    ╲╱  - 4⋅Zᵣ ⋅λ  + π ⋅wⁱ₀        ╲╱  - 4⋅Zᵣ ⋅λ  + π ⋅wⁱ₀  ⎥
⎢1 - ─────────────────────────, 1 + ─────────────────────────⎥
⎣               2⋅λ                            2⋅λ           ⎦


There are many more interesting questions but that’s all for now.

# Gaussian optics in Sympy

[update] The API shown here will most definitely change.

During my internship at LKB, Paris I’m expected to prepare a number of optical setups. I thought that it would be interesting to do the work in SymPy. That’s why I started working on the gaussopt module in sympy.physics (for the foreseeable future it won’t be upstream). I plan to make it small and basic (containing only the most basic functionality).

For the moment the only things on the blueprint are ray transfer matrices, geometric and gaussian beams.

To define a ray transfer matrix for free space or for a thin lens you just use the build in classes (a general matrix is also available):

In [1]: from sympy.physics.gaussopt import *

In [2]: distance = Symbol('distance', positive=True)

In [3]: focus = Symbol('focus', real=True)

In [4]: fs = FreeSpace(distance)

In [5]: tl = ThinLens(focus)

In [6]: fs
Out[6]:
⎡1  distance⎤
⎢           ⎥
⎣0     1    ⎦

In [7]: tl
Out[7]:
⎡  1    0⎤
⎢        ⎥
⎢  -1    ⎥
⎢─────  1⎥
⎣focus   ⎦



The module supports geometric beams but gaussian beams are more interesting, so I’ll show only them. The formalism that supports them is based on the complex beam parameter. Here is how you define a beam at its waist:

In [8]: waist = Symbol('waist', positive=True)

In [9]: wavelen = Symbol('wavelen', positive=True)

In [10]: z_rayleigh = waist**2 * pi / wavelen

In [11]: p = BeamParameter(0, z_rayleigh, wavelen)

In [12]: p.q
Out[12]:
2
ⅈ⋅π⋅waist
──────────
wavelen

In [13]: p.w
Out[13]: waist

In [14]: p.w_0
Out[14]: waist

In [15]: p.R
Out[15]: 0

In [16]: p.z
Out[16]: 0

In [17]: p.z_r
Out[17]:
2
π⋅waist
────────
wavelen



For the different properties shown here the names should be suggestive enough. The pretty_printer supports Greek letters for the wavelen but this is not so important here.

Now I would like to calculate how a telescopic system will act on the beam.  For example

In [18]: tl1 = ThinLens(focus/10)

In [19]: p_out = tl1*( fs*( tl*p))



The last line takes about 60 seconds and produces very UNsimplified expression. I suppose that I should do some implicit simplification. Also notice the parentheses. I should check if they are really required.

simplify(p_out) takes too much time so I’ll cheat a bit (from symbolics to numerics):

In [20]: p_out1 = p_out.subs({focus:1, distance:1*11/10, wavelen:532e-9, waist:5e-3})

In [21]: N(p_out1.q)
Out[21]: -0.110000000001061 + 1.47631233721549⋅ⅈ

In [22]: N(p_out1.w_0)
Out[22]: 0.000500000000000380

In [23]: N(p_out1.z)
Out[23]: -0.110000000001061



So with balanced telescopic system of power 10 the waist diminished 10 times and will be obtained at 11 cm after the second lens. Nice.

It would be nice to post some more realistic problems later on.