3) seconde version améliorée.
Pour celle-ci, j'ai modifié le realloc pour faire un spéculative realloc sur la pile (et si ca marche pas, on réalloue tout et on perd un peu de place pour rien), e qui donne:
MAY_INLINE void *MAY_REALLOC(void *x, size_t o, size_t n)
{
size_t n_a = MAY_ALIGNED_SIZE (n);
if (MAY_UNLIKELY (o >= n))
return x;
size_t o_a = MAY_ALIGNED_SIZE (o), diff = n_a - o_a;
char *old_top, *tmp;
if (may_c.Heap.current_mark <= (char*) x
&& (old_top = may_c.Heap.top) == (char*)(x+o_a)
&& (tmp = MAY_ATOMIC_ADD (may_c.Heap.top, diff))
&& tmp == old_top
&& tmp + diff < may_c.Heap.limit)
return x;
else
return memcpy (MAY_ALLOC(n_a), x, o);
}
Avec cette modification, la consommation mémoire revient à un niveau raisonnable et le MT aide bien les calculs :
Single Thread:
./t-pika "(1+(x+y+z+t+1)^20)*(1+x+y+z+t)^20" -oexpand > /dev/null
CPU time=6320ms
Two Thread:
./t-pika "(1+(x+y+z+t+1)^20)*(1+x+y+z+t)^20" -oexpand > /dev/null
CPU time=4591ms
Par contre, on a des régressions sur les temps sans MT du fait que l'allocation est bien plus lente (et oui un atomic_add nécessite de faire un flush cache...):
MAY V0.7.4 (GMP V5.0.5 MPFR V3.1.0 CC=gcc CFLAGS=-fexceptions -O3 -fomit-frame-pointer -funroll-loops -ffast-math -march=native -ffunction-sections -fdata-sections -static -flto -ffat-lto-objects -DMAY_USE_THREAD)
Start -- Base:0x24ea000 Top:0x24f2130 Used:33072 MaxUsed:33072 Max:10737418240
Construct (3*(a*x+b*y+c*z) with a=1/2, b=2/3 and c=4/5...0.00296ms [337837 execs/sec]
eval (sum ai*ai*ai) - quite different - N=100......0.01ms
eval (sum ai*ai*ai) - quite similar - N=100......0.01ms
eval (sum ai*ai*ai) - quite different - N=1000......0.08ms
eval (sum ai*ai*ai) - quite similar - N=1000......0.08ms
eval (sum ai*ai*ai) - quite different - N=10000......1.25ms
eval (sum ai*ai*ai) - quite similar - N=10000......1.50ms
eval (sum ai*ai*ai) - quite different - N=100000......56.00ms
eval (sum ai*ai*ai) - quite similar - N=100000......60.00ms
eval (sum ai*ai*ai) - quite different - N=1000000......687.00ms
eval (sum ai*ai*ai) - quite similar - N=1000000......831.00ms
eval(sum(i*x^i, n=0..20000)...14ms
eval(x+f(x)+f(f(x))+...+f(5000)(x)), subs f to id...532ms
eval(sum(i,i=0..20000)+x+sum(i,i=0..20000))...3ms
eval(sum(sin(n*PI/6), n=0..20000)...28ms
eval((2+3*I/4)^1000000)...191ms
evalf(sin(1+PI)^2+3^sqrt(1+PI^2)) to 100000 bits...139ms
expand ((a0+...a500)^2), replace a0, reeval...634ms
expand ((x0+...x2+1)^16*(1+(x0+...x2+1)^16))...40ms
expand ((x0+...x3+1)^20*(1+(x0+...x3+1)^20))...4427ms
expand ((x+y^400000000000+z)^20*(1+x+y+z^-1)^20)...830ms
expand ((1+x)^1000*(2+x)^1000)...304ms
expand ((17+x)^600*(42+x)^600)...250ms
expand ((1+sqrt(5))^65000)...3ms
expand ((1+x+y)^500)...562ms
expand (expand((1+x)^50*(1+y)^50) * expand((1-x)^50*(2-y)^50))...88ms
expand (expand((a+b+c+d+e+f+g+h)^5) * expand((a+b+c+d+e+f+g+h)^5))...622ms
expand ((1+x+...+x^65535)*(1+2*x+x^2+...+x^65535))...7897ms
divide ( (1+x)^1000+1 , (1-x)^500, x)...230ms
divide ( (1+x)^1000+1 , x^3-5*x+17, x)...3ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, x)...23ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, x)...3ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, {x,y})...823ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, {x,y})...1159ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...2ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...8ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...5ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...4ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...5ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1750ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...0ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...2ms
gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...1ms
expand+gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...58ms
gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...1ms
expand+gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...1ms
gcd ( (x-y)^50+a , (x+y)^50 )...1ms
expand+gcd ( (x-y)^50+a , (x+y)^50 )...0ms
gcd ( -2107-7967*x+19271*x^50+551*x^49-39300*x^48+236.. , -2401-3773*x-9484*x^50-4086*x^49-31296*x^48-216.. )...2ms
gcd ( -1368+2517*x-62928*x^500+126728*x^499-139637*x^.. , -6336-11784*x+4932*x^500+50975*x^499+97099*x^49.. )...193ms
gcd ( 3772-5709*x-28359*x^500+38352*x^499-18303*x^498.. , -3680-4456*x+90816*x^500+35952*x^499+89870*x^49.. )...779ms
Compute GCD in Z/181Z
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...2ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...11ms
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...15ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...14ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...7ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1926ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...2ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...27ms
Compute GCD in Z/43051Z
gcd ( -936639990+26248623452*x^47-30174373832*x^46-19.. , 1882371920-30937311949*x^47+6616907060*x^46-742.. )...16ms
gcd ( -96986421453*x^426-75230349764*x^425+1282023978.. , 13695560229*x^426+16181971852*x^425-90237548124.. )...1597ms
gcd ( 138233755629*x^426-22168686741*x^425-6531403218.. , -12298243395*x^426-33246261919*x^425+1488121621.. )...6880ms
GCDFT1: gcd = 1 (n=14)...109ms
GCDFT2: gcd of linearly dense quartic inputs with quadratic GCDs (n=7)...63ms
GCDFT3: gcd of sparse inputs where degree // to #vars (n=4)...30ms
GCDFT4: gcd of sparse inputs where degree // to #vars (second) (n=4)...17ms
GCDFT5: gcd quadratic non-monic with other quadratic factors (n=6)...8ms
GCDFT7: gcd completly dense non-monic quadratic inputs (n=8)...1628ms
GCDFT8: gcd sparse non-monic quadratic inputs with linear gcds (n=10)...32ms
GCDFT9: trivariate inputs with increasing degrees (n=17)...1ms
GCDFT10: trivariate polynomials whose GCD has common factors with cofactors (j=7,k=11)...78ms
diff ( x/(1+sin(x^(y+x^2)))^2 , x)...0.00097ms
R1: f(f(...) with f(z)=sqrt(1/3)*z^2 + i/3 and n=17...462ms
R2: hermite(25,y)...899ms
R6: sum(((x+sin(i))/x+(x-sin(i))/x) for n=100000...1965ms
R7: 100000 random float eval of x^24+34*x^12+45*x^3+9*x^18 +34*x^10+ 32*x^21...772ms
R8:right(x^2,0,5,100000) ...168ms
S2:expand((x^sin(x) + y^cos(y) - z^(x+y))^500) ...626ms
S3:diff(expand((x^y + y^z + z^x)^500,x) ...2393ms
S4:series(sin(x)*cos(x),x=0,500) ...198ms
series(tan(2+x),x=0,100) ...2400ms
Rationalize nested expression 1 ...464ms
Rationalize sum((i*y*t^i)/(y+i*t)^i),i=1..10 ...625ms
Rationalize sum((i*y*t^i)/(y+abs(5-i)*t)^i),i=1..10 ...62ms
End -- Base:0x24ea000 Top:0x24f2170 Used:33136 MaxUsed:796873120 Max:10737418240
Total time 46709ms
alors qu'en classique on a:
MAY V0.7.4 (GMP V5.0.5 MPFR V3.1.0 CC=gcc CFLAGS=-fexceptions -O3 -fomit-frame-pointer -funroll-loops -ffast-math -march=native -ffunction-sections -fdata-sections -static -flto -ffat-lto-objects)
Start -- Base:0x1390000 Top:0x1398130 Used:33072 MaxUsed:33072 Max:10737418240
Construct (3*(a*x+b*y+c*z) with a=1/2, b=2/3 and c=4/5...0.00181ms [551470 execs/sec]
eval (sum ai*ai*ai) - quite different - N=100......0.01ms
eval (sum ai*ai*ai) - quite similar - N=100......0.01ms
eval (sum ai*ai*ai) - quite different - N=1000......0.10ms
eval (sum ai*ai*ai) - quite similar - N=1000......0.10ms
eval (sum ai*ai*ai) - quite different - N=10000......1.71ms
eval (sum ai*ai*ai) - quite similar - N=10000......1.50ms
eval (sum ai*ai*ai) - quite different - N=100000......56.00ms
eval (sum ai*ai*ai) - quite similar - N=100000......56.00ms
eval (sum ai*ai*ai) - quite different - N=1000000......692.00ms
eval (sum ai*ai*ai) - quite similar - N=1000000......792.00ms
eval(sum(i*x^i, n=0..20000)...4ms
eval(x+f(x)+f(f(x))+...+f(5000)(x)), subs f to id...516ms
eval(sum(i,i=0..20000)+x+sum(i,i=0..20000))...4ms
eval(sum(sin(n*PI/6), n=0..20000)...20ms
eval((2+3*I/4)^1000000)...184ms
evalf(sin(1+PI)^2+3^sqrt(1+PI^2)) to 100000 bits...140ms
expand ((a0+...a500)^2), replace a0, reeval...544ms
expand ((x0+...x2+1)^16*(1+(x0+...x2+1)^16))...52ms
expand ((x0+...x3+1)^20*(1+(x0+...x3+1)^20))...6388ms
expand ((x+y^400000000000+z)^20*(1+x+y+z^-1)^20)...724ms
expand ((1+x)^1000*(2+x)^1000)...304ms
expand ((17+x)^600*(42+x)^600)...248ms
expand ((1+sqrt(5))^65000)...0ms
expand ((1+x+y)^500)...500ms
expand (expand((1+x)^50*(1+y)^50) * expand((1-x)^50*(2-y)^50))...148ms
expand (expand((a+b+c+d+e+f+g+h)^5) * expand((a+b+c+d+e+f+g+h)^5))...492ms
expand ((1+x+...+x^65535)*(1+2*x+x^2+...+x^65535))...7737ms
divide ( (1+x)^1000+1 , (1-x)^500, x)...164ms
divide ( (1+x)^1000+1 , x^3-5*x+17, x)...0ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, x)...16ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, x)...4ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, {x,y})...616ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, {x,y})...916ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...0ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...8ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...4ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...8ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...0ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1528ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...0ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...4ms
gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...0ms
expand+gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...48ms
gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...0ms
expand+gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...0ms
gcd ( (x-y)^50+a , (x+y)^50 )...0ms
expand+gcd ( (x-y)^50+a , (x+y)^50 )...0ms
gcd ( -2107-7967*x+19271*x^50+551*x^49-39300*x^48+236.. , -2401-3773*x-9484*x^50-4086*x^49-31296*x^48-216.. )...0ms
gcd ( -1368+2517*x-62928*x^500+126728*x^499-139637*x^.. , -6336-11784*x+4932*x^500+50975*x^499+97099*x^49.. )...128ms
gcd ( 3772-5709*x-28359*x^500+38352*x^499-18303*x^498.. , -3680-4456*x+90816*x^500+35952*x^499+89870*x^49.. )...500ms
Compute GCD in Z/181Z
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...4ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...8ms
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...12ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...12ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...4ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1404ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...0ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...16ms
Compute GCD in Z/43051Z
gcd ( -936639990+26248623452*x^47-30174373832*x^46-19.. , 1882371920-30937311949*x^47+6616907060*x^46-742.. )...12ms
gcd ( -96986421453*x^426-75230349764*x^425+1282023978.. , 13695560229*x^426+16181971852*x^425-90237548124.. )...1128ms
gcd ( 138233755629*x^426-22168686741*x^425-6531403218.. , -12298243395*x^426-33246261919*x^425+1488121621.. )...4816ms
GCDFT1: gcd = 1 (n=14)...108ms
GCDFT2: gcd of linearly dense quartic inputs with quadratic GCDs (n=7)...52ms
GCDFT3: gcd of sparse inputs where degree // to #vars (n=4)...24ms
GCDFT4: gcd of sparse inputs where degree // to #vars (second) (n=4)...16ms
GCDFT5: gcd quadratic non-monic with other quadratic factors (n=6)...4ms
GCDFT7: gcd completly dense non-monic quadratic inputs (n=8)...1357ms
GCDFT8: gcd sparse non-monic quadratic inputs with linear gcds (n=10)...28ms
GCDFT9: trivariate inputs with increasing degrees (n=17)...0ms
GCDFT10: trivariate polynomials whose GCD has common factors with cofactors (j=7,k=11)...60ms
diff ( x/(1+sin(x^(y+x^2)))^2 , x)...0.00066ms
R1: f(f(...) with f(z)=sqrt(1/3)*z^2 + i/3 and n=17...456ms
R2: hermite(25,y)...632ms
R6: sum(((x+sin(i))/x+(x-sin(i))/x) for n=100000...1772ms
R7: 100000 random float eval of x^24+34*x^12+45*x^3+9*x^18 +34*x^10+ 32*x^21...736ms
R8:right(x^2,0,5,100000) ...148ms
S2:expand((x^sin(x) + y^cos(y) - z^(x+y))^500) ...524ms
S3:diff(expand((x^y + y^z + z^x)^500,x) ...1916ms
S4:series(sin(x)*cos(x),x=0,500) ...192ms
series(tan(2+x),x=0,100) ...2900ms
Rationalize nested expression 1 ...364ms
Rationalize sum((i*y*t^i)/(y+i*t)^i),i=1..10 ...616ms
Rationalize sum((i*y*t^i)/(y+abs(5-i)*t)^i),i=1..10 ...56ms
End -- Base:0x1390000 Top:0x1398170 Used:33136 MaxUsed:983431120 Max:10737418240
Total time 42900ms
On a des régressions en performances... qui sont dommages

et en plus le GC est toujours désactivé pendant le MT, ce qui fait que certains algorithmes peuvent prendre beaucoup plus de mémoire...
(J'ai oubliais de dire : si on compile sans le support MT, on a effectivement pas de regresssion et çà, c'est bien

).
J'aillais me dire merde, il me faut bien une heap séparée pour chaque thread, mais comment faire ?
4) Troisième version:
Chaque thread a sa heap séparée. Mais lorsqu'il a finit son travail, sa heap est donnée au GC pour qu'il la supprime en temps et en heure (au prochain GC impliquant la marque enregistrée lors du point de synchro), et il s'en créée une autre en attendant.
Cette méthode ne va pas consommer de mémoire réelle en plus, par contre, elle va consommer beaucoup plus de mémoires virtuelles...
Un autre problème est que le GC devient plus complexe: il doit déterminer si un bout de mémoire appartient à une heap qui va être libéré ou pas (ce n'est pas compliqué mais ca prend du temps!)
(Je vous passe tous les détails sordibes de mise au point, de bugs aléatoires, et de crise de nerf sur le débuggage du truc).
Passons au test:
Single Thread:
/t-pika "(1+(x+y+z+t+1)^20)*(1+x+y+z+t)^20" -oexpand > /dev/null
CPU time=6352ms
Multi-Thread:
./t-pika "(1+(x+y+z+t+1)^20)*(1+x+y+z+t)^20" -oexpand > /dev/null
CPU time=4454ms
Ok ca marche bien !
On consomme effectivement pas mal en virtuel: 395Go virtualisé mais seulement 84Mo alloué.
Et la campagne complète n'a pas de regression:
MAY V0.7.4 (GMP V5.0.5 MPFR V3.1.0 CC=gcc CFLAGS=-fexceptions -O3 -fomit-frame-pointer -funroll-loops -ffast-math -march=native -ffunction-sections -fdata-sections -static -flto -ffat-lto-objects -DMAY_WANT_THREAD)
Start -- Base:0x1ae3000 Top:0x1aeb130 Used:33072 MaxUsed:33072 Max:10737422335
Construct (3*(a*x+b*y+c*z) with a=1/2, b=2/3 and c=4/5...0.00232ms [430416 execs/sec]
eval (sum ai*ai*ai) - quite different - N=100......0.02ms
eval (sum ai*ai*ai) - quite similar - N=100......0.02ms
eval (sum ai*ai*ai) - quite different - N=1000......0.08ms
eval (sum ai*ai*ai) - quite similar - N=1000......0.08ms
eval (sum ai*ai*ai) - quite different - N=10000......1.25ms
eval (sum ai*ai*ai) - quite similar - N=10000......1.25ms
eval (sum ai*ai*ai) - quite different - N=100000......53.00ms
eval (sum ai*ai*ai) - quite similar - N=100000......56.00ms
eval (sum ai*ai*ai) - quite different - N=1000000......680.00ms
eval (sum ai*ai*ai) - quite similar - N=1000000......810.00ms
eval(sum(i*x^i, n=0..20000)...10ms
eval(x+f(x)+f(f(x))+...+f(5000)(x)), subs f to id...516ms
eval(sum(i,i=0..20000)+x+sum(i,i=0..20000))...2ms
eval(sum(sin(n*PI/6), n=0..20000)...24ms
eval((2+3*I/4)^1000000)...183ms
evalf(sin(1+PI)^2+3^sqrt(1+PI^2)) to 100000 bits...139ms
expand ((a0+...a500)^2), replace a0, reeval...554ms
expand ((x0+...x2+1)^16*(1+(x0+...x2+1)^16))...35ms
expand ((x0+...x3+1)^20*(1+(x0+...x3+1)^20))...4472ms
expand ((x+y^400000000000+z)^20*(1+x+y+z^-1)^20)...702ms
expand ((1+x)^1000*(2+x)^1000)...302ms
expand ((17+x)^600*(42+x)^600)...255ms
expand ((1+sqrt(5))^65000)...2ms
expand ((1+x+y)^500)...502ms
expand (expand((1+x)^50*(1+y)^50) * expand((1-x)^50*(2-y)^50))...115ms
expand (expand((a+b+c+d+e+f+g+h)^5) * expand((a+b+c+d+e+f+g+h)^5))...490ms
expand ((1+x+...+x^65535)*(1+2*x+x^2+...+x^65535))...7787ms
divide ( (1+x)^1000+1 , (1-x)^500, x)...151ms
divide ( (1+x)^1000+1 , x^3-5*x+17, x)...2ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, x)...16ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, x)...3ms
divide ( (1+x+y^2)^50+1 , (1-x)^25+y, {x,y})...597ms
divide ( (1+x+y^2)^25+1 , x^3*y-5*x*y^42+17*y+1, {x,y})...888ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...2ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...7ms
gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...5ms
expand+gcd ( (1+2*x)^200*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...4ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...3ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1533ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...0ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...2ms
gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...1ms
expand+gcd ( (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^4*(3*x.. , (7*y*x^2*z^2-3*x*y*z+11*(x+1)*y^2+5*z+1)^3*(3*x.. )...49ms
gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...2ms
expand+gcd ( (x^2-y^2)*(a+b)^10 , (x-y)*(a-c)^10 )...1ms
gcd ( (x-y)^50+a , (x+y)^50 )...0ms
expand+gcd ( (x-y)^50+a , (x+y)^50 )...0ms
gcd ( -2107-7967*x+19271*x^50+551*x^49-39300*x^48+236.. , -2401-3773*x-9484*x^50-4086*x^49-31296*x^48-216.. )...2ms
gcd ( -1368+2517*x-62928*x^500+126728*x^499-139637*x^.. , -6336-11784*x+4932*x^500+50975*x^499+97099*x^49.. )...128ms
gcd ( 3772-5709*x-28359*x^500+38352*x^499-18303*x^498.. , -3680-4456*x+90816*x^500+35952*x^499+89870*x^49.. )...506ms
Compute GCD in Z/181Z
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...3ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42) )...8ms
gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...11ms
expand+gcd ( (1+2*x)^400*(x^3+2*x^2+1) , (1+2*x)^42*(x^3-2*x+42)+1 )...10ms
gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...5ms
expand+gcd ( (1+2*x+y)^100*(x^3+2*x^2*y+1) , (1+2*x+y)^42*(x^3-2*x+42) )...1428ms
gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...2ms
expand+gcd ( (x^2-3*x*y+y^2)^4*(3*x-7*y+2)^5 , (x^2-3*x*y+y^2)^3*(3*x-7*y-2)^6 )...20ms
Compute GCD in Z/43051Z
gcd ( -936639990+26248623452*x^47-30174373832*x^46-19.. , 1882371920-30937311949*x^47+6616907060*x^46-742.. )...11ms
gcd ( -96986421453*x^426-75230349764*x^425+1282023978.. , 13695560229*x^426+16181971852*x^425-90237548124.. )...1138ms
gcd ( 138233755629*x^426-22168686741*x^425-6531403218.. , -12298243395*x^426-33246261919*x^425+1488121621.. )...4823ms
GCDFT1: gcd = 1 (n=14)...106ms
GCDFT2: gcd of linearly dense quartic inputs with quadratic GCDs (n=7)...55ms
GCDFT3: gcd of sparse inputs where degree // to #vars (n=4)...23ms
GCDFT4: gcd of sparse inputs where degree // to #vars (second) (n=4)...14ms
GCDFT5: gcd quadratic non-monic with other quadratic factors (n=6)...7ms
GCDFT7: gcd completly dense non-monic quadratic inputs (n=8)...1395ms
GCDFT8: gcd sparse non-monic quadratic inputs with linear gcds (n=10)...29ms
GCDFT9: trivariate inputs with increasing degrees (n=17)...1ms
GCDFT10: trivariate polynomials whose GCD has common factors with cofactors (j=7,k=11)...64ms
diff ( x/(1+sin(x^(y+x^2)))^2 , x)...0.00066ms
R1: f(f(...) with f(z)=sqrt(1/3)*z^2 + i/3 and n=17...457ms
R2: hermite(25,y)...669ms
R6: sum(((x+sin(i))/x+(x-sin(i))/x) for n=100000...1768ms
R7: 100000 random float eval of x^24+34*x^12+45*x^3+9*x^18 +34*x^10+ 32*x^21...663ms
R8:right(x^2,0,5,100000) ...150ms
S2:expand((x^sin(x) + y^cos(y) - z^(x+y))^500) ...523ms
S3:diff(expand((x^y + y^z + z^x)^500,x) ...2119ms
S4:series(sin(x)*cos(x),x=0,500) ...189ms
series(tan(2+x),x=0,100) ...2335ms
Rationalize nested expression 1 ...367ms
Rationalize sum((i*y*t^i)/(y+i*t)^i),i=1..10 ...619ms
Rationalize sum((i*y*t^i)/(y+abs(5-i)*t)^i),i=1..10 ...59ms
End -- Base:0x1ae3000 Top:0x1aeb170 Used:33136 MaxUsed:386605624 Max:10737422335
Total time 40630ms
C'est bien, pas de regression, et je vais plus vite.
Vous me direz, seulement 2s de gagner ? Oui mais j'ai pas converti grand chose en //. Seul expand dans certains cas, parallélise les données. Donc c'est encourangeant !
(Note: j'ai pas encore géré les exceptions dans le code // : je sens la prise de tête).
Quelle est la suite ?
- Trouver les endroits où je peux paralléliser.
- débugger le reste du code.
Merci de votre lecture !