
Sat Sep 12 09:49:45 EDT 2015
numactl --interleave=all ../testing/testing_zgetrf -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:49:51 2015
% Usage: ../testing/testing_zgetrf [options] [-h|--help]

% ngpu 1
%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123      3.74 (   0.00)      2.47 (   0.00)     ---   
 1234  1234    103.01 (   0.05)    150.12 (   0.03)     ---   
   10    10      0.26 (   0.00)      0.32 (   0.00)     ---   
   20    20      0.64 (   0.00)      0.87 (   0.00)     ---   
   30    30      1.42 (   0.00)      1.79 (   0.00)     ---   
   40    40      2.25 (   0.00)      2.57 (   0.00)     ---   
   50    50      2.27 (   0.00)      2.59 (   0.00)     ---   
   60    60      3.11 (   0.00)      3.33 (   0.00)     ---   
   70    70      5.35 (   0.00)      0.92 (   0.00)     ---   
   80    80      7.31 (   0.00)      1.39 (   0.00)     ---   
   90    90      8.31 (   0.00)      1.78 (   0.00)     ---   
  100   100      9.95 (   0.00)      2.30 (   0.00)     ---   
  200   200     28.20 (   0.00)      9.17 (   0.00)     ---   
  300   300     45.06 (   0.00)     19.42 (   0.00)     ---   
  400   400     63.50 (   0.00)     30.62 (   0.01)     ---   
  500   500     66.47 (   0.01)     43.57 (   0.01)     ---   
  600   600     76.17 (   0.01)     56.51 (   0.01)     ---   
  700   700     93.78 (   0.01)     71.39 (   0.01)     ---   
  800   800     73.41 (   0.02)     86.27 (   0.02)     ---   
  900   900     91.91 (   0.02)     90.85 (   0.02)     ---   
 1000  1000     60.89 (   0.04)    115.31 (   0.02)     ---   
 2000  2000    111.82 (   0.19)    269.96 (   0.08)     ---   
 3000  3000    145.80 (   0.49)    521.67 (   0.14)     ---   
 4000  4000    204.33 (   0.84)    629.99 (   0.27)     ---   
 5000  5000    210.19 (   1.59)    687.77 (   0.48)     ---   
 6000  6000    206.22 (   2.79)    782.66 (   0.74)     ---   
 7000  7000    267.67 (   3.42)    838.74 (   1.09)     ---   
 8000  8000    271.82 (   5.02)    890.31 (   1.53)     ---   
 9000  9000    252.62 (   7.70)    922.73 (   2.11)     ---   
10000 10000    276.02 (   9.66)    956.23 (   2.79)     ---   
12000 12000    232.82 (  19.79)   1003.32 (   4.59)     ---   
14000 14000    256.65 (  28.51)   1034.01 (   7.08)     ---   
16000 16000    278.43 (  39.23)   1059.97 (  10.30)     ---   
18000 18000    286.40 (  54.30)   1067.07 (  14.57)     ---   
20000 20000    289.66 (  73.65)   1082.21 (  19.71)     ---   
Sat Sep 12 09:57:01 EDT 2015

Sat Sep 12 09:57:01 EDT 2015
numactl --interleave=all ../testing/testing_zgetrf_gpu -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:57:07 2015
% Usage: ../testing/testing_zgetrf_gpu [options] [-h|--help]

%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123     ---   (  ---  )      1.69 (   0.00)     ---  
 1234  1234     ---   (  ---  )    200.99 (   0.02)     ---  
   10    10     ---   (  ---  )      0.07 (   0.00)     ---  
   20    20     ---   (  ---  )      0.36 (   0.00)     ---  
   30    30     ---   (  ---  )      0.90 (   0.00)     ---  
   40    40     ---   (  ---  )      1.60 (   0.00)     ---  
   50    50     ---   (  ---  )      1.75 (   0.00)     ---  
   60    60     ---   (  ---  )      2.67 (   0.00)     ---  
   70    70     ---   (  ---  )      0.67 (   0.00)     ---  
   80    80     ---   (  ---  )      1.01 (   0.00)     ---  
   90    90     ---   (  ---  )      1.36 (   0.00)     ---  
  100   100     ---   (  ---  )      1.80 (   0.00)     ---  
  200   200     ---   (  ---  )      7.93 (   0.00)     ---  
  300   300     ---   (  ---  )     18.87 (   0.00)     ---  
  400   400     ---   (  ---  )     32.66 (   0.01)     ---  
  500   500     ---   (  ---  )     51.24 (   0.01)     ---  
  600   600     ---   (  ---  )     67.59 (   0.01)     ---  
  700   700     ---   (  ---  )     89.08 (   0.01)     ---  
  800   800     ---   (  ---  )    110.94 (   0.01)     ---  
  900   900     ---   (  ---  )    133.24 (   0.01)     ---  
 1000  1000     ---   (  ---  )    161.17 (   0.02)     ---  
 2000  2000     ---   (  ---  )    401.40 (   0.05)     ---  
 3000  3000     ---   (  ---  )    630.06 (   0.11)     ---  
 4000  4000     ---   (  ---  )    757.28 (   0.23)     ---  
 5000  5000     ---   (  ---  )    784.64 (   0.42)     ---  
 6000  6000     ---   (  ---  )    883.64 (   0.65)     ---  
 7000  7000     ---   (  ---  )    944.57 (   0.97)     ---  
 8000  8000     ---   (  ---  )    995.43 (   1.37)     ---  
 9000  9000     ---   (  ---  )    969.28 (   2.01)     ---  
10000 10000     ---   (  ---  )   1010.49 (   2.64)     ---  
12000 12000     ---   (  ---  )   1065.84 (   4.32)     ---  
14000 14000     ---   (  ---  )   1103.78 (   6.63)     ---  
16000 16000     ---   (  ---  )   1132.22 (   9.65)     ---  
18000 18000     ---   (  ---  )   1133.64 (  13.72)     ---  
20000 20000     ---   (  ---  )   1129.45 (  18.89)     ---  
Sat Sep 12 09:59:15 EDT 2015
