
Sat Sep 12 09:39:40 EDT 2015
numactl --interleave=all ../testing/testing_dgetrf -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:39:47 2015
% Usage: ../testing/testing_dgetrf [options] [-h|--help]

% ngpu 1
%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123      0.86 (   0.00)      4.44 (   0.00)     ---   
 1234  1234     82.03 (   0.02)     77.26 (   0.02)     ---   
   10    10      0.03 (   0.00)      0.09 (   0.00)     ---   
   20    20      0.14 (   0.00)      0.24 (   0.00)     ---   
   30    30      0.37 (   0.00)      0.49 (   0.00)     ---   
   40    40      0.67 (   0.00)      0.85 (   0.00)     ---   
   50    50      0.97 (   0.00)      1.13 (   0.00)     ---   
   60    60      1.27 (   0.00)      1.53 (   0.00)     ---   
   70    70      1.69 (   0.00)      2.02 (   0.00)     ---   
   80    80      2.27 (   0.00)      2.73 (   0.00)     ---   
   90    90      2.82 (   0.00)      3.19 (   0.00)     ---   
  100   100      3.45 (   0.00)      3.74 (   0.00)     ---   
  200   200     11.53 (   0.00)      3.69 (   0.00)     ---   
  300   300     21.58 (   0.00)      8.21 (   0.00)     ---   
  400   400     33.48 (   0.00)     13.91 (   0.00)     ---   
  500   500     43.77 (   0.00)     21.00 (   0.00)     ---   
  600   600     56.33 (   0.00)     28.32 (   0.01)     ---   
  700   700     60.95 (   0.00)     35.39 (   0.01)     ---   
  800   800     72.49 (   0.00)     43.96 (   0.01)     ---   
  900   900     78.26 (   0.01)     51.05 (   0.01)     ---   
 1000  1000     76.28 (   0.01)     60.73 (   0.01)     ---   
 2000  2000     99.36 (   0.05)    153.07 (   0.03)     ---   
 3000  3000    109.96 (   0.16)    248.50 (   0.07)     ---   
 4000  4000    171.59 (   0.25)    340.14 (   0.13)     ---   
 5000  5000    184.05 (   0.45)    437.88 (   0.19)     ---   
 6000  6000    175.65 (   0.82)    529.60 (   0.27)     ---   
 7000  7000    201.05 (   1.14)    593.39 (   0.39)     ---   
 8000  8000    232.90 (   1.47)    651.45 (   0.52)     ---   
 9000  9000    221.73 (   2.19)    703.38 (   0.69)     ---   
10000 10000    248.58 (   2.68)    739.56 (   0.90)     ---   
12000 12000    157.27 (   7.32)    809.69 (   1.42)     ---   
14000 14000    169.63 (  10.78)    854.90 (   2.14)     ---   
16000 16000    196.69 (  13.88)    891.83 (   3.06)     ---   
18000 18000    181.36 (  21.44)    915.36 (   4.25)     ---   
20000 20000    200.49 (  26.60)    946.51 (   5.63)     ---   
Sat Sep 12 09:42:38 EDT 2015

Sat Sep 12 09:42:38 EDT 2015
numactl --interleave=all ../testing/testing_dgetrf_gpu -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 09:42:45 2015
% Usage: ../testing/testing_dgetrf_gpu [options] [-h|--help]

%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |PA-LU|/(N*|A|)
%========================================================================
  123   123     ---   (  ---  )      0.79 (   0.00)     ---  
 1234  1234     ---   (  ---  )     88.02 (   0.01)     ---  
   10    10     ---   (  ---  )      0.01 (   0.00)     ---  
   20    20     ---   (  ---  )      0.09 (   0.00)     ---  
   30    30     ---   (  ---  )      0.24 (   0.00)     ---  
   40    40     ---   (  ---  )      0.49 (   0.00)     ---  
   50    50     ---   (  ---  )      0.70 (   0.00)     ---  
   60    60     ---   (  ---  )      1.06 (   0.00)     ---  
   70    70     ---   (  ---  )      1.22 (   0.00)     ---  
   80    80     ---   (  ---  )      1.63 (   0.00)     ---  
   90    90     ---   (  ---  )      1.98 (   0.00)     ---  
  100   100     ---   (  ---  )      2.40 (   0.00)     ---  
  200   200     ---   (  ---  )      2.66 (   0.00)     ---  
  300   300     ---   (  ---  )      6.75 (   0.00)     ---  
  400   400     ---   (  ---  )     12.40 (   0.00)     ---  
  500   500     ---   (  ---  )     19.47 (   0.00)     ---  
  600   600     ---   (  ---  )     27.85 (   0.01)     ---  
  700   700     ---   (  ---  )     36.87 (   0.01)     ---  
  800   800     ---   (  ---  )     46.64 (   0.01)     ---  
  900   900     ---   (  ---  )     56.38 (   0.01)     ---  
 1000  1000     ---   (  ---  )     65.94 (   0.01)     ---  
 2000  2000     ---   (  ---  )    181.52 (   0.03)     ---  
 3000  3000     ---   (  ---  )    309.37 (   0.06)     ---  
 4000  4000     ---   (  ---  )    400.52 (   0.11)     ---  
 5000  5000     ---   (  ---  )    516.45 (   0.16)     ---  
 6000  6000     ---   (  ---  )    623.33 (   0.23)     ---  
 7000  7000     ---   (  ---  )    706.47 (   0.32)     ---  
 8000  8000     ---   (  ---  )    776.60 (   0.44)     ---  
 9000  9000     ---   (  ---  )    801.42 (   0.61)     ---  
10000 10000     ---   (  ---  )    858.18 (   0.78)     ---  
12000 12000     ---   (  ---  )    860.44 (   1.34)     ---  
14000 14000     ---   (  ---  )    976.78 (   1.87)     ---  
16000 16000     ---   (  ---  )   1008.20 (   2.71)     ---  
18000 18000     ---   (  ---  )   1024.99 (   3.79)     ---  
20000 20000     ---   (  ---  )   1047.48 (   5.09)     ---  
Sat Sep 12 09:43:36 EDT 2015
