
Sat Sep 12 10:19:24 EDT 2015
numactl --interleave=all ../testing/testing_sgeqrf -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 10:19:30 2015
% Usage: ../testing/testing_sgeqrf [options] [-h|--help]

% ngpu 1
%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)   |R - Q^H*A|   |I - Q^H*Q|
%==============================================================================
  123   123      5.83 (   0.00)      1.29 (   0.00)       ---
 1234  1234    139.63 (   0.02)    148.27 (   0.02)       ---
   10    10      0.19 (   0.00)      0.00 (   0.00)       ---
   20    20      0.55 (   0.00)      0.04 (   0.00)       ---
   30    30      1.05 (   0.00)      0.12 (   0.00)       ---
   40    40      1.89 (   0.00)      1.01 (   0.00)       ---
   50    50      2.20 (   0.00)      1.48 (   0.00)       ---
   60    60      3.08 (   0.00)      2.19 (   0.00)       ---
   70    70      3.16 (   0.00)      2.48 (   0.00)       ---
   80    80      3.85 (   0.00)      1.57 (   0.00)       ---
   90    90      3.97 (   0.00)      1.87 (   0.00)       ---
  100   100      3.95 (   0.00)      1.51 (   0.00)       ---
  200   200     14.07 (   0.00)      6.56 (   0.00)       ---
  300   300     31.91 (   0.00)     14.41 (   0.00)       ---
  400   400     66.72 (   0.00)     26.18 (   0.00)       ---
  500   500     89.49 (   0.00)     38.71 (   0.00)       ---
  600   600     86.50 (   0.00)     54.54 (   0.01)       ---
  700   700    101.51 (   0.00)     67.47 (   0.01)       ---
  800   800    121.79 (   0.01)     87.78 (   0.01)       ---
  900   900    106.09 (   0.01)    102.12 (   0.01)       ---
 1000  1000    128.52 (   0.01)    122.24 (   0.01)       ---
 2000  2000    193.06 (   0.06)    330.74 (   0.03)       ---
 3000  3000    243.41 (   0.15)    560.04 (   0.06)       ---
 4000  4000    265.74 (   0.32)    717.33 (   0.12)       ---
 5000  5000    284.24 (   0.59)    916.14 (   0.18)       ---
 6000  6000    295.27 (   0.98)   1025.16 (   0.28)       ---
 7000  7000    319.50 (   1.43)   1109.35 (   0.41)       ---
 8000  8000    319.59 (   2.14)   1346.25 (   0.51)       ---
 9000  9000    310.83 (   3.13)   1458.30 (   0.67)       ---
10000 10000    348.94 (   3.82)   1571.59 (   0.85)       ---
12000 12000    403.33 (   5.71)   1705.67 (   1.35)       ---
14000 14000    489.37 (   7.48)   1766.55 (   2.07)       ---
16000 16000    520.48 (  10.49)   1891.88 (   2.89)       ---
18000 18000    529.23 (  14.69)   1927.11 (   4.04)       ---
20000 20000    532.25 (  20.04)   2059.15 (   5.18)       ---
Sat Sep 12 10:21:34 EDT 2015

Sat Sep 12 10:21:34 EDT 2015
numactl --interleave=all ../testing/testing_sgeqrf_gpu -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 10:21:40 2015
% Usage: ../testing/testing_sgeqrf_gpu [options] [-h|--help]

% version 1
%   M     N   CPU GFlop/s (sec)   GPU GFlop/s (sec)    |b - A*x|
%===============================================================
  123   123     ---   (  ---  )      0.83 (   0.00)       ---
 1234  1234     ---   (  ---  )    126.56 (   0.02)       ---
   10    10     ---   (  ---  )      0.00 (   0.00)       ---
   20    20     ---   (  ---  )      0.01 (   0.00)       ---
   30    30     ---   (  ---  )      0.04 (   0.00)       ---
   40    40     ---   (  ---  )      0.10 (   0.00)       ---
   50    50     ---   (  ---  )      0.18 (   0.00)       ---
   60    60     ---   (  ---  )      0.27 (   0.00)       ---
   70    70     ---   (  ---  )      0.41 (   0.00)       ---
   80    80     ---   (  ---  )      0.59 (   0.00)       ---
   90    90     ---   (  ---  )      0.79 (   0.00)       ---
  100   100     ---   (  ---  )      0.81 (   0.00)       ---
  200   200     ---   (  ---  )      7.25 (   0.00)       ---
  300   300     ---   (  ---  )     14.33 (   0.00)       ---
  400   400     ---   (  ---  )     24.01 (   0.00)       ---
  500   500     ---   (  ---  )     28.72 (   0.01)       ---
  600   600     ---   (  ---  )     42.28 (   0.01)       ---
  700   700     ---   (  ---  )     58.05 (   0.01)       ---
  800   800     ---   (  ---  )     69.31 (   0.01)       ---
  900   900     ---   (  ---  )     83.06 (   0.01)       ---
 1000  1000     ---   (  ---  )    103.35 (   0.01)       ---
 2000  2000     ---   (  ---  )    285.61 (   0.04)       ---
 3000  3000     ---   (  ---  )    485.46 (   0.07)       ---
 4000  4000     ---   (  ---  )    658.79 (   0.13)       ---
 5000  5000     ---   (  ---  )    845.89 (   0.20)       ---
 6000  6000     ---   (  ---  )    969.34 (   0.30)       ---
 7000  7000     ---   (  ---  )   1055.68 (   0.43)       ---
 8000  8000     ---   (  ---  )   1133.29 (   0.60)       ---
 9000  9000     ---   (  ---  )   1267.92 (   0.77)       ---
10000 10000     ---   (  ---  )   1470.62 (   0.91)       ---
12000 12000     ---   (  ---  )   1631.16 (   1.41)       ---
14000 14000     ---   (  ---  )   1725.38 (   2.12)       ---
16000 16000     ---   (  ---  )   1843.85 (   2.96)       ---
18000 18000     ---   (  ---  )   1872.06 (   4.15)       ---
20000 20000     ---   (  ---  )   1967.07 (   5.42)       ---
Sat Sep 12 10:22:34 EDT 2015
