pyvkfft-test
Run pyvkfft unit tests, regular or systematic
usage: pyvkfft-test [-h] [--colour] [--html [HTML ...]] [--gpu GPU]
[--opencl_platform OPENCL_PLATFORM] [--mailto MAILTO]
[--mailto_fail MAILTO_FAIL] [--mailto_smtp MAILTO_SMTP]
[--nproc NPROC] [--silent] [--c2c] [--systematic]
[--axes [AXES ...]]
[--backend {pycuda,cupy,pyopencl} [{pycuda,cupy,pyopencl} ...]]
[--bluestein] [--db [DB ...]] [--dct [{1,2,3,4}]]
[--dst [{1,2,3,4}]] [--double] [--dry-run]
[--fast-random FAST_RANDOM] [--inplace] [--graph [GRAPH]]
[--lut] [--max-nb-tests MAX_NB_TESTS]
[--ndim {1,2,3,12,123}] [--norm {0,1}] [--ref-long-double]
[--r2c] [--fstride] [--radix [{2,3,5,7,11,13} ...]]
[--radix-max-pow RADIX_MAX_POW] [--range RANGE RANGE]
[--range-mb RANGE_MB RANGE_MB]
[--range-nd-narrow RANGE_ND_NARROW RANGE_ND_NARROW]
[--serial] [--timeout TIMEOUT]
Named Arguments
- --colour
Use colour depending on how good the measured accuracy is
Default:
False
- --html
Summarises the results in html row(s). This is saved to 'pyvkfft-test%04d.html', starting at i=1001 and incrementing. Files with i=1000 and i=1999 are the beginning and the end of thehtml file, which can be concatenated to form a valid html page.If --graph is also used, this includes a graph of the accuracy which can be displayed by clicking on the type of transform.
- --gpu
Name (or sub-string) of the GPU to use
- --opencl_platform
Name (or sub-string) of the opencl platform to use (case-insensitive). Note that by default the PoCL platform is skipped, unless it is specifically requested or it is the only one available (PoCL has some issues with VkFFT for some transforms)
- --mailto
Email address the results will be sent to
- --mailto_fail
Email address the results will be sent to, only if the test fails
- --mailto_smtp
SMTP server address to mail the results
Default:
'localhost'
- --nproc
Number of parallel process to use to speed up tests. Make sure the sum of parallel process will not use too much GPU memory
Default:
[1]
- --silent
Use this to minimise the written output (note that tests can take a long time be patient
Default:
False
- --c2c
When used without --systematic, perform only c2c quick tests and skip the long r2c/dct/dst unless they were also requested.
Default:
False
- --systematic
Perform a systematic accuracy test over a range of array sizes. Without this argument a faster test (a few minutes) will be performed with selected array sizes for all possible transforms.
Default:
False
- --fast-random
Use this option to run a random percentage of the full test suite, for faster results. A number between 5 and 100 is required.
systematic
Options for --systematic:
- --axes
transform axes: x (fastest) is 1,y is 2, z is 3, e.g. '--axes 1', '--axes 2 3'.The default is to perform the transform along the ndim fastest axes. Using this overrides --ndim
- --backend
Possible choices: pycuda, cupy, pyopencl
Choose single or multiple GPU backends,by default all available backends are selected.
- --bluestein, --nonradix
Only perform transform with non-radix dimensions, i.e. the largest number in the prime decomposition of each array dimension must be larger than 13
Default:
False
- --db
Save the results to an sql database. If no filename isgiven, pyvkfft-test.sql will be used. If the file alreadyexists, the results are added to the file. Fields storedinclude HOSTNAME, EPOCH, BACKEND, LANGUAGE, TRANSFORM (c2c, r2c or dct/dst1/2/3/4, AXES, ARRAY_SHAPE, NDIMS, NDIM, PRECISION, INPLACE,NORM, LUT, N, N2_FFT, N2_IFFT, NI_FFT, NI_IFFT, TOLERANCE,DT_APP, DT_FFT, DT_IFFT, SRC_UNCHANGED_FFT, SRC_UNCHANGED_IFFT, GPU_NAME, SUCCESS, ERROR, VKFFT_ERROR_CODE
- --dct
Possible choices: 1, 2, 3, 4
Test direct cosine transforms (default is c2c): '--dct' (defaults to dct 2), '--dct 1'
Default:
False
- --dst
Possible choices: 1, 2, 3, 4
Test direct sine transforms (default is c2c): '--dst' (defaults to dst 2), '--dst 1'
Default:
False
- --double
Use double precision (float64/complex128) instead of single
Default:
False
- --dry-run
Perform a dry-run, printing the number of array shapes to test
Default:
False
- --inplace
Use inplace transforms
Default:
False
- --graph
Save the graph of the accuracy as a function of the sizeto the given filename (if no name is given, it will be automatically generated).Requires matplotlib, and scipy for linear regression.
- --lut
Force the use of a LUT for the transform, to improve accuracy. By default VkFFT will activate the LUT on some GPU with less accurate accelerated trigonometric functions. This is automatically true for double precision
Default:
False
- --max-nb-tests
Maximum number of tests. If the number of generated test cases is larger, the program will abort.
Default:
[1000]
- --ndim
Possible choices: 1, 2, 3, 12, 123
Number of dimensions for the transform. Using 12 or 123 will result in testing bother 1 and 2 or 1,2 and 3. It isrecommended to use --range_mb and
Default:
[1]
- --norm
Possible choices: 0, 1
Normalisation to test (must be 1 for dct or dst)
Default:
[1]
- --ref-long-double
Use long double precision for the reference calculation, (requires scipy). This gives more objective accuracy plots but can be slower (or much slower on some architectures).
Default:
False
- --r2c
Test real-to-complex transform (default is c2c)
Default:
False
- --fstride
Test F-ordered arrays (default is C-ordered). Not supported for DCT/DST
Default:
False
- --radix
Possible choices: 2, 3, 5, 7, 11, 13
Perform only radix transforms. If no value is given, all available radix transforms are allowed. Alternatively a list can be given: '--radix 2' (only 2**n array sizes), '--radix 2 3 5' (only 2**N1 * 3**N2 * 5**N3)
- --radix-max-pow
For radix runs, specify the maximum exponent of each base integer, i.e. for '--radix 2 3 --radix-max-pow 2' will limit lengths to 2**N1 * 3**N2 with N1,N2<=2
- --range
Range of array lengths [min, max] along each transform dimension, '--range 2 128'
Default:
[2, 128]
- --range-mb
Allowed range of array sizes [min, max] in Mbytes, e.g. '--range-mb 2 128'. This can be used to limit the arrays size while allowing large lengths along individual dimensions. It can also be used to separate runs with a given size range and different nproc values. This takes into account the type (single or double), and also whether the transform is made inplace, so this represents the total GPU memoryused.
Default:
[0, 128]
- --range-nd-narrow
Two values (drel dabs), e.g. '--range_nd_narrow 0.10 11' with 0<=drel<=1 and dabs (integer>=0) must be given to allow 2D and 3D tests to be done on arrays with different lengths along every dimension, but while limiting the difference between lengths. For example in 2D for an (N1,N2) array shape, generated lengths will verify abs(n2-n1)<max(dabs+drel*N1). The default value of (0,0) only allows the same lengths. This allows to test more diverse configurations while limiting the number of tests.
Default:
['0', '0']
- --serial
Serialise the tests instead of spawning them in separate process, allowing to diagnose more errors. Incompatible with nproc>1.
Default:
False
- --timeout
Change the timeout (in seconds) to raise a TimeOut error for individual tests. After 4 have failed, give up.
Default:
[120]
- Examples:
- pyvkfft-test
the regular test which tries the fft interface, using parallel streams (for pycuda), and C2C/R2C/DCT/DST transforms for sizes N=15,17,30,34 with 1D to 4 or 5D transforms, also N=808,2988,4200,13000,13001, 13002,130172 for 1D and 2D transforms. All tests are done with single and double precision, in and out-of-place, norm=0 and 1, and all available backends (pyopencl, pycuda and cupy). For C2C arrays up to dimension 5 are tested, with all possible combination of transform axes. That's for a total of a tens of thousands transforms, which are tested against the result of numpy, scipy or pyfftw (when available) for accuracy. The text output gives the N2 and Ninf (aka max) relative norm of the transform, with the ratio in () to the expected tolerance for both direct and inverse transforms.
- pyvkfft-test --nproc 8 --gpu v100 --mailto_fail toto@pyvkfft.org
same test, but using 8 parallel process to speed up, and use a GPU with 'v100' in its name. Also, send the results in case of a failure to the given email address
- pyvkfft-test --systematic --backend pycuda --nproc 8 --radix --range 2 10000
Perform a systematic test of C2C transforms in (by default) 1D and single precision, for N=2 to 10000, only for radix transforms
- pyvkfft-test --systematic --backend pycuda --nproc 8 --radix 2 7 11 --range 2 10000 --double
Same test, but only for radix sizes with factors 2, 7 and 11, and double accuracy
- pyvkfft-test --systematic --backend cupy --nproc 8 --bluestein --range 2 10000 --ndim 2 --lut --inplace
test with cupy backend, only non-radix 2D inplace R2C transforms
, using a lookup table( lut) for higher single precision accuracy.
- Columns in the text output:
backend
type of transform
array shape
axes for the transform. If None, axes are set by the number of transform dimensions
number of dimensions for the transform. Can be None if axes are given.
type of algorithm for each axis: r=radix, R=Rader, B=Bluestein, -=skipped axis
number of uploads for each axis: 0 if not transformed, 1 if the axis length fits in the cache and the transform can be done in 1 read+write, 2 or 3 if multi-upload is used
data type and precision
use of a Look-Up-Table (LUT) -for single precision only.
inplace or out-of-place transform
normalisation for the transform: 0 or 1
order of the array: C (fast axis is last) or F (fast axis is first)
N2 and N_inf error norm for the forward transform, with the comparison to the maximum allowed error (and in parenthesis the ratio to this maximum), and finally 0 or 1 depending on whether the source array was modified (0) or not (1)
Same values for the inverse transform
temporary buffer size allocated by VkFFT if necessary, for large transforms
status: OK, FAIL (if accuracy is above limit or source array unexpectedly changed) or ERROR (an error was raised during execution, e.g. compilation, memory,...)