Skip to content

Planning for plastimatch FDK migration

gregsharp edited this page Nov 29, 2010 · 20 revisions

Rough draft of file list

FFT ramp_filter.c

CUDA utils cuda_util.cu cuda_kernel_util.cu cuda_mem.cu

OpenCL utils autotune_opencl.cxx opencl_probe.cxx opencl_util.cxx opencl_util_nvidia.cxx

FDK fdk_main.c fdk_opts.c fdk_util.c bowtie_correction.c fdk_cuda.cu fdk_opencl.cxx

Misc stuff bstring_util.cxx dir_list.c file_util.cxx fwrite_block.c hnd_io.c logfile.c math_util.h mha_io.c print_and_exit.c proj_image.c proj_image_dir.c proj_matrix.c string_util.c threading.c plm_timer.c volume.c volume_limit.c delayload.c

Questions for discussion

  1. How does ITK FFT compare with FFTW? Especially I would like to know for multicore.

    • Note #1: for GPU we should consider native GPU FFT methods.
    • Note #2: how does convolution compare to FFT (may be superior for empirical kernels and low-res images)?
    • Simon:
      • ITK uses either vnl (default) or fftw. On 363x512x512 elekta projections, internally zero padded to 363x512x1024, I have 96 seconds for vnl and 17s for fftw. Moreover, vnl does not support non power of 2 image dimensions. I found a note about fft and itk 4, it seems to be one of their concerns.
      • I noticed in ramp_filter.c that you use a 1D fft. I have observed much better performances with 2D or 3D FFT in FFTW. Is there an easy way to benchmark plastimatch ramp filter? I could run the benchmark on the same dataset I have used in the previous point.
      • I have some issues with multicore. Without going into details, you can not call itk fftw, e.g. FFTWRealToComplexConjugateImageFilter, from another Filter ThreadedGenerateData function, this is not thread safe. I guess I can correct for that but it is not straightforward if we want to keep the ability to do both vnl and fftw. TO DISCUSS...
      • GPU: we should start with benchmarking cufftw.
      • I don't see how convolution could bring something on CPU. Do we care about speed for low-res images? It will be fast anyway.
      • http://www.fftw.org/fftw3_doc/Thread-safety.html#Thread-safety
  2. CUDA and OpenCL utils should build as library in subdirectory of RTK. Non-fdk code in plastimatch can link to that library.

    • Simon: ok for a library but why in a subdirectory?
  3. Plastimatch FDK has it's own method for specifying geometry (.txt files + .pfm files which is used for generic systems.) Do we want to keep this?

    • Simon: intuitively, no, we want to keep one good one :-). Can you tell us more about pfm files?
  4. For file/directory processing, we should perhaps migrate to ITK-stype methods.

    • Simon: I don't get this point, sorry. What processing?
  5. Any difference between PLM and RTK for hnd processing?

    • Simon: I mostly copy pasted your code, there should not be any difference.
  6. ITK v4 will (perhaps) disallow raw pointer access to images. That is potentially a performance problem.

    • Note #1: ITK stock iterators are very slow, (IMO they are unusable)
    • Note #2: We should benchmark
    • Simon:
      • Can we actually do any gpu processing without access to the raw pointer?
      • (Maybe not a problem any more. -Greg) http://www.cmake.org/Wiki/ITK_Release_4.0
      • If stock iterators are the basic ones, like itk::ImageRegionIterator, I'm surprised and I always use them. Are you sure they are slow? Yes, let's benchmark. Any specific operation in mind?
  7. PLM will require, at least temporarily, to retain the plm_image and plm_matrix methods for images and geometry. This is to maintain compatibility with DRR code. Therefore, some bridge code will be needed.

Clone this wiki locally