This is the git repository for the FFTW library for computing Fourier transforms (version 3.x), maintained by the FFTW authors.

Related tags

Miscellaneous fftw3
Overview

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)

This is the git repository for the FFTW library for computing Fourier transforms (version 3.x), maintained by the FFTW authors.

Unlike most other programs, most of the FFTW source code (in C) is generated automatically. This repository contains the generator and it does not contain the generated code. YOU WILL BE UNABLE TO COMPILE CODE FROM THIS REPOSITORY unless you have special tools and know what you are doing. In particular, do not expect things to work by simply executing configure; make or cmake.

Most users should ignore this repository, and should instead download official tarballs from http://fftw.org/, which contain the generated code, do not require any special tools or knowledge, and can be compiled on any system with a C compiler.

Advanced users and FFTW maintainers may obtain code from github and run the generation process themselves. See README for details.

Comments
  • Thread Safe Planner

    Thread Safe Planner

    http://www.fftw.org/fftw3_doc/Thread-safety.html mentions

    We do not think this should be an important restriction

    I hope I can convince you otherwise.

    Problem Description

    fftw3 is used by a variety of audio plugins.

    Those plugins are loaded into the host's memory-space (usually an audio workstation). The host has limited control of what the plugin does internally, and the plugins do not know about each other.

    There is no way to ensure that two independent plugins which are linked against libfftw do not run the shared planner simultaneously. Nor is there a possibility to control this on host application level.

    When two independent plugins create fftw plans the application usually segfaults or similar undesired effects manifest.

    Possible solutions for this include:

    1. Statically link plugins against libfftw. Every plugin will have its own copy. The plans are not shared with other plugins (which is mostly fine). This still requires a bit of special attention: (fftw symbol visibility needs to be overridden for static links and the plugin must protect its planning routings for multiple instances of itself). Furthermore distributors must honor that (special built of fftw + static link). -- It is very unlikely that both plugin-authors and various gnu/linux-distributors do get this right (most distros dislike static linking) for the growing number of audio-plugins using fftw.
    2. process separate all plugins in the host. That is not a viable option for Digital Audio Workstations where low-latency is important, context switches (particularly realtime thread) heavy and inter-process communication does not scale (compared to shared memory), especially so if the DAW does not limit audio track or channel count.
    3. Discourage use of fftw for audio-plugins or even refuse to load plugis using it in the host. -- not the best idea :)
    4. Ship a special (ABI compatible) build of libfftw with the host application which protects the planner. Plugins in the same memory space will use the already loaded library. This requires patching libfftw, but when doing so... why not do it upstream directly. Otherwise it has similar issues as (1).

    Discussion

    The issue at hand is not limited to audio-application, there are likely other applications with similar problems out there (gnu-octave comes to mind, but I don't know for certain).

    As the Thread-safety page mentions, it's as simple as

    wrap a semaphore lock around any calls to the planner

    Is there some good reason why libfftw does not do this by default?

    Existing applications should not be affected by this (they're not supposed to call the planner from different threads), but that change would make all the difference for multi-threaded plugin hosts.

    I suppose it could be a bit of work to wrap all planner entry-points with a semaphore, yet there may be a neat simple solution using #define.

    I'll be happy to look into this, but before going that way, I'd like to ask if such a change would be accepted by fftw or if there is an even better solution planned for future version that will make fftw's planner thread-safe.

    yours truly, robin - for the linux-audio community and for himself


    Notable audio plugins using fftw3: http://calf.sourceforge.net/ http://factorial.hu/plugins/lv2/ir http://guitarix.sourceforge.net/ http://breakfastquay.com/rubberband/ http://plugin.org.uk/ http://zynaddsubfx.sourceforge.net/ https://github.com/x42/meters.lv2 ...

    Notable affected plugin hosts: http://ardour.org/ http://qtractor.sourceforge.net/ https://github.com/falkTX/Carla/ ...

    see also https://community.ardour.org/node/8271

    opened by x42 51
  • [RFC|WIP] Proposal for full CMake support

    [RFC|WIP] Proposal for full CMake support

    Objective: Introducing a platform agnostic build system based on CMake

    As FFTW seeks to build for many different architectures, I propose here to move the build system to CMake. With the implementation presented here it is already possible to configure the project and create all generated source files. The generation requires an ocaml installation as well as a working version of ocamlbuild. Optionally the generated codelets can be formatted with clang-format. The respective tools can be located using cmake defines or the cmake gui tool. Furthermore benchmarks, tests and wisdom tool are covered. The tests have been integrated to run with CTest.

    However the support for building dynamic libraries (especially on Windows) remains open work. Therefore this PR is rather a request for comments to see if the maintainers are interested in integrating the authors work.

    opened by EikeVerdenhalven 31
  • r2r fft of 2D dataset in just one drection

    r2r fft of 2D dataset in just one drection

    Hi, I would like to perform a fft in one direction for a 2D dataset. Now I use fftw_f = fftw_plan_r2r_2d(imax,kmax, *in, *out, FFTW_R2HC,FFTW_R2HC, FFTW_MEASURE); fftw_b = fftw_plan_r2r_2d(imax,kmax, *out, *in, FFTW_HC2R,FFTW_HC2R, FFTW_MEASURE); for my forward and backwards transformation, where in & out are 2D arrays!

    Yet this performs a FFTW_R2HC & FFTW_HC2R for both directions. It would be convenient to set the second r2r kind, so that there is no transformation in this direction. According to: 4.3.6 Real-to-Real Transform Kinds there is currently no kind with this functionality. Is there another way to implement something like this?

    opened by chi86 16
  • undefined reference to fftw_solvtab_*

    undefined reference to fftw_solvtab_*

    Running arch linux.

    Error

    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2cf'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2cb'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_dft_standard'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2r'
    collect2: error: ld returned 1 exit status
    

    Logs

    $ cmake ..
    -- The C compiler identification is GNU 7.3.1
    -- The CXX compiler identification is GNU 7.3.1
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Looking for alloca.h
    -- Looking for alloca.h - found
    -- Looking for altivec.h
    -- Looking for altivec.h - not found
    -- Looking for c_asm.h
    -- Looking for c_asm.h - not found
    -- Looking for dlfcn.h
    -- Looking for dlfcn.h - found
    -- Looking for intrinsics.h
    -- Looking for intrinsics.h - not found
    -- Looking for inttypes.h
    -- Looking for inttypes.h - found
    -- Looking for libintl.h
    -- Looking for libintl.h - found
    -- Looking for limits.h
    -- Looking for limits.h - found
    -- Looking for mach/mach_time.h
    -- Looking for mach/mach_time.h - not found
    -- Looking for malloc.h
    -- Looking for malloc.h - found
    -- Looking for memory.h
    -- Looking for memory.h - found
    -- Looking for stddef.h
    -- Looking for stddef.h - found
    -- Looking for stdint.h
    -- Looking for stdint.h - found
    -- Looking for stdlib.h
    -- Looking for stdlib.h - found
    -- Looking for string.h
    -- Looking for string.h - found
    -- Looking for strings.h
    -- Looking for strings.h - found
    -- Looking for sys/types.h
    -- Looking for sys/types.h - found
    -- Looking for sys/time.h
    -- Looking for sys/time.h - found
    -- Looking for sys/stat.h
    -- Looking for sys/stat.h - found
    -- Looking for sys/sysctl.h
    -- Looking for sys/sysctl.h - found
    -- Looking for time.h
    -- Looking for time.h - found
    -- Looking for uintptr.h
    -- Looking for uintptr.h - not found
    -- Looking for unistd.h
    -- Looking for unistd.h - found
    -- Checking prototype drand48 for HAVE_DECL_DRAND48 - True
    -- Checking prototype srand48 for HAVE_DECL_SRAND48 - True
    -- Checking prototype cosl for HAVE_DECL_COSL - True
    -- Checking prototype sinl for HAVE_DECL_SINL - True
    -- Checking prototype memalign for HAVE_DECL_MEMALIGN - True
    -- Checking prototype posix_memalign for HAVE_DECL_POSIX_MEMALIGN - True
    -- Looking for clock_gettime
    -- Looking for clock_gettime - found
    -- Looking for gettimeofday
    -- Looking for gettimeofday - found
    -- Looking for getpagesize
    -- Looking for getpagesize - found
    -- Looking for drand48
    -- Looking for drand48 - found
    -- Looking for srand48
    -- Looking for srand48 - found
    -- Looking for memalign
    -- Looking for memalign - found
    -- Looking for posix_memalign
    -- Looking for posix_memalign - found
    -- Looking for mach_absolute_time
    -- Looking for mach_absolute_time - not found
    -- Looking for alloca
    -- Looking for alloca - found
    -- Looking for isnan
    -- Looking for isnan - found
    -- Looking for snprintf
    -- Looking for snprintf - found
    -- Looking for strchr
    -- Looking for strchr - found
    -- Looking for sysctl
    -- Looking for sysctl - not found
    -- Looking for cosl
    -- Looking for cosl - found
    -- Looking for sinl
    -- Looking for sinl - found
    -- Check size of float
    -- Check size of float - done
    -- Check size of double
    -- Check size of double - done
    -- Check size of int
    -- Check size of int - done
    -- Check size of long
    -- Check size of long - done
    -- Check size of long long
    -- Check size of long long - done
    -- Check size of unsigned int
    -- Check size of unsigned int - done
    -- Check size of unsigned long
    -- Check size of unsigned long - done
    -- Check size of unsigned long long
    -- Check size of unsigned long long - done
    -- Check size of size_t
    -- Check size of size_t - done
    -- Check size of ptrdiff_t
    -- Check size of ptrdiff_t - done
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/anubis/git/fftw3/build
    
    $ make -j9
    Scanning dependencies of target fftw3
    [  1%] Building C object CMakeFiles/fftw3.dir/api/execute-dft.c.o
    [  2%] Building C object CMakeFiles/fftw3.dir/api/execute-dft-r2c.c.o
    [  2%] Building C object CMakeFiles/fftw3.dir/api/execute-r2r.c.o
    [  2%] Building C object CMakeFiles/fftw3.dir/api/execute-split-dft-c2r.c.o
    [  1%] Building C object CMakeFiles/fftw3.dir/api/execute-dft-c2r.c.o
    [  2%] Building C object CMakeFiles/fftw3.dir/api/execute-split-dft-r2c.c.o
    [  2%] Building C object CMakeFiles/fftw3.dir/api/configure.c.o
    [  4%] Building C object CMakeFiles/fftw3.dir/api/apiplan.c.o
    [  4%] Building C object CMakeFiles/fftw3.dir/api/execute-split-dft.c.o
    [  4%] Building C object CMakeFiles/fftw3.dir/api/execute.c.o
    [  5%] Building C object CMakeFiles/fftw3.dir/api/export-wisdom-to-file.c.o
    [  5%] Building C object CMakeFiles/fftw3.dir/api/export-wisdom-to-string.c.o
    [  5%] Building C object CMakeFiles/fftw3.dir/api/f77api.c.o
    [  6%] Building C object CMakeFiles/fftw3.dir/api/export-wisdom.c.o
    [  6%] Building C object CMakeFiles/fftw3.dir/api/flops.c.o
    [  7%] Building C object CMakeFiles/fftw3.dir/api/forget-wisdom.c.o
    [  7%] Building C object CMakeFiles/fftw3.dir/api/import-system-wisdom.c.o
    [  8%] Building C object CMakeFiles/fftw3.dir/api/import-wisdom-from-file.c.o
    [  9%] Building C object CMakeFiles/fftw3.dir/api/import-wisdom.c.o
    [  9%] Building C object CMakeFiles/fftw3.dir/api/import-wisdom-from-string.c.o
    [  9%] Building C object CMakeFiles/fftw3.dir/api/map-r2r-kind.c.o
    [  9%] Building C object CMakeFiles/fftw3.dir/api/malloc.c.o
    [ 10%] Building C object CMakeFiles/fftw3.dir/api/mapflags.c.o
    [ 10%] Building C object CMakeFiles/fftw3.dir/api/mkprinter-file.c.o
    [ 11%] Building C object CMakeFiles/fftw3.dir/api/mkprinter-str.c.o
    [ 12%] Building C object CMakeFiles/fftw3.dir/api/mktensor-iodims.c.o
    [ 12%] Building C object CMakeFiles/fftw3.dir/api/mktensor-iodims64.c.o
    [ 12%] Building C object CMakeFiles/fftw3.dir/api/mktensor-rowmajor.c.o
    [ 12%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-1d.c.o
    [ 13%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-2d.c.o
    [ 13%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-3d.c.o
    [ 14%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-c2r-1d.c.o
    [ 14%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-c2r-2d.c.o
    [ 14%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-c2r-3d.c.o
    [ 15%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-c2r.c.o
    [ 15%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-r2c-1d.c.o
    [ 16%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-r2c-2d.c.o
    [ 16%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-r2c-3d.c.o
    [ 17%] Building C object CMakeFiles/fftw3.dir/api/plan-dft-r2c.c.o
    [ 17%] Building C object CMakeFiles/fftw3.dir/api/plan-dft.c.o
    [ 18%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-dft-c2r.c.o
    [ 18%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-dft-r2c.c.o
    [ 18%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-dft.c.o
    [ 19%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-r2r.c.o
    [ 19%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-split-dft-c2r.c.o
    [ 20%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-split-dft-r2c.c.o
    [ 20%] Building C object CMakeFiles/fftw3.dir/api/plan-guru-split-dft.c.o
    [ 20%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-dft-c2r.c.o
    [ 21%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-dft-r2c.c.o
    [ 22%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-split-dft-c2r.c.o
    [ 22%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-dft.c.o
    [ 22%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-r2r.c.o
    [ 22%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-split-dft-r2c.c.o
    [ 23%] Building C object CMakeFiles/fftw3.dir/api/plan-guru64-split-dft.c.o
    [ 23%] Building C object CMakeFiles/fftw3.dir/api/plan-many-dft-c2r.c.o
    [ 24%] Building C object CMakeFiles/fftw3.dir/api/plan-many-dft-r2c.c.o
    [ 25%] Building C object CMakeFiles/fftw3.dir/api/plan-many-dft.c.o
    [ 25%] Building C object CMakeFiles/fftw3.dir/api/plan-many-r2r.c.o
    [ 25%] Building C object CMakeFiles/fftw3.dir/api/plan-r2r-1d.c.o
    [ 25%] Building C object CMakeFiles/fftw3.dir/api/plan-r2r-2d.c.o
    [ 25%] Building C object CMakeFiles/fftw3.dir/api/plan-r2r.c.o
    [ 26%] Building C object CMakeFiles/fftw3.dir/api/plan-r2r-3d.c.o
    [ 27%] Building C object CMakeFiles/fftw3.dir/api/print-plan.c.o
    [ 27%] Building C object CMakeFiles/fftw3.dir/api/rdft2-pad.c.o
    [ 28%] Building C object CMakeFiles/fftw3.dir/api/the-planner.c.o
    [ 28%] Building C object CMakeFiles/fftw3.dir/api/version.c.o
    [ 28%] Building C object CMakeFiles/fftw3.dir/dft/bluestein.c.o
    [ 29%] Building C object CMakeFiles/fftw3.dir/dft/buffered.c.o
    [ 29%] Building C object CMakeFiles/fftw3.dir/dft/conf.c.o
    [ 30%] Building C object CMakeFiles/fftw3.dir/dft/ct.c.o
    [ 30%] Building C object CMakeFiles/fftw3.dir/dft/dftw-direct.c.o
    [ 30%] Building C object CMakeFiles/fftw3.dir/dft/dftw-directsq.c.o
    [ 31%] Building C object CMakeFiles/fftw3.dir/dft/dftw-generic.c.o
    [ 32%] Building C object CMakeFiles/fftw3.dir/dft/dftw-genericbuf.c.o
    [ 32%] Building C object CMakeFiles/fftw3.dir/dft/direct.c.o
    [ 32%] Building C object CMakeFiles/fftw3.dir/dft/generic.c.o
    [ 33%] Building C object CMakeFiles/fftw3.dir/dft/indirect-transpose.c.o
    [ 33%] Building C object CMakeFiles/fftw3.dir/dft/indirect.c.o
    [ 33%] Building C object CMakeFiles/fftw3.dir/dft/kdft-dif.c.o
    [ 34%] Building C object CMakeFiles/fftw3.dir/dft/kdft-difsq.c.o
    [ 34%] Building C object CMakeFiles/fftw3.dir/dft/kdft-dit.c.o
    [ 35%] Building C object CMakeFiles/fftw3.dir/dft/kdft.c.o
    [ 35%] Building C object CMakeFiles/fftw3.dir/dft/nop.c.o
    [ 36%] Building C object CMakeFiles/fftw3.dir/dft/plan.c.o
    [ 36%] Building C object CMakeFiles/fftw3.dir/dft/problem.c.o
    [ 36%] Building C object CMakeFiles/fftw3.dir/dft/rader.c.o
    [ 36%] Building C object CMakeFiles/fftw3.dir/dft/solve.c.o
    [ 37%] Building C object CMakeFiles/fftw3.dir/dft/rank-geq2.c.o
    [ 38%] Building C object CMakeFiles/fftw3.dir/dft/vrank-geq1.c.o
    [ 38%] Building C object CMakeFiles/fftw3.dir/dft/zero.c.o
    [ 39%] Building C object CMakeFiles/fftw3.dir/dft/scalar/n.c.o
    [ 39%] Building C object CMakeFiles/fftw3.dir/dft/scalar/t.c.o
    [ 40%] Building C object CMakeFiles/fftw3.dir/kernel/align.c.o
    [ 40%] Building C object CMakeFiles/fftw3.dir/kernel/alloc.c.o
    [ 40%] Building C object CMakeFiles/fftw3.dir/kernel/assert.c.o
    [ 41%] Building C object CMakeFiles/fftw3.dir/kernel/awake.c.o
    [ 41%] Building C object CMakeFiles/fftw3.dir/kernel/cpy1d.c.o
    [ 41%] Building C object CMakeFiles/fftw3.dir/kernel/buffered.c.o
    [ 42%] Building C object CMakeFiles/fftw3.dir/kernel/cpy2d-pair.c.o
    [ 42%] Building C object CMakeFiles/fftw3.dir/kernel/cpy2d.c.o
    [ 43%] Building C object CMakeFiles/fftw3.dir/kernel/ct.c.o
    [ 43%] Building C object CMakeFiles/fftw3.dir/kernel/debug.c.o
    [ 44%] Building C object CMakeFiles/fftw3.dir/kernel/extract-reim.c.o
    [ 44%] Building C object CMakeFiles/fftw3.dir/kernel/hash.c.o
    [ 44%] Building C object CMakeFiles/fftw3.dir/kernel/iabs.c.o
    [ 45%] Building C object CMakeFiles/fftw3.dir/kernel/kalloc.c.o
    [ 45%] Building C object CMakeFiles/fftw3.dir/kernel/md5-1.c.o
    [ 46%] Building C object CMakeFiles/fftw3.dir/kernel/md5.c.o
    [ 46%] Building C object CMakeFiles/fftw3.dir/kernel/minmax.c.o
    [ 47%] Building C object CMakeFiles/fftw3.dir/kernel/ops.c.o
    [ 47%] Building C object CMakeFiles/fftw3.dir/kernel/pickdim.c.o
    [ 47%] Building C object CMakeFiles/fftw3.dir/kernel/plan.c.o
    [ 48%] Building C object CMakeFiles/fftw3.dir/kernel/planner.c.o
    [ 48%] Building C object CMakeFiles/fftw3.dir/kernel/primes.c.o
    [ 48%] Building C object CMakeFiles/fftw3.dir/kernel/problem.c.o
    [ 49%] Building C object CMakeFiles/fftw3.dir/kernel/print.c.o
    [ 49%] Building C object CMakeFiles/fftw3.dir/kernel/rader.c.o
    [ 50%] Building C object CMakeFiles/fftw3.dir/kernel/scan.c.o
    [ 50%] Building C object CMakeFiles/fftw3.dir/kernel/solver.c.o
    [ 51%] Building C object CMakeFiles/fftw3.dir/kernel/solvtab.c.o
    [ 51%] Building C object CMakeFiles/fftw3.dir/kernel/stride.c.o
    [ 52%] Building C object CMakeFiles/fftw3.dir/kernel/tensor.c.o
    [ 52%] Building C object CMakeFiles/fftw3.dir/kernel/tensor1.c.o
    [ 53%] Building C object CMakeFiles/fftw3.dir/kernel/tensor3.c.o
    [ 53%] Building C object CMakeFiles/fftw3.dir/kernel/tensor2.c.o
    [ 53%] Building C object CMakeFiles/fftw3.dir/kernel/tensor4.c.o
    [ 53%] Building C object CMakeFiles/fftw3.dir/kernel/tensor7.c.o
    [ 54%] Building C object CMakeFiles/fftw3.dir/kernel/tensor5.c.o
    [ 55%] Building C object CMakeFiles/fftw3.dir/kernel/tensor8.c.o
    [ 55%] Building C object CMakeFiles/fftw3.dir/kernel/tile2d.c.o
    [ 55%] Building C object CMakeFiles/fftw3.dir/kernel/tensor9.c.o
    [ 56%] Building C object CMakeFiles/fftw3.dir/kernel/timer.c.o
    [ 56%] Building C object CMakeFiles/fftw3.dir/kernel/transpose.c.o
    [ 57%] Building C object CMakeFiles/fftw3.dir/kernel/trig.c.o
    [ 57%] Building C object CMakeFiles/fftw3.dir/kernel/twiddle.c.o
    [ 57%] Building C object CMakeFiles/fftw3.dir/rdft/buffered.c.o
    [ 57%] Building C object CMakeFiles/fftw3.dir/rdft/conf.c.o
    [ 58%] Building C object CMakeFiles/fftw3.dir/rdft/buffered2.c.o
    [ 59%] Building C object CMakeFiles/fftw3.dir/rdft/ct-hc2c-direct.c.o
    [ 59%] Building C object CMakeFiles/fftw3.dir/rdft/ct-hc2c.c.o
    [ 60%] Building C object CMakeFiles/fftw3.dir/rdft/dft-r2hc.c.o
    [ 60%] Building C object CMakeFiles/fftw3.dir/rdft/dht-r2hc.c.o
    [ 60%] Building C object CMakeFiles/fftw3.dir/rdft/dht-rader.c.o
    [ 61%] Building C object CMakeFiles/fftw3.dir/rdft/direct-r2c.c.o
    [ 61%] Building C object CMakeFiles/fftw3.dir/rdft/direct-r2r.c.o
    [ 62%] Building C object CMakeFiles/fftw3.dir/rdft/direct2.c.o
    [ 62%] Building C object CMakeFiles/fftw3.dir/rdft/generic.c.o
    [ 62%] Building C object CMakeFiles/fftw3.dir/rdft/hc2hc-generic.c.o
    [ 63%] Building C object CMakeFiles/fftw3.dir/rdft/hc2hc-direct.c.o
    [ 63%] Building C object CMakeFiles/fftw3.dir/rdft/hc2hc.c.o
    [ 64%] Building C object CMakeFiles/fftw3.dir/rdft/indirect.c.o
    [ 64%] Building C object CMakeFiles/fftw3.dir/rdft/khc2c.c.o
    [ 65%] Building C object CMakeFiles/fftw3.dir/rdft/khc2hc.c.o
    [ 65%] Building C object CMakeFiles/fftw3.dir/rdft/kr2c.c.o
    [ 66%] Building C object CMakeFiles/fftw3.dir/rdft/kr2r.c.o
    [ 66%] Building C object CMakeFiles/fftw3.dir/rdft/nop.c.o
    [ 67%] Building C object CMakeFiles/fftw3.dir/rdft/nop2.c.o
    [ 67%] Building C object CMakeFiles/fftw3.dir/rdft/plan.c.o
    [ 67%] Building C object CMakeFiles/fftw3.dir/rdft/plan2.c.o
    [ 68%] Building C object CMakeFiles/fftw3.dir/rdft/problem.c.o
    [ 68%] Building C object CMakeFiles/fftw3.dir/rdft/problem2.c.o
    [ 69%] Building C object CMakeFiles/fftw3.dir/rdft/rank-geq2-rdft2.c.o
    [ 69%] Building C object CMakeFiles/fftw3.dir/rdft/rank-geq2.c.o
    [ 69%] Building C object CMakeFiles/fftw3.dir/rdft/rank0-rdft2.c.o
    [ 70%] Building C object CMakeFiles/fftw3.dir/rdft/rank0.c.o
    [ 71%] Building C object CMakeFiles/fftw3.dir/rdft/rdft-dht.c.o
    [ 71%] Building C object CMakeFiles/fftw3.dir/rdft/rdft2-inplace-strides.c.o
    [ 71%] Building C object CMakeFiles/fftw3.dir/rdft/rdft2-rdft.c.o
    [ 72%] Building C object CMakeFiles/fftw3.dir/rdft/rdft2-tensor-max-index.c.o
    [ 72%] Building C object CMakeFiles/fftw3.dir/rdft/rdft2-strides.c.o
    [ 72%] Building C object CMakeFiles/fftw3.dir/rdft/solve.c.o
    [ 73%] Building C object CMakeFiles/fftw3.dir/rdft/solve2.c.o
    [ 73%] Building C object CMakeFiles/fftw3.dir/rdft/vrank-geq1-rdft2.c.o
    [ 74%] Building C object CMakeFiles/fftw3.dir/rdft/vrank-geq1.c.o
    [ 74%] Building C object CMakeFiles/fftw3.dir/rdft/vrank3-transpose.c.o
    [ 74%] Building C object CMakeFiles/fftw3.dir/rdft/scalar/hc2c.c.o
    [ 75%] Building C object CMakeFiles/fftw3.dir/rdft/scalar/hfb.c.o
    [ 76%] Building C object CMakeFiles/fftw3.dir/rdft/scalar/r2c.c.o
    [ 76%] Building C object CMakeFiles/fftw3.dir/rdft/scalar/r2r.c.o
    [ 76%] Building C object CMakeFiles/fftw3.dir/reodft/conf.c.o
    [ 77%] Building C object CMakeFiles/fftw3.dir/reodft/reodft00e-splitradix.c.o
    [ 77%] Building C object CMakeFiles/fftw3.dir/reodft/redft00e-r2hc-pad.c.o
    [ 77%] Building C object CMakeFiles/fftw3.dir/reodft/redft00e-r2hc.c.o
    [ 78%] Building C object CMakeFiles/fftw3.dir/reodft/reodft010e-r2hc.c.o
    [ 78%] Building C object CMakeFiles/fftw3.dir/reodft/reodft11e-r2hc-odd.c.o
    [ 79%] Building C object CMakeFiles/fftw3.dir/reodft/reodft11e-r2hc.c.o
    [ 79%] Building C object CMakeFiles/fftw3.dir/reodft/reodft11e-radix2.c.o
    [ 79%] Building C object CMakeFiles/fftw3.dir/reodft/rodft00e-r2hc-pad.c.o
    [ 80%] Building C object CMakeFiles/fftw3.dir/reodft/rodft00e-r2hc.c.o
    [ 80%] Building C object CMakeFiles/fftw3.dir/simd-support/altivec.c.o
    [ 81%] Building C object CMakeFiles/fftw3.dir/simd-support/avx-128-fma.c.o
    [ 81%] Building C object CMakeFiles/fftw3.dir/simd-support/avx.c.o
    [ 82%] Building C object CMakeFiles/fftw3.dir/simd-support/avx2.c.o
    [ 82%] Building C object CMakeFiles/fftw3.dir/simd-support/avx512.c.o
    [ 82%] Building C object CMakeFiles/fftw3.dir/simd-support/kcvi.c.o
    [ 83%] Building C object CMakeFiles/fftw3.dir/simd-support/neon.c.o
    [ 84%] Building C object CMakeFiles/fftw3.dir/simd-support/sse2.c.o
    [ 84%] Building C object CMakeFiles/fftw3.dir/simd-support/taint.c.o
    [ 84%] Building C object CMakeFiles/fftw3.dir/simd-support/vsx.c.o
    [ 85%] Linking C shared library libfftw3.so
    [ 85%] Built target fftw3
    Scanning dependencies of target bench
    [ 86%] Building C object CMakeFiles/bench.dir/libbench2/after-rcopy-to.c.o
    [ 86%] Building C object CMakeFiles/bench.dir/libbench2/allocate.c.o
    [ 86%] Building C object CMakeFiles/bench.dir/libbench2/after-rcopy-from.c.o
    [ 86%] Building C object CMakeFiles/bench.dir/libbench2/after-hccopy-to.c.o
    [ 87%] Building C object CMakeFiles/bench.dir/libbench2/after-hccopy-from.c.o
    [ 87%] Building C object CMakeFiles/bench.dir/libbench2/after-ccopy-from.c.o
    [ 87%] Building C object CMakeFiles/bench.dir/libbench2/bench-cost-postprocess.c.o
    [ 88%] Building C object CMakeFiles/bench.dir/libbench2/aset.c.o
    [ 88%] Building C object CMakeFiles/bench.dir/libbench2/after-ccopy-to.c.o
    [ 88%] Building C object CMakeFiles/bench.dir/libbench2/bench-main.c.o
    [ 89%] Building C object CMakeFiles/bench.dir/libbench2/bench-exit.c.o
    [ 89%] Building C object CMakeFiles/bench.dir/libbench2/dotens2.c.o
    [ 89%] Building C object CMakeFiles/bench.dir/libbench2/caset.c.o
    [ 90%] Building C object CMakeFiles/bench.dir/libbench2/can-do.c.o
    [ 91%] Building C object CMakeFiles/bench.dir/libbench2/info.c.o
    [ 91%] Building C object CMakeFiles/bench.dir/libbench2/main.c.o
    [ 92%] Building C object CMakeFiles/bench.dir/libbench2/mflops.c.o
    [ 92%] Building C object CMakeFiles/bench.dir/libbench2/mp.c.o
    [ 93%] Building C object CMakeFiles/bench.dir/libbench2/ovtpvt.c.o
    [ 93%] Building C object CMakeFiles/bench.dir/libbench2/my-getopt.c.o
    [ 93%] Building C object CMakeFiles/bench.dir/libbench2/pow2.c.o
    [ 94%] Building C object CMakeFiles/bench.dir/libbench2/problem.c.o
    [ 94%] Building C object CMakeFiles/bench.dir/libbench2/report.c.o
    [ 95%] Building C object CMakeFiles/bench.dir/libbench2/speed.c.o
    [ 95%] Building C object CMakeFiles/bench.dir/libbench2/tensor.c.o
    [ 95%] Building C object CMakeFiles/bench.dir/libbench2/timer.c.o
    [ 96%] Building C object CMakeFiles/bench.dir/libbench2/util.c.o
    [ 96%] Building C object CMakeFiles/bench.dir/libbench2/verify-dft.c.o
    [ 97%] Building C object CMakeFiles/bench.dir/libbench2/verify-lib.c.o
    [ 97%] Building C object CMakeFiles/bench.dir/libbench2/verify-r2r.c.o
    [ 98%] Building C object CMakeFiles/bench.dir/libbench2/verify-rdft2.c.o
    [ 98%] Building C object CMakeFiles/bench.dir/libbench2/verify.c.o
    [ 98%] Building C object CMakeFiles/bench.dir/libbench2/zero.c.o
    [ 99%] Building C object CMakeFiles/bench.dir/tests/bench.c.o
    [ 99%] Building C object CMakeFiles/bench.dir/tests/hook.c.o
    [100%] Building C object CMakeFiles/bench.dir/tests/fftw-bench.c.o
    [100%] Linking C executable bench
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2cf'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2cb'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_dft_standard'
    /usr/local/bin/ld: libfftw3.so.3: undefined reference to `fftw_solvtab_rdft_r2r'
    collect2: error: ld returned 1 exit status
    make[2]: *** [CMakeFiles/bench.dir/build.make:1006: bench] Error 1
    make[1]: *** [CMakeFiles/Makefile2:68: CMakeFiles/bench.dir/all] Error 2
    make: *** [Makefile:141: all] Error 2
    
    opened by FFY00 13
  • IBM POWER 8 arch support

    IBM POWER 8 arch support

    Hello!

    There has been news that IBM would (help) optimize FFTW3 to their new H/W architecture, how is it going? Would the altivec code for older POWER H/W work in this new architecture ?

    TIA, Fabricio

    opened by fcannini 13
  • multithreaded code and fftwf_destroy_plan() SIGSEGV

    multithreaded code and fftwf_destroy_plan() SIGSEGV

    I have a number of hardware threads executing simultaneously on different unrelated data. That is, they are not operating concurrently on different parts of the same data buffer.

    • Each thread allocates its own buffer via fftwf_malloc().
    • Wisdom is loaded from disk via fftwf_import_wisdom_from_filename().
    • A plan is initialized for calculating via the halfcomplex format via fftwf_plan_r2r_1d(..., ..., ..., FFTW_R2HC, FFTW_EXHAUSTIVE).
    • If wisdom was not loaded from disk, it is saved via fftwf_export_wisdom_to_filename().
    • After the above construction steps are completed, the transform is repeatedly updated in a loop as new signal data is fed into the thread's input window and fftwf_execute() is called after each update.
    • After the work loop is completed, the signal data is freed via fftwf_free(), the plan is cleaned up with fftwf_destroy_plan(), and lastly fftwf_cleanup().

    This works fine for a single thread, but as soon as I have multiple threads performing the above, the cleanup fails on the fftwf_destroy_plan() raising a SIGSEGV.

    I'm aware FFTW has some functions to assist with thread safety, but the documentation I saw wasn't clear to me on what to do in the scenario where each thread is operating on totally independent data and plans.

    I am using my distro's fftw 3.3.8-2 on amd64 with single precision floating point.

    opened by kiplingw 12
  • fftw_plan_dft_c2r_2d chopping off residual imaginary parts

    fftw_plan_dft_c2r_2d chopping off residual imaginary parts

    I ran into an annoying problem that was coming from the fact that the fftw_plan_dft_c2r_2d"chops off" the residual imaginary parts from the rounding error of the IFT. To solve the problem I had to switch to fftw_plan_dft_2dand work with complex arrays. Everything is explained in detail in this Stackoverflow question and answer that I wrote. I think that this behaviour of fftw_plan_dft_c2r_2dshould be mentioned in the guide, as it can lead to problems when used recursively in a sum as I did.

    opened by francescoboc 12
  • windows cmake

    windows cmake

    2> Creating C:/bin/fftw37/Release/fftw3.lib and obj C:/bin/fftw37/Release/fftw3.exp 2>conf.c.obj : error LNK2019: unresolved _fftw_solvtab_dft_standard,symbols _fftw_dft_conf_standard 2>conf.c.obj : error LNK2019: unresolved _fftw_solvtab_rdft_r2cf,symbols _fftw_rdft_conf_standard 2>conf.c.obj : error LNK2019: unresolved _fftw_solvtab_rdft_r2cb,symbols _fftw_rdft_conf_standard 2>conf.c.obj : error LNK2019: unresolved _fftw_solvtab_rdft_r2r,symbols _fftw_rdft_conf_standard 2>C:\bin\fftw37\Release\fftw3.dll : fatal error LNK1120: 4 Unresolved external symbols

    opened by sunjunlishi 12
  • Add support for cmake-based builds

    Add support for cmake-based builds

    We are trying to add a Windows recipe on conda-forge for fftw3, see the discussion here. Would it be possible to work something out if there is a config.h.in that we can make CMake use? Or, is there an easier alternative that would be recommended?

    opened by shadowwalkersb 12
  • FFTW 3.3.5 --enable-avx build failed by using pgcc

    FFTW 3.3.5 --enable-avx build failed by using pgcc

    Hello all!

    I tried to build FFTW version 3.3.5 with --enable-avx flag by using PGI-16.9 compiler and got the following errors:

    PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx.h: 262) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx.h: 271) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx.h: 280) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx.h: 289)

    Using --enable-avx2 provides another errors of the same type: PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 102) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 103) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 104) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 105) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 90) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 91) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 94) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 95) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 96) PGC-S-0094-Illegal type conversion required (./../common/n1fv_10.c: 98) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 100) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 101) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 102) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 106) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 107) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 108) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 83) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 86) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 87) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 88) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 89) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 90) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 91) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 94) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 95) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 96) PGC-S-0094-Illegal type conversion required (./../common/n1fv_11.c: 97) PGC-S-0094-Illegal type conversion required (./../common/n1fv_3.c: 54) PGC-S-0094-Illegal type conversion required (./../common/n1fv_5.c: 64) PGC-S-0094-Illegal type conversion required (./../common/n1fv_5.c: 65) PGC-S-0094-Illegal type conversion required (./../common/n1fv_5.c: 68) PGC-S-0094-Illegal type conversion required (./../common/n1fv_5.c: 70) PGC-S-0094-Illegal type conversion required (./../common/n1fv_5.c: 71) PGC-S-0094-Illegal type conversion required (./../common/n1fv_6.c: 69) PGC-S-0094-Illegal type conversion required (./../common/n1fv_6.c: 71) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 70) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 71) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 72) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 73) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 75) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 76) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 79) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 80) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 81) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 82) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 85) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 86) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 87) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 88) PGC-S-0094-Illegal type conversion required (./../common/n1fv_7.c: 89) PGC-S-0094-Illegal type conversion required (./../common/n1fv_8.c: 80) PGC-S-0094-Illegal type conversion required (./../common/n1fv_8.c: 81) PGC-S-0094-Illegal type conversion required (./../common/n1fv_8.c: 82) PGC-S-0094-Illegal type conversion required (./../common/n1fv_8.c: 83) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 103) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 104) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 105) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 106) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 107) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 108) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 110) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 111) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 83) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 87) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 89) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 93) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 94) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 95) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 96) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 97) PGC-S-0094-Illegal type conversion required (./../common/n1fv_9.c: 98) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 263) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 273) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 278) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 282) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 301) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 341) PGC-S-0094-Illegal type conversion required (../../../simd-support/simd-avx2.h: 349)

    Is any way to overcome this issue, except disabling avx and avx2 ?

    opened by DarthLaran 11
  • Quantum Espresso test case fails with segmentation fault in FFTW

    Quantum Espresso test case fails with segmentation fault in FFTW

    Hi, Quantum Espresso (QE) v6.5 is failing with segmentation fault in FFTW when QE is configured with --enable-parallel and --enable-openmp. Please check the exact details.

    Steps to reproduce this issue :-

    (This issue can be reproduced on Ubuntu 19.10, openMPI v4.0.5, GCC 9.2.1.)

    1. Download QE v6.5 package from https://github.com/QEF/q-e/releases/tag/qe-6.5
    2. Configure qe-6.5:-
      ./configure --enable-parallel --enable-openmp --with-scalapack=no --with-netlib CC=${MPICC} FC=${MPIFC} F77=${MPIF77} LDFLAGS="-pthread -fopenmp" CFLAGS="-O3 -ffast-math -mavx2" FFLAGS="-O3 -ffast-math -mavx2" FCFLAGS="-O3 -ffast-math -mavx2" FFT_LIBS="-L$FFTW_HOME/lib -lfftw3 -lfftw3_omp" MPI_LIBS="-I$MPI_HOME/include -pthread --enable-new-dtags -L$MPI_HOME/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi" --prefix=$INSTALL_DIR
    3. make all
    4. make install
    5. Unzip the attachment "vc-relax6.zip" and keep the input test file "vc-relax6.in" in your directory from where you are executing pw.x binary of qe-6.5
    6. Run pw.x as :- mpirun -np 2 -x OMP_NUM_THREADS=1 $INSTALL_DIR/bin/pw.x -in vc-relax6.in Please note that this issue appears only for MPI ranks equal or greater than 2.

    Upon running pw.x with the test case, segmentation fault is encountered as below :-

    Thread 1 "pw.x" received signal SIGSEGV, Segmentation fault.

    0x00007ffff7ab78f2 in awake (ego_=0x555557294d40, wakefulness=SLEEPY) at dftw-direct.c:130 130 X(twiddle_awake)(wakefulness, &ego->td, ego->slv->desc->tw,

    (gdb) bt #0 0x00007ffff7ab78f2 in awake (ego_=0x555557294d40, wakefulness=SLEEPY) at dftw-direct.c:130 #1 0x00007ffff7aae8aa in fftw_plan_awake (ego=0x555557294d40, wakefulness=SLEEPY) at plan.c:66 #2 0x00007ffff7aae8aa in fftw_plan_awake (ego=0x55555727f4a0, wakefulness=SLEEPY) at plan.c:66 #3 0x00007ffff7aae8aa in fftw_plan_awake (ego=0x55555727f5c0, [email protected]=SLEEPY) at plan.c:66 #4 0x00007ffff7b7234c in fftw_destroy_plan (p=0x555557278260) at apiplan.c:565 #5 0x00007ffff7b7273c in dfftw_destroy_plan_ (p=) at f77funcs.h:34 #6 0x0000555555a99e08 in fft_scalar_fftw3::init_plan () at fft_scalar.FFTW3.f90:151 #7 fft_scalar_fftw3::cft_1z (c=..., nsl=200, nz=20, ldz=20, isign=1, cout=...) at fft_scalar.FFTW3.f90:104 #8 0x0000555555a9bcf6 in fft_parallel::tg_cft3s (f=..., dfft=..., isgn=1) at fft_parallel.f90:121 #9 0x0000555555a95211 in invfft_y (fft_kind=<error reading variable: value requires 3943370 bytes, which is more than max-value-size>, f=..., dfft=..., howmany=<error reading variable: Cannot access memory at address 0x0>, [email protected]=3) at fft_fwinv.f90:66 #10 0x0000555555694fed in setlocal () at setlocal.f90:96 #11 0x00005555557a14e8 in hinit0 () at hinit0.f90:83 #12 0x00005555555fdc6d in init_run () at init_run.f90:113 #13 0x0000555555679540 in reset_gvectors () at run_pwscf.f90:325 #14 run_pwscf (exit_status=0) at run_pwscf.f90:228 #15 0x0000555555564f6a in pwscf () at pwscf.f90:103 #16 0x0000555555564c8f in main (argc=, argv=) at pwscf.f90:40 #17 0x00007ffff739b1e3 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6 #18 0x0000555555564cbe in _start () at pwscf.f90:40

    • This segentation fault is not encountered if QE's pw.x is not configured with --enable-openmp.
    • The solver data ego->slv is invalid at this point in awake() and it gives segmentation fault while accessing its elements.
    • If QE's pw.x binary is run with "mpirun -np 1", it has no issues.

    Any thoughts if this is actually a FFTW issue OR a possible QE's bug in using FFTW's init_plan and destroy_plan functions? In case its a FFTW issue, it needs to be fixed for QE test case to pass.

    vc-relax6.zip

    With Regards, S. Biplab Raut

    bug 
    opened by BiplabRaut 10
  • Clarify current/future cmake support

    Clarify current/future cmake support

    The wording in the readme is rather contradictory with how the current status of the project is. I.e.:

    • You can build the project from cmake, and it is suggested in order to get native support
      • If there are missing implementations specific to autotools, please enumerate them so that we can work on porting them
    • Packagers should migrate to cmake build in order to support find_pacakge() #130
    • What is the future of cmake support, and the view of migrating to it completely
    opened by LecrisUT 6
  • generic-256simd and openmp options induce results that depend on the number of threads

    generic-256simd and openmp options induce results that depend on the number of threads

    Hello,

    This issue might be related to #294.

    • FFTW version

    3.3.10

    • Obtained results

    I experienced what seems to be a bug when compiling with --generic-256simd option. I obtain different results when running in sequential and when running with multiple threads (openmp).

    • Expected result

    Identical results whatever the number of threads.

    Here is a piece of code that allows to reproduce the problem. I put identical data into data1 and data2 and perform a sequential transform on the first, and a 4 threads transform on the the second. Then I get the indices where results are differents and print the corresponding values:

    #include <array>
    #include <cassert>
    #include <complex>
    #include <cstddef>
    #include <fftw3.h>
    #include <filesystem>
    #include <functional>
    #include <iostream>
    #include <numeric>
    #include <omp.h>
    #include <random>
    #include <vector>
    
    using namespace std;
    
    fftw_complex *init_rand_data(int n)
    {
        uniform_real_distribution<double> distrib(0.0, 1.0);
        mt19937 engine;
        auto gen = [&distrib, &engine]() { return distrib(engine); };
        fftw_complex *res;
        res = (fftw_complex *)fftw_malloc(sizeof(fftw_complex) * n);
        for (size_t i = 0; i < n; i++)
            res[i][0] = gen();
        return res;
    }
    
    vector<size_t> get_diff_indices(const fftw_complex *v1, const fftw_complex *v2,
                                    int n)
    {
        vector<size_t> is{};
        for (size_t i = 0; i < n; i++)
        {
            if ((v1[i][0] != v2[i][0]) || (v1[i][1] != v2[i][1]))
                is.push_back(i);
        }
        return is;
    }
    
    void diff_print(const vector<size_t> &i_diff, const fftw_complex *d1,
                    const fftw_complex *d2, int n)
    {
        for (auto const i : i_diff)
        {
            cout << "data1[" << i << "]=" << d1[i][0] << "," << d1[i][1]
                 << " data2[" << i << "]=" << d2[i][0] << "," << d2[i][1] << "\n";
        }
    }
    
    int main(int argc, char *argv[])
    {
        fftw_init_threads();
    
        const array<int, 3> N_global{256, 256, 256};
        const int N = N_global[0] * N_global[1] * N_global[2];
    
        cout << "Domain size: " << N_global[0] << "*" << N_global[1] << "*"
             << N_global[2] << "\n";
    
        auto data1 = init_rand_data(N);
        fftw_complex *data2;
        data2 = (fftw_complex *)fftw_malloc(sizeof(fftw_complex) * N);
        for (size_t i = 0; i < N; i++)
        {
            data2[i][0] = data1[i][0];
            data2[i][1] = data1[i][1];
        }
    
        cout << "data generated" << endl;
    
        /** 1 thread case */
        int nth1 = 1;
        fftw_plan_with_nthreads(nth1);
    
        auto p1 = fftw_plan_dft(3, &N_global[0], data1, data1, -1, FFTW_ESTIMATE);
    
        cout << "Running with " << nth1 << "thread(s)\n ";
        fftw_execute(p1);
    
        /** n threads case */
        int nth2 = 4;
        fftw_plan_with_nthreads(nth2);
    
        auto p2 = fftw_plan_dft(3, &N_global[0], data2, data2, -1, FFTW_ESTIMATE);
    
        cout << "Running with " << nth2 << "thread(s)\n ";
        fftw_execute(p2);
    
        /** For debug */
        // data1[76][0] = 12.0;
        // data1[98][0] = -1.0;
    
        /** Differences between data1 and data2*/
        auto diff = get_diff_indices(data1, data2, N);
    
        /** Print differences by process */
        cout << "Differences:\n ";
        diff_print(diff, data1, data2, N);
    
        cout << "End" << endl;
        fftw_destroy_plan(p1);
        fftw_free(data1);
        fftw_destroy_plan(p2);
        fftw_cleanup_threads();
    }
    

    Here is a sample of the output with the ``--generic-256simd` option:

    Domain size: 256*256*256
    data generated
    Running with 1thread(s)
     Running with 4thread(s)
     Differences:
     data1[16384]=315.297,235.74 data2[16384]=960.016,505.978
    data1[32768]=1983.18,1.27898e-13 data2[32768]=-134.363,-93.4117
    data1[49152]=315.297,-235.74 data2[49152]=529.096,-620.237
    data1[81920]=-22.5832,-181.609 data2[81920]=435.089,313.97
    data1[98304]=-606.381,233.907 data2[98304]=170.796,-30.2235
    data1[114688]=105.228,149.729 data2[114688]=-174.098,-249.013
    data1[147456]=-952.618,-164.618 data2[147456]=-2512.8,214.681
    data1[163840]=-4191.25,-735.789 data2[163840]=-405.033,-1910
    ....
    

    and the output without the--generic-256simd option:

    Domain size: 256*256*256
    data generated
    Running with 1thread(s)
     Running with 4thread(s)
     Differences:
     End
    

    I tried smaller domain sizes (128³, 64³ ...) and the problem is not visible.

    Best,

    Simon

    opened by simonlegrand 0
  • Overhead on 'new array exec funs'

    Overhead on 'new array exec funs'

    Hi, I've been processing a large number (say thousands) of arrays, each with the same, fixed, but small size (like 20 in one dimension). I believe this pertains to a scenario well-suited with 'new array execute functions' rather than the multi-threaded fftw. Still, the job was divided by arrays and sent to several threads, all of which are sharing the same one 'plan' (maintained by the master thread). I wonder if there's any bottleneck to be expected due to the overhead between the threads in accessing the 'plan'? Because I did notice the overall speed seemed to flat once >5/6 threads were used.

    thanks, gz15028

    opened by gz15028 0
  • how to set the number cpus in FFTW mpi

    how to set the number cpus in FFTW mpi

    I can botain correct results when i set the number of cpu is 1 , 2 ,4 for N = 40. but when I set the cpu=8 or 10, it gives me totally wrong results. Could you please give me some advices aboub thow to set the number of cpus.

    opened by ztdepztdep 0
  • FFTW thread safe usage for Mac M1.

    FFTW thread safe usage for Mac M1.

    Hi,

    1. While creating FFTW planner from more than one thread, it returns NULL handle. Here is the link describing thread safe usage with ‘void fftw_make_planner_thread_safe(void)’ API. Looks like it’s not available under Mac M1. In the mean time we found another document saying: fftw_make_planner_thread_safe is unnecessary in the Arm Performance Libraries implementation since all planning is thread-safe. Could you please share the correct usage to be thread safe?

    2. BTW we tried to synchronize the planner creation part only and it mitigates the issue, but it’s still doesn’t work correctly if planners created with different sizes.

    Attaching source code to replicate the issue. fftw_simple_test.zip

    Thanks in advance.

    opened by aharutyunyan1 0
Releases(fftw-3.3.7)
  • fftw-3.3.7(Oct 29, 2017)

    The files in this github page contain development files used by the FFTW developers, and are not meant for end-user consumption. DO NOT DOWNLOAD THESE FILES unless you know what you are doing. Normal users should obtain FFTW from http://fftw.org

    Source code(tar.gz)
    Source code(zip)
  • dont-use-me(Jul 31, 2016)

    The files in this github page contain development files used by the FFTW developers, and are not meant for end-user consumption. DO NOT DOWNLOAD THESE FILES unless you know what you are doing. Normal users should obtain FFTW from http://fftw.org

    Source code(tar.gz)
    Source code(zip)
Owner
FFTW
FFTW
A FASTA - Fourier Transform based Sequence in Sequence Finder

A tool that finds a nucleic sub-sequence string ( from a FASTA file ) in a FASTA file using the fourier transform.

Thomas Haschka 2 Sep 12, 2022
The official Allegro 5 git repository. Pull requests welcome!

Welcome to Allegro! Allegro is a cross-platform library mainly aimed at video game and multimedia programming. It handles common, low-level tasks such

Allegro 1.5k Dec 28, 2022
This is the Arduino® compatible port of the AIfES machine learning framework, developed and maintained by Fraunhofer Institute for Microelectronic Circuits and Systems.

AIfES for Arduino® AIfES (Artificial Intelligence for Embedded Systems) is a platform-independent and standalone AI software framework optimized for e

null 166 Jan 4, 2023
Additional components for ESP-IDF, maintained by Espressif

Espressif IDF Extra Components This repository aims to store ESP-IDF extra components which have been seperated and uploaded into IDF Component Manage

Espressif Systems 37 Jan 4, 2023
This package estimates the calibration parameters that transforms the camera frame (parent) into the lidar frame (child)

Camera-LiDAR Calibration This package estimates the calibration parameters that transforms the camera frame (parent) into the lidar frame (child). We

Australian Centre for Field Robotics 219 Jan 4, 2023
Resources and forum for the Chinese community, maintained and moderated by CoinSummer & PL.

Awesome Filecoin 请注意:本文中的链接并非全部是官方链接,部分链接是第三方链接,有些甚至是收费链接,请大家注意区分。 1. Website 1.1 浏览器 FilFox - 6Block 团队开发的 Filecoin 浏览器 Filscan - IPFS原力团队开发的 Filecoi

Filecoin 413 Jan 4, 2023
Not a big fan of git. May create a nicer repo in the future.

os My x86-64 hobby operating system. Cooperative multitasking system with no user-mode support, everything runs on ring 0 (for now). Packed with a rea

tiagoporsch 13 Sep 9, 2022
A fast Perforce to Git conversion tool written in C++ using Perforce Helix Core C++ API and Libgit2

P4 Fusion A fast Perforce depot to Git repository converter using the Helix Core C/C++ API as an attempt to mitigate the performance bottlenecks in gi

Salesforce 56 Dec 30, 2022
Standardise code formating for cmake projects with git and clang-format

git-cmake-format This project aims to provide a quick and easy way to integrate clang-format into your CMake project hosted in a git repository, it co

Kenneth Benzie 50 Dec 28, 2022
ncurses Git mirror

------------------------------------------------------------------------------- -- Copyright 2020,2021 Thomas E. Dickey

Repo mirrors 307 Dec 24, 2022
File path converter for Windows & Git Bash

windows-git-bash-path-converter Motivation Made this because it was so mad to convert path between Windows and Git Bash How to use Windows file path t

Jooho Lee 3 Mar 15, 2022
Flight rules for git

Flight rules for Git ?? English ∙ Español ∙ Русский ∙ 简体中文∙ 한국어 ∙ Tiếng Việt ∙ Français ∙ 日本語 What are "flight rules"? A guide for astronauts (now, pr

Kate Hudson 40.6k Jan 3, 2023
Open source courseware for Git and GitHub

GitHub Training Kit: Cheatsheets We ❤️ Contributors Like You! We’re eager to work with you, our user community, to improve these materials and develop

GitHub 3.8k Jan 4, 2023
Make CVE-2020-0668 exploit work for version < win10 v1903 and version >= win10 v1903

CVE-2020-0668 Made CVE-2020-0668 exploit work for version < win10 v1903 and version >= win10 v1903 Diaghub Exploit (< v1903) powershell exploit works

null 12 Nov 9, 2022
Modified version of srlua for MSVC using version 5.4 of Lua

Modified version of srlua for MSVC using version 5.4 of Lua. Quote from the original README: This is a self-running Lua interpreter. It is meant to be

Augusto Goulart 4 Jan 4, 2023
Playbit System interface defines an OS-like computing platform which can be implemented on a wide range of hosts

PlaySys The Playbit System interface PlaySys defines an OS-like computing platform which can be implemented on a wide range of hosts like Linux, BSD,

Playbit 237 Dec 1, 2022
Emergency alert and tracer for realtime high-performance computing app (work in progress, currently supported env is only Linux x86-64).

HPC Emerg Emergency alert and tracer for realtime high-performance computing app (work in progress, currently supported env is only Linux x86-64). Exa

Ammar Faizi 7 Jan 19, 2022
Ashita v4 Beta release repository. Contains the current, most up-to-date, publicly released version of the Ashita v4 beta.

Ashita v4 Beta Release This repository contains the current, most up to date and publicly released version of the Ashita v4 beta. Lead Developers Ashi

Ashita 22 Dec 27, 2022