An Aspiring Drop-In Replacement for NumPy at Scale

Overview

Legate NumPy

Legate NumPy is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the NumPy API on top of the Legion runtime. Using Legate NumPy you do things like run the final example of the Python CFD course completely unmodified on 2048 A100 GPUs in a DGX SuperPOD and achieve good weak scaling.

drawing

Legate NumPy works best for programs that have very large arrays of data that cannot fit in the memory of a single GPU or a single node and need to span multiple nodes and GPUs. While our implementation of the current NumPy API is still incomplete, programs that use unimplemented features will still work (assuming enough memory) by falling back to the canonical NumPy implementation.

If you have questions, please contact us at legate(at)nvidia.com.

  1. Dependencies
  2. Usage and Execution
  3. Supported and Planned Features
  4. Supported Types and Dimensions
  5. Documentation
  6. Future Directions
  7. Known Bugs

Dependencies

Users must have a working installation of the Legate Core library prior to installing Legate NumPy.

Legate NumPy requires Python >= 3.6. We provide a conda environment file that installs all needed dependencies in one step. Use the following command to create a conda environment with it:

conda env create -n legate -f conda/legate_numpy_dev.yml

Installation

Installation of Legate NumPy is done with either setup.py for simple uses cases or install.py for more advanced use cases. The most common installation command is:

python setup.py --with-core <path-to-legate-core-installation>

This will build Legate NumPy against the Legate Core installation and then install Legate NumPy into the same location. Users can also install Legate NumPy into an alternative location with the canonical --prefix flag as well.

python setup.py --prefix <install-dir> --with-core <path-to-legate-core-installation>

Note that after the first invocation of setup.py this repository will remember which Legate Core installation to use and the --with-core option can be omitted unless the user wants to change it.

Advanced users can also invoke install.py --help to see options for configuring Legate NumPy by invoking the install.py script directly.

Of particular interest to Legate NumPy users will likely be the option for specifying an installation of OpenBLAS to use. If you already have an installation of OpenBLAS on your machine you can inform the install.py script about its location using the --with-openblas flag:

python setup.py --with-openblas /path/to/open/blas/

Usage and Execution

Using Legate NumPy as a replacement for NumPy is easy. Users only need to replace:

import numpy as np

with:

import legate.numpy as np

These programs can then be run by the Legate driver script described in the Legate Core documentation.

legate legate_numpy_program.py

For execution with multiple nodes (assuming Legate Core is installed with GASNet support) users can supply the --nodes flag. For execution with GPUs, users can use the --gpus flags to specify the number of GPUs to use per node. We encourage all users to familiarize themselves with these resource flags as described in the Legate Core documentation or simply by passing --help to the legate driver script.

Supported and Planned Features

Legate NumPy is currently a work in progress and we are gradually adding support for additional NumPy operators. Unsupported NumPy operations will provide a warning that we are falling back to canonical NumPy. Please report unimplemented features that are necessary for attaining good performance so that we can triage them and prioritize implementation appropriately. The more users that report an unimplemented feature, the more we will prioritize it. Please include a pointer to your code if possible too so we can see how you are using the feature in context.

Supported Types and Dimensions

Legate NumPy currently supports the following NumPy types: float16, float32, float64, int16, int32, int64, uint16, uint32, uint64, bool, complex64, and complex128. Legate currently also only works on up to 3D arrays at the moment. We're currently working on support for N-D arrays. If you have a need for arrays with more than three dimensions please let us know about it.

Documentation

A complete list of available features can is provided in the API reference.

Future Directions

There are three primary directions that we plan to investigate with Legate NumPy going forward:

  • More features: we plan to identify a few key lighthouse applications and use the demands of these applications to drive the addition of new features to Legate NumPy.
  • We plan to add support for sharded file I/O for loading and storing large data sets that could never be loaded on a single node. Initially this will begin with native support for h5py but will grow to accommodate other formats needed by our lighthouse applications.
  • Strong scaling: while Legate NumPy is currently implemented in a way that enables weak scaling of codes on larger data sets, we would also like to make it possible to strong-scale Legate applications for a single problem size. This will require leveraging some of the more advanced features of Legion from inside the Python interpreter.

We are open to comments, suggestions, and ideas.

Known Bugs

  • Legate NumPy can exercise a bug in OpenBLAS when it is run with multiple OpenMP processors
  • On Mac OSX, Legate NumPy can trigger a bug in Apple's implementation of libc++. The bug has since been fixed but likely will not show up on most Apple machines for quite some time. You may have to manually patch your implementation of libc++. If you have trouble doing this please contact us and we will be able to help you.
Issues
  • Build OpenBLAS with CROSS option to prevent tests at compile time

    Build OpenBLAS with CROSS option to prevent tests at compile time

    This change would prevent OpenBLAS from running tests during compilation. This is needed when building on machines or in conditions (e.g., using docker) that cause some of the OpenBLAS tests to fail. The potential downside of this is that we would want to run with OpenBLAS checks by default when users are building in the same environment in which they will run. If we want to run OpenBLAS tests by default, we could add an option to prevent testing at build time.

    opened by marcinz 18
  • legate numpy very slow compared to Python+Numpy

    legate numpy very slow compared to Python+Numpy

    I've been testing a simple Laplace Eq. solver to compare Python+Numpy to legate.numpy and legate is hugely slower than Numpy.

    The code is taken from: https://barbagroup.github.io/essential_skills_RRC/laplace/1/ . The code I actually run is the following:

    import numpy as np
    import time
    
    
    def L2_error(p, pn):
        return np.sqrt(np.sum((p - pn)**2)/np.sum(pn**2))
    # end if
    
    
    def laplace2d(p, l2_target):
        '''Iteratively solves the Laplace equation using the Jacobi method
    
        Parameters:
        ----------
        p: 2D array of float
            Initial potential distribution
        l2_target: float
            target for the difference between consecutive solutions
    
        Returns:
        -------
        p: 2D array of float
            Potential distribution after relaxation
        '''
    
        l2norm = 1.0
        icount = 0
        tot_time = 0.0
        pn = np.empty_like(p)
        while l2norm > l2_target:
    
            start = time.perf_counter()
    
            icount = icount + 1
            pn = p.copy()
            p[1:-1,1:-1] = .25 * (pn[1:-1,2:] + pn[1:-1, :-2] \
                                  + pn[2:, 1:-1] + pn[:-2, 1:-1])
    
            ##Neumann B.C. along x = L
            p[1:-1, -1] = p[1:-1, -2]     # 1st order approx of a derivative 
            l2norm = L2_error(p, pn)
            end = time.perf_counter()
    
            tot_time = tot_time + (end-start)
    
        # end while
    
        print("l2norm = ",l2norm)
        print("icount = ",icount)
        print("Total Iteration Time = ",tot_time)
        print("   Time per iteration = ",tot_time/icount)
    
        return p
    # end if
    
    
    
    if __name__ == "__main__":
    
        nx = 401
        ny = 401
    
        # Initial conditions
        p = np.zeros((ny,nx)) ##create a XxY vector of 0's
    
        # Dirichlet boundary conditions
        x = np.linspace(0,1,nx)
        p[-1,:] = np.sin(1.5*np.pi*x/x[-1])
        del x
    
    
        start = time.time()
        p = laplace2d(p.copy(), 1e-8)
        stop = time.time()
    
        print("Elapsed time = ",(stop-start)," secs")
        print(" ")
    
    
    # end if
    

    When I run it on my laptop with Anaconda Python3 and Numpy I get the following:

    $ python3 jacobi.py 
    l2norm =  9.99986062249016e-09
    icount =  153539
    Total Iteration Time =  127.02529454990054
       Time per iteration =  0.0008273161512703648
    Elapsed time =  127.14257955551147  secs
    

    When I change the import line to legate.numpy, I usually stop the code after 15 minutes of wall time. I have let it run for up to 60 minutes and it never converges.

    As a check, I've run the Numpy code with legate itself and it exactly matches the Numpy results.

    I have been experimenting with replacing the l2norm computations with numpy specific functions (np.subtract, np.square, etc.) but I have achieved no increase in performance.

    Does anyone have any recommendations?

    Thanks!

    Jeff

    (edit by Manolis: added some formatting for the code sections)

    enhancement 
    opened by laytonjbgmail 15
  • use OpenBLAS develop branch

    use OpenBLAS develop branch

    This is clearly an issue in OpenBLAS but it blocks my Legate Numpy install and is unexpected, based on my experience with OpenBLAS in other contexts.

    [email protected]:~/LEGATE/np$ python3 ./install.py --install-dir $HOME/LEGATE --with-core $HOME/LEGATE 2>&1 | tee log
    Verbose build is  off
    Legate is installing OpenBLAS into a local directory...
    Cloning into '/tmp/tmpm780ryjm'...
    Note: switching to 'd2b11c47774b9216660e76e2fc67e87079f26fa1'.
    
    You are in 'detached HEAD' state. You can look around, make experimental
    changes and commit them, and you can discard any commits you make in this
    state without impacting any branches by switching back to a branch.
    
    If you want to create a new branch to retain commits you create, you may
    do so (now or later) by using -c with the switch command. Example:
    
      git switch -c <new-branch-name>
    
    Or undo this operation with:
    
      git switch -
    
    Turn off this advice by setting config variable advice.detachedHead to false
    
    Switched to a new branch 'master'
    getarch_2nd.c: In function ‘main’:
    getarch_2nd.c:14:35: error: ‘SGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_UNROLL_M’?
       14 |     printf("SGEMM_UNROLL_M=%d\n", SGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   SBGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:14:35: note: each undeclared identifier is reported only once for each function it appears in
    getarch_2nd.c:15:35: error: ‘SGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_UNROLL_N’?
       15 |     printf("SGEMM_UNROLL_N=%d\n", SGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   SBGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:16:35: error: ‘DGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       16 |     printf("DGEMM_UNROLL_M=%d\n", DGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:17:35: error: ‘DGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       17 |     printf("DGEMM_UNROLL_N=%d\n", DGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:21:35: error: ‘CGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       21 |     printf("CGEMM_UNROLL_M=%d\n", CGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:22:35: error: ‘CGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       22 |     printf("CGEMM_UNROLL_N=%d\n", CGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:23:35: error: ‘ZGEMM_DEFAULT_UNROLL_M’ undeclared (first use in this function); did you mean ‘XGEMM_DEFAULT_UNROLL_M’?
       23 |     printf("ZGEMM_UNROLL_M=%d\n", ZGEMM_DEFAULT_UNROLL_M);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   XGEMM_DEFAULT_UNROLL_M
    getarch_2nd.c:24:35: error: ‘ZGEMM_DEFAULT_UNROLL_N’ undeclared (first use in this function); did you mean ‘QGEMM_DEFAULT_UNROLL_N’?
       24 |     printf("ZGEMM_UNROLL_N=%d\n", ZGEMM_DEFAULT_UNROLL_N);
          |                                   ^~~~~~~~~~~~~~~~~~~~~~
          |                                   QGEMM_DEFAULT_UNROLL_N
    getarch_2nd.c:71:50: error: ‘SGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       71 |     printf("#define SLOCAL_BUFFER_SIZE\t%ld\n", (SGEMM_DEFAULT_Q * SGEMM_DEFAULT_UNROLL_N * 4 * 1 *  sizeof(float)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:72:50: error: ‘DGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       72 |     printf("#define DLOCAL_BUFFER_SIZE\t%ld\n", (DGEMM_DEFAULT_Q * DGEMM_DEFAULT_UNROLL_N * 2 * 1 *  sizeof(double)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:73:50: error: ‘CGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       73 |     printf("#define CLOCAL_BUFFER_SIZE\t%ld\n", (CGEMM_DEFAULT_Q * CGEMM_DEFAULT_UNROLL_N * 4 * 2 *  sizeof(float)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    getarch_2nd.c:74:50: error: ‘ZGEMM_DEFAULT_Q’ undeclared (first use in this function); did you mean ‘SBGEMM_DEFAULT_Q’?
       74 |     printf("#define ZLOCAL_BUFFER_SIZE\t%ld\n", (ZGEMM_DEFAULT_Q * ZGEMM_DEFAULT_UNROLL_N * 2 * 2 *  sizeof(double)));
          |                                                  ^~~~~~~~~~~~~~~
          |                                                  SBGEMM_DEFAULT_Q
    make: *** [Makefile.prebuild:74: getarch_2nd] Error 1
    Makefile:154: *** OpenBLAS: Detecting CPU failed. Please set TARGET explicitly, e.g. make TARGET=your_cpu_target. Please read README for the detail..  Stop.
    Traceback (most recent call last):
      File "./install.py", line 543, in <module>
        driver()
      File "./install.py", line 539, in driver
        install_legate_numpy(unknown=unknown, **vars(args))
      File "./install.py", line 359, in install_legate_numpy
        install_openblas(openblas_dir, thread_count, verbose)
      File "./install.py", line 143, in install_openblas
        execute_command(
      File "./install.py", line 62, in execute_command
        subprocess.check_call(args, cwd=cwd, shell=shell)
      File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
        raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command '['make', '-j', '8', 'USE_THREAD=1', 'NO_STATIC=1', 'USE_OPENMP=1', 'NUM_PARALLEL=32', 'LIBNAMESUFFIX=legate']' returned non-zero exit status 2.
    
    opened by jeffhammond 14
  • Fix reciprocal tests for zero values and improve test value customization

    Fix reciprocal tests for zero values and improve test value customization

    Reciprocal is not valid for integers, which leads to test failures on certain platforms.

    https://numpy.org/doc/stable/reference/generated/numpy.reciprocal.html

    • Splits math ufunc tests into default operations and operations needing special values
    • Enables easier customization of input values to the different tests
    • Simplifies syntax for calling tests since all tests only take a single argument
    • Uses a hash-based seed for initializing random values so that tests use the same values whether tests are run individually or all-at-once
    opened by jjwilke 13
  • Refactor test driver for cpu/gpu sharding

    Refactor test driver for cpu/gpu sharding

    This PR add support for cpu/gpu sharding so that that it is no longer necessary to unset REALM_SYNTHETIC_CORE_MAP

    Other notes:

    • Test stages were refactored and simplified. Most control was moved to base protocol class, with stage implementations responsible only for computing the sharding spec, etc.
    • --debug now also includes explicitly modified env vars in printed output
    • --fbmem command line option was added for GPU stage
    • Some missing docs were added
    • Some minimal tests of sharding computations were added, but perhaps more is advised

    Tested locally with MPI GASNet conduit, with variations of the following invocations:

    GASNET_QUIET=1 ./test.py --use=openmp --omps=2 --ompthreads=1 --launcher=mpirun --debug
    GASNET_QUIET=1 ./test.py --use=cpus --cpus 2 --launcher=mpirun --debug 
    GASNET_QUIET=1 ./test.py --use=cuda --gpus 2 --launcher=mpirun --debug
    GASNET_QUIET=1 ./test.py --use=eager --launcher=mpirun --debug
    
    opened by bryevdv 13
  • Realm not completing gather copy on the GPU

    Realm not completing gather copy on the GPU

    Problem

    Advanced indexing of a relatively huge (e.g., length 10K) 1D array returns UnboundLocalError: local variable 'shardfn' referenced before assignment, rather than NotImplementedError.

    I understand that advanced indexing is mostly not yet implemented. Most related routines raise NotImplementedError to let users know about this situation. However, this particular use case raises this different error, which seems to be a bug to me.

    To reproduce

    1. step 1: prepare test.py:
      from legate import numpy
      a = numpy.arange(10000)
      print(a[(1, 2, 3), ]) 
      
    2. step 2: run with, for example
      $ legate --cpus 1 test.py
      

    Output

    Traceback (most recent call last):
      File "<blahblah>/lib/python3.8/site-packages/legion_top.py", line 394, in legion_python_main
        run_path(args[start], run_name='__main__')
      File "<blahblah>/lib/python3.8/site-packages/legion_top.py", line 193, in run_path
        exec(code, module.__dict__, module.__dict__)
      File "./test.py", line 3, in <module>
        print(a[(1, 2, 3), ])
      File "<blahblah>/lib/python3.8/site-packages/legate/numpy/array.py", line 381, in __getitem__
        shape=None, thunk=self._thunk.get_item(key, stacklevel=2)
      File "<blahblah>/lib/python3.8/site-packages/legate/numpy/deferred.py", line 414, in get_item
        copy = Copy(mapper=self.runtime.mapper_id, tag=shardfn)
    UnboundLocalError: local variable 'shardfn' referenced before assignment
    

    Expected results

    Either [1, 2, 3] or NotImplementedError.

    Notes

    • Interestingly, smaller arrays do not have this issue. For example, if a = numpy.arange(100), the code works fine.
    • Another way to make it works is to use GPUs instead of CPUs. For example, legate --gpus 1 test.py works fine. This is interesting, as the GPU implementation seems to be more stable than CPU implementation?
    bug in progress 
    opened by piyueh 13
  • Use pytest for test running

    Use pytest for test running

    This PR convert the existing test modules to use pytest for test discovery and running. Tests were also updated for minor cleanup and to utilize pytest features (e.g. parameterize and fixtures) effectively. The end result is to afford testing options that are more familiar and ergonomic to "standard" python devs.

    Overview

    Cleanup

    A few commits perform some minor cleanup on the existing tests:

    • https://github.com/nv-legate/cunumeric/pull/297/commits/572e7f106cffb80df2d0dcd0bb17ff423d2e2b11 — remove old and outdated "universal functions" tests
    • https://github.com/nv-legate/cunumeric/pull/297/commits/5e0ff856667234f0bff4d0ff48ab08dcf192ed7c — remove a useless test
    • https://github.com/nv-legate/cunumeric/pull/297/commits/5d8a6172caa97501e795688c22d969efbe404056 — remove redundant bare return statements throughout

    File movement

    • To make it easier to use pytest for test selection, the existing tests were moved and renamed:

    • https://github.com/nv-legate/cunumeric/pull/297/commits/3f3c5cb8d3ed05a0981990e9920d851d58dd19a1 — move files to integration subdirectory

    • https://github.com/nv-legate/cunumeric/pull/297/commits/9f4f4108449f56f360e90c2f56d30569796b7c0c — prefix all test module filenames with test_

    Moving the files to a subdirectory allows this entire group of tests to be easily run by executing pytest with this directory path. The test_ prefix is to conform with pytest expectations for default test discovery.

    Pytest updates

    • https://github.com/nv-legate/cunumeric/pull/297/commits/571ea075d549da3875550aa0d28f4154af83339d — rename some helper functions to avoid test_ prefix. By default, pytest will interpret anything starting with test or Test as test to run.
    • https://github.com/nv-legate/cunumeric/pull/297/commits/76779365e25cff1c9274b17f454145267aee51b8 — a minimal update that adds pytest scaffolding to each test module and the least change to get running. For most tests this was just a couple of lines of boilerplate change. But a few tests required slightly more extensive updates.
    • https://github.com/nv-legate/cunumeric/pull/297/commits/bbda5a96336d320ade26e34cd6359acc6da6d0b7 — more in-depth changed to split out tests into sensible smaller jobs, and to utilize parameterization and fixtures.

    Notes

    • The existing test.py module runs exactly as before. The only change was to update it for the new path to the test files ~~(see note about verbose output, though)~~
    • If desired to make this PR smaller, the final commit could be removed and submitted separately.
    • There is a legate.core issue that currently required fixtures to avoid cunumeric array re-use in tests
      • https://github.com/nv-legate/legate.core/issues/205
    • In the final commit I did try to split up tests and/or use parametrize anywhere it made sense, so that more fine-grained reporting output becomes available:
      tests/integration/test_reduction_complex.py::test_sum PASSED   [ 96%]
      tests/integration/test_reduction_complex.py::test_prod PASSED  [ 96%]
      tests/integration/test_repeat.py::test_basic PASSED            [ 96%]
      tests/integration/test_repeat.py::test_axis PASSED             [ 96%]
      tests/integration/test_repeat.py::test_nd[1] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[2] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[3] PASSED            [ 96%]
      tests/integration/test_repeat.py::test_nd[4] PASSED            [ 97%]
      

    Further notes will be inline.

    Operation

    Running test.py

    The existing way to run tests with test.py is unchanged:

    ./test.py --use=cuda 
    

    This still generates the existing report:

    image

    ~~(Note: Currently test.py -v only shows stdout from test.py and not from tests. This will be straightforward to restore but I would like to do that in a follow-on PR dedicated to test.py. Test stdout can be observed with either of the methods below with the standard -s flag. cc @magnatelee)~~

    Executing individual tests

    Individual tests can still be executed by running the module with legate including by passing in runtime command line options:

    legate tests/integration/test_nonzero.py --gpus 2 -cunumeric:test
    

    However, now the output is a standard pytest report:

    image

    It is also now possible to pass standard pytests options, e.g. -v for a verbose report:

    image

    Using pytest for test discovery

    Although it is not yet possible to simply run pytest <dir> in the typical way, it is possible to achieve the same operation with a little more explicit invocation:

    legate -c "import pytest; pytest.main(['tests/integration'])" --gpus 2 -cunumeric:test 
    

    Note that all the standard pytest options, e.g. incluing -k and -m for test filtering, can be used.

    The above will run all the tests under tests/integration, and generate a standard combined pytest report:

    image

    The full report output text can be seen here:

    legate37 ❯ legate -c "import pytest; pytest.main(['tests/integration'])" --gpus 2 -cunumeric:test 
    
    WARNING: Disabling control replication for interactive run
    ========================================================================================= test session starts ==========================================================================================
    platform linux -- Python 3.7.12, pytest-7.1.1, pluggy-1.0.0
    rootdir: /home/bryan/work/cunumeric
    collected 2860 items                                                                                                                                                                                   
    
    tests/integration/test_2d_reduction.py ...                                                                                                                                                       [  0%]
    tests/integration/test_3d_reduction.py .                                                                                                                                                         [  0%]
    tests/integration/test_advanced_indexing.py .                                                                                                                                                    [  0%]
    tests/integration/test_append.py ........                                                                                                                                                        [  0%]
    tests/integration/test_argmin.py ..                                                                                                                                                              [  0%]
    tests/integration/test_array_creation.py ..............                                                                                                                                          [  1%]
    tests/integration/test_array_split.py .......                                                                                                                                                    [  1%]
    tests/integration/test_binary_op.py .....                                                                                                                                                        [  1%]
    tests/integration/test_binary_op_2d.py ....                                                                                                                                                      [  1%]
    tests/integration/test_binary_op_broadcast.py ....                                                                                                                                               [  1%]
    tests/integration/test_binary_op_complex.py .....                                                                                                                                                [  1%]
    tests/integration/test_binary_op_typing.py ..................................................................................................................................................... [  7%]
    ................................................................................................................................................................................................ [ 13%]
    ................................................................................................................................................................................................ [ 20%]
    ................................................................................................................................................................................................ [ 27%]
    ................................................................................................................................................................................................ [ 33%]
    ................................................................................................................................................................................................ [ 40%]
    ................................................................................................................................................................................................ [ 47%]
    ................................................................................................................................................................................................ [ 54%]
    ................................................................................................................................................................................................ [ 60%]
    ................................................................................................................................................................................................ [ 67%]
    ................................................................................................................................................................................................ [ 74%]
    .................................................................................................................................................................................                [ 80%]
    tests/integration/test_binary_ufunc.py .                                                                                                                                                         [ 80%]
    tests/integration/test_bincount.py ......                                                                                                                                                        [ 80%]
    tests/integration/test_block.py ............                                                                                                                                                     [ 81%]
    tests/integration/test_cholesky.py .........                                                                                                                                                     [ 81%]
    tests/integration/test_compare.py ......                                                                                                                                                         [ 81%]
    tests/integration/test_complex_ops.py ....                                                                                                                                                       [ 81%]
    tests/integration/test_concatenate_stack.py ................................................                                                                                                     [ 83%]
    tests/integration/test_contains.py ..                                                                                                                                                            [ 83%]
    tests/integration/test_convolve.py ......                                                                                                                                                        [ 83%]
    tests/integration/test_copy.py .                                                                                                                                                                 [ 83%]
    tests/integration/test_dot.py .........................                                                                                                                                          [ 84%]
    tests/integration/test_einsum.py .......................................................................................................................................................         [ 89%]
    tests/integration/test_eye.py ...............                                                                                                                                                    [ 90%]
    tests/integration/test_fill.py .                                                                                                                                                                 [ 90%]
    tests/integration/test_flatten.py .........                                                                                                                                                      [ 90%]
    tests/integration/test_flip.py ..........                                                                                                                                                        [ 91%]
    tests/integration/test_get_item.py .                                                                                                                                                             [ 91%]
    tests/integration/test_index_routines.py ............                                                                                                                                            [ 91%]
    tests/integration/test_ingest.py ....                                                                                                                                                            [ 91%]
    tests/integration/test_inlinemap-keeps-region-alive.py .                                                                                                                                         [ 91%]
    tests/integration/test_inner.py .........................                                                                                                                                        [ 92%]
    tests/integration/test_interop.py .                                                                                                                                                              [ 92%]
    tests/integration/test_intra_array_copy.py ....                                                                                                                                                  [ 92%]
    tests/integration/test_jacobi.py .                                                                                                                                                               [ 92%]
    tests/integration/test_length.py ....                                                                                                                                                            [ 92%]
    tests/integration/test_linspace.py ....                                                                                                                                                          [ 93%]
    tests/integration/test_logical.py .....s                                                                                                                                                         [ 93%]
    tests/integration/test_lstm_backward_test.py .                                                                                                                                                   [ 93%]
    tests/integration/test_lstm_simple_forward.py .                                                                                                                                                  [ 93%]
    tests/integration/test_map_reduce.py .                                                                                                                                                           [ 93%]
    tests/integration/test_mask.py ....                                                                                                                                                              [ 93%]
    tests/integration/test_matmul.py ................                                                                                                                                                [ 94%]
    tests/integration/test_nonzero.py ........                                                                                                                                                       [ 94%]
    tests/integration/test_norm.py .                                                                                                                                                                 [ 94%]
    tests/integration/test_numpy_interop.py .                                                                                                                                                        [ 94%]
    tests/integration/test_outer.py .................                                                                                                                                                [ 95%]
    tests/integration/test_overwrite_slice.py .                                                                                                                                                      [ 95%]
    tests/integration/test_randint.py ..                                                                                                                                                             [ 95%]
    tests/integration/test_reduction.py .......                                                                                                                                                      [ 95%]
    tests/integration/test_reduction_axis.py ..................                                                                                                                                      [ 96%]
    tests/integration/test_reduction_complex.py ..                                                                                                                                                   [ 96%]
    tests/integration/test_repeat.py ......                                                                                                                                                          [ 96%]
    tests/integration/test_reshape.py .................                                                                                                                                              [ 96%]
    tests/integration/test_set_item.py .                                                                                                                                                             [ 96%]
    tests/integration/test_shape.py .s                                                                                                                                                               [ 97%]
    tests/integration/test_singleton_access.py .                                                                                                                                                     [ 97%]
    tests/integration/test_slicing.py ..ss                                                                                                                                                           [ 97%]
    tests/integration/test_sort.py .                                                                                                                                                                 [ 97%]
    tests/integration/test_squeeze.py .                                                                                                                                                              [ 97%]
    tests/integration/test_swapaxes.py ....                                                                                                                                                          [ 97%]
    tests/integration/test_tensordot.py .........................                                                                                                                                    [ 98%]
    tests/integration/test_tile.py ..                                                                                                                                                                [ 98%]
    tests/integration/test_transpose.py ..                                                                                                                                                           [ 98%]
    tests/integration/test_trilu.py ..........                                                                                                                                                       [ 98%]
    tests/integration/test_unary_functions_2d.py ........                                                                                                                                            [ 99%]
    tests/integration/test_unary_functions_2d_complex.py .....                                                                                                                                       [ 99%]
    tests/integration/test_unary_ufunc.py .                                                                                                                                                          [ 99%]
    tests/integration/test_unique.py .....                                                                                                                                                           [ 99%]
    tests/integration/test_update.py .                                                                                                                                                               [ 99%]
    tests/integration/test_vdot.py ....                                                                                                                                                              [ 99%]
    tests/integration/test_view.py .                                                                                                                                                                 [ 99%]
    tests/integration/test_vstack.py ...                                                                                                                                                             [ 99%]
    tests/integration/test_where.py .....s                                                                                                                                                           [ 99%]
    tests/integration/test_window.py .                                                                                                                                                               [100%]
    
    =========================================================================================== warnings summary ===========================================================================================
    tests/integration/test_advanced_indexing.py: 124 warnings
      /home/bryan/work/cunumeric/cunumeric/array.py:773: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if key.dtype != np.bool and not np.issubdtype(
    
    tests/integration/test_advanced_indexing.py: 124 warnings
      /home/bryan/work/cunumeric/cunumeric/array.py:777: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if key.dtype != np.bool and key.dtype != np.int64:
    
    tests/integration/test_advanced_indexing.py: 30 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:425: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        and key.dtype == np.bool
    
    tests/integration/test_advanced_indexing.py: 120 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:479: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        and k.dtype == np.bool
    
    tests/integration/test_advanced_indexing.py: 120 warnings
      /home/bryan/work/cunumeric/cunumeric/deferred.py:516: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
      Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
        if k.dtype == np.bool:
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:118: RuntimeWarning: converting index array to int64 type
        assert np.array_equal(x[index], x_num[index_num])
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:122: RuntimeWarning: converting index array to int64 type
        x_num[index_num] = 3.5
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:129: UserWarning: cuNumeric performing implicit type conversion from float64 to int64
        x_num[index_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:129: RuntimeWarning: converting index array to int64 type
        x_num[index_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:559: UserWarning: cuNumeric performing implicit type conversion from float64 to int64
        z_num[indx_num] = b_num
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:609: RuntimeWarning: converting index array to int64 type
        res_num = x_num[ind_num, ind_num]
    
    tests/integration/test_advanced_indexing.py::test
      /home/bryan/work/cunumeric/tests/integration/test_advanced_indexing.py:627: UserWarning: cuNumeric performing implicit type conversion from int16 to float64
        x_num[ind_num, ind_num] = b_num
    
    tests/integration/test_cholesky.py::test_complex[8]
    tests/integration/test_cholesky.py::test_complex[9]
    tests/integration/test_cholesky.py::test_complex[255]
    tests/integration/test_cholesky.py::test_complex[512]
      /home/bryan/work/cunumeric/tests/integration/test_cholesky.py:48: UserWarning: cuNumeric performing implicit type conversion from complex128 to float64
        d[1] = b
    
    tests/integration/test_dot.py::test_dot[1-1]
    tests/integration/test_dot.py::test_dot[1-2]
    tests/integration/test_dot.py::test_dot[2-1]
    tests/integration/test_dot.py::test_dot[2-2]
      /home/bryan/work/cunumeric/tests/integration/test_dot.py:34: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        return lib.dot(*args, **kwargs)
    
    tests/integration/test_einsum.py::test_cast[->]
    tests/integration/test_einsum.py::test_cast[a->]
    tests/integration/test_einsum.py::test_cast[a,->]
    tests/integration/test_einsum.py::test_cast[a,a->]
      /home/bryan/work/cunumeric/cunumeric/array.py:123: ComplexWarning: Casting complex values to real discards the imaginary part
        *args, **kwargs
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float16 to complex64
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float32 to float16
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from float32 to complex64
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from complex64 to float16
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a->a]
    tests/integration/test_einsum.py::test_cast[a,->a]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:235: UserWarning: cuNumeric performing implicit type conversion from complex64 to float32
        cn.einsum(expr, *cn_inputs, out=cn_out)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float16 to float32
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float16 to complex64
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_einsum.py::test_cast[a,a->]
    tests/integration/test_einsum.py::test_cast[a,a->a]
    tests/integration/test_einsum.py::test_cast[a,b->ab]
    tests/integration/test_einsum.py::test_cast[ab,ca->a]
    tests/integration/test_einsum.py::test_cast[ab,ca->b]
      /home/bryan/work/cunumeric/tests/integration/test_einsum.py:228: UserWarning: cuNumeric performing implicit type conversion from float32 to complex64
        cn_res = cn.einsum(expr, *cn_inputs)
    
    tests/integration/test_flatten.py::test_basic[(1, 1)]
    tests/integration/test_flatten.py::test_basic[(1, 1, 1)]
    tests/integration/test_flatten.py::test_basic[(1, 10)]
    tests/integration/test_flatten.py::test_basic[(1, 10, 1)]
    tests/integration/test_flatten.py::test_basic[(10, 10)]
    tests/integration/test_flatten.py::test_basic[(10, 10, 10)]
      /home/bryan/work/cunumeric/tests/integration/test_flatten.py:26: RuntimeWarning: cuNumeric has not implemented reshape using Fortran-like index order and is falling back to canonical numpy. You may notice significantly decreased performance for this function call.
        c = num_arr.flatten(order)
    
    tests/integration/test_index_routines.py::test_choose_1d
      /home/bryan/work/cunumeric/tests/integration/test_index_routines.py:42: UserWarning: cuNumeric performing implicit type conversion from int64 to float64
        assert np.array_equal(
    
    tests/integration/test_sort.py::test
      /home/bryan/work/cunumeric/tests/integration/test_sort.py:23: UserWarning: cuNumeric performing implicit type conversion from complex64 to complex128
        if not num.allclose(a_np, a_num):
    
    tests/integration/test_vdot.py::test[complex64-complex128]
    tests/integration/test_vdot.py::test[complex128-complex64]
      /home/bryan/work/cunumeric/tests/integration/test_vdot.py:28: UserWarning: cuNumeric performing implicit type conversion from complex64 to complex128
        mk_0to1_array(lib, (5,), dtype=b_dtype),
    
    -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
    ============================================================================ 2855 passed, 5 skipped, 601 warnings in 41.34s ============================================================================
    

    There are some warning that show up in the output at the end, mostly about implicit casting. I am not sure if these are expected or not (I expect they would be)

    Future work

    Things that are not in this PR but are planned for follow-on PRs:

    • Using pytest to run the examples
    • ~~Restoring previous -v behaviour to test.py~~
    • Increased documentation for running tests
    • Allowing to run with simpler legate -m pytest
    opened by bryevdv 12
  • Initial unit tests

    Initial unit tests

    This PR adds some "basic" unit tests to a cubset of cunumeric modules:

    • cunumeric.coverage
    • cunumeric.patch
    • cunumeric.utils

    Currently, these tests may be run manually by executing the command

    legate -c "import pytest; pytest.main(['tests/unit', '-v'])"
    

    which will result in output similar to:

    WARNING: Disabling control replication for interactive run
    ======================================================== test session starts ========================================================
    platform linux -- Python 3.7.12, pytest-7.1.1, pluggy-1.0.0 -- /home/bryan/anaconda3/envs/legate37/bin/python3
    cachedir: .pytest_cache
    rootdir: /home/bryan/work/cunumeric
    collected 66 items                                                                                                                  
    
    tests/unit/cunumeric/test_coverage.py::test_FALLBACK_WARNING PASSED                                                           [  1%]
    tests/unit/cunumeric/test_coverage.py::test_MOD_INTERNAL PASSED                                                               [  3%]
    tests/unit/cunumeric/test_coverage.py::test_NDARRAY_INTERNAL PASSED                                                           [  4%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_empty PASSED                                               [  6%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_no_filters PASSED                                          [  7%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_name_filters PASSED                                        [  9%]
    tests/unit/cunumeric/test_coverage.py::Test_filter_namespace::test_type_filters PASSED                                        [ 10%]
    tests/unit/cunumeric/test_coverage.py::test_implemented PASSED                                                                [ 12%]
    tests/unit/cunumeric/test_coverage.py::Test_unimplemented::test_reporting_True PASSED                                         [ 13%]
    tests/unit/cunumeric/test_coverage.py::Test_unimplemented::test_reporting_False PASSED                                        [ 15%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_module::test_report_coverage_True PASSED                                    [ 16%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_module::test_report_coverage_False PASSED                                   [ 18%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_class::test_report_coverage_True PASSED                                     [ 19%]
    tests/unit/cunumeric/test_coverage.py::Test_clone_class::test_report_coverage_False PASSED                                    [ 21%]
    tests/unit/cunumeric/test_patch.py::test_no_patch PASSED                                                                      [ 22%]
    tests/unit/cunumeric/test_patch.py::test_patch PASSED                                                                         [ 24%]
    tests/unit/cunumeric/test_utils.py::test_find_last_user_stacklevel PASSED                                                     [ 25%]
    tests/unit/cunumeric/test_utils.py::test_get_line_number_from_frame PASSED                                                    [ 27%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_default_top_only PASSED                                  [ 28%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_top_only_True PASSED                                     [ 30%]
    tests/unit/cunumeric/test_utils.py::Test_find_last_user_frames::test_top_only_False PASSED                                    [ 31%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[foo] PASSED                                        [ 33%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[10] PASSED                                         [ 34%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[10.2] PASSED                                       [ 36%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value3] PASSED                                     [ 37%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value4] PASSED                                     [ 39%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value5] PASSED                                     [ 40%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[value6] PASSED                                     [ 42%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_type_bad[None] PASSED                                       [ 43%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float16] PASSED                                   [ 45%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float32] PASSED                                   [ 46%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float64] PASSED                                   [ 48%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[float] PASSED                                     [ 50%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int16] PASSED                                     [ 51%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int32] PASSED                                     [ 53%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int64] PASSED                                     [ 54%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[int] PASSED                                       [ 56%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint16] PASSED                                    [ 57%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint32] PASSED                                    [ 59%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[uint64] PASSED                                    [ 60%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[bool_] PASSED                                     [ 62%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_supported[bool] PASSED                                      [ 63%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[float128] PASSED                                [ 65%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[complex64] PASSED                               [ 66%]
    tests/unit/cunumeric/test_utils.py::Test_is_supported_dtype::test_unsupported[datetime64] PASSED                              [ 68%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape0-0] PASSED                                                    [ 69%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape1-10] PASSED                                                   [ 71%]
    tests/unit/cunumeric/test_utils.py::test_calculate_volume[shape2-6] PASSED                                                    [ 72%]
    tests/unit/cunumeric/test_utils.py::test_get_arg_dtype PASSED                                                                 [ 74%]
    tests/unit/cunumeric/test_utils.py::test_get_arg_value_dtype PASSED                                                           [ 75%]
    tests/unit/cunumeric/test_utils.py::test_dot_modes PASSED                                                                     [ 77%]
    tests/unit/cunumeric/test_utils.py::test_inner_modes PASSED                                                                   [ 78%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[0-0] PASSED                                                         [ 80%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[0-1] PASSED                                                         [ 81%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes_bad[1-0] PASSED                                                         [ 83%]
    tests/unit/cunumeric/test_utils.py::test_matmul_modes PASSED                                                                  [ 84%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[1-3-2] PASSED                                  [ 86%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[3-1-2] PASSED                                  [ 87%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_single_axis[1-1-2] PASSED                                  [ 89%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_axes_length PASSED                                         [ 90%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_negative_axes PASSED                                       [ 92%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_mismatched_axes PASSED                                     [ 93%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_bad_axes_oob PASSED                                            [ 95%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_single_axis PASSED                                             [ 96%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_tuple_axis PASSED                                              [ 98%]
    tests/unit/cunumeric/test_utils.py::Test_tensordot_modes::test_explicit_axis PASSED                                           [100%]
    
    ======================================================== 66 passed in 2.71s =========================================================
    
    
    

    Notes

    • The mock packages was added to the conda environment file. It needs to be installed (manually in an existing env, otherwise) to run the tests.
    • I haven't thought about contracting differential forms in ~20 years, so the product mode tests record a snapshot of current behavior as-is, while trying to cover error paths explicitly. I did not separately try to verify correctness.
    • Slots were removed from Runtime. They were interfering with the ability to mock Runtime methods and attributes, but also are not really appropriate in this situation. The intended purpose of __slots__ is a space-optimization in the case of many small objects.
    • I added a flag to the wrappers returned by implemented / unimplemented just to greatly streamline testing.
    • The last deprecated uses of np.bool etc, were removed.
    opened by bryevdv 10
  • Hang during destroying of the interpreter

    Hang during destroying of the interpreter

    This bug happens in the conda environment (confirmed on at least 2 different machines):

    [0 - 7ffec1256700]    7.756864 {2}{python}: destroying interpreter
    ^C
    Thread 1 "legion_python" received signal SIGINT, Interrupt.
    syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
    (gdb) info threads
      Id   Target Id         Frame
    * 1    Thread 0x7ffff11c6000 (LWP 24267) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      2    Thread 0x7fffeb062700 (LWP 24293) "legion_python" 0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
      3    Thread 0x7fffea761700 (LWP 24295) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      4    Thread 0x7fffea65b700 (LWP 24296) "legion_python" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
      9    Thread 0x7ffec1fff700 (LWP 24301) "jemalloc_bg_thd" 0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
      10   Thread 0x7ffec1256700 (LWP 24302) "legion_python" 0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7ffff11c5f30) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7ffff11c5f30) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3c37f8d in Realm::UnfairCondVar::wait (this=0x555555662130) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:874
    #4  0x00007ffff3c63a6e in Realm::KernelThreadTaskScheduler::shutdown (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:1291
    #5  0x00007ffff3fc6ca9 in Realm::LocalPythonProcessor::shutdown (this=0x555555661bf0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:736
    #6  0x00007ffff3abe104 in Realm::RuntimeImpl::wait_for_shutdown (this=0x555555607860) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:2310
    #7  0x00007ffff3ab63fe in Realm::Runtime::wait_for_shutdown (this=0x7fffffff92b8) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:636
    #8  0x00007ffff6cbe2a3 in Legion::Internal::Runtime::start (argc=3, argv=0x7fffffff9478, background=false, supply_default_mapper=true) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/cmdline.inl:29262
    #9  0x00007ffff666e257 in Legion::Runtime::start (argc=3, argv=0x7fffffff9478, background=false, default_mapper=true) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/runtime/realm/legion_context.h:7371
    #10 0x00005555555961bb in main (argc=3, argv=0x7fffffff9478) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/runtime/realm/legion_mapping.inl:217
    (gdb) thread 2
    [Switching to thread 2 (Thread 0x7fffeb062700 (LWP 24293))]
    #0  0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
    32	../sysdeps/unix/sysv/linux/accept4.c: No such file or directory.
    (gdb) bt
    #0  0x00007ffff36450c7 in accept4 (fd=29, addr=..., addr_len=0x7fffeb05ebb4, flags=524288) at ../sysdeps/unix/sysv/linux/accept4.c:32
    #1  0x00007ffff1d76cd3 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
    #2  0x00007ffff1e18bd6 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
    #3  0x00007ffff13d46db in start_thread (arg=0x7fffeb062700) at pthread_create.c:463
    #4  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 3
    [Switching to thread 3 (Thread 0x7fffea761700 (LWP 24295))]
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory.
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7fffea761630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7fffea761630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3aeef74 in Realm::BackgroundWorkThread::main_loop (this=0x55555566dc00) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:166
    #4  0x00007ffff3af2a26 in Realm::Thread::thread_entry_wrapper<Realm::BackgroundWorkThread, &Realm::BackgroundWorkThread::main_loop> (obj=0x55555566dc00) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/mutex.inl:97
    #5  0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x55555566df90) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #6  0x00007ffff13d46db in start_thread (arg=0x7fffea761700) at pthread_create.c:463
    #7  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 4
    [Switching to thread 4 (Thread 0x7fffea65b700 (LWP 24296))]
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    38	in ../sysdeps/unix/sysv/linux/x86_64/syscall.S
    (gdb) bt
    #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
    #1  0x00007ffff3c37317 in Realm::Doorbell::wait_slow (this=0x7fffea65b630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:241
    #2  0x00007ffff3af145f in Realm::Doorbell::wait (this=0x7fffea65b630) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/utils.h:81
    #3  0x00007ffff3aeef74 in Realm::BackgroundWorkThread::main_loop (this=0x5555556762f0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/timers.inl:166
    #4  0x00007ffff3af2a26 in Realm::Thread::thread_entry_wrapper<Realm::BackgroundWorkThread, &Realm::BackgroundWorkThread::main_loop> (obj=0x5555556762f0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/mutex.inl:97
    #5  0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x55555566e6b0) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #6  0x00007ffff13d46db in start_thread (arg=0x7fffea65b700) at pthread_create.c:463
    #7  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 9
    [Switching to thread 9 (Thread 0x7ffec1fff700 (LWP 24301))]
    #0  0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
    88	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
    (gdb) bt
    #0  0x00007ffff13daad3 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7ffec260a5f4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
    #1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7ffec260a638, cond=0x7ffec260a5c8) at pthread_cond_wait.c:502
    #2  __pthread_cond_wait (cond=0x7ffec260a5c8, mutex=0x7ffec260a638) at pthread_cond_wait.c:655
    #3  0x00007ffec3a9ef6b in background_thread_sleep (tsdn=<optimized out>, interval=<optimized out>, info=<optimized out>) at src/background_thread.c:232
    #4  background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized out>) at src/background_thread.c:307
    #5  background_thread0_work (tsd=<optimized out>) at src/background_thread.c:452
    #6  background_work (ind=<optimized out>, tsd=<optimized out>) at src/background_thread.c:490
    #7  background_thread_entry () at src/background_thread.c:522
    #8  0x00007ffff13d46db in start_thread (arg=0x7ffec1fff700) at pthread_create.c:463
    #9  0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    (gdb) thread 10
    [Switching to thread 10 (Thread 0x7ffec1256700 (LWP 24302))]
    #0  0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    205	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
    (gdb) bt
    #0  0x00007ffff13dd7c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7ffecc070b10) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    #1  do_futex_wait ([email protected]=0x7ffecc070b10, abstime=0x0) at sem_waitcommon.c:111
    #2  0x00007ffff13dd8b8 in __new_sem_wait_slow ([email protected]=0x7ffecc070b10, abstime=0x0) at sem_waitcommon.c:181
    #3  0x00007ffff13dd929 in __new_sem_wait ([email protected]=0x7ffecc070b10) at sem_wait.c:42
    #4  0x00007fffe374d6a2 in PyThread_acquire_lock_timed.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/thread_pthread.h:486
    #5  0x00007fffe3888e71 in acquire_timed () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Modules/_threadmodule.c:102
    #6  0x00007fffe3888f7b in lock_PyThread_acquire_lock () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Modules/_threadmodule.c:183
    #7  0x00007fffe37c4e57 in method_vectorcall_VARARGS_KEYWORDS () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/unicodeobject.c:1404
    #8  0x00007fffe37e50ff in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb7078ef0, callable=0x7fffe3201e40, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:123
    #9  PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7ffeb7078ef0, callable=0x7fffe3201e40) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:123
    #10 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, trace_info=0x7ffec1251f40, tstate=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:5867
    #11 _PyEval_EvalFrameDefault () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:4198
    #12 0x00007fffe37a6f95 in _PyEval_EvalFrame (throwflag=0, f=0x7ffeb7078d60, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:4995
    #13 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, con=<optimized out>, tstate=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/ceval.c:5065
    #14 _PyFunction_Vectorcall.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/call.c:342
    #15 0x00007fffe3738233 in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=9223372036854775808, args=0x7ffec1252168, callable=0x7fffe2f5b7f0, tstate=0x7ffeb0000bc0) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:114
    #16 PyObject_VectorcallMethod.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Objects/call.c:770
    #17 0x00007fffe387a8cc in _PyObject_CallMethodIdNoArgs (name=0x7fffe39c3970 <PyId__shutdown.15474>, self=<optimized out>) at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Include/cpython/abstract.h:239
    #18 wait_for_thread_shutdown () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/pylifecycle.c:2823
    #19 0x00007fffe38a4670 in Py_FinalizeEx.localalias () at /home/conda/feedstock_root/build_artifacts/python-split_1642146689888/work/Python/pylifecycle.c:1719
    #20 0x00007ffff3fc543f in Realm::PythonInterpreter::~PythonInterpreter (this=0x7ffecc000b20, __in_chrg=<optimized out>) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:196
    #21 0x00007ffff3fc7062 in Realm::LocalPythonProcessor::destroy_interpreter (this=0x555555661bf0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:801
    #22 0x00007ffff3fc65a8 in Realm::PythonThreadTaskScheduler::worker_terminate (this=0x555555661e10, switch_to=0x0) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:662
    #23 0x00007ffff3c6333c in Realm::ThreadedTaskScheduler::scheduler_loop (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/runtime_impl.h:1152
    #24 0x00007ffff3fc5e77 in Realm::PythonThreadTaskScheduler::python_scheduler_loop (this=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/runtime_impl.h:395
    #25 0x00007ffff3fcb10c in Realm::Thread::thread_entry_wrapper<Realm::PythonThreadTaskScheduler, &Realm::PythonThreadTaskScheduler::python_scheduler_loop> (obj=0x555555661e10) at /gpfs/fs1/mzalewski/.pyenv/versions/anaconda3-2021.11/envs/legate-dev/x86_64-conda-linux-gnu/include/c++/11.2.0/bits/threads.h:97
    #26 0x00007ffff3c3d9a8 in Realm::KernelThread::pthread_entry (data=0x7ffeccaff9c0) at /gpfs/fs1/mzalewski/repos/quickstart-collection/legate.core/legion/bindings/python/stl_map.h:774
    #27 0x00007ffff13d46db in start_thread (arg=0x7ffec1256700) at pthread_create.c:463
    #28 0x00007ffff364371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    

    The conda environment:

    _libgcc_mutex             0.1                 conda_forge    conda-forge
    _openmp_mutex             4.5                       1_gnu    conda-forge
    _sysroot_linux-64_curr_repodata_hack 3                   h5bd9786_13    conda-forge
    abseil-cpp                20210324.2           h9c3ff4c_0    conda-forge
    arrow-cpp                 6.0.1           py310h500f8fe_8_cpu    conda-forge
    aws-c-cal                 0.5.11               h95a6274_0    conda-forge
    aws-c-common              0.6.2                h7f98852_0    conda-forge
    aws-c-event-stream        0.2.7               h3541f99_13    conda-forge
    aws-c-io                  0.10.5               hfb6a706_0    conda-forge
    aws-checksums             0.1.11               ha31a3da_7    conda-forge
    aws-sdk-cpp               1.8.186              hb4091e7_3    conda-forge
    binutils_impl_linux-64    2.36.1               h193b22a_2    conda-forge
    binutils_linux-64         2.36                 hf3e587d_4    conda-forge
    bzip2                     1.0.8                h7f98852_4    conda-forge
    c-ares                    1.18.1               h7f98852_0    conda-forge
    ca-certificates           2021.10.8            ha878542_0    conda-forge
    cffi                      1.15.0          py310h0fdd8cc_0    conda-forge
    cuda-cccl                 11.6.55              hdc25635_0    nvidia
    cuda-compiler             11.6.0               hde35cc3_0    nvidia
    cuda-cudart               11.6.55              he381448_0    nvidia
    cuda-cudart-dev           11.6.55              h42ad0f4_0    nvidia
    cuda-cuobjdump            11.6.55              h9dd2d0c_0    nvidia
    cuda-cuxxfilt             11.6.55              h69de05d_0    nvidia
    cuda-driver-dev           11.6.55                       0    nvidia
    cuda-libraries-dev        11.6.0               hde35cc3_0    nvidia
    cuda-nvcc                 11.6.55              h5758ece_0    nvidia
    cuda-nvprune              11.6.55              h3791f62_0    nvidia
    cuda-nvrtc                11.6.55              hc54fff9_0    nvidia
    cuda-nvrtc-dev            11.6.55              h42ad0f4_0    nvidia
    cudatoolkit               11.6.0              habf752d_10    conda-forge
    cutensor                  1.4.0.6              h7537e88_1    conda-forge
    gcc                       11.2.0               h702ea55_4    conda-forge
    gcc_impl_linux-64         11.2.0              h82a94d6_12    conda-forge
    gcc_linux-64              11.2.0               h39a9532_4    conda-forge
    gflags                    2.2.2             he1b5a44_1004    conda-forge
    gfortran                  11.2.0               h8811e0c_4    conda-forge
    gfortran_impl_linux-64    11.2.0              h7a446d4_12    conda-forge
    gfortran_linux-64         11.2.0               h777b47f_4    conda-forge
    glog                      0.5.0                h48cff8f_0    conda-forge
    grpc-cpp                  1.42.0               ha1441d3_1    conda-forge
    gxx                       11.2.0               h702ea55_4    conda-forge
    gxx_impl_linux-64         11.2.0              h82a94d6_12    conda-forge
    kernel-headers_linux-64   3.10.0              h4a8ded7_13    conda-forge
    krb5                      1.19.2               hcc1bbae_3    conda-forge
    ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
    libblas                   3.9.0           13_linux64_openblas    conda-forge
    libbrotlicommon           1.0.9                h7f98852_6    conda-forge
    libbrotlidec              1.0.9                h7f98852_6    conda-forge
    libbrotlienc              1.0.9                h7f98852_6    conda-forge
    libcblas                  3.9.0           13_linux64_openblas    conda-forge
    libcublas                 11.8.1.74            h1e58c10_0    nvidia
    libcublas-dev             11.8.1.74            h7a51e1f_0    nvidia
    libcufft                  10.7.0.55            h563f203_0    nvidia
    libcufft-dev              10.7.0.55            h05eb8d0_0    nvidia
    libcurand                 10.2.9.55            h7c349da_0    nvidia
    libcurand-dev             10.2.9.55            hd2e71f0_0    nvidia
    libcurl                   7.81.0               h2574ce0_0    conda-forge
    libcusolver               11.3.2.107           hc875929_0    nvidia
    libcusolver-dev           11.3.2.107           h78cb71c_0    nvidia
    libcusparse               11.7.1.55            h9a152cf_0    nvidia
    libcusparse-dev           11.7.1.55            h02e612c_0    nvidia
    libedit                   3.1.20191231         he28a2e2_2    conda-forge
    libev                     4.33                 h516909a_1    conda-forge
    libevent                  2.1.10               h9b69904_4    conda-forge
    libffi                    3.4.2                h7f98852_5    conda-forge
    libgcc-devel_linux-64     11.2.0              h0952999_12    conda-forge
    libgcc-ng                 11.2.0              h1d223b6_12    conda-forge
    libgfortran-ng            11.2.0              h69a702a_11    conda-forge
    libgfortran5              11.2.0              h5c6108e_11    conda-forge
    libgomp                   11.2.0              h1d223b6_12    conda-forge
    liblapack                 3.9.0           13_linux64_openblas    conda-forge
    libnghttp2                1.46.0               h812cca2_0    conda-forge
    libnpp                    11.6.0.55            hdb0c674_0    nvidia
    libnpp-dev                11.6.0.55            h0163868_0    nvidia
    libnsl                    2.0.0                h7f98852_0    conda-forge
    libnvjpeg                 11.6.0.55            h6f17e28_0    nvidia
    libnvjpeg-dev             11.6.0.55            h0163868_0    nvidia
    libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
    libprotobuf               3.19.3               h780b84a_0    conda-forge
    libsanitizer              11.2.0              he4da1e4_12    conda-forge
    libssh2                   1.10.0               ha56f1ee_2    conda-forge
    libstdcxx-devel_linux-64  11.2.0              h0952999_12    conda-forge
    libstdcxx-ng              11.2.0              he4da1e4_12    conda-forge
    libthrift                 0.15.0               he6d91bd_1    conda-forge
    libutf8proc               2.7.0                h7f98852_0    conda-forge
    libuuid                   2.32.1            h7f98852_1000    conda-forge
    libzlib                   1.2.11            h36c2ea0_1013    conda-forge
    lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
    ncurses                   6.2                  h58526e2_4    conda-forge
    numpy                     1.22.1          py310h454958d_0    conda-forge
    openssl                   1.1.1l               h7f98852_0    conda-forge
    opt_einsum                3.3.0              pyhd8ed1ab_1    conda-forge
    orc                       1.7.2                h1be678f_0    conda-forge
    parquet-cpp               1.5.1                         2    conda-forge
    pip                       21.3.1             pyhd8ed1ab_0    conda-forge
    pyarrow                   6.0.1           py310h1a3fb3d_8_cpu    conda-forge
    pycparser                 2.21               pyhd8ed1ab_0    conda-forge
    python                    3.10.2          h62f1059_0_cpython    conda-forge
    python_abi                3.10                    2_cp310    conda-forge
    re2                       2021.11.01           h9c3ff4c_0    conda-forge
    readline                  8.1                  h46c0cb4_0    conda-forge
    s2n                       1.0.10               h9b69904_0    conda-forge
    setuptools                60.5.0          py310hff52083_0    conda-forge
    snappy                    1.1.8                he1b5a44_3    conda-forge
    sqlite                    3.37.0               h9cd32fc_0    conda-forge
    sysroot_linux-64          2.17                h4a8ded7_13    conda-forge
    tk                        8.6.11               h27826a3_1    conda-forge
    tzdata                    2021e                he74cb21_0    conda-forge
    wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
    xz                        5.2.5                h516909a_1    conda-forge
    zlib                      1.2.11            h36c2ea0_1013    conda-forge
    zstd                      1.5.2                ha95c52a_0    conda-forge
    

    @m3vaz tried it with a slightly different but quite similar environment, and he observed the same issue.

    bug 
    opened by marcinz 10
  • Using a scalar in `allclose` raises an `AttributeError`

    Using a scalar in `allclose` raises an `AttributeError`

    Problem

    Using a scalar in allclose raises AttributeError: PROJ_1D_1D_.

    To reproduce

    1. step 1: create test.py
      from legate import numpy as lnp
      import numpy as realnp
      
      # vanilla numpy works
      a = realnp.full(10, 1e-1)
      print(realnp.allclose(a, 1e-1))
      
      # legate numpy not working
      la = lnp.full(10, 1e-1)
      print(lnp.allclose(la, 1e-1))
      
    2. step 2: run test.py with legate --cpus 1 ./test.py -lg:numpy:test

    Output

    The first part that uses vanilla NumPy prints True.

    The second part that uses Legate NumPy raises:

    Traceback (most recent call last):
      File "<prefix>/lib/python3.8/site-packages/legion_top.py", line 408, in legion_python_main
        run_path(args[start], run_name='__main__')
      File "<prefix>/lib/python3.8/site-packages/legion_top.py", line 200, in run_path
        exec(code, module.__dict__, module.__dict__)
      File "./test.py", line 10, in <module>
        print(lnp.allclose(la, 1e-1))
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/module.py", line 459, in allclose
        return ndarray.perform_binary_reduction(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/array.py", line 2068, in perform_binary_reduction
        dst._thunk.binary_reduction(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/deferred.py", line 5167, in binary_reduction
        ) = self.runtime.compute_broadcast_transform(
      File "<prefix>/lib/python3.8/site-packages/legate/numpy/runtime.py", line 2500, in compute_broadcast_transform
        self.first_proj_id + getattr(NumPyProjCode, proj_name),
      File "<prefix>/lib/python3.8/enum.py", line 384, in __getattr__
        raise AttributeError(name) from None
    AttributeError: PROJ_1D_1D_
    

    Expected behavior

    Working like vanilla NumPy, or raising an exception with a clear message of what is not supported.

    opened by piyueh 10
  • Discussion PR for conda envs split

    Discussion PR for conda envs split

    cc @manopapad @magnatelee @marcinz @m3vaz

    @manopapad asked if I might make a first stab at splitting out the env files as I described earlier. This PR splits out the current single conda env dev file into per-python version files:

    environment-test-3.8.yml
    environment-test-3.9.yml
    environment-test-3.10.yml
    

    These are "kitchen sink" env files, that should usable for any dev who wants to install, run tests, and build docs locally. I think these are immediately useful to devs in order to create per-python dev environments easily.

    I think the commented groupings in these files represent the usage of the various dependencies, but please provide any feedback.

    So then the question is what further refinements or splits might be useful for CI? Some ideas:

    • an environment-build.yml This would be much more minimal, with conda-build (and ripgrep) just enough to create the conda package consumed by later build stages

    • If there is a single job responsible for building docs, then a more minimal environment-docs.yml could be created without the build/testing deps

    • Could also truly split all these into environment-dev-3.x.yml (kitchen sink, used by devs locally) and environment-test-3.x.yml (used in CI test runs, omits docs and build deps). These envs are pretty small to be honest, so personally I think that might be overkill (i.e. just use these for CI and local dev to have fewer env files to maintain)

    Notes

    • Moved docs/examples deps to pip. In my experience sphinx extensions, etc can have the most conda availability issues and I think it might make the install a but quicker as well. But could potentially be reverted to conda if desired.
    • I have verified all these files are installable via e.g.
      conda env create -n dev310 -f environment-test-3.10.yml
      

    Questions

    • Assuming these are refined for use in cunumeric what (strict?) subset should be used for legate.core? Or, just keep the two repos in sync with each other?
    • What automation/checks are reasonable to help keep env files consistent when updates are made?
    opened by bryevdv 9
  • Empty slicing is allowed in NumPy but not in cuNumeric

    Empty slicing is allowed in NumPy but not in cuNumeric

    NumPy and cuNumeric behave differently for the following example:

    a = np.array([1,2,3])                                                                                                                                                                                                                                                                                                                                                                                         
    print(a[4:5])
    

    NumPy produces an empty slice for this program, whereas cuNumeric in deferred mode raises an exception.

    bug 
    opened by magnatelee 0
  • Cannot build docs w/o cuRAND

    Cannot build docs w/o cuRAND

    Currently, if cuRAND is not available, then cunumeric/random/__init__.py will fall back to wrapping certain RNG functions from NumPy, including their docstrings. This is problematic because NumPy's docstring for random.binomial contains unreferenced footnotes, which causes our sphinx build to fail with:

    cunumeric.random.binomial:54:Footnote [1] is not referenced
    

    Note that the error message seems to be referring to cunumeric.random.binomial, but that's only because we wrap numpy.random.binomial.

    cuRAND is always available on CUDA-enabled builds, but only on certain platforms for CPU-only builds (notably, it is not available on Mac).

    Here are some potential fixes:

    • tweak the doc build to not generate references to the functions that require cuRAND, if cuRAND is not available
    • refactor the code such that all random functions are available even without cuRAND, potentially with a reduced selection of bitgenerators
    • suppress the "unreferenced footnote" warnings in sphinx
    • error-out with an informative message when building the docs if cuRAND is not available

    Note that some tests also currently fail w/o cuRAND.

    bug documentation testing 
    opened by manopapad 0
  • fixing advanced indexing operation for empty arrays

    fixing advanced indexing operation for empty arrays

    This PR fixes test cases when input/index and/or values are empty arrays It depends on https://github.com/nv-legate/legate.core/pull/319 to me merged first

    opened by ipdemes 0
Releases(v22.08.00)
  • v22.08.00(Aug 9, 2022)

    Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Adding support for ND output regions in Advanced Indexing task by @ipdemes in #370
    • added support for 'searchsorted' by @mfoerste4 in #414
    • np.packbits and np.unpackbits by @magnatelee in #427
    • Implementation of atleast_{1,2,3}d by @sbak5 in #404
    • Implementing cunumeric.random.BitGenerator by @fduguet-nv in #254
    • Adding support for some simple _indices routines by @ipdemes in #417
    • adding mask_indices routine by @ipdemes in #426
    • Random advanced distributions by @fduguet-nv in #470
    • Distributed nd sort for cpu/omp by @mfoerste4 in #437
    • Initial implementation of scan routines. by @rkarim2 in #425
    • Adding support for take_along_axis and put_along_axis by @ipdemes in #436
    • cunumeric.ndim by @magnatelee in #495
    • Add support for curand conda package build (cherry pick #510) by @marcinz in #512

    Improvements

    • Don't run the resolution logic if the arrays have the same dtype by @magnatelee in #389
    • Set cuda virtual package as hard run requirement for gpu conda package by @m3vaz in #398
    • First pass mypy typing by @bryevdv in #387
    • Generalize Dict to Mapping for newer versions of mypy by @jjwilke in #405
    • Add support for using cupy in sort.py by @robinw0928 in #395
    • Refactor test.py by @bryevdv in #378
    • Use Numpy axis normalizations where possible by @bryevdv in #419
    • More mypy by @bryevdv in #413
    • adding bounds check for advanced indexing by @ipdemes in #397
    • Report Elapsed Time in cholesky's output by @SeyedMir in #423
    • Support -vv for more verbose test output by @bryevdv in #432
    • Add typing to runtime.py by @bryevdv in #428
    • Update compress/take tests for pytest by @bryevdv in #435
    • Project down to a 1D store for the scalar reduction output by @magnatelee in #455
    • Fallback to self = np.ndarray when necessary by @bryevdv in #431
    • Add types to thunk modules by @bryevdv in #438
    • allclose detail + misc tests improvements by @bryevdv in #457
    • cunumeric.random - Adding Module-scoped functions by @fduguet-nv in #481
    • Activate the NumPy fallback for cunumeric.random in CPU build by @magnatelee in #485
    • Legacy generators for cpu build by @magnatelee in #487
    • Allow CPU build to optionally use cuRAND by @magnatelee in #498
    • Sanitize shapes in ndarray's constructor by @magnatelee in #496
    • src/cunumeric/sort: stop using std::{inclusive, exclusive}_scan by @rohany in #499
    • Update conda requirements by @manopapad in #383
    • Handle dtype/casting/out properly in contractions by @manopapad in #402
    • Missing / overzealous check_eager_args calls by @manopapad in #465
    • Strengthen some types by @manopapad in #468

    Bug Fixes

    • Add missing includes to aid intellisense providers by @trxcllnt in #382
    • Proper exception handling for cholesky by @magnatelee in #391
    • Fixes for building with setup.py outside conda, primarily Mac by @jjwilke in #394
    • Use the right API to check if the store is unbound by @magnatelee in #399
    • Fix nargs for report:dump-csv by @bryevdv in #400
    • Handle empty outputs correctly in advanced indexing task by @magnatelee in #396
    • Fall back to NumPy in array_function and array_ufunc by @magnatelee in #424
    • Fix for legate data interface by @magnatelee in #429
    • Fix test_floating.py test to call sys.exit by @marcinz in #433
    • Make missing pynvml an error for GPU tests by @bryevdv in #441
    • Make the NumPy fallback work correctly in randint by @magnatelee in #450
    • Squeeze fix by @magnatelee in #448
    • Correctly prune out empty tasks in binary reduction by @magnatelee in #453
    • Minor fix for indexing routines by @magnatelee in #452
    • Make DeferredArray.reshape always return a deferred array by @magnatelee in #454
    • Re-freezing conda compiler versions (#415) by @m3vaz in #449
    • Fix for floating point predicates by @magnatelee in #466
    • markdown version fix by @ipdemes in #459
    • Fixup typing regressions by @bryevdv in #471
    • Remove ill-defined advanced indexing test case by @magnatelee in #484
    • Handle empty inputs correctly in local scan tasks by @magnatelee in #491
    • Handle an unknown in a tuple correctly in reshape by @magnatelee in #490
    • fix mismatched size_t/uint64_t types by @jjwilke in #475
    • Allow scalar cunumeric ndarrays as array indices by @manopapad in #479

    Documentation

    • adding new version for documentations by @ipdemes in #447
    • Updates to api_compare.py by @bryevdv in #456
    • Be stricter applying CuWrapperMetadata by @bryevdv in #463
    • Add custom nitpicky ref checks for cunumeric APIs by @bryevdv in #462
    • Docs coverage check by @bryevdv in #469
    • Fix the API reference for random functions and scan operators by @magnatelee in #497

    New Contributors

    • @jjwilke made their first contribution in https://github.com/nv-legate/cunumeric/pull/394
    • @SeyedMir made their first contribution in https://github.com/nv-legate/cunumeric/pull/423
    • @fduguet-nv made their first contribution in https://github.com/nv-legate/cunumeric/pull/254
    • @rkarim2 made their first contribution in https://github.com/nv-legate/cunumeric/pull/425
    • @rohany made their first contribution in https://github.com/nv-legate/cunumeric/pull/499

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.02...v22.08.00

    Source code(tar.gz)
    Source code(zip)
  • v22.05.02(Jun 21, 2022)

    This hotfix release fixes issues in conda recipes.

    What's Changed

    • Cherry pick: Update conda requirements (#383) by @marcinz in https://github.com/nv-legate/cunumeric/pull/406
    • Cherry pick: Set cuda virtual package as hard run requirement for conda gpu package (#398) by @marcinz in https://github.com/nv-legate/cunumeric/pull/407
    • Cherry pick: Fix nargs for report:dump-csv (#400) by @marcinz in https://github.com/nv-legate/cunumeric/pull/408
    • Re-freezing conda compiler versions by @m3vaz in https://github.com/nv-legate/cunumeric/pull/415

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.01...v22.05.02

    Source code(tar.gz)
    Source code(zip)
  • v22.05.01(Jun 16, 2022)

    This hotfix release updates the conda build recipe to make the cuNumeric package depend on the right version of NumPy and also fixes a bug in the command-line argument parser.

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.05.00...v22.05.01

    Source code(tar.gz)
    Source code(zip)
  • v22.05.00(Jun 7, 2022)

    Release 22.05 features complete support for advanced indexing and related indexing routines (compress and take), a multi-node multi-GPU sorting implementation for multi-dimensional ndarrays, window functions, several matrix/tensor operations (trace, matrix_power, multi_dot, and einsum_path) and primitive support for FFT on a single GPU using cuFFT.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • thrust allocator for sort by @mfoerste4 in #228
    • implementation of np.block w/ a test by @sbak5 in #213
    • Window functions by @magnatelee in #283
    • Advanced indexing by @ipdemes in #235
    • First implementation of single-GPU FFT using cuFFT by @mferreravila in #238
    • Use the stream pool in Legate core by @magnatelee in #295
    • Add partition api and utilize sort backend by @mfoerste4 in #287
    • implementing TRACE operation by @ipdemes in #263
    • adding support for negative indices in advanced indexing by @ipdemes in #322
    • Add cpu-only packages to the conda variants by @m3vaz in #330
    • Bump minpy to 3.8 (conda env and recipe) by @bryevdv in #332
    • Remaning ufuncs by @magnatelee in #315
    • Logic functions by @magnatelee in #347
    • Slicing-based np.block implementation by @sbak5 in #306
    • Implement matrix_power by @manopapad in #360
    • Distributed N-dimensional sort by @mfoerste4 in #316
    • Implement einsum_path by @manopapad in #361
    • adding diag_indices and diag_indices_from routines by @ipdemes in #367
    • Implement moveaxis by @manopapad in #364
    • Implement __array_function __and array_ufunc by @manopapad in #353
    • Implement more norm cases by @manopapad in #366
    • Implement multi_dot by @manopapad in #358
    • Adding support for "indices" routine by @ipdemes in #368
    • Support axis=None and keepdims=True/False in argmin and argmax by @trxcllnt in #346

    Improvements

    • Move the ufunc module (ported to branch-22.05) by @magnatelee in #242
    • Use ufuncs in special methods by @magnatelee in #247
    • Initial unit tests by @bryevdv in #229
    • Revise type coercion by @magnatelee in #264
    • adding 'only' option to the tests.py by @ipdemes in #248
    • Updates for using the new unbound store API by @magnatelee in #265
    • Don't run the resolution logic if the arrays have the same dtype (ported to 22.05) by @magnatelee in #390
    • Use find_packages for installation by @magnatelee in #269
    • Some misc tests and types by @bryevdv in #268
    • Forward-port #257 by @manopapad in #273
    • Split up sort.cu for parallel compilation by @magnatelee in #277
    • Debugging checks by @magnatelee in #281
    • Update example programs by @magnatelee in #289
    • Bump up NumPy version by @magnatelee in #291
    • Don't use constexpr for window functions by @magnatelee in #294
    • Better error message on unsupported complex reductions by @manopapad in #300
    • handle coverage wrapping uniformly including ufuncs by @bryevdv in #272
    • Architecture-agnostic check for int128 by @manopapad in #293
    • Unit test fixups by @bryevdv in #303
    • reduce testcases for partition test by @mfoerste4 in #304
    • Adding conda build recipe files by @marcinz in #274
    • Use pytest for test running by @bryevdv in #297
    • Add unit tests to test.py by @marcinz in #305
    • Change _cunumeric_implemented into a dataclass by @manopapad in #318
    • Pass reporting explicity to coverage decorators by @bryevdv in #333
    • FFT refactoring by @magnatelee in #310
    • Declare ufunc formatter to be safe for parallel read by @magnatelee in #335
    • Force installation of Lapack in OpenBLAS build by @marcinz in #266
    • Mark no out-of-range indices for copies by @magnatelee in #336
    • Discussion PR for conda envs split by @bryevdv in #326
    • Use 64-bit integers for global thread ids by @magnatelee in #349
    • Use legate.core arg parsing by @bryevdv in #343
    • adding compress and take operations by @ipdemes in #296
    • Conda recipes improvements by @marcinz in #345
    • Misc small updates by @bryevdv in #352
    • adding performance tests for indexing routines by @ipdemes in #337
    • Add support for using cupy by @robinw0928 in #373

    Bug Fixes

    • Forward port late commits from 22.03 by @bryevdv in #241
    • Catch up the ufunc renaming (ported to 22.05) by @magnatelee in #244
    • Activate the cuBLAS workaround by checking the cuBLAS version at runtime (ported to 22.05) by @magnatelee in #246
    • fix large shape >int32 by @mfoerste4 in #236
    • Fix a compile error by @magnatelee in #251
    • Fix the out-of-bounds bug in reshape by @magnatelee in #267
    • add missing comparison functions by @bryevdv in #278
    • Fix nonzero by @magnatelee in #285
    • fix return value of ndarray.argsort by @mfoerste4 in #286
    • Fix typos in tests after pytest transition by @manopapad in #309
    • Update trace.py tests / fix some warnings by @bryevdv in #307
    • Don't dump test stdout unconditionally by @bryevdv in #314
    • Add typing_extensions requirement to conda recipe by @marcinz in #325
    • Fix pytest exit to fail on errors by @marcinz in #334
    • Fixing #321 issue by @ipdemes in #341
    • Missing arguments in cases of eager-to-deferred fallback by @manopapad in #348
    • Add a missing instance of share=True by @manopapad in #350
    • Fix return types for some of the unary ops by @magnatelee in #354
    • fixing compile-time warnings by @ipdemes in #351
    • Remove special case handling for scalar arrays by @manopapad in #357
    • Fix the bug in np.append test on empty input array and non-empty scalars by @sbak5 in #365
    • Match NumPy's behavior for isclose(inf,inf) by @manopapad in #372
    • Fix unary reductions by @magnatelee in #369
    • Allow DeferredThunks to be created for empty arrays by @manopapad in #371
    • Fix documentation building by @manopapad in #377
    • Make the example programs pass the CI by @magnatelee in #380

    Documentation

    • Comparison table update by @ipdemes in #252
    • Add user-facing docs for coverage reporting by @bryevdv in #261
    • creating script for calculating API coverage by categories by @ipdemes in #271
    • Doc update by @magnatelee in #275
    • Fix docs builds for trace by @manopapad in #308
    • fixing documentation for fft by @ipdemes in #302
    • Add a custom autodoc class for ufuncs by @bryevdv in #317
    • Refactor comparison table as Sphinx extension by @bryevdv in #323
    • lgpatch docs + doc fixups by @bryevdv in #356

    New Contributors

    • @mferreravila made their first contribution in https://github.com/nv-legate/cunumeric/pull/238
    • @m3vaz made their first contribution in https://github.com/nv-legate/cunumeric/pull/330
    • @robinw0928 made their first contribution in https://github.com/nv-legate/cunumeric/pull/373

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.03.00...v22.05.00

    Source code(tar.gz)
    Source code(zip)
  • v22.03.00(Apr 5, 2022)

    Release 22.03 adds several new features, including np.repeat, np.unique, np.inner, np.outer, and 35 new universal functions (ufuncs). In this release, we also have significantly revised and refactored tensor operations to make them comprehensive. Preliminary support for 1D array sorting for multi-GPU execution is available. (CPU and OpenMP paths are still single processor only.) We have also made performance improvements for np.convolve and np.tril/trilu for GPU execution. Finally, we have added a tool that reports cuNumeric’s API coverage for a given NumPy program execution. (For the usage, please refer to “Measuring API coverage” in the cuNumeric documentation.)

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Sort pr by @mfoerste4 in #199
    • Add basic cunumeric.patch module by @bryevdv in #225
    • adding support for REPEAT operation by @ipdemes in #190
    • np.unique implementation by @magnatelee in #192
    • np.append & ndarray.flatten by @sbak5 in #196
    • General cuFFT plan cache by @magnatelee in #195
    • Tools for checking API coverage by @magnatelee in #191
    • Overhaul linear algebra operations by @manopapad in #217

    Improvements

    • Move the ufunc module by @magnatelee in #234
    • ufunc refactoring + a bunch of missing ufuncs by @magnatelee in #223
    • Expand coverage reporting to ndarray methods by @bryevdv in #219
    • Einsum benchmark improvements by @manopapad in #222
    • Remove old-style casts by @manopapad in #218
    • Optimize np.tril used in Cholesky by @magnatelee in #214
    • Add a convergence threshold argument to the cg example by @marcinz in #221
    • Make sure nonzero produces outputs in C order by @magnatelee in #216
    • API cleanup for ndarray by @bryevdv in #209
    • Minor improvement for diag by @magnatelee in #211
    • Stop using alloca by @magnatelee in #212
    • Port and refactor GH #140 "Use cufft callbacks for better performance on fft-based convolutions" by @magnatelee in #204

    Bug Fixes

    • Activate the cuBLAS workaround by checking the cuBLAS version at runtime by @magnatelee in #245
    • Catch up the ufunc renaming by @magnatelee in #243
    • Fix coverage for ufuncs by @bryevdv in #240
    • Fix docs breakage by @bryevdv in #239
    • Fix compilation errors on clang by @manopapad in #233
    • Add cunumeric.ufunc to packages by @bryevdv in #231
    • Fix trailing comma tuple bug by @bryevdv in #230
    • Fix the build issue with Thrust by @magnatelee in #227
    • Fix some docs breakage by @bryevdv in #224
    • Fix for #208 by @magnatelee in #210
    • Fix for #206 by @magnatelee in #207
    • Fixed bugs for 1D array inputs on vstack , dstack and column_stack by @sbak5 in #182

    Documentation

    • Add docstrings to ndarray methods by @bryevdv in #205
    • Clean up Sphinx warnings by @bryevdv in #202
    • adding versions to the documentation by @ipdemes in #198
    • adding script for comparing API coverage + table at the documentation page by @ipdemes in #193
    • User facing documentation for API usage tool by @bryevdv in https://github.com/nv-legate/cunumeric/pull/262

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v22.01.00...v22.03.00

    Source code(tar.gz)
    Source code(zip)
  • v22.01.00(Feb 10, 2022)

    Release 22.01 adds support for einsum expressions, logic functions and a subset of indexing and array manipulation routines.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    New Features

    • Convolution by @magnatelee and @lightsighter in #103
    • Added few universal functions and logical operations by @ipdemes in #134
    • numpy.tril and numpy.triu by @magnatelee in #144
    • Einsum operation by @manopapad in #142
    • Cholesky factorization by @magnatelee in #160
    • Implemented split routines and a test by @sbak5 in #152
    • Choose operation by @ipdemes in #146

    Improvements

    • Convolve Cache for cuFFT by @lightsighter in #109
    • Warmup iterations for Richardson-Lucy by @magnatelee in #113
    • Remove NumPyAllocation by @magnatelee in #118
    • Update for new data ingest interface by @manopapad in #105
    • Enable some temporarily commented-out tests by @manopapad in #119
    • Testcase for legate.core!94 by @manopapad in #120
    • Use built-in reduction op by @magnatelee in #136
    • Managing CUDA library contexts directly in cuNumeric by @magnatelee in #138
    • Support for cuSOLVER by @magnatelee in #139
    • Make CUDA library context cache thread safe by @magnatelee in #141
    • Use .cu for CUDA library management by @magnatelee in #145
    • Some reusable test input generators by @manopapad in #153
    • Fix Wundefined-var-template clang warning by @manopapad in #154
    • Add eager fallback mode to testing script by @manopapad in #156
    • Add eager tests by @marcinz in #157
    • Small additions to test input generators by @manopapad in #159
    • No longer need to reserve one dim for reductions by @manopapad in #161
    • Use a per-device stream cache for CUDA library calls by @magnatelee in #165
    • Simple tiling heuristic for Cholesky factorization by @magnatelee in #167
    • Fix clang-format config to include cu,cuh,inl files by @manopapad in #168
    • LEGATE_ABORT is now a statement by @magnatelee in #169
    • Preloading CUDA libraries by @magnatelee in #171
    • Use CHECK_* macros in a couple more places by @manopapad in #172
    • Fix some invocations of complex constructors by @manopapad in #173
    • Add a switch to not call tril on Cholesky outputs by @magnatelee in #174
    • Do python install on custom dir w/o eggs by @manopapad in #177
    • Refined 'tests/array_split.py' w/ more essential input shapes by @sbak5 in #178
    • WIP: adding logic for DIAGONAL by @ipdemes in #170
    • Stack and concatenate routines including subroutines by @sbak5 in #175
    • Refactoring by @magnatelee in #181

    Bug Fixes

    • Fix #111 by @magnatelee in #115
    • math.prod not available in python 3.7 by @manopapad in #129
    • Fix some compiler warnings by @magnatelee in #130
    • dot: fix error message on unsupported array dimensions by @manopapad in #133
    • Fix slot calculation in reduction kernel by @manopapad in #148
    • Port fix for #79 by @manopapad in #155
    • Build OpenBLAS with CROSS option to prevent tests at compile time by @marcinz in #158
    • Pin setuptools version, to work around breaking change by @manopapad in #164
    • Workaround for a bug in cuBLAS < 11.4 by @magnatelee in #185
    • Cannot install cuNumeric to different dir than Legate Core by @manopapad in #186
    • Adjust error tolerance for float16, to avoid spurious test failure by @manopapad in #166

    Documentation

    • Adding contributions file by @marcinz in #147
    • Update docstrings by @magnatelee in #188

    New Contributors

    • @lightsighter made their first contribution in https://github.com/nv-legate/cunumeric/pull/109
    • @ipdemes made their first contribution in https://github.com/nv-legate/cunumeric/pull/134
    • @pre-commit-ci made their first contribution in https://github.com/nv-legate/cunumeric/pull/151
    • @sbak5 made their first contribution in https://github.com/nv-legate/cunumeric/pull/152

    Full Changelog: https://github.com/nv-legate/cunumeric/compare/v21.11.00...v22.01.00

    Source code(tar.gz)
    Source code(zip)
  • v21.11.00(Nov 9, 2021)

    This is the initial public alpha release of cuNumeric, an aspiring drop-in replacement for NumPy at scale.

    Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

    What's Changed

    • Refactoring for the broadcasting logic by @magnatelee in https://github.com/nv-legate/cunumeric/pull/18
    • Improved partitioning and sharding for GEMV by @manopapad in https://github.com/nv-legate/cunumeric/pull/37
    • Fix #16 by @manopapad in https://github.com/nv-legate/cunumeric/pull/38
    • Add CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/43
    • Use a script on the runner to checkout CI repository by @marcinz in https://github.com/nv-legate/cunumeric/pull/44
    • Fix CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/45
    • Extend tests with CPU/GPU/OMP testing by @marcinz in https://github.com/nv-legate/cunumeric/pull/48
    • Remove accidental part of the job matrix from CI by @marcinz in https://github.com/nv-legate/cunumeric/pull/49
    • Add missing alignment constraints for matrix-vector multiplication by @magnatelee in https://github.com/nv-legate/cunumeric/pull/58
    • Force left alignment for pointers and references by @magnatelee in https://github.com/nv-legate/cunumeric/pull/59
    • Don't alter the GC priority for external instances by @magnatelee in https://github.com/nv-legate/cunumeric/pull/60
    • Be strict when importing legate.numpy in examples by @manopapad in https://github.com/nv-legate/cunumeric/pull/61
    • Fix for reinterpret casts that are actually unsafe in the modern c++ by @magnatelee in https://github.com/nv-legate/cunumeric/pull/62
    • Remove the return type of the void-returning function in the mapper by @magnatelee in https://github.com/nv-legate/cunumeric/pull/63
    • Remove dependency on numpy>=1.20 by @manopapad in https://github.com/nv-legate/cunumeric/pull/64
    • Stop using looping templates by @magnatelee in https://github.com/nv-legate/cunumeric/pull/65
    • Bug fix for release mode by @magnatelee in https://github.com/nv-legate/cunumeric/pull/66
    • Port nozero to the new buffer API by @magnatelee in https://github.com/nv-legate/cunumeric/pull/68
    • Missing constraint for bincount by @magnatelee in https://github.com/nv-legate/cunumeric/pull/69
    • Clean up install script by @manopapad in https://github.com/nv-legate/cunumeric/pull/70
    • Fixes to compile on MacOS by @manopapad in https://github.com/nv-legate/cunumeric/pull/71
    • Disable absolute and allcose for complex types only with Clang by @magnatelee in https://github.com/nv-legate/cunumeric/pull/72
    • Generalize the reshape operator by @magnatelee in https://github.com/nv-legate/cunumeric/pull/73
    • Improve dot product for half precision floats by @magnatelee in https://github.com/nv-legate/cunumeric/pull/74
    • Support for tensordot by @magnatelee in https://github.com/nv-legate/cunumeric/pull/75
    • Bugfixes on operations by @manopapad in https://github.com/nv-legate/cunumeric/pull/76
    • Add missing type casts for __half by @magnatelee in https://github.com/nv-legate/cunumeric/pull/77
    • Pull the correct Core image by @marcinz in https://github.com/nv-legate/cunumeric/pull/78
    • Port remaining fixes from old branch by @manopapad in https://github.com/nv-legate/cunumeric/pull/80
    • Remove remaining conditional legate.numpy imports from examples by @manopapad in https://github.com/nv-legate/cunumeric/pull/81
    • Always dump test output by @marcinz in https://github.com/nv-legate/cunumeric/pull/83
    • Minor code cleanups by @manopapad in https://github.com/nv-legate/cunumeric/pull/85
    • Attempt to address #84 by @manopapad in https://github.com/nv-legate/cunumeric/pull/86
    • Always follow the core's choice regarding CUDA/OpenMP support by @manopapad in https://github.com/nv-legate/cunumeric/pull/88
    • Fix legate data interface by @magnatelee in https://github.com/nv-legate/cunumeric/pull/92
    • Handle overlapping stores correctly in dot by @magnatelee in https://github.com/nv-legate/cunumeric/pull/93
    • Improvements to handling of scalar arrays by @manopapad in https://github.com/nv-legate/cunumeric/pull/90
    • Port to the new calling convention by @magnatelee in https://github.com/nv-legate/cunumeric/pull/89
    • Prevent CI on forks by @marcinz in https://github.com/nv-legate/cunumeric/pull/94
    • Emptiness checks for matrix ops by @magnatelee in https://github.com/nv-legate/cunumeric/pull/95
    • Mapper update by @magnatelee in https://github.com/nv-legate/cunumeric/pull/82
    • Port to the new reduction op interface by @magnatelee in https://github.com/nv-legate/cunumeric/pull/96
    • Stop using delinearization by @magnatelee in https://github.com/nv-legate/cunumeric/pull/97
    • Dead code elimination by @magnatelee in https://github.com/nv-legate/cunumeric/pull/98
    • Reorganizing source files by @magnatelee in https://github.com/nv-legate/cunumeric/pull/99
    • Remove leftover requirements.txt by @manopapad in https://github.com/nv-legate/cunumeric/pull/100
    • Update for build system changes by @manopapad in https://github.com/nv-legate/cunumeric/pull/101
    • Updates for new attachment interface by @manopapad in https://github.com/nv-legate/cunumeric/pull/102
    • Fix for matrix-vector multiplication by @magnatelee in https://github.com/nv-legate/cunumeric/pull/104
    • Another attempt to fix degenerate cases by @magnatelee in https://github.com/nv-legate/cunumeric/pull/107
    • Fix #111 by @magnatelee in https://github.com/nv-legate/cunumeric/pull/116
    • Release 21.11.00 by @marcinz in https://github.com/nv-legate/cunumeric/pull/121

    New Contributors

    • @marcinz made their first contribution in https://github.com/nv-legate/cunumeric/pull/43

    Full Changelog: https://github.com/nv-legate/cunumeric/commits/v21.11.00

    Source code(tar.gz)
    Source code(zip)
Owner
Legate
High Productivity High Performance Computing
Legate
This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Fast Face Classification (F²C) This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicit

null 33 Jun 27, 2021
FG-Net: Fast Large-Scale LiDAR Point Clouds Understanding Network Leveraging Correlated Feature Mining and Geometric-Aware Modelling

FG-Net: Fast Large-Scale LiDAR Point Clouds Understanding Network Leveraging Correlated Feature Mining and Geometric-Aware Modelling Comparisons of Running Time of Our Method with SOTA methods RandLA and KPConv:

Kangcheng LIU 68 Jun 23, 2022
Square Root Bundle Adjustment for Large-Scale Reconstruction

Square Root Bundle Adjustment for Large-Scale Reconstruction

Nikolaus Demmel 169 Aug 10, 2022
[NeurIPS 2021 Spotlight] Learning to Delegate for Large-scale Vehicle Routing

Learning to Delegate for Large-scale Vehicle Routing This directory contains the code, data, and model for our NeurIPS 2021 Spotlight paper Learning t

null 35 Aug 3, 2022
Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"

Beidi Chen 1k Jun 29, 2022
An Open-Source Analytical Placer for Large Scale Heterogeneous FPGAs using Deep-Learning Toolkit

DREAMPlaceFPGA An Open-Source Analytical Placer for Large Scale Heterogeneous FPGAs using Deep-Learning Toolkit. This work leverages the open-source A

Rachel Selina Rajarathnam 15 Jul 20, 2022
Multi-Scale Representation Learning on Proteins

Multi-Scale Representation Learning on Proteins (Under Construction and Subject to Change) Pending: Update links for dataset. This is the official PyT

Vignesh Ram Somnath 16 Aug 2, 2022
An Aspiring Drop-In Replacement for Pandas at Scale

Legate Pandas Legate Pandas is a distributed and accelerated drop-in replacement of Pandas. Legate Pandas enables high-performance, scalable execution

Legate 67 Aug 9, 2022
A drop-in replacement for std::list with 293% faster insertion, 57% faster erasure, 17% faster iteration and 77% faster sorting on average. 20-24% speed increase in use-case testing.

plf_list A drop-in replacement for std::list with (on average): 293% faster insertion 57% faster erasure 17% faster iteration 77% faster sorting 70% f

Matt Bentley 116 Jul 25, 2022
SIDKick -- the first complete SID 6581/8580-drop-in-replacement that you can build yourself

.- the first complete SID-drop-in-replacement that you can build yourself -. SIDKick is a drop-in replacement for the SID sound chips used in C64s and

null 83 Aug 2, 2022
A faster drop-in replacement for giflib. It uses more RAM, but you get more speed.

GIFLIB-Turbo What is it? A faster drop-in replacement for GIFLIB Why did you write it? Starting in the late 80's, I was fascinated with computer graph

Larry Bank 27 Jun 9, 2022
A simple "no frills" drop-in replacement PCB for the KBDfans 67mkII / 67lite

67mk_E A simple "no frills" drop-in replacement PCB for the KBDfans 67mkII / 67lite KiCAD PCB files Gerbers for PCB production JLCPCB BOM JLCPCB CPL V

null 22 May 20, 2022
A drop-in replacement for std::list with 293% faster insertion, 57% faster erasure, 17% faster iteration and 77% faster sorting on average. 20-24% speed increase in use-case testing.

plf::list A drop-in replacement for std::list with (on average): 293% faster insertion 57% faster erasure 17% faster iteration 77% faster sorting 70%

Matt Bentley 116 Jul 25, 2022
Amiga 1200 keyboard MPU drop-in replacement pcb

A1200_keyb_MPU Amiga 1200 keyboard MPU drop-in replacement pcb As the 68HC05 (p/n 391508-01) used in the Amiga 1200 is getting to be very expensive, I

Oleg Mishin 16 Jun 22, 2022
mold is a faster drop-in replacement for existing Unix linkers

mold: A Modern Linker mold is a faster drop-in replacement for existing Unix linkers. It is several times faster than LLVM lld linker, the second-fast

Rui Ueyama 8.4k Aug 11, 2022
Improved and configurable drop-in replacement to std::function that supports move only types, multiple overloads and more

fu2::function an improved drop-in replacement to std::function Provides improved implementations of std::function: copyable fu2::function move-only fu

Denis Blank 397 Aug 3, 2022
C++ implementation of the Python Numpy library

NumCpp: A Templatized Header Only C++ Implementation of the Python NumPy Library

David Pilger 2.4k Aug 8, 2022
libnpy is a simple C++ library for reading and writing of numpy's .npy files.

C++ library for reading and writing of numpy's .npy files

Leon Merten Lohse 164 Aug 3, 2022
C++ implementation of the Python Numpy library

NumCpp: A Templatized Header Only C++ Implementation of the Python NumPy Library Author: David Pilger [email protected] Version: License Testing C++

David Pilger 2.4k Aug 3, 2022
filters.python and readers.numpy for PDAL

PDAL Python Plugins PDAL Python plugins allow you to process data with PDAL into Numpy arrays. They support embedding Python in PDAL pipelines with th

PDAL 3 Apr 22, 2022
cvnp: pybind11 casts between numpy and OpenCV, possibly with shared memory

cvnp: pybind11 casts and transformers between numpy and OpenCV, possibly with shared memory Explicit transformers between cv::Mat / cv::Matx and numpy

Pascal Thomet 9 Jul 10, 2022
A single file drop-in memory leak tracking solution for C++ on Windows

MemLeakTracker A single file drop-in memory leak tracking solution for C++ on Windows This small piece of code allows for global memory leak tracking

null 22 Jul 18, 2022
Windows 11 Drag & Drop to the Taskbar (Partial Fix)

Windows 11 Drag & Drop to the Taskbar (Partial Fix) This program partially fixes the missing "Drag & Drop to the Taskbar" support in Windows 11. In th

null 1.2k Aug 5, 2022
Windows 11 Drag & Drop to the Taskbar (Fix)

Windows 11 Drag & Drop to the Taskbar (Fix) This program fixes the missing "Drag & Drop to the Taskbar" support in Windows 11. In the best case, such

null 1.2k Aug 6, 2022
D2R mod generator. Provide quick tool to generate .txt files to change game balance: increase drop, monster density or even randomize items.

Diablo 2 mod generator Generator is inspired by d2modmaker. It provides fast and easy way to create mod without any modding knowledge. Features includ

Smirnov Vladimir 19 Jul 28, 2022
A drop-in entity editor for EnTT with Dear ImGui

imgui_entt_entity_editor A drop-in, single-file entity editor for EnTT, with ImGui as graphical backend. demo-code (live) Editor Editor with Entiy-Lis

Erik Scholz 143 Aug 8, 2022
Alien Swarm: Reactive Drop

Alien Swarm: Reactive Drop Alien Swarm: Reactive Drop is a standalone modification for Valve's Alien Swarm game. This repository contains the source c

Reactive Drop Team 19 Aug 6, 2022
Microsoft 2.4k Aug 7, 2022