Feature Store for Machine Learning

Overview


Unit Tests Code Standards Docs Latest GitHub Release

Overview

Feast (Feature Store) is an operational data system for managing and serving machine learning features to models in production. Please see our documentation for more information about the project.

Getting Started with Docker Compose

Clone the latest stable version of the Feast repository and navigate to the infra/docker-compose sub-directory:

git clone https://github.com/feast-dev/feast.git
cd feast/infra/docker-compose
cp .env.sample .env

The .env file can optionally be configured based on your environment.

Bring up Feast:

docker-compose pull && docker-compose up -d

Please wait for the containers to start up. This could take a few minutes since the quickstart contains demo infastructure like Kafka and Jupyter.

Once the containers are all running, please connect to the provided Jupyter Notebook containing example notebooks to try out.

Important resources

Please refer to the official documentation at https://docs.feast.dev

Notice

Feast is a community project and is still under active development. Your feedback and contributions are important to us. Please have a look at our contributing guide for details.

Contributors

Thanks goes to these incredible people:

Issues
  • Split Field model into distinct Feature and Entity objects

    Split Field model into distinct Feature and Entity objects

    What this PR does / why we need it: This is a split-off off #612 that introduces the model changes made in that PR in a more digestible chunk.

    This PR includes:

    • Removal of Field object
    • Addition of distinct Feature and Entity objects
    • Removal of TFX fields on entities

    SQL changes:

    • Surrogate long ids for feature sets, features and entities
    • drop TFX constraints from entities table

    Does this PR introduce a user-facing change?:

    Model changes to FeatureSets, Features and Entities. Requires Migration.
    
    lgtm approved size/XXL 
    opened by zhilingc 43
  • Add support for Redis and Redis Cluster

    Add support for Redis and Redis Cluster

    What this PR does / why we need it: PR adds Redis and RedisCluster support as online store. As described in https://github.com/feast-dev/feast/issues/1497 it is backward compatible with Feast 0.9 thus it can be used for migration to 0.10

    Which issue(s) this PR fixes: Fixes #1497

    Does this PR introduce a user-facing change?:

    Redis and RedisCluster online stores support added. Format is compatible with Feast 0.9.
    
    kind/feature size/L lgtm approved ok-to-test release-note 
    opened by qooba 36
  • Feast API: Adding a new historical store

    Feast API: Adding a new historical store

    1. Introduction

    We've had a lot of demand for either open source or AWS batch stores (#367, #259). Folks from the community have asked us how they can contribute code to add their stores types.

    In this issue I will walk through how batch stores are currently being used and how a new batch store type can be added.

    2. Overview

    Feast interacts with a batch store in two places

    • Data ingestion: Ingestion jobs that load data into stores must be able to locate stores, apply migrations, and write data into feature set tables.
    • Feature serving (batch): Feast serving executes batch retrieval jobs in order for users to export historical feature data.

    3. Data ingestion

    Feast creates and manages population jobs that stream in data from upstream data sources. Currently Feast only supports Kafka as a data source, meaning these jobs are all long running. Batch ingestion pushes data to Kafka topics after which they are picked up by these "population" jobs.

    In order for the ingestion + population flow to complete, the destination store must be writable. This means that Feast must be able to create the appropriate tables/schemas in the store and also write data from the population job into the store.

    Currently Feast Core starts and manages these population jobs that ingest data into stores, although we are planning to move this responsibility to the serving layer. Feast Core starts an Apache Beam job which synchronously runs migrations on the destination store and subsequently starts consuming from Kafka and publishing records.

    Below is a "happy-path" example of a batch ingestion process: Untitled (1)

    In order to accommodate a new store type, the Apache Beam job needs to be updated to support

    • Setup (create tables/schemas): The current implementation for BigQuery/Redis is captured in StoreUtil.java
    • Writes: A store specific client needs to be implemented that can write to a new store type in WriteToStore.java

    4. Feature serving (batch)

    Feast Serving is a web service that allows for the retrieval of feature data from a batch feature store. Below is a sequence diagram for a typical feature request from a batch store.

    Untitled

    Currently we only have support for BigQuery has a batch store. The entry point for this implementation is the BigQueryServingService, which extends the ServingService interface.

    public interface ServingService {
      GetFeastServingInfoResponse getFeastServingInfo(GetFeastServingInfoRequest getFeastServingInfoRequest);
      GetOnlineFeaturesResponse getOnlineFeatures(GetOnlineFeaturesRequest getFeaturesRequest);
      GetBatchFeaturesResponse getBatchFeatures(GetBatchFeaturesRequest getFeaturesRequest);
      GetJobResponse getJob(GetJobRequest getJobRequest);
    }
    

    The ServingService is called from the wrapping gRPC service ServingService, where the functionality is more clearly described.

    The interface defines the following methods

    • getFeastServingInfo: Get the store type, either online or offline.
    • getOnlineFeatures: Get online features synchronously.
    • getBatchFeatures: Get batch features asynchronously. Retrieval for batch features always happens asynchronously, because of the time taken for an export to complete. This method returns immediately with a JobId to the client. The client can then poll the job status until the query has reached a terminal state (succeeded or failed).
    • getJob: Should return the Job status for a specific Job Id

    Notes on the current design: Although the actual functionality will be retained, the structure of these interfaces will probably change away from extending a service interface and towards having a store interface. There are various problems with the current implementation

    1. Batch and online stores share a single interface. I believe the intention here was to allow some stores to support both online and historical/batch storage, but for most stores this isn't the case. There is also no reason why we can't have two interfaces here. Ideally this should be split in two.
    2. The current approach is to extend services for each new store type, but this seems to be a poor abstractions. Ideally we would have both a batch and online store interface (not service interface), which is called from a single serving implementation. This approach would be a clearer separation of concerns and would prevent things like job management happening within a service implementation.
    wontfix area/serving area/ingestion 
    opened by woop 34
  • Add computation and retrieval of batch feature statistics

    Add computation and retrieval of batch feature statistics

    What this PR does / why we need it: This PR adds support for retrieval of batch statistics over data ingested into a feast warehouse store, as proposed in M2 of the Feature Validation RFC.

    Note that it deviates from the RFC in the following ways:

    • Statistics are computed using SQL. This is because TFDV is unfortunately, only available in python, and Multi-SDK connectors for Beam is still a work in progress. Computing the statistics using SQL will be the compromise until either TFDV is available in Java, or cross-language execution is supported.
    • Statistics can only be computed over a single feature-set at a time. This is mostly to reduce complexity in implementation. Since datasets are unable to span multiple feature sets, it makes sense to have this restriction in place.

    This is a bit of a chonky PR, and the code itself requires a bit of cleaning up, hence the WIP status, but refer to the attached notebook for how this implementation looks like for a user of Feast.

    Does this PR introduce a user-facing change?:

    - Adds GetFeatureStatistics to the CoreService
    - Adds get_statistics method to the client in the python SDK
    
    kind/feature approved size/XXL do-not-merge/work-in-progress ok-to-test area/core 
    opened by zhilingc 31
  • Add feature and feature set labels, for metadata

    Add feature and feature set labels, for metadata

    What this PR does / why we need it: -> Extension for #463

    Update (@suwik):

    • Addressed previous review comments
    • Added labels on the feature set level
    • Removed the python SDK changes related to labels implemented inside field.py (30d63fa0fa)
    • Made a small unit test refactoring on the way.

    Which issue(s) this PR fixes: Fixes #463

    Does this PR introduce a user-facing change?:

    Feature spec and Feature set spec will both have a new field called labels
    
    kind/feature lgtm approved size/XL ok-to-test area/core 
    opened by imjuanleonard 31
  • Switch from protobuf to arrow

    Switch from protobuf to arrow

    Is your feature request related to a problem? Please describe.

    Serialization costs for protobuf are very high.

    Describe the solution you'd like

    Switching to arrow would decrease serialization costs by a lot. This issue tracks an investigation into feasibility of switching to arrow from protobuf.

    Describe alternatives you've considered

    Additional context

    See this document for the results of a detailed investigation into latency issues due to on-demand feature views, which prompted the observation that serialization costs for protobuf are extremely high.

    wontfix kind/bug priority/p1 
    opened by felixwang9817 29
  • GetFeastCoreVersion failed with code

    GetFeastCoreVersion failed with code "StatusCode.UNIMPLEMENTED"

    Expected Behavior

    Feast Core and Serving should be connected in the python sdk when running feast version shown by the following output (From https://github.com/gojek/feast/blob/master/docs/getting-started/install-feast.md)

    { "sdk": { "version": "feast 0.3.0" }, "core": { "url": "192.168.99.100:32090", "version": "0.3", "status": "connected" }, "serving": { "url": "192.168.99.100:32091", "version": "0.3", "status": "connected" } }

    Current Behavior

    When running feast version

    GetFeastCoreVersion failed with code "StatusCode.UNIMPLEMENTED" Method feast.core.CoreService/GetFeastCoreVersion is unimplemented {"sdk": {"version": "feast 0.3.0"}, "core": {"url": "192.168.39.232:32090", "version": "", "status": "not connected"}, "serving": {"url": "192.168.39.232:32091", "version": "0.3", "status": "connected"}}

    Steps to reproduce

    Follow https://github.com/gojek/feast/blob/master/docs/getting-started/install-feast.md steps 0-2 for minikube (local) installation.

    Then ran pip3 install -e ${FEAST_HOME_DIR}/sdk/python --user feast config set core_url ${FEAST_CORE_URL} feast config set serving_url ${FEAST_SERVING_URL} feast version Which is where the problem occured

    Specifications

    • Version: Master (0.3)
    • Platform: Localhost (Ubuntu 18.04)
    • Subsystem: python 3.6.8, helm 2.16.0, kubectl client 1.16.1, kubectl server 1.15.5, minikube 1.5.2

    Possible Solution

    I'm not sure. I did however notice something strange when the pods are starting up. In the picture below a number of restarts occur for the core and serving services in the cluster. Before the restart occurs the pods are always going from 'ContainerCreating' to 'Running' to 'Error' to CrashLoopBackOff'. This happens in loops until it finally just says 'Running' after 5-6 mins. And it happens every time i do a clean (maybe unclean) installation. My best guess is that the core service has a bug with the connection but it could be in the python sdk as well for all i know. image

    opened by NicholaiStaalung 28
  • Remove feature specs being able to declare their serving or warehouse stores

    Remove feature specs being able to declare their serving or warehouse stores

    Currently a Feature can declare it's own data stores in the feature spec, which must have been registered with Core before hand.. This adds a lot of complexity to declaring features and is very error prone.

    Instead we should have the data stores dictated by Core, and feature specs should know nothing about them.

    This means that a Feast deployment will now only be able to have 1 serving store and 1 warehouse store at a time.

    Some things to note:

    This changes the way features can configure some settings. For example redis expiry must now be set in FeatureSpec.options rather than FeautureSpec.datastores.serving.options

    So the option key has changed from "expiry" to "redis.expiry". It is still called "expiry" when overriding the a default it in the StorageSpec.options however. We need to find a better way to document this.

    I think I like the idea of only have one place to set options in a FeatureSpec. But if it applies to the actual underlying storage or not depends on if that storage is actually being used. So it's not clear that these should be feature options at all.

    lgtm approved size/XXL 
    opened by tims 27
  • Add support to Python SDK for staging files on Amazon S3

    Add support to Python SDK for staging files on Amazon S3

    What this PR does / why we need it: Currently Feast client cannot stage data anywhere except GCP storage. The PR adds S3 staging support on client. Which issue(s) this PR fixes: Client can stage files on s3.

    Python Client changes for #706. Edit: Fixes #562

    Does this PR introduce a user-facing change?: No

    Added support to Python SDK for staging files on Amazon S3
    
    kind/feature lgtm approved size/XL ok-to-test 
    opened by jmelinav 25
  • Authentication and Authorization

    Authentication and Authorization

    What this PR does / why we need it:

    First implementation of auth for Feast (related to #504 minimal implementation).

    1. Adds authentication to Feast Core (with support for different implementations). Currently any JWT bearer token through gRPC metadata.
    2. Adds authorization to Feast Core (with support for different implementations). Currently only supports Ory Keto. A follow up PR will add an HTTP authorization adapter.
    3. Adds authentication to Python SDK/CLI. Two implementations included: users can enable authentication client side and Feast will send their Google Open ID credentials as gRPC metadata to Core, or they can provide client credentials and OAuth2 provider and the JWT will be fetched for them.
    4. Refactored the Python SDK/CLI SSL/TLS handling.
    5. Prevents unauthorized creation or modification of feature sets in projects that a user does not have membership in.

    Limitations

    Does not handle user or role management in authorization provider (creating projects, adding members, removing members, listing members).

    Which issue(s) this PR fixes:

    Related to #504, but doesn't close the card. This is a minimal implementation. Replaces https://github.com/feast-dev/feast/pull/554

    Does this PR introduce a user-facing change?:

    Yes, documentation will be needed:

    • The Python Client SDK has a constructor now to pass authentication configuration.
    • The Core Service API requires GRPC metadata when authentication is enabled.
    • Configuration for Core has been extended to enable authentication and authorization.
    kind/feature lgtm approved size/XXL ok-to-test 
    opened by dr3s 24
  • S3 endpoint configuration #1169

    S3 endpoint configuration #1169

    What this PR does / why we need it:

    Which issue(s) this PR fixes:

    Fixes #1169

    Does this PR introduce a user-facing change?:

    Added an option to configure S3 endpoint url
    
    kind/feature size/M lgtm approved ok-to-test release-note 
    opened by mike0sv 23
Releases(v0.22.2)
  • v0.22.2(Jul 29, 2022)

    0.22.2 (2022-07-29)

    This patch release reverts an accidental removal of Python 3.7 support in 0.22.1.

    Reverts

    • ci: "Fix night ci syntax error and update readme (#2935)" (31f54c8)
    • ci: fix: Fix nightly ci again (#2939). This reverts commit c36361951d29714392b1def6e54f83ae45cd5d9a. (33cbaeb)
    • ci: Revert "ci: Add a nightly CI job for integration tests (#2652)" (d4bb394)
    • ci: Revert "fix: Deprecate 3.7 wheels and fix verification workflow (#2934)" (efadb22)
    • Revert "fix: Change numpy version on setup.py and upgrade it to resolve dependabot warning (#2887)" (87190cb)
    Source code(tar.gz)
    Source code(zip)
  • v0.22.1(Jul 19, 2022)

    0.22.1 (2022-07-19)

    Bug Fixes

    • Change numpy version on setup.py and upgrade it to resolve dependabot warning (#2887) (b9190b9)
    • Change the feature store plan method to public modifier (#2904) (568058a)
    • Deprecate 3.7 wheels and fix verification workflow (#2934) (146e36d)
    • Fix build wheels workflow to install apache-arrow correctly (#2932) (4b69e0e)
    • Fix grpc and update protobuf (#2894) (f726c96)
    • Fix night ci syntax error and update readme (#2935) (b35553b)
    • Fix nightly ci again (#2939) (c363619)
    • Fix the go build and use CgoArrowAllocator to prevent incorrect garbage collection (#2919) (f4f4894)
    • Fixing broken links to feast documentation on java readme and contribution (#2892) (a45e10a)
    • Resolve small typo in README file (#2930) (9840c1b)
    • Update gopy to point to fork to resolve github annotation errors. (#2940) (9b9fbbe)
    Source code(tar.gz)
    Source code(zip)
  • v0.22.0(Jun 29, 2022)

    0.22.0 (2022-06-29)

    Overview

    This release adds many bug fixes, as well as some new functionality to help with realtime use cases:

    • SQLAlchemy backed registry (as an alternative to a file registry) (documentation)
    • A universal push API so users can push (streaming) feature values to both offline / online stores (documentation)
    • [Alpha] High level objects to define stream transformations. There are contrib components to help pull registered stream transformations, execute them, and push transformed feature values to the online store. (tutorial)
    • [Alpha] Logging served feature values (from the Go feature server) to the offline store and validating against Great Expectations suite using feast validate (tutorial)

    Features

    • Add feast repo-upgrade for automated repo upgrades (#2733) (a3304d4)
    • Add file write_to_offline_store functionality (#2808) (c0e2ad7)
    • Add http endpoint to the Go feature server (#2658) (3347a57)
    • Add simple TLS support in Go RedisOnlineStore (#2860) (521488d)
    • Add StreamProcessor and SparkKafkaProcessor as contrib (#2777) (83ab682)
    • Added Spark support for Delta and Avro (#2757) (7d16516)
    • CLI interface for validation of logged features (#2718) (c8b11b3)
    • Enable stream feature view materialization (#2798) (a06700d)
    • Enable stream feature view materialization (#2807) (7d57724)
    • Implement offline_write_batch for BigQuery and Snowflake (#2840) (97444e4)
    • Offline push endpoint for pushing to offline stores (#2837) (a88cd30)
    • Push to Redshift batch source offline store directly (#2819) (5748a8b)
    • Scaffold for unified push api (#2796) (1bd0930)
    • SQLAlchemy Registry Support (#2734) (b3fe39c)
    • Stream Feature View FCOS (#2750) (0cf3c92)
    • Update stream fcos to have watermark and sliding interval (#2765) (3256952)
    • Validating logged features via Python SDK (#2640) (2874fc5)

    Bug Fixes

    • Add columns for user metadata in the tables (#2760) (269055e)
    • Add project columns in the SQL Registry (#2784) (336fdd1)
    • Add S3FS dependency (which Dask depends on for S3 files) (#2701) (5d6fa94)
    • Bugfixes for how registry is loaded (#2768) (ecb8b2a)
    • Conversion of null timestamp from proto to python (#2814) (cb23648)
    • Correct feature statuses during feature logging test (#2709) (cebf609)
    • Correctly generate projects-list.json when calling feast ui and using postgres as a source (#2845) (bee8076)
    • Dynamodb drops missing entities when batching (#2802) (a2e9209)
    • Enable faulthandler and disable flaky tests (#2815) (4934d84)
    • Explicitly translate errors when instantiating the go fs (#2842) (7a2c4cd)
    • Fix broken roadmap links (#2690) (b3ba8aa)
    • Fix bugs in applying stream feature view and retrieving online features (#2754) (d024e5e)
    • Fix Feast UI failure with new way of specifying entities (#2773) (0d1ac01)
    • Fix feature view getitem for feature services (#2769) (88cc47d)
    • Fix issue when user specifies a port for feast ui (#2692) (1c621fe)
    • Fix macos wheel version for 310 and also checkout edited go files (#2890) (bdf170f)
    • Fix on demand feature view crash from inference when it uses df.apply (#2713) (c5539fd)
    • Fix SparkKafkaProcessor query_timeout parameter (#2789) (a8d282d)
    • Fix workflow syntax error (#2869) (fae45a1)
    • Fixed custom S3 endpoint read fail (#2786) (6fec431)
    • Go install gopy instead using go mod tidy (#2863) (2f2b519)
    • Hydrate infra object in the sql registry proto() method (#2782) (452dcd3)
    • Implement apply_materialization and infra methods in sql registry (#2775) (4ed107c)
    • Minor refactor to format exception message (#2764) (da763c6)
    • Prefer installing gopy from feast's fork as opposed to upstream (#2839) (34c997d)
    • Python server is not correctly starting in integration tests (#2706) (7583a0b)
    • Random port allocation for python server in tests (#2710) (dee8090)
    • Refactor test to reuse LocalRegistryFile (#2763) (4339c0a)
    • Revert "chore(release): release 0.22.0" (#2852) (e6a4636)
    • Stop running go mod tidy in setup.py (#2877) (676ecbb), closes /github.com/pypa/cibuildwheel/issues/189#issuecomment-549933912
    • Support push sources in stream feature views (#2704) (0d60eaa)
    • Sync publish and build_wheels workflow to fix verify wheel error. (#2871) (b0f050a)
    • Update roadmap with stream feature view rfc (#2824) (fc8f890)
    • Update udf tests and add base functions to streaming fcos and fix some nonetype errors (#2776) (331a214)
    Source code(tar.gz)
    Source code(zip)
  • v0.21.3(Jun 13, 2022)

  • v0.21.2(May 17, 2022)

  • v0.21.1(May 16, 2022)

  • v0.21.0(May 13, 2022)

    0.21.0 (2022-05-13)

    Overview

    This release adds many bug fixes, and also adds several new features:

    • HBase online store (contrib)
    • [Alpha] feast_ ui command to spin up the Web UI within a feature repository

    There is ongoing work towards:

    • [Alpha] High performance Go feature server (via feast serve with a go feature server enabled)
    • [Alpha] Stream transformations
    • [Alpha] Validating logged feature values (from the Go feature server) with Great Expectations and feast validate

    Features

    • Add hbase online store support in feast (#2590) (c9eda79)
    • Adding SSL options for Postgres (#2644) (0e809c2)
    • Allow Feast UI to be spun up with CLI command: feast ui (#2667) (44ca9f5)
    • Allow to pass secrets and environment variables to transformation service (#2632) (ffa33ad)
    • CLI command 'feast serve' should start go-based server if flag is enabled (#2617) (f3ff812)
    • Create stream and batch feature view abstractions (#2559) (d1f76e5)
    • Postgres supported as Registry, Online store, and Offline store (#2401) (ed2f979)
    • Support entity fields in feature view schema parameter by dropping them (#2568) (c8fcc35)
    • Write logged features to an offline store (Python API) (#2574) (134dc5f)
    • Write logged features to Offline Store (Go - Python integration) (#2621) (ccad832)

    Bug Fixes

    • Addresses ZeroDivisionError when materializing file source with same timestamps (#2551) (1e398d9)
    • Asynchronously refresh registry for the feast ui command (#2672) (1b09ca2)
    • Build platform specific python packages with ci-build-wheel (#2555) (b10a4cf)
    • Delete data sources from registry when using the diffing logic (#2669) (fc00ca8)
    • Enforce kw args featureservice (#2575) (160d7b7)
    • Enforce kw args in datasources (#2567) (0b7ec53)
    • Feature logging to Redshift is broken (#2655) (479cd51)
    • Feature service to templates (#2649) (1e02066)
    • Feature with timestamp type is incorrectly interpreted by Go FS (#2588) (e3d9588)
    • Fix __hash__ methods (#2556) (ebb7dfe)
    • Fix AWS bootstrap template (#2604) (c94a69c)
    • Fix broken proto conversion methods for data sources (#2603) (00ed65a)
    • Fix case where on demand feature view tab is broken if no custom tabs are passed. (#2682) (01d3568)
    • Fix DynamoDB fetches when there are entities that are not found (#2573) (7076fe0)
    • Fix Feast UI parser to work with new APIs (#2668) (8d76751)
    • Fix java server after odfv update (#2602) (0ca6297)
    • Fix materialization with ttl=0 bug (#2666) (ab78702)
    • Fix push sources and add docs / tests pushing via the python feature server (#2561) (e8e418e)
    • Fixed data mapping errors for Snowflake (#2558) (53c2ce2)
    • Forcing ODFV udfs to be main module and fixing false positive duplicate data source warning (#2677) (2ce33cd)
    • Include the ui/build directory, and remove package data (#2681) (0384f5f)
    • Infer features for feature services when they depend on feature views without schemas (#2653) (87c194c)
    • Pin dependencies to nearest major version (#2647) (bb72b7c)
    • Pin pip<22.1 to get around breaking change in pip==22.1 (#2678) (d3e01bc)
    • Punt deprecation warnings and clean up some warnings. (#2670) (f775d2e)
    • Reject undefined features when using get_historical_features or get_online_features (#2665) (36849fb)
    • Remove ci extra from the feature transformation server dockerfile (#2618) (25613b4)
    • Remove incorrect call to logging.basicConfig (#2676) (8cbf51c)
    • Small typo in CLI (#2578) (f372981)
    • Switch from join_key to join_keys in tests and docs (#2580) (d66c931)
    • Teardown trino container correctly after tests (#2562) (72f1558)
    • Update build_go_protos to use a consistent python path (#2550) (f136f8c)
    • Update data source timestamp inference error message to make sense (#2636) (3eaf6b7)
    • Update field api to add tag parameter corresponding to labels in Feature. (#2610) (689d20b)
    • Update java integration tests and add more logging (#2637) (10e23b4)
    • Update on demand feature view api (#2587) (38cd7f9)
    • Update RedisCluster to use redis-py official implementation (#2554) (ce5606f)
    • Use cwd when getting module path (#2577) (b550e59)
    • Use ParquetDataset for Schema Inference (#2686) (4f85e3e)
    • Use timestamp type when converting unixtimestamp feature type to arrow (#2593) (c439611)
    Source code(tar.gz)
    Source code(zip)
  • v0.20.2(Apr 28, 2022)

    0.20.2 (2022-04-28)

    Bug Fixes

    • Feature with timestamp type is incorrectly interpreted by Go FS (#2588) (3ec943a)
    • Fix AWS bootstrap template (#2604) (6df5a49)
    • Fix broken proto conversion methods for data sources (#2603) (c391216)
    • Remove ci extra from the feature transformation server dockerfile (#2618) (a7437fa)
    • Update field api to add tag parameter corresponding to labels in Feature. (#2610) (40962fc)
    • Use timestamp type when converting unixtimestamp feature type to arrow (#2593) (a1c3ee3)
    Source code(tar.gz)
    Source code(zip)
  • v0.20.1(Apr 20, 2022)

    0.20.1 (2022-04-20)

    Bug Fixes

    • Addresses ZeroDivisionError when materializing file source with same timestamps (#2551) (5539c51)
    • Build platform specific python packages with ci-build-wheel (#2555) (1757639)
    • Enforce kw args featureservice (#2575) (4dce254)
    • Enforce kw args in datasources (#2567) (6374634)
    • Fix __hash__ methods (#2556) (dd8b854)
    • Fix DynamoDB fetches when there are entities that are not found (#2573) (882328f)
    • Fix push sources and add docs / tests pushing via the python feature server (#2561) (c5006c2)
    • Fixed data mapping errors for Snowflake (#2558) (abd6be7)
    • Small typo in CLI (#2578) (8717bc8)
    • Switch from join_key to join_keys in tests and docs (#2580) (6130b80)
    • Update build_go_protos to use a consistent python path (#2550) (1c523bf)
    • Update RedisCluster to use redis-py official implementation (#2554) (c47fa2a)
    • Use cwd when getting module path (#2577) (28752f2)
    Source code(tar.gz)
    Source code(zip)
  • v0.20.0(Apr 14, 2022)

    0.20.0 (2022-04-14)

    Highlights

    We are delighted to announce the release of Feast 0.20, which introduces many new features and enhancements:

    • High performance Python feature serving (through embedding Go and optimized DynamoDB batch gets)
    • Many connector improvements and bug fixes (DynamoDB, Snowflake, Spark, Trino)
      • Note: Trino has been officially bundled into Feast. You can now run this with pip install feast[trino]!
    • Graduated alpha features (python feature server + push features)
    • Feast API changes
    • [Experimental] Feast UI as an importable npm module

    Detailed changelog:

    Bug Fixes

    • Add inlined data sources to the top level registry (#2456) (356788a)
    • Add new value types to types.ts for web ui (#2463) (ad5694e)
    • Add PushSource proto and Python class (#2428) (9a4bd63)
    • Add spark to lambda dockerfile (#2480) (514666f)
    • Added private_key auth for Snowflake (#2508) (c42c9b0)
    • Added Redshift and Spark typecheck to data_source event_timestamp_col inference (#2389) (04dea73)
    • Building of go extension fails (#2448) (7d1efd5)
    • Bump the number of versions bumps expected to 27 (#2549) (ecc9938)
    • Create init files for the proto-generated python dirs (#2410) (e17028d)
    • Don't prevent apply from running given duplicate empty names in data sources. Also fix repeated apply of Spark data source. (#2415) (b95f441)
    • Dynamodb deduplicate batch write request by partition keys (#2515) (70d4a13)
    • Ensure that init files exist in proto dirs (#2433) (9b94f7b)
    • Fix DataSource constructor to unbreak custom data sources (#2492) (712653e)
    • Fix default feast apply path without any extras (#2373) (6ba7fc7)
    • Fix definitions.py with new definition (#2541) (eefc34a)
    • Fix entity row to use join key instead of name (#2521) (c22fa2c)
    • Fix Java Master (#2499) (e083458)
    • Fix registry proto (#2435) (ea6a9b2)
    • Fix some inconsistencies in the docs and comments in the code (#2444) (ad008bf)
    • Fix spark docs (#2382) (d4a606a)
    • Fix Spark template to work correctly on feast init -t spark (#2393) (ae133fd)
    • Fix the feature repo fixture used by java tests (#2469) (32e925e)
    • Fix unhashable Snowflake and Redshift sources (cd8f1c9)
    • Fixed bug in passing config file params to snowflake python connector (#2503) (34f2b59)
    • Fixing Spark template to include source name (#2381) (a985f1d)
    • Make name a keyword arg for the Entity class (#2467) (43847de)
    • Making a name for data sources not a breaking change (#2379) (71d7ae2)
    • Minor link fix in CONTRIBUTING.md (#2481) (2917e27)
    • Preserve ordering of features in _get_column_names (#2457) (495b435)
    • Relax click python requirement to >=7 (#2450) (f202f92)
    • Remove date partition column field from datasources that don't s… (#2478) (ce35835)
    • Remove docker step from unit test workflow (#2535) (6f22f22)
    • Remove spark from the AWS Lambda dockerfile (#2498) (6abae16)
    • Request data api update (#2488) (0c9e5b7)
    • Schema update (#2509) (cf7bbc2)
    • Simplify DataSource.from_proto logic (#2424) (6bda4d2)
    • Snowflake api update (#2487) (1181a9e)
    • Support passing batch source to streaming sources for backfills (#2523) (90db1d1)
    • Timestamp update (#2486) (bf23111)
    • Typos in Feast UI error message (#2432) (e14369d)
    • Update feature view APIs to prefer keyword args (#2472) (7c19cf7)
    • Update file api (#2470) (83a11c6)
    • Update Makefile to cd into python dir before running commands (#2437) (ca32155)
    • Update redshift api (#2479) (4fa73a9)
    • Update some fields optional in UI parser (#2380) (cff7ac3)
    • Use a single version of jackson libraries and upgrade to 2.12.6.1 (#2473) (5be1cc6)
    • Use dateutil parser to parse materialization times (#2464) (6c55e49)
    • Use the correct dockerhub image tag when building feature servers (#2372) (0d62c1d)

    Features

    Source code(tar.gz)
    Source code(zip)
  • v0.19.4(Apr 6, 2022)

  • v0.19.3(Mar 9, 2022)

  • v0.19.2(Mar 6, 2022)

  • v0.19.1(Mar 5, 2022)

  • v0.19.0(Mar 5, 2022)

    0.19.0 (2022-03-05)

    Bug Fixes

    • Added additional value types to UI parser and removed references to registry-bq.json (#2361) (d202d51)
    • Fix Redshift bug that stops waiting on statements after 5 minutes (#2363) (74f887f)
    • Method _should_use_plan only returns true for local sqlite provider (#2344) (fdb5f21)
    • Remove redis service to prevent more conflicts and add redis node to master_only (#2354) (993616f)
    • Rollback Redis-py to Redis-py-cluster (#2347) (1ba86fb)
    • Update github workflow to prevent redis from overlapping ports. (#2350) (c2a6c6c)

    Features

    • Add owner field to Entity and rename labels to tags (412d625)
    • Allow all snowflake python connector connection methods to be available to Feast (#2356) (ec7385c)
    • Allowing password based authentication and SSL for Redis in Java feature server (0af8adb)
    • Event timestamps response (#2355) (5481caf)
    • Feast Spark Offline Store (#2349) (98b8d8d)
    • Initial merge of Web UI logic (#2352) (ce3bc59)
    • Key ttl setting for redis online store (#2341) (236a108)
    • Metadata changes & making data sources top level objects to power Feast UI (#2336) (43da230)
    Source code(tar.gz)
    Source code(zip)
  • v0.18.1(Feb 15, 2022)

    Full Changelog: https://github.com/feast-dev/feast/compare/v0.18.0...v0.18.1

    Fixed bugs:

    • ODFVs raise a PerformanceWarning for very large sets of features #2293
    • Don't require snowflake to always be installed #2309 (judahrand)
    • podAnnotations Values in the feature-server chart #2304 (tpvasconcelos)
    • Fixing the Java helm charts and adding a demo tutorial on how to use them #2298 (adchia)
    • avoid using transactions on OSS Redis #2296 (DvirDukhan)
    • Include infra objects in registry dump and fix Infra's from_proto #2295 (adchia)
    • Expose snowflake credentials for unit testing #2288 (sfc-gh-madkins)
    • Fix flaky tests (test_online_store_cleanup & test_feature_get_online_features_types_match) #2276 (pyalex)

    Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.18.0(Feb 5, 2022)

    Overview

    Today, we released Feast 0.18, with some major developments:

    • Snowflake offline store support has been merged into the main repo
    • Introduced saved datasets, which allows persisting data frames retrieved from offline stores
    • The first milestone of Data Quality Monitoring project has been implemented. This enables defining expectation suites (using Great Expectations) and running them against training datasets
    • Python feature server graduated from the alpha status
    • A significant performance improvements have been achieved in both Python & Java feature servers

    ✨ New Features:

    🔴 Fixed bugs:

    🔨 Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.17.0(Jan 4, 2022)

    Overview

    Today, we released Feast 0.17, which includes:

    • an initial cut at feast plan (See RFC-030)
    • many optimizations for materialization / feature serving in both python + java feature servers, especially with Redis as an online store.
    • a simplified Java server (without Spring Boot boilerplate)
    • a helm chart for deploying the python feature server (as an alternative to deploying in AWS Lambda)
    • other bug fixes, including type conversion bugs and log4j patches

    ✨ New Features:

    • Add feast-python-server helm chart #2177 (michelle-rascati-sp)
    • Add a feast plan command, and have CLI output differentiates between created, deleted and unchanged objects #2147 (achals)
    • Refactor tag methods to infer created, deleted, and kept repo objects #2142 (achals)
    • Pre compute the timestamp range for feature views #2103 (judahrand)

    🔴 Fixed bugs:

    🔨 Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.16.1(Dec 11, 2021)

    Changelog

    v0.16.1 (2021-12-10)

    Full Changelog

    This was a quick patch fix to patch in the log4j vulnerability fixes.

    Fixed bugs:

    Merged pull requests:

    • Updating lambda docker image to feature-server-python-aws #2130 (adchia)
    • Fix README to reflect new integration test suites #2124 (adchia)
    • Remove argument feature_refs #2115 (judahrand)
    Source code(tar.gz)
    Source code(zip)
  • v0.16.0(Dec 8, 2021)

    Overview

    Today we are releasing Feast 0.16, which includes many bug fixes and optimizations.

    👥 Contributors

    Thanks to @achals, @adchia, @ArrichM, @aurobindoc, @casassg, @danilopeixoto, @felixwang9817, @judahrand, @mavysavydav, @olivierlabreche, @nossrannug, @ptoman-pa, @pyalex, @tsotnet, @ysk24ok for the contributions!

    ✨ New Features:

    • Install redis extra in AWS Lambda feature server & add hiredis depend… #2057 (tsotnet)
    • Support of GC and S3 storages for registry in Java Feature Server #2043 (pyalex)
    • Adding stream ingestion alpha documentation #2005 (adchia)

    🔴 Fixed bugs:

    • requested_features are not passed to online_read() from passthrough_provider #2106
    • feast apply broken with 0.15.* if the registry already exists #2086
    • Inconsistent logic with on_demand_feature_views #2072
    • requested_features is passed to online_read from passthrough_provider #2107 (aurobindoc)
    • Don't materialize FeatureViews where online is False #2101 (judahrand)
    • Have apply_total use the repo_config that's passed in as a parameter (makes it more compatible with custom wrapper code) #2099 (mavysavydav)
    • Do not attempt to compute ODFVs when there are no ODFVs #2090 (felixwang9817)
    • Duplicate feast apply bug #2087 (felixwang9817)
    • Add --host as an option for feast serve #2078 (nossrannug)
    • Fix feature server docker image tag generation in pr integration tests #2077 (tsotnet)
    • Fix ECR Image build on master branch #2076 (tsotnet)
    • Optimize memory usage during materialization #2073 (judahrand)
    • Fix unexpected feature view deletion when applying edited odfv #2054 (ArrichM)
    • Properly exclude entities from feature inference #2048 (mavysavydav)
    • Don't allow FeatureStore.apply with commit=False #2047 (nossrannug)
    • Fix bug causing OnDemandFeatureView.infer_features() to fail when the… #2046 (ArrichM)
    • Add missing comma in setup.py #2031 (achals)
    • Correct cleanup after usage e2e tests #2015 (pyalex)
    • Change Environment timestamps to be in UTC #2007 (felixwang9817)
    • get_online_features on demand transform bug fixes + local integration test mode #2004 (adchia)
    • Always pass full and partial feature names to ODFV #2003 (judahrand)
    • ODFV UDFs should handle list types #2002 (Agent007)
    • Update bq_to_feast_value_type with BOOLEAN type as a legacy sql data type #1996 (mavysavydav)
    • Fix bug where using some Pandas dtypes in the output of an ODFV fails #1994 (judahrand)
    • Fix duplicate update infra #1990 (felixwang9817)
    • Improve performance of _convert_arrow_to_proto #1984 (nossrannug)

    🔨 Merged pull requests:

    • Update FAQ #2118 (felixwang9817)
    • Move helm chart back to main repo #2113 (pyalex)
    • Set package long description encoding to UTF-8 #2111 (danilopeixoto)
    • Update release workflow to include new docker images #2108 (adchia)
    • Use the maintainers group in Codeowners instead of individuals #2102 (achals)
    • Remove tfx schema from ValueType #2098 (pyalex)
    • Add data source implementations to RTD docs #2097 (felixwang9817)
    • Updated feature view documentation to include blurb about feature inferencing #2096 (mavysavydav)
    • Fix integration test that is unstable due to incorrect materialization boundaries #2095 (pyalex)
    • Broaden google-cloud-core dependency #2094 (ptoman-pa)
    • Use pip-tools to lock versions of dependent packages #2093 (ysk24ok)
    • Fix typo in feature retrieval doc #2092 (olivierlabreche)
    • Fix typo in FeatureView example (doc) #2091 (olivierlabreche)
    • Use request.addfinalizer instead of the yield based approach in integ tests #2089 (achals)
    • Odfv logic #2088 (felixwang9817)
    • Refactor _convert_arrow_to_proto #2085 (judahrand)
    • Add github run id into the integration test projects for debugging #2069 (achals)
    • Fixing broken entity key link in quickstart #2068 (adchia)
    • Fix java_release workflow by removing step without users/with #2067 (achals)
    • Allow using cached registry when writing to the online store #2066 (achals)
    • Raise import error when repo configs module cannot be imported #2065 (felixwang9817)
    • Remove refs to tensorflow_metadata #2063 (achals)
    • Add detailed error messages for test_univerisal_e2e failures #2062 (achals)
    • Remove unused protos & deprecated java modules #2061 (pyalex)
    • Asynchronously refresh registry in transformation service #2060 (pyalex)
    • Fix GH workflow for docker build of java parts #2059 (pyalex)
    • Dedicated workflow for java PRs #2050 (pyalex)
    • Run java integration test with real google cloud and aws #2049 (pyalex)
    • Fixing typo enabling on_demand_transforms #2044 (ArrichM)
    • Make feast registry-dump print the whole registry as one json #2040 (nossrannug)
    • Remove tensorflow-metadata folders #2038 (casassg)
    • Update CHANGELOG for Feast v0.15.1 #2034 (felixwang9817)
    • Remove unsupported java parts #2029 (pyalex)
    • Fix checked out branch for PR docker image build workflow #2018 (tsotnet)
    • Extend "feast in production" page with description of java feature server #2017 (pyalex)
    • Remove duplicates in setup.py and run rudimentary verifications #2016 (achals)
    • Upload feature server docker image to ECR on approved PRs #2014 (tsotnet)
    • GitBook: [#1] Plugin standards documentation #2011 (felixwang9817)
    • Add changelog for v0.15.0 #2006 (adchia)
    • Add integration tests for AWS Lambda feature server #2001 (tsotnet)
    • Moving Feast Java back into main repo under java/ package #1997 (adchia)
    • Fix protobuf version conflict in [gcp] and [ci] packages #1992 (ysk24ok)
    • Improve aws lambda deployment (logging, idempotency, etc) #1985 (tsotnet)
    • Extend context for usage statistics collection & add latencies for performance analysis #1983 (pyalex)
    Source code(tar.gz)
    Source code(zip)
  • v0.15.1(Nov 13, 2021)

    Fixed bugs:

    Merged pull requests:

    • Remove unsupported java parts #2029 (pyalex)
    • Fix checked out branch for PR docker image build workflow #2018 (tsotnet)
    • Remove duplicates in setup.py and run rudimentary verifications #2016 (achals)
    • Upload feature server docker image to ECR on approved PRs #2014 (tsotnet)
    • Add integration tests for AWS Lambda feature server #2001 (tsotnet)
    • Moving Feast Java back into main repo under java/ package #1997 (adchia)
    Source code(tar.gz)
    Source code(zip)
  • v0.15.0(Nov 8, 2021)

    Overview

    Today we are releasing Feast 0.15, which includes performance improvements, bug fixes, and several features:

    1. [Experimental] Push based stream ingestion (docs): Feast now allows users to push features previously registered in a feature view to the online store. This most commonly would be done from a stream processing job (e.g. a Beam or Spark Streaming job).
    2. Entity aliasing (docs): This allows for use cases where the same entity has different column names in different source tables (e.g. there are "spammer", "reporter", and "user" tables that all refer to the same user entity).
    3. Feature Transformation Server: a server that executes on demand transformations. The existing feature server (e.g. deployed with AWS Lambda) executes on demand transformations already. This new server integrates with Feast Serving (java server) for latency sensitive usecases.
    4. Easy way to test offline/online store plugins using the existing Feast test suite. See docs for details.

    Experimental features are subject to API changes in the near future as we collect feedback. If you have thoughts, please don’t hesitate to reach out to the Feast team!

    👥 Contributors

    Thanks to @achals, @adchia, @Agent007, @amommendes, @codyjlin, @DvirDukhan, @felixwang9817, @judahrand, @loftiskg, @mavysavydav, @MattDelac, @nossrannug, @pyalex, @qooba, @samuel100, @tsotnet, @vas28r13, and @ysk24ok for the contributions!

    ✨ New Features:

    • Feature transformation server docker image #1972 (felixwang9817)
    • eventtime check before writing features, use pipelines, ttl #1961 (vas28r13)
    • Plugin repo universal tests #1946 (felixwang9817)
    • direct data ingestion into Online store #1939 (vas28r13)
    • Add an interface for TransformationService and a basic implementation #1932 (achals)
    • Allows registering of features in request data as RequestFeatureView. Refactors common logic into a BaseFeatureView class #1931 (adchia)
    • Add final_output_feature_names in Query context to avoid SELECT * EXCEPT #1911 (MattDelac)
    • Add Dockerfile for GCP CloudRun FeatureServer #1887 (judahrand)

    🔴 Fixed bugs:

    • feast=0.14.0 query_generator() unecessary used twice #1978
    • get_online_features on demand transform bug fixes + local integration test mode #2004 (adchia)
    • Always pass full and partial feature names to ODFV #2003 (judahrand)
    • Update bq_to_feast_value_type with BOOLEAN type as a legacy sql data type #1996 (mavysavydav)
    • Fix bug where using some Pandas dtypes in the output of an ODFV fails #1994 (judahrand)
    • Fix duplicate update infra #1990 (felixwang9817)
    • Improve performance of _convert_arrow_to_proto #1984 (nossrannug)
    • Fix duplicate upload entity #1981 (achals)
    • fix redis cluster materialization #1968 (qooba)
    • Allow plugin repos to actually overwrite repo configs #1966 (felixwang9817)
    • Delete keys from Redis when tearing down online store #1965 (achals)
    • Fix issues with lint test and upgrade pip version #1964 (felixwang9817)
    • Move IntegrationTestRepoConfig class to another module #1962 (felixwang9817)
    • Solve package conflict in [gcp] and [ci] #1955 (ysk24ok)
    • Remove some paths from unit test cache #1944 (achals)
    • Fix bug in feast alpha enable CLI command #1940 (felixwang9817)
    • Fix conditional statements for if OnDemandFVs exist #1937 (codyjlin)
    • Fix __getitem__ return value for feature view and on-demand feature view #1936 (mavysavydav)
    • Corrected setup.py BigQuery version that's needed for a contributor's merged PR 1844 #1934 (mavysavydav)

    🔨 Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.14.1(Oct 28, 2021)

    Fixed bugs:

    • Fix duplicate upload entity #1981 (achals)
    • Fix bug in feast alpha enable CLI command #1940 (felixwang9817)
    • Fix conditional statements for if OnDemandFVs exist #1937 (codyjlin)
    • Fix __getitem__ return value for feature view and on-demand feature view #1936 (mavysavydav)
    • Corrected setup.py BigQuery version that's needed for a contributor's merged PR 1844 #1934 (mavysavydav)

    Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.14.0(Oct 20, 2021)

    Overview

    Today we are releasing Feast 0.14, which includes a new feature and several important improvements:

    1. [Experimental] AWS Lambda feature servers, which allow you to quickly deploy an HTTP server to serve online features on AWS Lambda. GCP Cloud Run and Java feature servers are coming soon! (see docs)
    2. Bug fixes around performance. The core online serving path is now significantly faster.
    3. Improvements for developer experience. The integration tests are now faster, and temporary tables created during integration tests are immediately dropped after the test.

    Experimental features are subject to API changes in the near future as we collect feedback. If you have thoughts, please don’t hesitate to reach out to the Feast team!

    👥 Contributors

    Thanks to @achals, @adchia, @Agent007, @DvirDukhan, @felixwang9817, @loftiskg, @mavysavydav, @samuel100, @tsotnet, and @ysk24ok for the contributions!

    ✨ New Features:

    • Rename FVProjection member functions to be more clear #1929 (mavysavydav)
    • Make serverless alpha feature #1928 (felixwang9817)
    • Feast endpoint #1927 (felixwang9817)
    • Add location to BigQueryOfflineStoreConfig #1921 (loftiskg)
    • Create & teardown Lambda & API Gateway resources for serverless feature server #1900 (tsotnet)
    • Hide FeatureViewProjections from user interface & have FeatureViews carry FVProjections that carries the modified info of the FeatureView #1899 (mavysavydav)
    • Upload docker image to ECR during feast apply #1877 (felixwang9817)
    • Added .with_name method in FeatureView/OnDemandFeatureView classes for name aliasing. FeatureViewProjection will hold this information #1872 (mavysavydav)

    🔴 Fixed bugs:

    • Update makefile to use pip installed dependencies #1920 (loftiskg)
    • Delete tables #1916 (felixwang9817)
    • Set a 5 minute limit for redshift statement execution #1915 (achals)
    • Use set when parsing repos to prevent duplicates #1913 (achals)
    • resolve environment variables in repo config #1909 (samuel100)
    • Respect specified ValueTypes for features during materialization #1906 (Agent007)
    • Fix issue with feature views being detected as duplicated when imported #1905 (achals)
    • Use contextvars to maintain a call stack during the usage calls #1882 (achals)

    🔨 Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.13.0(Sep 22, 2021)

    Overview

    Today we are releasing Feast 0.13, which includes 3 new features:

    1. [Experimental] On demand feature views, which allow for consistently applied transformations in both training and online paths. This also introduces the concept of request data, which is data only available at the time of the prediction request, as potential inputs into these transformations (see docs)
    2. [Experimental] Python feature servers, which allow you to quickly deploy a local HTTP server to serve online features. Serverless deployments and java feature servers to come soon! (see docs)
    3. Feature views without entities, which allow you to specify features that should only be joined on event timestamps. You do not need lists of entities / entity values when defining and retrieving features from these feature views. (see docs)

    Experimental features are subject to API changes in the near future as we collect feedback. If you have thoughts, please don’t hesitate to reach out to the Feast team!

    👥 Contributors

    Thanks to @achals, @adchia, @baineng, @codyjlin, @DvirDukhan, @felixwang9817, @GregKuhlmann, @guykhazma, @hamzakpt, @judahrand, @jdamji, @LarsKlingen, @MattDelac, @mavysavydav, @mmurdoch, @qooba, @tedhtchang, @samuel100, @tsotnet, and @WingCode for the contributions!

    💥 Breaking changes:

    • Enforce case-insensitively unique feature view names #1835 (codyjlin)
    • Add init to Provider contract #1796 (woop)

    ✨ New Features:

    • Add on demand feature view experimental docs #1880 (adchia)
    • Adding telemetry for on demand feature views and making existing usage calls async #1873 (adchia)
    • Read registry & config from env variables in AWS Lambda feature server #1870 (tsotnet)
    • Add feature server configuration for AWS lambda #1865 (felixwang9817)
    • Add MVP support for on demand transforms for AWS to_s3 and to_redshift #1856 (adchia)
    • Add MVP support for on demand transforms for bigquery #1855 (adchia)
    • Add arrow support for on demand feature views #1853 (adchia)
    • Support adding request data in on demand transforms #1851 (adchia)
    • Support on demand feature views in feature services #1849 (achals)
    • Infer features for on demand feature views, support multiple output features #1845 (achals)
    • Add Registry and CLI operations for on demand feature views #1828 (achals)
    • Implementing initial on demand transforms for historical retrieval to_df #1824 (adchia)
    • Registry store plugin #1812 (DvirDukhan)
    • Enable entityless featureviews #1804 (codyjlin)
    • Initial scaffolding for on demand feature view #1803 (adchia)
    • Add s3 support (with custom endpoints) #1789 (woop)
    • Local feature server implementation (HTTP endpoint) #1780 (tsotnet)

    🔴 Fixed bugs:

    • Array/list feature materialization in BQ crashes in type conversion #1839
    • Fixing odfv cli group description #1890 (adchia)
    • Fix list feature format for BigQuery offline datasources. #1889 (judahrand)
    • Add dill to main dependencies #1886 (judahrand)
    • Fix pytest_collection_modifyitems to select benchmark tests only #1874 (achals)
    • Add support for multiple entities in Redshift #1850 (felixwang9817)
    • Move apply(dummy_entity) to apply time to ensure it persists in FeatureStore #1848 (codyjlin)
    • Add schema parameter to RedshiftSource #1847 (felixwang9817)
    • Pass bigquery job object to get_job #1844 (LarsKlingen)
    • Simplify _python_value_to_proto_value by looking up values in a dict #1837 (achals)
    • Update historical retrieval integration test for on demand feature views #1836 (achals)
    • Fix flaky connection to redshift data API #1834 (achals)
    • Init registry during create_test_environment #1829 (achals)
    • Randomly generating new BQ dataset for offline_online_store_consistency test #1818 (adchia)
    • Ensure docstring tests always teardown #1817 (felixwang9817)
    • Use get_multi instead of get for datastore reads #1814 (achals)
    • Fix Redshift query for external tables #1810 (woop)
    • Use a random dataset and table name for simple_bq_source #1801 (achals)
    • Refactor Environment class and DataSourceCreator API, and use fixtures for datasets and data sources #1790 (achals)
    • Fix get_online_features telemetry to only log every 10000 times #1786 (felixwang9817)
    • Add a description field the Feature Service class and proto #1771 (achals)
    • Validate project name upon feast.apply #1766 (tedhtchang)

    🔨 Merged pull requests:

    • Add ValueType.NULL #1893 (judahrand)
    • Adding more details to the CONTRIBUTING.md #1888 (adchia)
    • Parse BQ DATETIME and TIMESTAMP #1885 (judahrand)
    • Add durations to list the slowest tests #1881 (achals)
    • Upload benchmark information to S3 after integration test runs #1878 (achals)
    • Refactor providers to remove duplicate implementations #1876 (achals)
    • Add Felix & Danny to code owners file #1869 (tsotnet)
    • Initial docker image for aws lambda feature server #1866 (tsotnet)
    • Add flags file to include experimental flags and test/usage flags #1864 (adchia)
    • Hookup pytest-benchmark to online retreival #1858 (achals)
    • Add feature server docs & small changes in local server #1852 (tsotnet)
    • Add roadmap to README.md #1843 (woop)
    • Enable the types test to run on all compatible environments #1840 (adchia)
    • Update reviewers/approvers to include Danny/Felix #1833 (adchia)
    • Fix wrong links in README #1832 (baineng)
    • Remove older offline/online consistency tests #1831 (achals)
    • Replace individual cli tests with parametrized tests #1830 (achals)
    • Reducing wait interval for BQ integration tests #1827 (adchia)
    • Reducing size of universal repo to decrease integration test time #1826 (adchia)
    • Refactor the datastore online_read method to be slightly more efficient #1819 (achals)
    • Remove old doc #1815 (achals)
    • Rename telemetry to usage #1800 (felixwang9817)
    • Updating quickstart colab to explain more concepts and highlight value prop of Feast #1799 (adchia)
    • Fix Azure Terraform installation. #1793 (mmurdoch)
    • Disable integration test reruns to identify flaky tests #1787 (achals)
    • Rerun failed python integration tests #1785 (achals)
    • Add Redis to the universal integration tests #1784 (achals)
    • Add online feature retrieval integration test using the universal repo #1783 (achals)
    • Fix wrong description in README.md #1779 (WingCode)
    • Clean up docstring tests #1778 (felixwang9817)
    • Add offline retrival integration tests using the universal repo #1769 (achals)
    • Adding initial type support related tests for BQ #1768 (adchia)
    Source code(tar.gz)
    Source code(zip)
  • v0.12.1(Aug 20, 2021)

  • v0.12.0(Aug 6, 2021)

    Overview

    Today we are releasing Feast 0.12, which includes 2 new major features:

    1. A new AWS provider which uses DynamoDB for the online store and Redshift for the offline store.
    2. A concept of FeatureService, that represents a logical group of features from one or more feature views.

    👥 Contributors

    Thanks to @achals, @adchia, @charliec443, @codyjlin, @DvirDukhan, @felixwang9817, @GregKuhlmann, @MattDelac, @mavysavydav, @Mwad22, @nels, @potatochip, @szalai1, @tedhtchang and @tsotnet for the contributions!

    💥 Breaking changes:

    • Set default feature naming to not include feature view name. Add option to include feature view name in feature naming. #1641 (Mwad22)

    ✨ New Features:

    • AWS Template improvements (input prompt for configs, default to Redshift) #1731 (tsotnet)
    • Clean up uploaded entities in Redshift offline store #1730 (tsotnet)
    • Implement Redshift historical retrieval #1720 (tsotnet)
    • Add custom data sources #1713 (achals)
    • Added --skip-source-validation flag to feast apply #1702 (mavysavydav)
    • Allow specifying FeatureServices in FeatureStore methods #1691 (achals)
    • Implement materialization for RedshiftOfflineStore & RedshiftRetrievalJob #1680 (tsotnet)
    • Add FeatureService proto definition #1676 (achals)
    • Add RedshiftDataSource #1669 (tsotnet)
    • Add streaming sources to the FeatureView API #1664 (achals)
    • Add to_table() to RetrievalJob object #1663 (MattDelac)
    • Provide the user with more options for setting the to_bigquery config #1661 (codyjlin)

    🔴 Fixed bugs:

    • Fix feast apply bugs #1754 (tsotnet)
    • Teardown integration tests resources for aws #1740 (achals)
    • Fix GCS version #1732 (potatochip)
    • Fix unit test warnings related to file_url #1726 (tedhtchang)
    • Refactor data source classes to fix import issues #1723 (achals)
    • Append ns time and random integer to redshift test tables #1716 (achals)
    • Add randomness to bigquery table name #1711 (felixwang9817)
    • Fix dry_run bug that was making to_bigquery hang indefinitely #1706 (codyjlin)
    • Stringify WhichOneof to make mypy happy #1705 (achals)
    • Update redis options parsing #1704 (DvirDukhan)
    • Cancel BigQuery job if block_until_done call times out or is interrupted #1699 (codyjlin)
    • Teardown infrastructure after integration tests #1697 (achals)
    • Fix unit tests that got broken by Pandas 1.3.0 release #1683 (tsotnet)
    • Remove default list from the FeatureView constructor #1679 (achals)
    • BQ exception should be raised first before we check the timedout #1675 (MattDelac)
    • Allow strings for online/offline store instead of dicts #1673 (achals)
    • Cancel BigQuery job if timeout hits #1672 (MattDelac)
    • Make sure FeatureViews with same name can not be applied at the same … #1651 (tedhtchang)

    🔨 Merged pull requests:

    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Jun 24, 2021)

    Overview

    Feast 0.11 is here! This is the first release after the major changes introduced in Feast 0.10. We’ve focused on two areas in particular:

    1. Introducing a new online store, Redis, which supports feature serving at high throughput and low latency.
    2. Improving the Feast user experience through reduced boilerplate, smoother workflows, and improved error messages. A key addition here is the introduction of feature inferencing, which allows Feast to dynamically discover data schemas in data sources.

    👥 Contributors

    Thanks to @MattDelac , @vtao2, @tsotnet, @woop, @achals, @tedhtchang, @qooba, @codyjlin, @jklegar, @rightx2, @kevinhu, @shihabuddinbuet, @jongillham and @szalai1 for the contributions!

    ✨ New Features:

    🔴 Fixed bugs:

    • Schema Inferencing should happen at apply time #1646 (mavysavydav)
    • Don't use .result() in BigQueryOfflineStore, since it still leads to OOM #1642 (tsotnet)
    • Don't load entire bigquery query results in memory #1638 (tsotnet)
    • Remove file loader & its test #1632 (tsotnet)
    • Provide descriptive error on invalid table reference #1627 (codyjlin)
    • Fix ttl duration when ttl is None #1624 (MattDelac)
    • Fix race condition in historical e2e tests #1620 (woop)
    • Add validations when materializing from file sources #1615 (achals)
    • Add entity column validations when getting historical features from bigquery #1614 (achals)
    • Allow telemetry configuration to fail gracefully #1612 (achals)
    • Update type conversion from pandas to timestamp to support various the timestamp types #1603 (achals)
    • Add current directory in sys path for CLI commands that might depend on custom providers #1594 (MattDelac)
    • Fix contention issue #1582 (woop)
    • Ensure that only None types fail predicate #1580 (woop)
    • Don't create bigquery dataset if it already exists #1569 (tsotnet)
    • Don't lose materialization interval tracking when re-applying feature views #1559 (jklegar)
    • Validate project and repo names for apply and init commands #1558 (tedhtchang)
    • Bump supported Python version to 3.7 #1504 (tsotnet)

    🔨 Merged pull requests:

    • Rename telemetry to usage #1660 (tsotnet)
    • Refactor OfflineStoreConfig classes into their owning modules #1657 (achals)
    • Run python unit tests in parallel #1652 (achals)
    • Refactor OnlineStoreConfig classes into owning modules #1649 (achals)
    • Fix table_refs in BigQuerySource definitions #1644 (tsotnet)
    • Make test historical retrieval longer #1630 (MattDelac)
    • Fix failing historical retrieval assertion #1622 (woop)
    • Add a specific error for missing columns during materialization #1619 (achals)
    • Use drop_duplicates() instead of groupby (about 1.5~2x faster) #1617 (rightx2)
    • Optimize historical retrieval with BigQuery offline store #1602 (MattDelac)
    • Use CONCAT() instead of ROW_NUMBER() #1601 (MattDelac)
    • Minor doc fix in the code snippet: Fix to reference the right instance for the retrieved job instance object #1599 (dmatrix)
    • Repo and project names should not start with an underscore #1597 (tedhtchang)
    • Append nanoseconds to dataset name in test_historical_retrival to prevent tests stomping over each other #1593 (achals)
    • Make start and end timestamps tz aware in the CLI #1590 (achals)
    • Bump fastavro version #1585 (kevinhu)
    • Change OfflineStore class description #1571 (tedhtchang)
    • Fix Sphinx documentation building #1563 (woop)
    • Add test coverage and remove MacOS integration tests #1562 (woop)
    • Improve GCP exception handling #1561 (woop)
    • Update default cli no option help message #1550 (tedhtchang)
    • Add opt-out exception logging telemetry #1535 (jklegar)
    • Add instruction for install Feast on IKS and OpenShift using Kustomize #1534 (tedhtchang)
    • BigQuery type to Feast type conversion chart update #1530 (mavysavydav)
    • remove unnecessay path join in setup.py #1529 (shihabuddinbuet)
    • Add roadmap to documentation #1528 (woop)
    • Add test matrix for different Python versions #1526 (woop)
    • Update broken urls in the github pr template file #1521 (tedhtchang)
    • Add a fixed timestamp to quickstart data #1513 (jklegar)
    • Make gcp imports optional #1512 (jklegar)
    • Fix documentation inconsistency #1510 (jongillham)
    • Upgrade grpcio version in python SDK #1508 (szalai1)
    • pre-commit command typo fix in CONTRIBUTING.md #1506 (mavysavydav)
    • Add optional telemetry to other CLI commands #1505 (jklegar)
    Source code(tar.gz)
    Source code(zip)
  • v0.10.8(Jun 17, 2021)

    Implemented enhancements:

    • Add to_bigquery() function to BigQueryRetrievalJob #1634 (vtao2)

    Fixed bugs:

    • Don't use .result() in BigQueryOfflineStore, since it still leads to OOM #1642 (tsotnet)
    • Don't load entire bigquery query results in memory #1638 (tsotnet)
    • Add entity column validations when getting historical features from bigquery #1614 (achals)

    Merged pull requests:

    • Make test historical retrieval longer #1630 (MattDelac)
    • Fix failing historical retrieval assertion #1622 (woop)
    • Optimize historical retrieval with BigQuery offline store #1602 (MattDelac)
    Source code(tar.gz)
    Source code(zip)
  • v0.10.7(Jun 7, 2021)

    Fixed bugs:

    • Fix race condition in historical e2e tests #1620 (woop)

    Merged pull requests:

    • Use drop_duplicates() instead of groupby (about 1.5~2x faster) #1617 (rightx2)
    • Use CONCAT() instead of ROW_NUMBER() #1601 (MattDelac)
    • Minor doc fix in the code snippet: Fix to reference the right instance for the retrieved job instance object #1599 (dmatrix)
    • Append nanoseconds to dataset name in test_historical_retrival to prevent tests stomping over each other #1593 (achals)
    • Make start and end timestamps tz aware in the CLI #1590 (achals)
    Source code(tar.gz)
    Source code(zip)
Owner
Feast
Feature Store for Machine Learning
Feast
Hopsworks - Data-Intensive AI platform with a Feature Store

Give us a star if you appreciate what we do What is Hopsworks? Quick Start Development and Operational ML on Hopsworks Docs Who’s behind Hopsworks? Op

Logical Clocks AB 749 Aug 6, 2022
Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine DSSTNE (pronounced "Destiny") is an open source software library for training and deploying

Amazon Archives 4.4k Jul 30, 2022
A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms.

iNeural A library for creating Artificial Neural Networks, for use in Machine Learning and Deep Learning algorithms. What is a Neural Network? Work on

Fatih Küçükkarakurt 5 Apr 5, 2022
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 166.8k Aug 3, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 898 Jul 25, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.7k Aug 2, 2022
A lightweight C++ machine learning library for embedded electronics and robotics.

Fido Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics. Fido is especially suited for robotic

The Fido Project 412 Jun 25, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14k Jul 31, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Aug 1, 2022
Machine Learning Platform for Kubernetes

Reproduce, Automate, Scale your data science. Welcome to Polyaxon, a platform for building, training, and monitoring large scale deep learning applica

polyaxon 3.1k Aug 4, 2022
In-situ data analyses and machine learning with OpenFOAM and Python

PythonFOAM: In-situ data analyses with OpenFOAM and Python Using Python modules for in-situ data analytics with OpenFOAM 8. NOTE that this is NOT PyFO

Argonne Leadership Computing Facility - ALCF 105 Aug 5, 2022
CNStream is a streaming framework for building Cambricon machine learning pipelines

CNStream is a streaming framework for building Cambricon machine learning pipelines

Cambricon Technologies 173 Aug 1, 2022
SecMML: Secure MPC(multi-party computation) Machine Learning Framework

SecMML 介绍 SecMML是FudanMPL(Multi-Party Computation + Machine Learning)的一个分支,是用于训练机器学习模型的高效可扩展的安全多方计算(MPC)框架,基于BGW协议实现。此框架可以应用到三个及以上参与方联合训练的场景中。目前,SecMM

null 77 Jul 14, 2022
In this tutorial, we will use machine learning to build a gesture recognition system that runs on a tiny microcontroller, the RP2040.

Pico-Motion-Recognition This Repository has the code used on the 2 parts tutorial TinyML - Motion Recognition Using Raspberry Pi Pico The first part i

Marcelo Rovai 16 Jun 18, 2022
Edge ML Library - High-performance Compute Library for On-device Machine Learning Inference

Edge ML Library (EMLL) offers optimized basic routines like general matrix multiplications (GEMM) and quantizations, to speed up machine learning (ML) inference on ARM-based devices. EMLL supports fp32, fp16 and int8 data types. EMLL accelerates on-device NMT, ASR and OCR engines of Youdao, Inc.

NetEase Youdao 176 Jul 21, 2022
A flexible, high-performance serving system for machine learning models

XGBoost Serving This is a fork of TensorFlow Serving, extended with the support for XGBoost, alphaFM and alphaFM_softmax frameworks. For more informat

iQIYI 120 Aug 1, 2022
Examples for using ONNX Runtime for machine learning inferencing.

Examples for using ONNX Runtime for machine learning inferencing.

Microsoft 269 Aug 4, 2022
Provide sample code of efficient operator implementation based on the Cambrian Machine Learning Unit (MLU) .

Cambricon CNNL-Example CNNL-Example 提供基于寒武纪机器学习单元(Machine Learning Unit,MLU)开发高性能算子、C 接口封装的示例代码。 依赖条件 操作系统: 目前只支持 Ubuntu 16.04 x86_64 寒武纪 MLU SDK: 编译和

Cambricon Technologies 1 Mar 7, 2022