|
| 1 | +# Couchbase Columnar offline store (contrib) |
| 2 | + |
| 3 | +## Description |
| 4 | + |
| 5 | +The Couchbase Columnar offline store provides support for reading [CouchbaseColumnarSources](../data-sources/couchbase.md). **Note that Couchbase Columnar is available through [Couchbase Capella](https://cloud.couchbase.com/).** |
| 6 | +* Entity dataframes can be provided as a SQL++ query or can be provided as a Pandas dataframe. A Pandas dataframe will be uploaded to Couchbase Capella Columnar as a collection. |
| 7 | + |
| 8 | +## Disclaimer |
| 9 | + |
| 10 | +The Couchbase Columnar offline store does not achieve full test coverage. |
| 11 | +Please do not assume complete stability. |
| 12 | + |
| 13 | +## Getting started |
| 14 | + |
| 15 | +In order to use this offline store, you'll need to run `pip install 'feast[couchbase]'`. You can get started by then running `feast init -t couchbase`. |
| 16 | + |
| 17 | +To get started with Couchbase Capella Columnar: |
| 18 | +1. Sign up for a [Couchbase Capella](https://cloud.couchbase.com/) account |
| 19 | +2. [Deploy a Columnar cluster](https://docs.couchbase.com/columnar/admin/prepare-project.html) |
| 20 | +3. [Create an Access Control Account](https://docs.couchbase.com/columnar/admin/auth/auth-data.html) |
| 21 | + - This account should be able to read and write. |
| 22 | + - For testing purposes, it is recommended to assign all roles to avoid any permission issues. |
| 23 | +4. [Configure allowed IP addresses](https://docs.couchbase.com/columnar/admin/ip-allowed-list.html) |
| 24 | + - You must allow the IP address of the machine running Feast. |
| 25 | + |
| 26 | + |
| 27 | +## Example |
| 28 | + |
| 29 | +{% code title="feature_store.yaml" %} |
| 30 | +```yaml |
| 31 | +project: my_project |
| 32 | +registry: data/registry.db |
| 33 | +provider: local |
| 34 | +offline_store: |
| 35 | + type: couchbase.offline |
| 36 | + connection_string: COUCHBASE_COLUMNAR_CONNECTION_STRING # Copied from Settings > Connection String page in Capella Columnar console, starts with couchbases:// |
| 37 | + user: COUCHBASE_COLUMNAR_USER # Couchbase cluster access name from Settings > Access Control page in Capella Columnar console |
| 38 | + password: COUCHBASE_COLUMNAR_PASSWORD # Couchbase password from Settings > Access Control page in Capella Columnar console |
| 39 | + timeout: 120 # Timeout in seconds for Columnar operations, optional |
| 40 | +online_store: |
| 41 | + path: data/online_store.db |
| 42 | +``` |
| 43 | +{% endcode %} |
| 44 | +
|
| 45 | +Note that `timeout`is an optional parameter. |
| 46 | +The full set of configuration options is available in [CouchbaseColumnarOfflineStoreConfig](https://rtd.feast.dev/en/master/#feast.infra.offline_stores.contrib.couchbase_offline_store.couchbase.CouchbaseColumnarOfflineStoreConfig). |
| 47 | + |
| 48 | + |
| 49 | +## Functionality Matrix |
| 50 | + |
| 51 | +The set of functionality supported by offline stores is described in detail [here](overview.md#functionality). |
| 52 | +Below is a matrix indicating which functionality is supported by the Couchbase Columnar offline store. |
| 53 | + |
| 54 | +| | Couchbase Columnar | |
| 55 | +| :----------------------------------------------------------------- |:-------------------| |
| 56 | +| `get_historical_features` (point-in-time correct join) | yes | |
| 57 | +| `pull_latest_from_table_or_query` (retrieve latest feature values) | yes | |
| 58 | +| `pull_all_from_table_or_query` (retrieve a saved dataset) | yes | |
| 59 | +| `offline_write_batch` (persist dataframes to offline store) | no | |
| 60 | +| `write_logged_features` (persist logged features to offline store) | no | |
| 61 | + |
| 62 | +Below is a matrix indicating which functionality is supported by `CouchbaseColumnarRetrievalJob`. |
| 63 | + |
| 64 | +| | Couchbase Columnar | |
| 65 | +| ----------------------------------------------------- |--------------------| |
| 66 | +| export to dataframe | yes | |
| 67 | +| export to arrow table | yes | |
| 68 | +| export to arrow batches | no | |
| 69 | +| export to SQL | yes | |
| 70 | +| export to data lake (S3, GCS, etc.) | yes | |
| 71 | +| export to data warehouse | yes | |
| 72 | +| export as Spark dataframe | no | |
| 73 | +| local execution of Python-based on-demand transforms | yes | |
| 74 | +| remote execution of Python-based on-demand transforms | no | |
| 75 | +| persist results in the offline store | yes | |
| 76 | +| preview the query plan before execution | yes | |
| 77 | +| read partitioned data | yes | |
| 78 | + |
| 79 | +To compare this set of functionality against other offline stores, please see the full [functionality matrix](overview.md#functionality-matrix). |
0 commit comments