pipeline run ignores --force  , no way to bypass cache and re-query BigQuery

Pipeline Run Has No Way to Bypass Cache and Re-Query BigQuery

# Context

The pipeline run command caches query results locally to avoid hitting BigQuery on every invocation. This saves both time and money. However, once data is cached, there is currently no way to force a re-query from the CLI — even when upstream data has changed or the existing cache is known to be stale.

```bash
TODO(bassosimone): add support for -f/--force to bypass cache
```
### Where the Cache Skips Happen

There are two separate locations where the pipeline short-circuits on cached data.

1. sync_mlab() in i ` qb_pipeline.py`
```bash
 with entry.lock():
    if not entry.exists():
        entry.sync()

  ```

If ` data.parquet  ` and ` stats.json ` exist on disk, entry.exists() returns `True`   and sync() is never called at all  no syncer runs, no BigQuery, nothing
 
```bash
def  _bq_syncer() in pipeline.py:126
def _bq_syncer(self, entry: PipelineCacheEntry) -> bool:
    if entry.exists():
        log.info("querying for %s... skipped (cached)", entry)
        return True
```
Even if sync() is somehow called, the BigQuery syncer itself bails out early when files exist. This is a second layer of caching that independently prevents re-querying. Both checks are necessary for normal operation  but both need to be bypassed when ` force ` is used.

###nExisting Precedent

` The iqb cache pull ` already implements a  ` -f/--force ` flag `cache_pull.py ` that re-downloads files when hashes mismatch.
Adding the same flag to ` pipeline run ` would Maintain CLI consistency


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline run ignores --force , no way to bypass cache and re-query BigQuery #168

Context

Where the Cache Skips Happen

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pipeline run ignores --force , no way to bypass cache and re-query BigQuery #168

Description

Context

Where the Cache Skips Happen

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions