[FEATUREE] Allow passing extraction model via ExtractionConfig instead of env var

The only ways to set the extraction LLM model are:
  1. Set EXTRACTION_MODEL env var before the config singleton lazily initializes
  2. Directly set GraphRAGConfig.extraction_llm (undocumented, reaches into internals)

  For multi-tenant pipelines where each tenant uses a different model, neither approach is clean. Setting env vars as a side effect before calling extract_and_build() is fragile — it relies on the singleton not having been initialized yet, and mutating global state between tenants requires knowing that
  _extraction_llm gets reset (it doesn't, so switching models between tenants is actually broken).

  Suggestion:

  Accept extraction_llm (model ID string or LLM instance) as a parameter on ExtractionConfig or IndexingConfig, and have it take precedence over the env var / default:

  extraction_config = ExtractionConfig(
      extraction_llm='us.anthropic.claude-sonnet-4-5-20250929-v1:0',
      preferred_entity_classifications=classifications,
  )

  This keeps configuration explicit and co-located, avoids global state mutation, and supports multi-tenant use cases naturally.

  Current workaround:

  os.environ['EXTRACTION_MODEL'] = tenant.model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATUREE] Allow passing extraction model via ExtractionConfig instead of env var #130

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATUREE] Allow passing extraction model via ExtractionConfig instead of env var #130

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions