Skip to content

Commit ecd0960

Browse files
chambridgeclaude
andauthored
docs(catalog): add model_type custom property documentation and examples (kubeflow#1988)
Add comprehensive documentation for the model_type custom property that enables differentiating between predictive and generative AI models in the Model Registry catalog. This feature provides a standardized approach for model classification and filtering based on AI/ML paradigm. Signed-off-by: Chris Hambridge <chambrid@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 848d331 commit ecd0960

5 files changed

Lines changed: 266 additions & 4 deletions

File tree

catalog/README.md

Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,204 @@ Artifact references:
6262
}
6363
```
6464

65+
## Custom Properties
66+
67+
Custom properties provide extensible metadata for models and artifacts beyond the predefined schema fields. They enable storing domain-specific metadata, classification tags, and arbitrary key-value data.
68+
69+
### Overview
70+
71+
Custom properties can be attached to:
72+
- **CatalogModel**: Model-level metadata (e.g., model type, validation status)
73+
- **CatalogModelArtifact**: Artifact-level metadata (e.g., validation date, deployment targets)
74+
- **CatalogMetricsArtifact**: Metrics metadata (e.g., benchmark names, hardware configurations)
75+
76+
Each custom property consists of:
77+
- **Key**: Property name (string)
78+
- **Value**: Typed metadata value with one of the following types:
79+
- `MetadataStringValue`: String values
80+
- `MetadataIntValue`: Integer values
81+
- `MetadataDoubleValue`: Floating-point values
82+
- `MetadataBoolValue`: Boolean values
83+
84+
### Model Type Property
85+
86+
The `model_type` custom property is a standardized property for categorizing models by their AI/ML paradigm. It enables filtering and governance based on model characteristics.
87+
88+
#### Specification
89+
90+
**Property Name**: `model_type`
91+
92+
**Metadata Type**: `MetadataStringValue`
93+
94+
**Allowed Values**:
95+
- `predictive` - Traditional ML models (regression, classification, forecasting, clustering, etc.)
96+
- `generative` - Generative AI models (LLMs, diffusion models, GANs, VAEs, etc.)
97+
- `unknown` - Model type not yet determined or not applicable
98+
99+
#### Usage
100+
101+
The `model_type` property should be set as a custom property on model artifacts to indicate the model's category:
102+
103+
**YAML Format** (for YAML catalog sources):
104+
```yaml
105+
models:
106+
- name: my-regression-model
107+
description: Sales forecasting model
108+
customProperties:
109+
model_type:
110+
metadataType: MetadataStringValue
111+
string_value: "predictive"
112+
artifacts:
113+
- uri: oci://registry.example.com/models/sales-forecast:v1.0
114+
115+
- name: my-llm-model
116+
description: Large language model for text generation
117+
customProperties:
118+
model_type:
119+
metadataType: MetadataStringValue
120+
string_value: "generative"
121+
artifacts:
122+
- uri: oci://registry.example.com/models/text-generator:v2.0
123+
```
124+
125+
**REST API Response** (JSON):
126+
```json
127+
{
128+
"name": "my-regression-model",
129+
"description": "Sales forecasting model",
130+
"customProperties": {
131+
"model_type": {
132+
"metadataType": "MetadataStringValue",
133+
"string_value": "predictive"
134+
}
135+
}
136+
}
137+
```
138+
139+
#### Model Type Classification Guide
140+
141+
**Predictive Models** (`predictive`):
142+
- Regression models (linear, polynomial, etc.)
143+
- Classification models (logistic regression, SVM, random forest, etc.)
144+
- Time-series forecasting
145+
- Clustering algorithms
146+
- Anomaly detection
147+
- Traditional neural networks (CNNs for classification, RNNs for prediction)
148+
- Gradient boosting models (XGBoost, LightGBM, CatBoost)
149+
- Recommendation systems (collaborative filtering)
150+
151+
**Generative Models** (`generative`):
152+
- Large Language Models (LLMs) - GPT, BERT, Llama, etc.
153+
- Text-to-image models - Stable Diffusion, DALL-E, etc.
154+
- Generative Adversarial Networks (GANs)
155+
- Variational Autoencoders (VAEs)
156+
- Diffusion models
157+
- Text-to-speech and speech-to-text models
158+
- Code generation models
159+
- Transformer-based generation models
160+
161+
**Unknown** (`unknown`):
162+
- Hybrid models that combine both paradigms
163+
- Experimental models under development
164+
- Models where classification is not yet determined
165+
166+
### Querying and Filtering by Custom Properties
167+
168+
#### Filter by Model Type
169+
170+
Search for all generative AI models:
171+
```bash
172+
GET /api/model_catalog/v1alpha1/models?source=my-catalog&filterQuery=customProperties.model_type.string_value='generative'
173+
```
174+
175+
Search for predictive models:
176+
```bash
177+
GET /api/model_catalog/v1alpha1/models?source=my-catalog&filterQuery=customProperties.model_type.string_value='predictive'
178+
```
179+
180+
#### Combining Filters
181+
182+
Filter by model type and other criteria:
183+
```bash
184+
# Generative models with production maturity
185+
GET /api/model_catalog/v1alpha1/models?source=my-catalog&filterQuery=customProperties.model_type.string_value='generative' AND maturity='Production'
186+
187+
# Predictive models for specific tasks
188+
GET /api/model_catalog/v1alpha1/models?source=my-catalog&filterQuery=customProperties.model_type.string_value='predictive' AND tasks CONTAINS 'regression'
189+
```
190+
191+
### Additional Custom Properties Examples
192+
193+
#### Validation and Certification
194+
195+
```yaml
196+
customProperties:
197+
validated:
198+
metadataType: MetadataStringValue
199+
string_value: ""
200+
validation_status:
201+
metadataType: MetadataStringValue
202+
string_value: "certified"
203+
validation_date:
204+
metadataType: MetadataStringValue
205+
string_value: "2025-01-20"
206+
compliance:
207+
metadataType: MetadataStringValue
208+
string_value: "GDPR,CCPA,SOC2"
209+
```
210+
211+
#### Performance and Hardware
212+
213+
```yaml
214+
customProperties:
215+
hardware_type:
216+
metadataType: MetadataStringValue
217+
string_value: "H100"
218+
hardware_count:
219+
metadataType: MetadataIntValue
220+
int_value: "2"
221+
throughput_tps:
222+
metadataType: MetadataDoubleValue
223+
double_value: 1105.4
224+
latency_p95_ms:
225+
metadataType: MetadataDoubleValue
226+
double_value: 108.3
227+
```
228+
229+
#### Deployment Metadata
230+
231+
```yaml
232+
customProperties:
233+
deployment_type:
234+
metadataType: MetadataStringValue
235+
string_value: "production"
236+
framework_type:
237+
metadataType: MetadataStringValue
238+
string_value: "vllm"
239+
framework_version:
240+
metadataType: MetadataStringValue
241+
string_value: "v0.8.4"
242+
use_case:
243+
metadataType: MetadataStringValue
244+
string_value: "chatbot"
245+
```
246+
247+
### Best Practices
248+
249+
1. **Use Standardized Properties**: For common use cases like `model_type`, use the documented property names and values to ensure consistency across catalogs.
250+
251+
2. **Choose Appropriate Types**: Select the correct metadata type for your values:
252+
- Use `MetadataStringValue` for text, enums, and identifiers
253+
- Use `MetadataIntValue` for counts and whole numbers
254+
- Use `MetadataDoubleValue` for measurements and metrics
255+
- Use `MetadataBoolValue` for flags
256+
257+
3. **Document Custom Properties**: Maintain documentation for any custom properties specific to your organization or use case.
258+
259+
4. **Validate Values**: When using enum-like properties (like `model_type`), validate values against the allowed set to prevent inconsistencies.
260+
261+
5. **Use Hierarchical Keys**: For complex metadata, consider using dot-notation or underscores to create logical groupings (e.g., `validation_status`, `hardware_type`).
262+
65263
## Configuration
66264

67265
The catalog service uses **file-based configuration** instead of traditional databases:

catalog/internal/catalog/testdata/dev-community-models.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,10 @@ models:
3131
- text-generation
3232
createTimeSinceEpoch: "1728086400000"
3333
lastUpdateTimeSinceEpoch: "1728086400000"
34+
customProperties:
35+
model_type:
36+
string_value: "generative"
37+
metadataType: MetadataStringValue
3438
artifacts:
3539
- uri: oci://registry.example.com/open-models/falcon-mini-2b:v1.0
3640
createTimeSinceEpoch: "1728086400000"
@@ -68,6 +72,10 @@ models:
6872
- sentiment-analysis
6973
createTimeSinceEpoch: "1726790400000"
7074
lastUpdateTimeSinceEpoch: "1726790400000"
75+
customProperties:
76+
model_type:
77+
string_value: "predictive"
78+
metadataType: MetadataStringValue
7179
artifacts:
7280
- uri: oci://registry.example.com/quantum-research/sentiment-analyzer:v2.0
7381
createTimeSinceEpoch: "1726790400000"
@@ -107,6 +115,10 @@ models:
107115
- text-generation
108116
createTimeSinceEpoch: "1723420800000"
109117
lastUpdateTimeSinceEpoch: "1723420800000"
118+
customProperties:
119+
model_type:
120+
string_value: "generative"
121+
metadataType: MetadataStringValue
110122
artifacts:
111123
- uri: oci://registry.example.com/indie-ai/creative-writer-3b:experimental
112124
createTimeSinceEpoch: "1723420800000"
@@ -149,6 +161,10 @@ models:
149161
- translation
150162
createTimeSinceEpoch: "1722297600000"
151163
lastUpdateTimeSinceEpoch: "1722297600000"
164+
customProperties:
165+
model_type:
166+
string_value: "generative"
167+
metadataType: MetadataStringValue
152168
artifacts:
153169
- uri: oci://registry.example.com/alpha-labs/translation-mini-1b:v1.5
154170
createTimeSinceEpoch: "1722297600000"

catalog/internal/catalog/testdata/dev-organization-models.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,10 @@ models:
6969
- question-answering
7070
createTimeSinceEpoch: "1736899200000"
7171
lastUpdateTimeSinceEpoch: "1736899200000"
72+
customProperties:
73+
model_type:
74+
string_value: "generative"
75+
metadataType: MetadataStringValue
7276
artifacts:
7377
- uri: oci://registry.example.com/acme-ai/neural-7b-instruct:v1.0
7478
createTimeSinceEpoch: "1736899200000"
@@ -125,6 +129,10 @@ models:
125129
- fill-mask
126130
createTimeSinceEpoch: "1730678400000"
127131
lastUpdateTimeSinceEpoch: "1730678400000"
132+
customProperties:
133+
model_type:
134+
string_value: "generative"
135+
metadataType: MetadataStringValue
128136
artifacts:
129137
- uri: oci://registry.example.com/stellar-labs/quantum-13b-base:v2.1
130138
createTimeSinceEpoch: "1730678400000"
@@ -185,6 +193,10 @@ models:
185193
- code-generation
186194
createTimeSinceEpoch: "1734652800000"
187195
lastUpdateTimeSinceEpoch: "1734652800000"
196+
customProperties:
197+
model_type:
198+
string_value: "generative"
199+
metadataType: MetadataStringValue
188200
artifacts:
189201
- uri: oci://registry.example.com/neural-dynamics/code-pilot-3b:latest
190202
createTimeSinceEpoch: "1734652800000"
@@ -248,6 +260,10 @@ models:
248260
- visual-question-answering
249261
createTimeSinceEpoch: "1738368000000"
250262
lastUpdateTimeSinceEpoch: "1738368000000"
263+
customProperties:
264+
model_type:
265+
string_value: "generative"
266+
metadataType: MetadataStringValue
251267
artifacts:
252268
- uri: oci://registry.example.com/acme-ai/multimodal-vision-7b:v1.0
253269
createTimeSinceEpoch: "1738368000000"
@@ -318,6 +334,10 @@ models:
318334
- text-generation
319335
createTimeSinceEpoch: "1736467200000"
320336
lastUpdateTimeSinceEpoch: "1736467200000"
337+
customProperties:
338+
model_type:
339+
string_value: "generative"
340+
metadataType: MetadataStringValue
321341
artifacts:
322342
- uri: oci://registry.example.com/stellar-labs/reasoning-1b-chat:v1.2
323343
createTimeSinceEpoch: "1736467200000"
@@ -389,6 +409,10 @@ models:
389409
- sentence-similarity
390410
createTimeSinceEpoch: "1726358400000"
391411
lastUpdateTimeSinceEpoch: "1726358400000"
412+
customProperties:
413+
model_type:
414+
string_value: "predictive"
415+
metadataType: MetadataStringValue
392416
artifacts:
393417
- uri: oci://registry.example.com/neural-dynamics/embeddings-large:v3.0
394418
createTimeSinceEpoch: "1726358400000"

catalog/internal/catalog/testdata/dev-validated-models.yaml

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ models:
44
provider: Validation Authority
55
logo: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIiB3aWR0aD0iMTAwIiBoZWlnaHQ9IjEwMCI+CiAgPGRlZnM+CiAgICA8bGluZWFyR3JhZGllbnQgaWQ9ImdyYWQyIiB4MT0iMCUiIHkxPSIwJSIgeDI9IjEwMCUiIHkyPSIxMDAlIj4KICAgICAgPHN0b3Agb2Zmc2V0PSIwJSIgc3R5bGU9InN0b3AtY29sb3I6IzI3QUU2MDtzdG9wLW9wYWNpdHk6MSIgLz4KICAgICAgPHN0b3Agb2Zmc2V0PSIxMDAlIiBzdHlsZT0ic3RvcC1jb2xvcjojMjI5OTU0O3N0b3Atb3BhY2l0eToxIiAvPgogICAgPC9saW5lYXJHcmFkaWVudD4KICA8L2RlZnM+CiAgPHBhdGggZD0iTSA1MCAxMCBMIDgwIDMwIEwgODAgNzAgTCA1MCA5MCBMIDIwIDcwIEwgMjAgMzAgWiIgZmlsbD0idXJsKCNncmFkMikiIHN0cm9rZT0iIzFFODQ0OSIgc3Ryb2tlLXdpZHRoPSIyIi8+CiAgPHBhdGggZD0iTSAzNSA1MCBMIDQ1IDYwIEwgNjUgMzUiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iNiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBmaWxsPSJub25lIi8+Cjwvc3ZnPgo=
66
description: |-
7-
(DEMO) Production-LLM-8B is an enterprise-validated 8 billion parameter language model that has
7+
(DEMO) Production-LLM-8B is an enterprise-validated 8 billion parameter generative language model that has
88
undergone rigorous testing for production deployment. Sed do eiusmod tempor incididunt
99
ut labore et dolore magna aliqua.
1010
readme: |-
@@ -77,6 +77,9 @@ models:
7777
validated:
7878
string_value: ""
7979
metadataType: MetadataStringValue
80+
model_type:
81+
string_value: "generative"
82+
metadataType: MetadataStringValue
8083
artifacts:
8184
- uri: oci://registry.example.com/certified/production-llm-8b:validated-v1.0
8285
createTimeSinceEpoch: "1737331200000"
@@ -479,7 +482,7 @@ models:
479482
provider: Validation Authority
480483
logo: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIiB3aWR0aD0iMTAwIiBoZWlnaHQ9IjEwMCI+CiAgPGRlZnM+CiAgICA8bGluZWFyR3JhZGllbnQgaWQ9ImdyYWQyIiB4MT0iMCUiIHkxPSIwJSIgeDI9IjEwMCUiIHkyPSIxMDAlIj4KICAgICAgPHN0b3Agb2Zmc2V0PSIwJSIgc3R5bGU9InN0b3AtY29sb3I6IzI3QUU2MDtzdG9wLW9wYWNpdHk6MSIgLz4KICAgICAgPHN0b3Agb2Zmc2V0PSIxMDAlIiBzdHlsZT0ic3RvcC1jb2xvcjojMjI5OTU0O3N0b3Atb3BhY2l0eToxIiAvPgogICAgPC9saW5lYXJHcmFkaWVudD4KICA8L2RlZnM+CiAgPHBhdGggZD0iTSA1MCAxMCBMIDgwIDMwIEwgODAgNzAgTCA1MCA5MCBMIDIwIDcwIEwgMjAgMzAgWiIgZmlsbD0idXJsKCNncmFkMikiIHN0cm9rZT0iIzFFODQ0OSIgc3Ryb2tlLXdpZHRoPSIyIi8+CiAgPHBhdGggZD0iTSAzNSA1MCBMIDQ1IDYwIEwgNjUgMzUiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iNiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBmaWxsPSJub25lIi8+Cjwvc3ZnPgo=
481484
description: |-
482-
(DEMO) Secure-Embeddings-V2 is a validated embedding model optimized for enterprise search
485+
(DEMO) Secure-Embeddings-V2 is a validated predictive embedding model optimized for enterprise search
483486
and retrieval with enhanced security features. Nemo enim ipsam voluptatem quia
484487
voluptas sit aspernatur aut odit aut fugit.
485488
readme: |-
@@ -561,6 +564,9 @@ models:
561564
validated:
562565
string_value: ""
563566
metadataType: MetadataStringValue
567+
model_type:
568+
string_value: "predictive"
569+
metadataType: MetadataStringValue
564570
artifacts:
565571
- uri: oci://registry.example.com/certified/secure-embeddings-v2:validated
566572
createTimeSinceEpoch: "1733356800000"
@@ -855,7 +861,7 @@ models:
855861
provider: Validation Authority
856862
logo: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIiB3aWR0aD0iMTAwIiBoZWlnaHQ9IjEwMCI+CiAgPGRlZnM+CiAgICA8bGluZWFyR3JhZGllbnQgaWQ9ImdyYWQyIiB4MT0iMCUiIHkxPSIwJSIgeDI9IjEwMCUiIHkyPSIxMDAlIj4KICAgICAgPHN0b3Agb2Zmc2V0PSIwJSIgc3R5bGU9InN0b3AtY29sb3I6IzI3QUU2MDtzdG9wLW9wYWNpdHk6MSIgLz4KICAgICAgPHN0b3Agb2Zmc2V0PSIxMDAlIiBzdHlsZT0ic3RvcC1jb2xvcjojMjI5OTU0O3N0b3Atb3BhY2l0eToxIiAvPgogICAgPC9saW5lYXJHcmFkaWVudD4KICA8L2RlZnM+CiAgPHBhdGggZD0iTSA1MCAxMCBMIDgwIDMwIEwgODAgNzAgTCA1MCA5MCBMIDIwIDcwIEwgMjAgMzAgWiIgZmlsbD0idXJsKCNncmFkMikiIHN0cm9rZT0iIzFFODQ0OSIgc3Ryb2tlLXdpZHRoPSIyIi8+CiAgPHBhdGggZD0iTSAzNSA1MCBMIDQ1IDYwIEwgNjUgMzUiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iNiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBmaWxsPSJub25lIi8+Cjwvc3ZnPgo=
857863
description: |-
858-
(DEMO) Analytics-Forecaster-5B is a specialized 5 billion parameter model validated for
864+
(DEMO) Analytics-Forecaster-5B is a specialized 5 billion parameter predictive model validated for
859865
business analytics, time-series forecasting, and data insights generation. Quis autem
860866
vel eum iure reprehenderit qui in ea voluptate velit.
861867
readme: |-
@@ -934,6 +940,9 @@ models:
934940
validated:
935941
string_value: ""
936942
metadataType: MetadataStringValue
943+
model_type:
944+
string_value: "predictive"
945+
metadataType: MetadataStringValue
937946
artifacts:
938947
- uri: oci://registry.example.com/certified/analytics-forecaster-5b:v2.0-validated
939948
createTimeSinceEpoch: "1731888000000"
@@ -1228,7 +1237,7 @@ models:
12281237
provider: Validation Authority
12291238
logo: data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIiB3aWR0aD0iMTAwIiBoZWlnaHQ9IjEwMCI+CiAgPGRlZnM+CiAgICA8bGluZWFyR3JhZGllbnQgaWQ9ImdyYWQyIiB4MT0iMCUiIHkxPSIwJSIgeDI9IjEwMCUiIHkyPSIxMDAlIj4KICAgICAgPHN0b3Agb2Zmc2V0PSIwJSIgc3R5bGU9InN0b3AtY29sb3I6IzI3QUU2MDtzdG9wLW9wYWNpdHk6MSIgLz4KICAgICAgPHN0b3Agb2Zmc2V0PSIxMDAlIiBzdHlsZT0ic3RvcC1jb2xvcjojMjI5OTU0O3N0b3Atb3BhY2l0eToxIiAvPgogICAgPC9saW5lYXJHcmFkaWVudD4KICA8L2RlZnM+CiAgPHBhdGggZD0iTSA1MCAxMCBMIDgwIDMwIEwgODAgNzAgTCA1MCA5MCBMIDIwIDcwIEwgMjAgMzAgWiIgZmlsbD0idXJsKCNncmFkMikiIHN0cm9rZT0iIzFFODQ0OSIgc3Ryb2tlLXdpZHRoPSIyIi8+CiAgPHBhdGggZD0iTSAzNSA1MCBMIDQ1IDYwIEwgNjUgMzUiIHN0cm9rZT0id2hpdGUiIHN0cm9rZS13aWR0aD0iNiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBmaWxsPSJub25lIi8+Cjwvc3ZnPgo=
12301239
description: |-
1231-
(DEMO) Compliance-Assistant-3B is a lightweight 3 billion parameter model validated for
1240+
(DEMO) Compliance-Assistant-3B is a lightweight 3 billion parameter generative model validated for
12321241
regulatory compliance checking, policy interpretation, and audit assistance. Omnis
12331242
voluptas assumenda est, omnis dolor repellendus.
12341243
readme: |-
@@ -1311,6 +1320,9 @@ models:
13111320
validated:
13121321
string_value: ""
13131322
metadataType: MetadataStringValue
1323+
model_type:
1324+
string_value: "generative"
1325+
metadataType: MetadataStringValue
13141326
artifacts:
13151327
- uri: oci://registry.example.com/certified/compliance-assistant-3b:validated-v1.1
13161328
createTimeSinceEpoch: "1738022400000"

0 commit comments

Comments
 (0)