-
Notifications
You must be signed in to change notification settings - Fork 0
Description
011: Native Hashtag Indexing + Filecoin Image Uploads (Platform-First Microblog Track)
Goal
Improve Token Host Builder as a platform, not just as a demo generator, by adding:
- native secondary query indexes with tokenizer support for hashtags,
- native generated-UI image upload support,
- a first-class upload provider abstraction with a Filecoin Onchain Cloud implementation,
- and a canonical microblog demo that proves the new platform surfaces end-to-end.
This ticket intentionally takes the "hard way" so the outcome is reusable across future apps and schemas rather than solving only one demo.
Sprint execution / source of truth
- This ticket is the source of truth for this sprint.
- The work should be implemented as a set of 8 stacked PRs, corresponding to the phases in this ticket.
- AI may prepare the PRs, but they are not to be merged until human approval.
- This ticket closes only when all 8 PRs are merged.
Recommended PR stack:
- docs scaffolding + backlog/spec sync
- THS extensions for native tokenized indexes
- generator equality index support
- generator native hashtag token index support + bounded behavior
- bounded-cost tests and perf validation
- upload abstraction + Filecoin Onchain Cloud adapter
- generated UI native image upload support
- canonical microblog example + end-to-end hardening/docs sync
Why this exists
Current builder reality is close, but still incomplete for the intended product shape:
- query indexes are modeled in THS but not fully implemented in the generator/runtime,
imageis a first-class schema field type but the generated UI still treats create/edit as plain text entry,- Filecoin is a first-class deploy target, but uploads are not yet integrated as a first-class generated-app capability,
- hashtag-based feeds are possible only through app-specific modeling workarounds, not through a native indexing capability.
This ticket closes those gaps in a way that improves the long-term Token Host platform contract.
Spec alignment
SPEC.mdsection 2.1 (deterministic builds)SPEC.mdsection 2.4 (CRUD-first, indexer-optional)SPEC.mdsection 2.5 (secure-by-default configuration)SPEC.mdsection 6.4 (field definitions)SPEC.mdsection 6.7 (indexes and constraints)SPEC.mdsection 7.5 (bounded list scanning)SPEC.mdsection 7.8 and 7.8.1 (on-chain indexes and key derivation)SPEC.mdsection 7.9 (events)SPEC.mdsection 8.3 (uploads)SPEC.mdsection 13.6 (upload signing / managed upload interface)
Problem statement
We need a canonical Token Host example app that is a microblog:
- text posts,
- image posts,
- hashtag-based feed navigation.
But the implementation must strengthen the platform itself:
- no demo-only image uploader,
- no browser-side
foc-cli, - no
Tag/PostTagworkaround as the primary architecture, - no unbounded tokenization or indexing behavior that is "correct" but gas-hostile in practice.
Product outcome
After this ticket lands, Token Host Builder should support:
- a schema-defined native tokenized secondary index suitable for hashtags,
- deterministic and bounded-cost contract generation for that index type,
- generated UIs that treat
imageas a native uploadable field, - a runtime upload-provider abstraction with a Filecoin Onchain Cloud adapter,
- and a canonical microblog demo schema/UI that exercises those capabilities.
Scope
1. THS / schema modeling
- Extend query-index modeling in a backward-compatible way.
- Preserve existing meaning of:
indexes.index: [{ field }]=> equality index
- Add optional native tokenizer/index options for string fields, expected to support at least:
- index mode/type
- tokenizer kind
- normalization policy
- bounded token extraction settings where appropriate
- Add schema validation and linting for unsupported combinations.
- Add schema migration support for any new THS shape.
2. Generator / contract layer
- Implement missing equality index support in the Solidity generator.
- Implement tokenized hashtag index generation as a first-class on-chain secondary index.
- Keep index storage and access semantics aligned with SPEC 7.8:
- candidate IDs only,
- paginated accessors,
- bounded limits,
- deterministic key derivation,
- correctness validated by follow-up
getC/ current-record reads.
- Define and enforce bounded token/scan behavior so gas remains predictable.
- Add events or event coverage sufficient for indexer/debugging parity where needed.
3. Runtime / client layer
- Add runtime helpers for equality-index and token-index lookups.
- Ensure generated UIs can query native hashtag feeds without client-side full scans of all posts.
- Keep runtime behavior compatible with indexer-optional operation.
4. Upload abstraction
- Introduce a first-class upload provider abstraction for generated apps.
- The browser/UI must talk to a stable Token Host upload interface, not directly to
foc-cli. - Implement a Filecoin Onchain Cloud provider behind a relay/adapter.
- The adapter may use
foc-clioperationally on the server side. - The generated UI should remain storage-provider-agnostic.
- The upload architecture should remain runner-agnostic so different deployments can choose different execution models without changing schema or UI behavior.
5. Generated UI
- Replace plain text handling for
imagecreate/edit with:- file pick,
- upload progress,
- preview,
- retry,
- replace/remove,
- final URL/CID persistence in the normal field value.
- Improve list/detail rendering so image-backed records look native in generated apps.
- Keep this as base generated UI behavior, not only app-extension code.
6. Canonical demo app
- Add a canonical microblog schema and generated-app example that proves:
- text posts,
- image posts,
- hashtag feeds using the native token index,
- Filecoin Onchain Cloud-backed image upload flow.
- Initial target chain for the example should be
filecoin_calibration.
Non-goals
- Building a general-purpose full-text search engine.
- Browser-side direct integration with
foc-cli. - Requiring a hosted Token Host control plane/backend to use the feature locally.
- Implementing every possible tokenizer or normalization mode in v1.
- Solving video uploads or arbitrary asset pipelines in this ticket.
- Rewriting the entire generated UI routing model.
Operating rules for this workstream
Spec sync rule
- Any merge that changes platform behavior must do one of:
- update
SPEC.mdin the same change, or - record an explicit spec delta and follow-up reconciliation note.
- update
AGENTS.mdmust be updated as milestones land so the backlog remains authoritative.
Documentation rule
This ticket is not complete unless the following living artifacts are created and maintained:
work log- chronological execution log
- decisions, changes, test evidence, follow-ups
working note- current architecture
- open questions
- spec delta tracker
- unresolved tradeoffs
blog post / memo draft- narrative summary of the problem, architecture, limits, and results
- suitable for future product, engineering, and external communication reuse
The initial recommended locations are:
docs/worklogs/native-hashtag-indexing-and-filecoin-image-uploads.mddocs/spec-deltas/native-hashtag-indexing-and-filecoin-image-uploads.mddocs/native-hashtag-indexing-and-filecoin-image-uploads-memo.md
Filecoin Onchain Cloud integration rules
- Use AgentHub material for
foc-clias the operator-facing source of truth while implementing the adapter. - Prefer the recommended high-level
uploadflow, not low-level dataset management, unless a concrete requirement forces lower-level control. - Keep Calibration (
314159) versus mainnet (314) explicit in tooling, docs, and tests. - Any credentials, wallet material, or funding configuration must remain environment-based and never be committed.
Proposed architecture
Native hashtag index
Recommended v1 direction:
- support native tokenized query indexes on
stringfields, - first tokenizer implementation is hashtag extraction,
- normalization is deterministic and documented,
- index storage is append-only candidate buckets keyed by token hash,
- reads are paginated and bounded,
- callers validate current record state and current field contents on read.
This should be modeled as a platform index feature, not as app-specific relational materialization.
Upload provider abstraction
Recommended v1 direction:
- generated UI calls a Token Host upload interface,
- interface is implemented locally in preview/dev and can later map to managed hosted mode,
- Filecoin Onchain Cloud is one provider implementation,
- final on-chain value remains a URL/CID string in the normal
imagefield. - runner choice is a deployment concern, not a schema concern.
Recommended runner modes to preserve flexibility:
process- local/dev or self-hosted Node service shells out to
foc-cli
- local/dev or self-hosted Node service shells out to
remote- generated UI or local preview talks to an external HTTP upload service that runs the FOC adapter elsewhere
worker- upload requests are handed off to a background worker where that is operationally preferable
sdk- reserved for a future direct-library path if we replace CLI invocation later
The generated UI should not need to know which runner mode is being used. It should only depend on a stable upload request/response contract.
Bounded-cost and performance requirements
This ticket must explicitly protect against designs that are algorithmically "correct" but operationally too expensive.
Required limits
The implementation must define and enforce bounded limits for at least:
- maximum list scan steps,
- maximum index page size,
- maximum tokens extracted per indexed field on create/update,
- maximum token length,
- maximum normalized source field length considered for tokenization, if needed,
- maximum multicall page/query combinations used by generated UI feed paths.
Required validation
Add schema validation or lints for dangerous configurations, including when appropriate:
- tokenizer requested on unsupported field types,
- tokenizer requested while on-chain indexing is disabled,
- field/UI combinations that imply unsupported behavior,
- settings that exceed safe generator/runtime limits.
Required tests
Add tests that measure both correctness and boundedness.
Contract/generator tests:
- equality index generation and accessors compile and behave correctly,
- hashtag token index generation compiles and behaves correctly,
- over-limit tokenization inputs revert or truncate according to the defined contract behavior,
- candidate buckets remain paginated and bounded.
Gas/perf tests:
- gas snapshot or threshold tests for post creation with:
- no hashtags,
- typical hashtag count,
- maximum allowed hashtag count
- gas snapshot or threshold tests for update paths with token changes,
- regression gate so a substantial gas increase fails CI or at least fails a dedicated perf suite.
Runtime/UI tests:
- generated UI does not fall back to full collection scans for hashtag feeds when native indexes are available,
- upload flow exposes progress/error states,
- image posts render correctly in list/detail surfaces.
Integration tests:
- end-to-end path for:
- upload image,
- create post,
- retrieve by hashtag feed,
- render feed entry with image.
Deliverables
- THS schema/type updates for native tokenized query indexes.
- THS validation/lint updates and migration support as needed.
- Solidity generator support for:
- equality indexes,
- native hashtag token indexes,
- bounded accessors and token limits.
- Runtime helpers for index-based reads.
- Upload-provider abstraction and Filecoin Onchain Cloud adapter.
- Generated UI support for native image upload fields.
- Canonical microblog schema/example app.
- Work log, working note, and blog/memo draft.
- Backlog/spec updates reflecting the new feature set.
Acceptance criteria
- A schema can declare a native hashtag-capable token index without custom app-specific relational modeling.
- Generated contracts compile and expose bounded, paginated accessors for equality and hashtag token indexes.
- Generated UI can create text and image posts without asking users to paste image URLs manually.
- Filecoin Onchain Cloud upload works through the Token Host upload abstraction in local/dev flow.
- Filecoin Onchain Cloud upload architecture does not assume a single hosting model for the runner; at minimum, the design must support both:
- local/self-hosted process execution
- remote adapter service execution
- A canonical microblog example on
filecoin_calibrationdemonstrates:- text post create,
- image post create,
- hashtag feed navigation.
- Spec deltas or spec updates are committed alongside behavior changes.
- Performance evidence exists for token/scan bounds and is recorded in repo docs.
Sequencing
Phase A: freeze design + process
- Create the work log, working note, and memo/blog draft.
- Record initial spec delta or update
SPEC.mdfor any new THS/index semantics. - Update
AGENTS.mdwith the planned workstream.
Phase B: schema and generator foundations
- Extend THS query-index modeling.
- Add validation/linting/migration support.
- Implement equality indexes properly.
- Implement native hashtag token indexes with bounded behavior.
Phase C: runtime and UI
- Add runtime lookup helpers.
- Add upload-provider abstraction.
- Implement Filecoin Onchain Cloud upload adapter.
- Upgrade generated UI image field support.
Phase D: canonical app and hardening
- Add canonical microblog example.
- Add gas/perf test coverage and thresholds.
- Finalize documentation, memo/blog draft, and backlog/spec sync.
Risks and open questions
- Tokenizer semantics must be simple enough to keep deterministic behavior and bounded gas.
- String normalization choices can create compatibility or UX confusion if not specified clearly.
foc-cliis an operator CLI, so local/dev ergonomics and credential handling must be designed carefully.- Some performance assertions may need dedicated CI lanes if they are too slow or environment-sensitive for the default test suite.
- If we discover that on-chain token indexing is too expensive even with strict bounds on some chains, the feature must degrade cleanly under
onChainIndexing=false.
Suggested implementation note
If a design choice is good engineering practice but not yet clearly represented in SPEC.md, prefer to write down the spec delta immediately instead of letting code become the de facto spec.