-
Notifications
You must be signed in to change notification settings - Fork 10
Spreadsheet-based classifications for RUM data – how? #529
Description
Overview
Our RUM data has a checkpoint field that is (at the moment) tracking the occurrence of technical events such as top (JS execution started), lcp, or load. User interaction is tracked via the click event that can apply to any click on the page. If we want deeper tracking of conversions, it would be ideal to allow users to define a mapping of URL patterns to named conversion events in .helix/config.xlsx (see https://github.com/adobe/helix-admin/issues/282) or a similar file.
What would be the best way of passing this classification information into the helix-run-query service?
Details
I see following options:
- we use the Primary/Replica Architecture outlined in H3 Multi-Cloud Storage Architecture helix-home#207 to create a replica of the helix content bus in Google Cloud Storage. Plain JSON files in Google Cloud Storage can be addressed in BigQuery just like a regular table, which can be then
JOINed into the other RUM data. - we allow
helix-run-queryto access the content-bus directly, read the JSON file and provide it as a String query parameter. The query would then be responsible for parsing the JSON and joining it with the RUM data, but that is achievable. Whoever runs the query would still have to provide owner/repo/ref so that the content bus ID can be resolved. - we use an array query parameter for each possible named event. Whoever calls the query service would be responsible for fetching the mapping table and adding the query parameters.
Proposed Actions
At the moment, (3) looks like the easiest option to me, it only comes with the limitation that the list of supported classified checkpoints would need to be pre-prescribed. This limitation has its upsides, as it allows us to infer deeper understanding when events are called buy, subscribe or recommend rather than generic event names like conversion1.
(1) and (2) would also invite an access control problem, so that we do not allow access to any content, but only to mapping tables in .helix.