Skip to content

Conversation

@rgraber
Copy link
Contributor

@rgraber rgraber commented Dec 1, 2025

πŸ—’οΈ Checklist

  1. run linter locally
  2. update developer docs (API, README, inline, etc.), if any
  3. for user-facing doc changes create a Zulip thread at #Support Docs Updates, if any
  4. draft PR with a title <type>(<scope>)<!>: <title> DEV-1234
  5. assign yourself, tag PR: at least Front end and/or Back end or workflow
  6. fill in the template below and delete template comments
  7. review thyself: read the diff and repro the preview as written
  8. open PR & confirm that CI passes & request reviewers, if needed
  9. delete this section before merging

πŸ“£ Summary

TODO

πŸ“– Description

TODO

πŸ‘· Description for instance maintainers

TODO

πŸ’­ Notes

TODO

πŸ‘€ Preview steps

  1. ℹ️ have an account and a project
  2. do this
  3. do that
  4. πŸ”΄ [on main] notice that this isn't anywhere
  5. 🟒 [on PR] notice that this is here
  6. do that another thing
  7. 🟒 notice that this changed like that

jnm and others added 30 commits April 9, 2024 11:22
…and with less tiptoeing around what's already there
previous work to `subsequences__old`
…2025

 # Conflicts:
 #	kobo/apps/subsequences__new/actions/manual_transcription.py
jnm and others added 30 commits November 5, 2025 12:58
action data within `_data` attribute for each version

Two tests are failing as they were already on 0a92a24:

    FAILED test_models.py::SubmissionSupplementTestCase::test_retrieve_data_from_migrated_data - KeyError: '_version'
    FAILED test_models.py::SubmissionSupplementTestCase::test_retrieve_data_with_stale_questions - AssertionError: assert {'group_name/question_name': {'manual_translation': {'en': {'_versions': [{'_uuid': '22b04ce8-61c2-4383-836f-5d5f0ad73645', 'value': 'berserk',...
)

### πŸ“£ Summary
Fixes a CI installation issue caused by an incompatibility between `pip` 25.3 and `pip-tools` 7.x.
### Notes
Unit tests only. Skips tests that we eventually want to implement but
don't have implementations for yet. DRF failures are unrelated to PR.
The only substantive difference is in how we add supplements to
duplicated submissions. The old `update_submission_extras` method has
been removed so instead we just create a SubmissionSupplement object
with the correct data.

---------

Co-authored-by: John N. Milner <[email protected]>
…DEV-1229 (#6492)

### πŸ’­ Notes
Add new QuestionAdvancedAction model and associated CRU endpoints. This
PR does not involve actually using the models, though it does include
audit logs for when users hit those endpoints. QuestionAdvancedAction
logs cannot be deleted.
Also includes the automatic migration of the `advanced_features` dict
into corresponding QuestionAdvancedAction objects. For now it does not
change the `advanced_features` dict since we are still using it, but
eventually it will be updated to signal that the data in it has already
been migrated and we should use the associated QuestionAdvancedAction
models instead.
The OpenAPI errors are pre-existing and will be dealt with at the branch
level sometime before merging the full project branch.

### πŸ‘€ Preview steps

1. ℹ️ have an account and a project with an audio question 
2. POST to `/api/v2/assets/<asset_uid>/advanced-features/` the following
data:
```
{
     "action": "manual_transcription",
     "question_xpath": <audio question xpath>,
     "params": [{"language": "en"}]
}
```
4. Navigate to `/api/v2/assets/<asset-uid>/advanced-features` in a
browser
5. 🟒 There should be one advanced feature in the list
6. Note the uuid of the action you just created
7. PATCH `/api/v2/assets/<asset_uid>/advanced-features/<action_uuid>/`
with `{"params": ["language": "es"]}`
8. Reload `/api/v2/assets/<asset-uid>/advanced-features`
9. 🟒 [on PR] notice that the params for the action now include both
English and Spanish
…6523)

### πŸ“£ Summary
Add supplemental NLP columns to data table.

### πŸ“– Description
This is just for adding the columns to the data table. They may not be
populated correctly. If an NLP action is enabled, there will be a column
for it, even if there are presently no responses.

### πŸ’­ Notes
Using analysis_form_json to avoid having to make changes on the frontend
even though it's not a very descriptive name. Does not add QA questions,
those will come later.
Removed some of the response fields from the old analysis_from_json
since they don't seem to be used. It's possible in QA or when dealing
with exports they will turn out to be used but we can always add them
back.

### πŸ‘€ Preview steps

1. ℹ️ have an account and a project with an audio question with at least
one response
2. In a python shell, enable all NLP actions by running
```
        asset = Asset.objects.get(uid={uid})
        for action in [
            Action.MANUAL_TRANSLATION,
            Action.MANUAL_TRANSCRIPTION,
            Action.AUTOMATIC_GOOGLE_TRANSLATION,
            Action.AUTOMATIC_GOOGLE_TRANSCRIPTION,
        ]:
            language = 'en' if 'transcription' in action else 'es'
            QuestionAdvancedFeature.objects.create(
                question_xpath={xpath},
                action=action,
                params=[{'language': language}],
                asset=asset,
            )
```
Note: this will enable English manual/automatic transcripts and Spanish
automatic/manual translations
3. Navigate to the data table
4. 🟒 [on PR] Notice there are columns for English transcript and Spanish
translation for the relevant question
### πŸ“£ Summary
Ensure transcriptions and translations are displayed in the data table.


### πŸ“– Description
Only accepted transcriptions/translations will be displayed.


### πŸ’­ Notes
There were several issues preventing transcriptions and translations
from showing up in the data table:
1. retrieve_data was not being called with `for_output=True`,
2. The method signature and the implementations of
transform_data_for_output did not correctly reflect how the method was
being called by SubmissionSupplement.retrieve_data
3. Even when for_output was set, SubmissionSupplement.retrieve_data did
not output the data in the format expected by the frontend

This PR addresses all of these issues. It uses the `_advanced_features`
field to determine the activated features because
QuestionAdvancedFeatures are not fully implemented yet.

Deferred for later:
Fixing drf
Using QuestionAdvancedFeatures instead of `_advanced_features`
Handling Qual actions


### πŸ‘€ Preview steps
Going through the preview is a little annoying because columns are
determined the new way (using QuestionAdvancedFeatures) but the data in
the rows is determined the old way (using Asset._advanced_features).
Also data can only be added by PATCH and not through the UI.

1. ℹ️ have an account and NLP set up
2. Create a new project with an audio question
3. Add a submission with an audio response
4. In a django shell, enable NLP actions the new way by running
```
        asset = Asset.objects.get(uid=<uid>)
        for action in [
            Action.MANUAL_TRANSLATION,
            Action.MANUAL_TRANSCRIPTION,
            Action.AUTOMATIC_GOOGLE_TRANSLATION,
            Action.AUTOMATIC_GOOGLE_TRANSCRIPTION,
        ]:
            language = 'en' if 'transcription' in action else 'es'
            QuestionAdvancedFeature.objects.create(
                question_xpath=<xpath>,
                action=action,
                params=[{'language': language}],
                asset=asset,
            )
```
5. Enable NLP actions the old way by running
```
        asset = Asset.objects.get(uid=<uid>)
        asset.advanced_features = {
                 '_version': '20250820',
                 '_actionConfigs': {
                         <xpath>: {
                                 'manual_transcription': [{'language':'en'}],
                                 'manual_translation': [{'language':'es'}],
                                 'automatic_google_transcription': [{'language':'en'}],
                                 'automatic_google_translation': [{'language':'es'}]
                         }
                 }
        }
        asset.save()
```
6. Using curl and your authorization token, PATCH the following JSONs to
`http://kf.kobo.local/api/v2/assets/<asset_uid>/data/<sub_uuid>/submission-supplement`.
In between each PATCH refresh the data table.
7. manual transcription: `'{"_version":"20250820", "<xpath>":
{"manual_transcription": {"language":"en", "value":"Hello"}}}'`
8. 🟒 [on PR] The transcription column should contain "Hello"
9. automatic transcription: ` '{"_version":"20250820", "<xpath>":
{"automatic_google_transcription": {"language":"en"}}}'`
10. 🟒 [on PR] The transcription column should contain "Hello"
11. accepting the automatic transcription: `'{"_version":"20250820",
"<xpath>": {"automatic_google_transcription": {"language":"en",
"accepted": true}}}'`
12. 🟒 [on PR] The transcription column should contain the automatically
generated transcription
13. automatic translation: `'{"_version":"20250820", "<xpath>":
{"automatic_google_translation": {"language":"es"}}}'`
14. 🟒 [on PR] The translation column should be empty
15. accepting the automatic translation: `'{"_version":"20250820",
"<xpath>": {"automatic_google_translation": {"language":"es",
"accepted": true}}}'`
16. 🟒 [on PR] The translation column should contain the automatic
translation
17. manual_translation: `'{"_version":"20250820", "<xpath>":
{"manual_translation": {"language":"es", "value":"Hola"}}}'`
18. 🟒 [on PR] The translation column should contain "Hola"
…aint` DEV-1432 (#6534)

### πŸ’­ Notes

Will need to re-apply the 0005_questionadvancedfeature migration with
the new changes.
### πŸ’­ Notes
Use the new QuestionAdvancedFeature model for revising/retrieving data
instead of the asset.advanced_features dict.

### πŸ‘€ Preview steps

1. ℹ️ have an account
2. Create a new project with an audio question
3. Add a submission
4. Enable transcriptions by running
```
curl -X POST -H 'Authorization: Token <your token>' http://kf.kobo.local/api/v2/assets/<asset_uid>/advanced-features/ --json '{"question_xpath":<audio_question_xpath>, "action": "manual_transcription", "params": [{"language": "en"}]}'
```
5. Add an English transcription by running
```
curl -X PATCH -H 'Authorization: Token <your token>' http://kf.kobo.local/api/v2/assets/<asset_uid>/data/<submission_uuid>/supplement/ --json '{"_version":"20250820", "<audio_question_xpath>": {"manual_transcription": {"language":"en", "value": "hello"}}}'
```
6. Navigate to the data table
7. 🟒 [on PR] The transcript for the submission should show up in the
table
…_for_output` for `QualAction` (#6504)

### πŸ“£ Summary
Add implementation of `get_output_fields()` and
`transform_data_for_output()` in `QualAction`.

### πŸ“– Description
This update enables qualitative analysis results to appear correctly in
exports or the table view.
The new logic:
- Defines the output fields for each qualitative question (including
labels, types, and choices).
- Converts stored qualitative results into export-ready values,
including expanding choice UUIDs into readable label objects.
### πŸ’­ Notes
Migrate asset.advanced_features to asset.advanced_features_set when
someone hits an advanced-features endpoint or saves an existing asset.

Notable Decisions:
We will use known_cols only to determine which questions had nlp actions
performed. If any question had a transcript or a translation in any
language, we will enable all 4 nlp actions (manual transcript, automatic
transcript, manual translation, automatic translation) for it, using the
languages in `advanced_features` as params

Removed the `set_version` method because it was causing circular imports
that were quite difficult to fix and it didn't seem worth it.

Note the data table will not load properly because this PR only migrates
`advanced_features` and not `SubmissionSupplements`.


### πŸ‘€ Preview steps


1. ℹ️ have an account
3. [on main] Create a new project with an audio question and at least
one submission
4. [on main] Add at least one transcription, one translation, and a QA
question
5. Switch to the PR branch
6. Navigate to `/api/v2/assets/<uid>/advanced-features`
8. 🟒 [on PR] notice there are configured advanced features for all nlp
actions (manual/automatic transcription/translation) and qual for the
relevant audio question
…ect params DEV-1441 (#6548)

### πŸ“£ Summary
Validate `params` before creating new advanced features.

### πŸ‘€ Preview steps

1. ℹ️ have an account and a project
2. `curl -X POST -H 'Authorization: Token <your token>'
http://kf.kobo.local/api/v2/assets/<asset_uid>/advanced-features --json
'{"question_xpath": <xpath>, "action": "manual_transcription", "params":
[{"something":"bad"}]}'
3. πŸ”΄ [on refactor-subsequences-2025] request 500s
4. 🟒 [on PR] request 400s
…ental DEV-934 (#6422)

### πŸ’­ Notes
Fill out the method for converting old SubmissionExtra content dicts to
the new format expected by SubmissionSupplemental for translations and
transcripts.
This code makes numerous assumptions to fill in information that is not
present in the old structure but required in the new:

1. If old[xpath]['transcript']['value'] ==
old[xpath]['googlets']['value']and the language codes are the same, we
assume the most recent transcript was automatically generated
2. If, for any revision in old[xpath]['transcript']['revisions'],
revision['value'] is the same as old[xpath]['googlets']['value'] and the
language codes match, we assume that revision was automatically
generated. If multiple match, we assume they were all automatically
generated. This should be pretty rare but it's possible
3. 1-2 also apply to transcriptions
4. old[xpath]['transcript']['dateModified'] will be assumed to be the
creation date of the most recent revision (ie whatever is in
old[xpath]['transcript']['value']). The same goes for translations
5. All uuids are newly generated
6. All old transcriptions/translations have status=complete with a
_dateAccepted of now() (whenever the code is running)
7. To determine the dependency of any old translation, whether automated
or manual:

* If we know the source language, look for the most recent transcript in
that language that was created before the translation.
* If there is none, take the most recent transcription in that language
* If there are no transcriptions in the source language, take the most
recent transcript
* If we don't know the source language, take the most recent transcript
that was created before the translation
* If there is none, take the most recent transcript

8. We can ignore any badly formatted revisions/transcripts/translations
9. Most recent revisions will be first in the version array
…` structure DEV-1443 (#6549)

### πŸ“£ Summary
Restore background and NLP processing by reading values from the new
`_data` field.

### πŸ“– Description
This fix updates the background processing logic to support the new data
structure where value, language, and status (when present) are now
nested under a `_data` dictionary. Some automated NLP actions were
broken because they were still looking for these fields at the top
level, where they can no longer exist.

### πŸ‘€ Preview steps

1. ℹ️ have an account and a project with an audio question
2. submit an audio file (longer than 2 minutes)
3. use the shell to process an automatic action (look at Linear for
snippet)
4. πŸ”΄ [on main] notice that the background process never starts, and if
user sends acceptance, external service is still called
5. 🟒 [on PR] notice that everything works as expected
### Summary 
This update improves the API documentation for subsequences so it more
accurately reflects the actual data returned by the API.
…ion result DEV-1442 (#6551)

### πŸ“£ Summary
Prevent `_dateAccepted` from being added during deletion of an action
result.

### πŸ“– Description
This fix corrects the subsequences logic so that `_dateAccepted` is not
set when an action result is deleted. Previously, the deletion path
could incorrectly mark the result as accepted by adding `_dateAccepted`,
which conflicted with the intended semantics of a removal. The updated
behavior ensures that deletion strictly removes the result without
recording any acceptance metadata, keeping action histories consistent
and accurate.


### Preview Steps

Use the snippet provided in the linear task description. 
Try it with `refactor-subsequences-2025` and see `Β _dateAccepted` is
added to the version.
With this PR is not present. 
You can try other actions (`manual_translation`, `automatic_*` and
`qual`) and get the same results.
### πŸ’­ Notes
Automatically set `{"options": {"deleted": True}` if a user attempts to
delete a QA question via a PATCH request.
As part of the `update_params` method of the `qual` action, we replace
the entire `options` dict within each question with whatever comes in
the request. This is because the only thing I know of in the `options`
dict is the `deleted` param, which we want to be able to change on the
front end. If it turns out we need the `options` dict for more, we may
have to write a more sophisticated way of merging old and new options.

In the future we may want to replace the options dict with just a
boolean for `hidden,` but this way minimizes the additional work the
frontend will have to do to work with the new API.


### πŸ‘€ Preview steps

1. ℹ️ have an account and a project with an audio question and at least
one submission
2. Enable QA with `curl -X POST -H 'Authorization: Token <your token>'
http://kf.kobo.local/api/v2/assets/<asset_uid>/advanced-features/ --json
'{"question_xpath": <xpath>, "action": <qual>, "params": [{"type":
"qualInteger", "label": {"_default": "How many?"},
"uuid":"11111111-1111-1111-1111-1111111111"}]}'`
3. Note the uid of the action that comes back in the response
4. Try to "remove" the question with `curl -X PATCH -H 'Authorization:
Token <your token>'
http://kf.kobo.local/api/v2/assets/<asset_uid>/advanced-features/<feature_uid>/
--json '{"params": []}'`
5. 🟒 [on PR] notice that the "How many?" question is still in the
response, with `{"options": {"hidden": True}}`

---------

Co-authored-by: Olivier LΓ©ger <[email protected]>
### πŸ’­ Notes
Port selected tests over from `test_submission_extras_api_post.py` in
the old app. Which tests to port were chosen based on whether or not
they applied in the new format and if they already had a reasonable
equivalent, which many did.
Also cleans up some old tests that were failing or had misleading
comments.
…#6547)

### πŸ“£ Summary
Add OpenAPI schema for the `/api/v2/assets/{uid_asset}/advanced-features/` endpoint.

### πŸ“– Description
The API schema output files and the generated Orval types have been
updated with the schema details for the action parameters in the
`QuestionAdvancedFeature` model.
### πŸ“£ Summary
Rename the "qual" action to "manual_qual."

### πŸ’­ Notes
Preparing the way for automatic qa.


### πŸ‘€ Preview steps
Mostly regression. To fix any existing assets, in a shell, run
`QuestionAdvancedFeature.objects.filter(action='qual').update(action='manual_qual')`

1. ℹ️ have an account and a project with an audio question and at least
one submission.
2. Enable QA with a POST to
`api/v2/assets/<asset_uid>/advanced-features/` with
```
{ "question_xpath": <xpath>,
"action": "manual_qual",
"params": [{"uuid": "1111111", "type": "qualInteger", "labels": {"_default":"How many?"}}]
``` 
3. Add a QA answer with a PATCH to
`api/v2/assets/<asset_uid>/data/<submission_uuid>/supplemental/ with
```
{
"_version": "20250820",
"<xpath>": {
        "manual_qual": {
               "uuid": "1111111",
               "value": 1,
         }
 }
}
```
4. Make sure the data table still loads
### πŸ’­ Notes

* use `value` instead of `val` in qualitative analysis responses to
match rest of refactor
* update formpack requirement (see
kobotoolbox/formpack#337)
* match pre-refactor behavior by excluding qualitative analysis note
(`qualNote`) questions from export columns
* restore `name` from earlier refactor work in `get_output_fields()`
* correct order of fields in `analysis_form_json`
* restore `pack.extend_survey()` during exports, missing after evil
merge πŸ‘Ώ

### πŸ‘€ Preview steps

1. Make sure your environment has been rebuilt or whatever's necessary
to pick up the new formpack requirement
2. Make a project with an audio question
3. Collect a submission
4. Add every kind of transcript, translation, and qualitative analysis
question (and response) to the audio response in that submission
* FYI I wrote notes in the description of DEV-1301 (internal
[link](https://linear.app/kobotoolbox/issue/DEV-1301/update-formpack-to-accept-new-shape-of-data))
about how to do this with `curl`
6. Run an XLSX export of the project
7. Make sure all the export includes the transcript, translation, and
all qualitative analysis responses
8. Make sure the export columns are in the proper order
9. Make sure nothing appears that shouldn't

---------

Co-authored-by: rgraber <[email protected]>
### πŸ“£ Summary
Updates the /data endpoint to return the answers to QA questions in a
different format.

### πŸ’­ Notes
Updates the data endpoint to return a dict of question_id: answer
instead of a list that we then have to search by uuid. Keeps it
consistent with other subsequence actions and makes it a little quicker
to process. Requires a new version of formpack.
We don't include supplement stuff in the docs for the /data endpoint so
there's no OpenAPI docs to update.

### πŸ‘€ Preview steps
Regression only

1. ℹ️ have an account and a project with an audio question and at least
one submission
2. Add every type of QA question. You can either follow the directions
here: (internal
[link](https://linear.app/kobotoolbox/issue/DEV-1301/update-formpack-to-accept-new-shape-of-data))
to do this with curl, or switch back to `main` and do it via the UI,
then switch back to the PR branch and hit
`/api/v2/assets/<asset_uid>/advanced-features` to migrate them.
3. Navigate to the data table
4. 🟒 All QA answers should be present 
5. Export the data
7. 🟒 All QA answers should be in the export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants