-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allow headers to be capitalized TASK-1174 #332
base: main
Are you sure you want to change the base?
Conversation
I couldn't reproduce the problem with this form, locally or on kf.kobotoolbox.org, but I was able to get the Here's a basic reproduction case that works for me: |
Two observations:
|
@tiritea, I'm curious what you think about diabolical behavior such as this (in the survey sheet) that we begin to allow with case-insensitive column names:
https://getodk.org/xlsform/ doesn't complain at all, and spits out XForm containing: <text id="/data/name:label">
<value>What Is Your Name</value>
</text> I don't really want to introduce a case-insensitive duplicate column detection mechanism right now, so I'm inclined to just ape the pyxform behavior and take the first column encountered—assuming that's what it's doing. |
OK, final comment of the night from me, I think 😅 This PR aims to tolerate any kind of capitalization in untranslated If we want to be insensitive about case in column names, I think we need to standardize everything as lowercase except for language names. For example,
…when run through pyxform yields: <translation lang="eNgLiSh (en)">
<text id="/data/name:label">
<value>What Is Your Name</value>
</text>
</translation>
<translation lang="english (en)">
<text id="/data/name:label">
<value>what is your name</value>
</text>
</translation> The real use case is that people will capitalize their language names as they see fit, and we should not interfere with that. Everything else, though, should get standardized, e.g. Oh, and one more thing: I don't see why there's not a for column_name in uniq_cols.keys():
if column_name in ['label', 'hint']:
…
# put a `continue` here? I added one out of curiosity, and all the tests still pass 🤷 None of the other string comparisons or regex matches should evaluate as true for |
I didn't see them referenced, but here are some related pyxform issues regarding case-(in)sensitivity: |
My thoughts: 🤮 TBH I dont have no idea if there is proscribed behavior for this! It could well be pyxform is taking the first, or it could just as easily be it is taking the last (but processing them in reverse order?), or... So I'm probably also inclined to just ape whatever pyxfrom seems to be doing. From the above links, it certainly appears they should be treated case insensitively, but in saying that pyxfrom is throwing an error if you have all upercase 'LABEL' (!?)
XLSForm Online sayeth: "Error: The survey element named 'name' has no label or hint." |
FWIW, it looks like rather than simply picking up the first matching column, pyxform may scan all them, till it finds a suitable populated match, but then doesn't continue looking further (!?) |
Somewhat tangential question @jnm were you using kobo-compose or kobo-install? I did repro the issue on my machine using the form I attached (and was able to deploy successfully on the PR branch) and I wonder what the difference was. |
To be honest, I was using my own weird thing this time, but I tried to control for that by testing the example file on production (kf.kobotoolbox.org). |
Is the solution to the immediate problem here to convert all column headers to lowercase unless they contain "::" before processing? It doesn't answer the question of which label column we use but it wouldn't make things worse, presumably. |
Unfortunately not; the logic is more complex. Some portions of column headers containing
Yes, I'm happy to punt on the duplicate-column question and agree that what we do here won't make that situation worse. |
If I am reading this code correctly, we look for "label" -> untranslated, It will be a bit of a pain, but we should be able to use these same regexes to pre-process columns and put everything that isn't a language into lowercase. I'm not yet 100% sure where the best place to do that is. ETA: part of me thinks the place to do this pre-processing is actually in kpi where we first read/load the spreadsheet but I don't love the idea of kpi having to know the internals of formpack. It's possible we can just put a 'standardize column name' method in formpack and have kpi call it though, which keeps everything nice and contained. EATA: or we could put this pre-processing at the beginning of |
@jnm ignoring the current very ugly state of the implementation, does this test look about right:
at the moment i'm calling it on every key in every row, which i don't love, but right now i just want it to work and will optimize later |
Summary
Allow capitalization of the 'label' and 'name' headers in an xls upload.
Notes
When an XLSForm is uploaded with capital letters in the label or name headers, interpret it the same as their lowercase counterparts. This fixes a
ValueError
on deploy, which was happening because the import didn't recognize the label column as one that doesn't expect translations.Preview
Note: when switching between main and this branch in the formpack code, make sure to run
pip-compile
andpip install
again.pip-compile dependencies/pip/requirements.in && pip-compile dependencies/pip/dev_requirements.in
pip install -r dependencies/pip/dev_requirements.txt
SimpleForm-copy.xlsx
ValueError: "Label" column is not translated