Skip to content

Kp 11381 roihu#101

Open
mmatthiesencsc wants to merge 32 commits intomasterfrom
KP-11381_Roihu
Open

Kp 11381 roihu#101
mmatthiesencsc wants to merge 32 commits intomasterfrom
KP-11381_Roihu

Conversation

@mmatthiesencsc
Copy link
Copy Markdown
Collaborator

@mmatthiesencsc mmatthiesencsc commented Apr 21, 2026

This PR installs the Language Bank's software components on Roihu as agreed with U Helsinki.
Removes:

  • aalto-asr
  • bert_models
  • hfst-ospell
  • hunpos
  • kaldi
  • python (maintained elswehere)
  • pytorch (maintained elswehere)
    Changes:
  • openSMILE: no ffmpeg integration, does not compile with newest ffmpeg.

At time of writing Heliots does not yet work since java is not yet available on Roihu.

@mmatthiesencsc mmatthiesencsc requested a review from aajarven April 21, 2026 04:23
Copy link
Copy Markdown
Member

@aajarven aajarven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I especially like the usage example in the udpipe module file.

In addition to the comments linked to specific lines of code, I noticed that we have some outdated paths in docs:

vagrant@kpdev:~/scratch/kielipankki-palvelut$ git grep "/ling/"
commandline/README.md:export PATH="/data/ling/hfst/3.16.0/bin:/data/ling/finnish-tagtools/1.6.0/bin:$PATH"
commandline/README.md:environment below /appl/soft/ling/.

The same file also has a list of installed tools that still lists things like ffmpeg.

Is the test_csc inventory still relevant? It still uses Puhti.

I think we also want to update this:

vagrant@kpdev:~/scratch/kielipankki-palvelut$ tail -n 8 README.md 

## Taito
Kielipankin työkalut Taito-laskentaympäristössä
- hfst
- hfst-morphologies
- hfst-ospell
- check-hfst
- finnish-parse (Turku Dependency Parser)

I didn't run the Ansible (yet), but it might be smart to do that afterwards to verify that the internal docs and the scripts are in sync and functional.

Comment thread commandline/roles/hfst/tasks/main.yml
Comment thread commandline/roles/hfst/templates/module_template.j2
Comment thread commandline/roles/trankit/tasks/main.yml Outdated
Copy link
Copy Markdown
Member

@aajarven aajarven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixes look good! I did the wrapper playbook stuff on top of this branch and tested running it, and I ran into quite a few permissions issues related to chmodding another user's files not being possible. I assume these are pre-existing conditions, but I think it would be good to fix them.

hfst

We have issues with our makefiles trying to change permissions on files that might have been created by another user's Ansible run, causing the task to fail:

[jarvenp2@roihu-cpu-inst hfst-english-installable]$ make install prefix=/appl/soft/manual/kielipankki/x86_64/hfst/3.16.0
...
chmod 0755 /appl/soft/manual/kielipankki/x86_64/hfst/3.16.0/bin/english-analyze-words
chmod: changing permissions of '/appl/soft/manual/kielipankki/x86_64/hfst/3.16.0/bin/english-analyze-words': Operation not permitted
make: *** [Makefile:18: install] Error 1
[jarvenp2@roihu-cpu-inst hfst-english-installable]$ ls -ls /appl/soft/manual/kielipankki/x86_64/hfst/3.16.0/bin/english-analyze-words
4 -rwxrwxr-x. 1 matthies r_installation_kielipankki 684 Apr 26 14:46 /appl/soft/manual/kielipankki/x86_64/hfst/3.16.0/bin/english-analyze-words

Should we either check the files for the correct permissions and only attempt to chmod if incorrect or delete the old files (abecause that everyone in the group does have the right to do) before running make install to ensure that the chmod will be run against one's own files? The former would be the neater solution, I think, because it won't leave the users without binaries if the play fails midway.

praat

Similarly, praat modulefile installation fails because permissions cannot be set due to ownership. This is an Ansible task though, so I assume that this would not happen if the permissions were already what Ansible wants them to be, but they aren't:

- name: Install modulefile                                                                              
  template:
    src: module_template.j2
    dest: "{{ module_path }}/{{ version }}.lua"
    mode: 0644

vs in reality there's 664 (which I assume is correct)

[jarvenp2@roihu-cpu-inst ~]$ ls -ls /appl/modulefiles/manual/kielipankki/x86_64/praat
total 4
4 -rw-rw-r--. 1 matthies r_installation_kielipankki 363 Apr 13 15:25 6.4.lua

udpipe

Udpipe is an even more curious case, as its module isn't owned by our installation project at all:

[jarvenp2@roihu-cpu-inst ~]$ ls -ls /appl/modulefiles/manual/kielipankki/x86_64/udpipe/1.4.0.lua
4 -rw-r--r--. 1 matthies pepr_matthies 481 Apr 15 11:27 /appl/modulefiles/manual/kielipankki/x86_64/udpipe/1.4.0.lua

enchant

There's also a permission problem with kp-spell in the enchant role, as the directory is not writable for the group:

[jarvenp2@roihu-cpu-inst ~]$ ls -ls /appl/soft/manual/kielipankki/x86_64/kp-spell
total 4
4 drwxr-sr-x. 5 matthies r_installation_kielipankki 4096 Apr 15 14:34 1.0

The output from Ansible also suggests that /bin/bash -ic thingy might also not be working as nicely as first thought, given that there's a lot of stuff like "Inappropriate ioctl for device" etc:

TASK [enchant : Make tykky wrappers] *******************************************************************
fatal: [roihu-cpu-inst.csc.fi]: FAILED! => {"changed": true, "cmd": "/bin/bash -ic \"module load tykky && \\\nwrap-container -w /usr/bin/enchant-2 kp-spell.sif --prefix /appl/soft/manual/kielipankki/x86_64/kp-spell/1.0\"\n", "delta": "0:00:00.357263", "end": "2026-04-26 15:29:17.423577", "msg": "non-zero return code", "rc": 1, "start": "2026-04-26 15:29:17.066314", "stderr": "bash: cannot set terminal process group (1698762): Inappropriate ioctl for device\nbash: no job control in this shell\ntput: No value for $TERM and no -T specified\ntput: No value for $TERM and no -T specified", "stderr_lines": ["bash: cannot set terminal process group (1698762): Inappropriate ioctl for device", "bash: no job control in this shell", "tput: No value for $TERM and no -T specified", "tput: No value for $TERM and no -T specified"], "stdout": "[ INFO ] Constructing configuration \n[ INFO ] Using /tmp/jarvenp2/cw-M8S9ZJ as temporary directory \n[ ERROR ] Installation dir /appl/soft/manual/kielipankki/x86_64/kp-spell/1.0 is not writable \n[ ERROR ] Set CW_DEBUG_KEEP_FILES env variable to keep build files or set CW_LOG_LEVEL to a higher value for more output, e.g. CW_LOG_LEVEL=3 ", "stdout_lines": ["[ INFO ] Constructing configuration ", "[ INFO ] Using /tmp/jarvenp2/cw-M8S9ZJ as temporary directory ", "[ ERROR ] Installation dir /appl/soft/manual/kielipankki/x86_64/kp-spell/1.0 is not writable ", "[ ERROR ] Set CW_DEBUG_KEEP_FILES env variable to keep build files or set CW_LOG_LEVEL to a higher value for more output, e.g. CW_LOG_LEVEL=3 "]}

opensmile

And opensmile has permission problems too:

TASK [opensmile : Copy executable to final location] ***************************************************
fatal: [roihu-cpu-inst.csc.fi]: FAILED! => {"changed": false, "msg": "Destination /appl/soft/manual/kielipankki/x86_64/openSMILE/3.0.2 not writable"}
[jarvenp2@roihu-cpu-inst ~]$ ls -ls /appl/soft/manual/kielipankki/x86_64/openSMILE
total 4
4 drwxr-sr-x. 4 matthies r_installation_kielipankki 4096 Apr 20 16:20 3.0.2

trankit

TASK [trankit : Install and wrap container with tykky] *************************************************
fatal: [roihu-cpu-inst.csc.fi]: FAILED! => {"changed": true, "cmd": "bash -i -c 'module load tykky; wrap-container -w /usr/bin/python /local_scratch/jarvenp2/build/trankit/trankit-1.1.0/trankit.sif --prefix /appl/soft/manual/kielipankki/x86_64/trankit/1.1.0'", "delta": "0:00:00.403062", "end": "2026-04-26 15:49:56.039073", "msg": "non-zero return code", "rc": 1, "start": "2026-04-26 15:49:55.636011", "stderr": "bash: cannot set terminal process group (1705030): Inappropriate ioctl for device\nbash: no job control in this shell\ntput: No value for $TERM and no -T specified\ntput: No value for $TERM and no -T specified", "stderr_lines": ["bash: cannot set terminal process group (1705030): Inappropriate ioctl for device", "bash: no job control in this shell", "tput: No value for $TERM and no -T specified", "tput: No value for $TERM and no -T specified"], "stdout": "[ INFO ] Constructing configuration \n[ INFO ] Using /tmp/jarvenp2/cw-5WHMJR as temporary directory \n[ ERROR ] Installation dir /appl/soft/manual/kielipankki/x86_64/trankit/1.1.0 is not writable \n[ ERROR ] Set CW_DEBUG_KEEP_FILES env variable to keep build files or set CW_LOG_LEVEL to a higher value for more output, e.g. CW_LOG_LEVEL=3 ", "stdout_lines": ["[ INFO ] Constructing configuration ", "[ INFO ] Using /tmp/jarvenp2/cw-5WHMJR as temporary directory ", "[ ERROR ] Installation dir /appl/soft/manual/kielipankki/x86_64/trankit/1.1.0 is not writable ", "[ ERROR ] Set CW_DEBUG_KEEP_FILES env variable to keep build files or set CW_LOG_LEVEL to a higher value for more output, e.g. CW_LOG_LEVEL=3 "]}
[jarvenp2@roihu-cpu-inst ~]$ ls -ls  /appl/soft/manual/kielipankki/x86_64/trankit
total 4
4 drwxr-sr-x. 6 matthies r_installation_kielipankki 4096 Apr 15 14:54 1.1.0

Comment thread commandline/README.md Outdated
Comment thread commandline/README.md

* hfst
* hfst-morphologies
* check-hfst
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does listing this as something that is installed make sense? It definitely is a thing from egrep -o "tags: [^ ]+ " site.yml |sed 's/tags:/ \*/', but maybe we should drop it from this list?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, all these tools are meant to be installed. The oneliner is meant to easily update this list.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is check-hfst really a tool to be installed? It looks like the role is just running tests

file:
path: "{{ item }}"
group: "{{ shared_group }}"
mode: "g+rwX,o+rX,o-w"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we set it this way, instead of doing the ordinary 0755? Given that these are both directories, the X (as opposed to x) shouldn't make a difference, I suppose

file:
path: "{{ item }}"
group: "{{ shared_group }}"
mode: "g+rwX,o+rX,o-w"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question about 755 applies here too

Co-authored-by: Anni Järvenpää <anni.jarvenpaa@csc.fi>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants