Long running SFTP transfers with :ssh_sftp.read/4 fail #8724

nsweeting · 2024-08-15T12:48:59Z

Describe the bug
Long running SFTP transfers using :ssh_sftp.read/4 seem to consistently fail at some point. Failure comes in the form of :ssh_sftp.read/4 getting stuck as a result of an :infinity timeout. The overall task wrapping the transfer eventually times out after x minutes of no data movement.

To Reproduce
Unfortunately this is a bit difficult. We were more or less able to reproduce - it just takes a long time. Essentially executing a long running SFTP transfer using :ssh_sftp.read/4 with a throttled download speed (500-600 kb/s range). After about 6-7GB of transfer - the read function seems to get "stuck" with no data movement.

Expected behavior
Long running SFTP transfers using :ssh_sftp.read/4 should complete.

Affected versions
erlang-26.2.5.1

Additional context
We run a service that is responsible for moving data from some SFTP location to our internal network. We move thousands of files a day. In this specific context - these servers are hosted by Salesforce. Download speeds are typically throttled to be in the 500-600 kb/s range. We can have many of these transfers running at the same time for the same server. We normally have no issue.

We recently upgraded the base docker image we use from hexpm/elixir:1.16.0-erlang-26.2.1-alpine-3.18.4 to hexpm/elixir:1.17.1-erlang-26.2.5.1-alpine-3.20.1. After this upgrade we had consistent failures for long running transfers. This would be for files in in 15GB range. They seemed to consistently fail in the 6-7GB range. We had days of these kinds of failures accumulate - so it actually seemed to be fairly reproducible - although it takes a long time! As soon as we switched back to hexpm/elixir:1.16.0-erlang-26.2.1-alpine-3.18.4 - all transfer jobs succeeded. Shorter transfers seem to have no issue.

Its difficult to know specifically whether this is an issue introduced from the OTP upgrade - but at this point - it seems related. There were a couple updates to the :ssh module within this upgrade range.

The text was updated successfully, but these errors were encountered:

IngelaAndin · 2024-08-16T09:37:06Z

Spontaneously this sounds like a ssh window_adjustment problem such fixed for a different scenario described here #7483. Our ssh expert is on vacation right now but he will be back soon an look into this.

nsweeting added the bug Issue is reported as a bug label Aug 15, 2024

IngelaAndin added the team:PS Assigned to OTP team PS label Aug 15, 2024

IngelaAndin assigned u3s Aug 16, 2024

IngelaAndin added the priority:medium label Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long running SFTP transfers with :ssh_sftp.read/4 fail #8724

Long running SFTP transfers with :ssh_sftp.read/4 fail #8724

nsweeting commented Aug 15, 2024 •

edited

Loading

IngelaAndin commented Aug 16, 2024

Long running SFTP transfers with :ssh_sftp.read/4 fail #8724

Long running SFTP transfers with :ssh_sftp.read/4 fail #8724

Comments

nsweeting commented Aug 15, 2024 • edited Loading

IngelaAndin commented Aug 16, 2024

nsweeting commented Aug 15, 2024 •

edited

Loading