Skip to content
This repository has been archived by the owner on Jan 6, 2023. It is now read-only.
This repository has been archived by the owner on Jan 6, 2023. It is now read-only.

Aeron Reliability #890

Open
Open
@jgerman

Description

We're having trouble with aeron exceptions in the onyx client. They are most often Client Conductor Timeouts though occasionally we see other aeron related exceptions. These exceptions kill the job between 1-4x per day (it's a long running job).

We can't seem to make these exceptions go away. GC does not appear to be an issue, nor do we see CPU usage spikes (our systems are running in GKE). Increasing CPU limits doesn't appear to help. The threads just seem to not be woken up in time to conduct their checks.

I'm pretty much stuck at this point trying various fixes while planning backup plans not involving Onyx. Any help to point me in the right direction would be appreciated.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions