Skip to content

Stroom/Proxy should accept all data from a trusted proxy #4783

@at055612

Description

@at055612

Currently if a client sends data to a proxy and the proxy responds with a 200, the proxy may be subsequently unable to send it downstream (e.g stroom rejects it because the feed status has changed). At the moment, the proxy will keep trying to send until it exceeds the retry count, then it will move to the error dir.

Ideally if we have accepted some data from an end client then we can't just leave the data in an error dir on proxy. Stroom should accept all data from a trusted proxy. There are a number of things we need to do to make this happen.

  • The /datafeed endpoint needs to establish if the caller is a trusted proxy.
    It will do this be making a request to the downstream proxy/stroom (in a similar way to how feed status check works). i.e it will send the API key from the client's request to the downstream. If downstream is another proxy it will relay the request downstream. If downstream is stroom it will check that the stroom_user associated with the api key holds the Stroom Proxy app permission and return true/false. All proxies will cache the outcome for a configurable period, e.g. 10mins.

  • Change receipt handling to not reject data from a trusted proxy.
    On data receipt, the feed status check should take into account whether the client is a trusted proxy or not. If the feed status is RECEIVE orDROP, data is received/dropped regardless. If the feed status is REJECT then the outcome depends on whether the client is a trusted proxy or not. If not, a reject response will be returned, if it is, then the data will be accepted and treated like RECEIVE.

  • Add auto creation of dead letter feeds for rejected data.
    If unwanted/unknown data is accepted from a trusted proxy we need to create/ensure a dead letter feed for it to go in. E.g. if the feed name is RUSTY_BADGER then stroom should create a feed called RUSTY_BADGER-DLQ, i.e. <feed name>-DLQ. This feed should default to utf8 and Raw Events. The admin can then make a decision about what to do about the unwanted data.

  • Create a new feed status resource that returns a list of all feeds and their feed statuses, with the feed names hashed.
    On receipt, the feed status check can just hash the feed and check in the local list of hashed feeds to determine the status.
    This list should be held in memory and written to disk (so it can be read on boot). It should be updated on boot and every 10mins or so.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementA new feature or enhancement to an existing featuref:proxy

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions