Skip to content

Support large args #1385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jakajancar
Copy link
Contributor

Closes #1313

This PR:

  • Limits the argument length to 100 when rendered in logs and in Django Admin
  • Adds a full-page args view to Django Admin

Before:

Screenshot 2025-05-08 at 09 11 58 Screenshot 2025-05-08 at 09 11 59
  • list view: the status and summary columns are pushed off-screen
  • list view: the row can get very tall, depending on the JSON passed to the argument
  • list view: if you have a 10MB arg that will be a very large HTML page
  • detail view: the title can get arbitrarily large

After:

Screenshot 2025-05-08 at 09 11 35 Screenshot 2025-05-08 at 09 11 37 Screenshot 2025-05-08 at 09 11 38

@ewjoachim This is slightly different than your original suggestion in #1313. I noticed that Job does not currently reference App and it seemed ugly to make it aware just for logging customization. Do we really need this to be customizable? Happy to still do it if you want.

Successful PR Checklist:

  • Tests
    • Not really tested now. Happy to add if we think something is "public API"
  • Documentation
    • The exact syntax of the log name is undocumented now, and I think this is best. Structured logs should be used, not parsing of call_string.

@jakajancar jakajancar requested a review from a team as a code owner May 8, 2025 16:28
@jakajancar jakajancar force-pushed the support-large-args branch from 9b9b346 to e03034c Compare May 8, 2025 17:49
Copy link

github-actions bot commented May 8, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  procrastinate
  jobs.py
  utils.py 343
Project Total  

This report was generated by python-coverage-comment-action

@jakajancar jakajancar marked this pull request as draft May 8, 2025 18:53
Copy link
Member

@ewjoachim ewjoachim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd be reassured with some tests. Doesn't have to be 100% coverage but what can be tested should. Please favor unit tests to integration tests where possible.

I'm not sure the Django admin is tested, so it's fine to leave that out (and you made screenshots)

I've left a few comments below :)

Nice work though ! Please be welcome in the joyful assembly of Procrastinate contributors. Feel free to share feedback on the contribution process (now or when it's merged).

EDIT: Ah, you just switched to draft :) Well, you have a few ideas for the rest then :)

@@ -329,3 +329,17 @@ def queues_display(queues: Iterable[str] | None) -> str:
return f"queues {', '.join(queues)}"
else:
return "all queues"


def ellipsize_middle(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this function, and it seems like the perfect candidate for a parametrize set of unit tests :)

In particular, I like that this implementation works well around 100 chars (if we look at what happens char by char, there's no moment where we add 1 char and we end up with ellipsis and with the whole string being less than 100 chars). That said, I felt the need to test it to be 100% sure of that, so it would be awesome if the tests were included :D

I specifically liked a lot that all the important configuration is passed as parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know, I'm also annoyed by the counterproductive ellipses in some implementations, hehe. Will add tests.

pretty_json = pretty_json[:2000] + "..."
arg_rows = [
format_html(
"<tr><td>{key}</td><td>{value}</td></tr>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go this route, we should at least wrap keys and values in <code>, but it that much better than json ?

json is easily copyable.
yaml could be a good middle ground, maybe ? It's readable (but it needs one more dep :( )

format_html(
"<tr><td>{key}</td><td>{value}</td></tr>",
key=key,
value=ellipsize_middle(value),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should json.dump() the value though. Given we know it's json, I guess the only difference will be boolean capitalization and null/None. But here in the django admin, it's not about being Python anymore, and json types are more "neutral".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, even if we don't json.dump(), your IDE should probably complain: value has no reason to be a string, it might be any json type. I guess PyRight should complain in the CI, if it doesn't I'll be somewhat disappointed at another Microsoft product :D (said the guy on GitHub, using VSCode )

Copy link
Contributor Author

@jakajancar jakajancar May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, this was an omission, I didn't even notice I don't json.dumps it.

I guess the problem is Django, instance.args is not correctly typed and appears as Unknown, so then type checking assumed that I knew what I was doing, which was a mistake :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No idea how it even worked before, as shown in your screenshot...)

key=key,
value=ellipsize_middle(value),
)
for key, value in instance.args.items()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about calls with thousands of short args ? Shouldn't we limit that too ? Maybe more than 30 args don't make a lot of sense to display ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me kind of contrived that you would define a function with thousands of parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that could be:

@app.task
def x(**kwargs):
    ...

Then it's only up to how big the dict is.

Comment on lines +140 to +141
f"{key}={ellipsize_middle(repr(value))}"
for key, value in self.task_kwargs.items()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you left out the part where we allow people to override this. It might be ok in this PR, but I'd like to make sure we're fine with this before a release. Doesn't have to be something gigantic, but this may be ruining someone's workflow with no hopes of improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I just saw your comment in the PR body ! I'll try to think about it. It's ok to leave it out of the PR for now if it's complicated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's easier if it can be configured in the tasks, then ?

Copy link
Contributor Author

@jakajancar jakajancar May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually made me think... Perhaps a better solution to all of this would be to have a way to simply exclude an argument from display (call_string, log context, and Django admin), for whatever reason (security, size).

This seems much simpler than all of this.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, it seemed much simpler but:

  • We can define on Task.unlogged_args, but you can't get from Job -> Task easily
  • Seems wrong to persist fixed argument names into Job/database
  • Also seems fairly "dangerous": you forget to filter in one place (call string, log context, admin, shell) and you're printing what you should not be.
  • Perhaps splitting task_kwargs into task_kwargs+task_unlogged_kwargs but that seems like an overkill too.

@jakajancar
Copy link
Contributor Author

@ewjoachim I'm tried to add some test that checks that log context does not grow infinitely with args size and am noticing some things:

  • First types.JSONValue is kinda misleading, as it has dict[str, t.Any] and list[t.Any], not dict[str, JSONValue] and list[JSONValue]. Not sure if it's intentional that JSON-ness is only guaranteed at the first level, or a limitation of the type system?
  • log_context returns (as is indicated in type) non-JSON data. Is this expected?
  • Either way, it's not enough that we only remove the mega-arg from call_string, must also do something with it in the log_context. What are your thoughts?

@ewjoachim
Copy link
Member

ewjoachim commented May 8, 2025

First types.JSONValue is kinda misleading, as it has dict[str, t.Any] and list[t.Any], not dict[str, JSONValue] and list[JSONValue]. Not sure if it's intentional that JSON-ness is only guaranteed at the first level, or a limitation of the type system?

At the time this was written, recursive types weren't supported.
See python/typing#182

But it's not the case anymore so feel free to fix the types (preferably in another PR, I suspect the CI will be complaining here and there).

log_context returns (as is indicated in type) non-JSON data. Is this expected?

You mean scheduled_at ? No, you're probably right, the typing is probably wrong. I think it should be a json, so we should probably isoformat() that datetime. Or do you have something else in mind ?

Either way, it's not enough that we only remove the mega-arg from call_string, must also do something with it in the log_context. What are your thoughts?

That's something that comes up a lot: I'm a maintainer of procrastinate (and the developer of a large part of it), but it's been ages since I last was a user of Procrastinate in any real life setting, so the points users like you bring up are usually kind of my best bet to understand and be aware of pain points. All in all, I would say the point you bring up seems very reasonable, and we should probably be mindful about the size of the logs.

That said, I think it would make sense that the complete args of each task is logged at least once when it's deferred and once when it's executed (probably upon completion or at the start of the task). That said, I'm ok if that's only in DEBUG logs.

Maybe something like: if we detect that the info logs we emit are truncated, we emit a debug log with the full value (once when deferred, once when executed) (trying to think if it's more useful to log before completion so if the computation crashes the whole process, we can figure out what exactly the task was executing, or after completion so we have the associated result. I think before ?) ?

@jakajancar
Copy link
Contributor Author

jakajancar commented May 8, 2025

That's something that comes up a lot: I'm a maintainer of procrastinate (and the developer of a large part of it), but it's been ages since I last was a user of Procrastinate in any real life setting

Yikes 😅

log_context returns (as is indicated in type) non-JSON data. Is this expected?

You mean scheduled_at ? No, you're probably right, the typing is probably wrong. I think it should be a json, so we should probably isoformat() that datetime. Or do you have something else in mind ?

test_nominal demonstrates that in task_kwargs you can get arbitrary objects (in this case Param).

At some point arbitrary types need to get converted into JSON for storage in the database. I guess it's unclear to me if task_kwargs is supposed to contain the original types or JSON-ified types.

  • If original:
    • Type should not be JSONDict (in the recursive type sense) but dict[str, Any]
    • call_string needs to use repr(value)
    • Log context needs to either:
      • do nothing, which means they cannot be limited in length (and note that by default structlog will just drop entire kwargs in such case as unloggable), or
      • need to contain repr(value)
      • the conversion to JSON needs to happen when creating the context
  • If JSON:
    • Type can remain JSONDict
    • call_string and log context have some flexibility in how to serialize or (potentially even recursively) ellipsize the data

@ewjoachim
Copy link
Member

ewjoachim commented May 10, 2025

This somewhat perfectly captures my opinion of 2025 so far :)

Screen.Recording.2025-05-10.at.13.34.14.mov

Joke apart, let me know if you need any help.

@jakajancar
Copy link
Contributor Author

@ewjoachim Yes, now you understand the problem this PR is solving :)

Tbh I have kind of given up on both approaches:

  • automatic limit for all args, because of mis-typing/unclearly defined types as described above,
  • unlogged_args since it’s inpossible to get Task from Job.

I will most likely go with an extra table to hold the large data and only reference it’s PK in the task, then delete the row at the end of the job.

@ewjoachim
Copy link
Member

I understand you have to fix pressing issues, and I respect that.

That said, I still think it makes sense, either for you or someone else, in due time, to continue working on this so that at some point, it stops being an issue for other people ?

I'm not saying you have to, but first and foremost thank you for trying, and then: maybe someone will want to pick this work up where it was left :) (or maybe you'll want to, though no pressure)

@ewjoachim ewjoachim mentioned this pull request May 10, 2025
10 tasks
@ewjoachim
Copy link
Member

ewjoachim commented May 11, 2025

Ok so: I've started working on this to get a better hang on the remaining issues:

  • Recursive JSON has been implemented, though indeed, accurate types mean we'll have to change the types of task kwargs to say it might be any object. That means anytime we need to json encode it, we'll need access to the json_dumps function that was provided to the connector.
  • There are a few shenanigans to bridge the gap between Job and Task. But the reason for this is that when we defer a job, or use the shell, we may not have the code of the task. It's perfectly possible that the worker is the only part of our system that knows about the details of a task. That's not necessarily an issue, we can do it on best effort, but that's why there's no direct link between job and task.

What smells bad for me is that if we json dumps all args so that we can truncate them to length, then arguments that were json encoded into a string (so mostly strings, but possibly other things too depending on the json_dumps function) become '"value"' with 2 sets of quote, and that looks bad.

I could check if after json_dumps, the string starts and ends with " and remove those, but I'm not a huge fan of this solution. I fear there's a bad design choice here that's going to backfire at some point.

Also, but that's kinda logical with the choices so far: if we truncate objects, then in the logs, all objects will become strings, so instead of logging {"task_kwargs": {"a": ["b"]} we'll log {"task_kwargs": {"a": "[\"b\"]"} and that's much less readable.

All in all, there's at least a decision to make:

  • A/ Do nothing, and let this length problem be
  • B/ Find something smart to do about it, which we haven't found yet
  • C/ Fix it in a consistent way that will make things more ugly for all users
  • D/ Fix it so that we don't change anything for users with only short parameters, and accept that it will sometimes be inconsistent at least by default, e.g. if a user has a job for which the arguments are sometimes shorter, sometimes longer than the default threshold. For those users, offer the customization hooks

I vote for B, so I welcome smartness from everyone around :D

@medihack
Copy link
Member

medihack commented May 11, 2025

  • A/ Do nothing, and let this length problem be
  • B/ Find something smart to do about it, which we haven't found yet
  • C/ Fix it in a consistent way that will make things more ugly for all users
  • D/ Fix it so that we don't change anything for users with only short parameters, and accept that it will sometimes be inconsistent at least by default, e.g. if a user has a job for which the arguments are sometimes shorter, sometimes longer than the default threshold. For those users, offer the customization hooks

I don't like D) because of the inconsistencies.
For A) it seems we are already too deep in it 😉 ... I would say it is the last resort (and we are not there yet).
Regarding C), I do not care that much about the ugliness of log outputs. Logs always look a bit ugly to me, and it's more important that the information I need is there. That said, I still would prefer that large args are (somehow configurable) shortened.
B), it is always good to come up with a better solution 😉 (but sometimes there isn't one).

So, to be more concrete, what would I prefer:

  • By default, have large args in the admin and the log output shortened.
  • Have a way in the admin interface to view the full args.
  • Some global configuration or environment variable to show the full args in the log output (turned off by default).
  • Use the same json_dumps encoder set in the connector to show the args in the admin and log output.
  • If no special encoder is used (relates to most users, I guess), we can try to prettify the output somehow (recursively go through all values).
  • If a custom encoder is set, just shorten the resulting args string as we can't know what the resulting format is (this must then be documented).

Not sure if this all makes sense, as you are much deeper into it (never used or touched the custom encoder stuff).

@ewjoachim
Copy link
Member

If a custom encoder is set, just shorten the resulting args string as we can't know what the resulting format is (this must then be documented).

I'm not sure this will be helpful if you have a list with 10000 short strings :(

@jakajancar
Copy link
Contributor Author

jakajancar commented May 12, 2025

For A) it seems we are already too deep in it 😉

🤣

Ignoring the details of the shortening and the presentation of the admin page for a second, what would make most sense to me is to convert the kwargs to JSON early, so that job.task_kwargs and consequently call_string, log_context, and anything else can rely on that invariant. This to me makes things conceptually clean, as job.task_kwargs will always contain the same type, both when initially created as well as when round-tripped through the database.

The conceptual unclarity and the amount of heart surgery was too much for me to undertake, but if you decide to do this change, I'm happy to pick up the trimming itself again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Admin page support for large args
3 participants