Skip to content

Failing test for ContentDispositionParser #850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kjg
Copy link
Contributor

@kjg kjg commented Jan 27, 2015

I looked into the parsers/ragel folder and I have no idea what I'm looking at there. Can you give me some hints as to what I should be changing to make this work?

@your
Copy link

your commented Jan 27, 2015

Hi,

I looked into lib/mail/parsers/ragel/common.rl and I think you should work on the section where it says:

# RFC2183: content_disposition
# TODO: recognize filename, size, creation date, etc.

and implement the missing grammar right from rfc2183.

Anyway, If you look closely at rfc2183 (https://tools.ietf.org/html/rfc2183) section 2.4:

The creation-date parameter MAY be used to indicate the date at which
the file was created. If this parameter is included, the paramter
value MUST be a quoted-string which contains a representation of the
creation date of the file in [RFC 822] `date-time' format.

Being still a 'proposed standard' I wouldn't be surprised if you found it unquoted, though..

@kjg kjg force-pushed the fix_content_disposition_unquoted_date_parsing branch from b4b84f4 to 38c85b1 Compare January 28, 2015 15:12
@kjg
Copy link
Contributor Author

kjg commented Jan 28, 2015

Yes, I am definitely receiving emails without these values quoted. Is it better to implement this according to the proposed spec, or according to what emails are actually out in the wild?

@jeremy
Copy link
Collaborator

jeremy commented Sep 30, 2015

Robustness principle applies here: be liberal in what we accept, but be conservative in what we send. We should make a good effort to parse what's in the wild.

@kjg
Copy link
Contributor Author

kjg commented Sep 30, 2015

I've dug around the ragel stuff in the code base now a few times, and I can't quite rap my head around it. Any tips on how to modify the ragel parser to handle this new case?

@jeremy
Copy link
Collaborator

jeremy commented May 15, 2017

The Content-Disposition parser (lib/mail/parsers/content_disposition_parser.rl) is declared as:

  include rfc2183_content_disposition "rfc2183_content_disposition.rl";
  main := disposition;

Where disposition (lib/mail/parsers/rfc2183_content_disposition.rl) is declared as:

  include rfc2045_content_type "rfc2045_content_type.rl";

  disposition_type = 'inline'i | 'attachment'i | extension_token;
  disposition_parm = parameter;
  disposition = (disposition_type >disp_type_s %disp_type_e)
                (";" disposition_parm)*;

So disposition is parsed as disposition_type followed by zero or more ;disposition_parm, where disposition_parm is parameter (lib/mail/parsers/rfc2045_content_type.rl), declared as:

  include rfc5322_lexical_tokens "rfc5322_lexical_tokens.rl"; 

  token = 0x21..0x27 | 0x2a..0x2b | 0x2c..0x2e | 0x30..0x39 | 0x41..0x5a | 0x5e..0x7e;
  value = (quoted_string | (token -- '"' | 0x3d)+) >param_val_s %param_val_e;
  attribute = (token+) >param_attr_s %param_attr_e;
  parameter = CFWS? attribute "=" value CFWS?;

A parameter is optional folding whitespace containing comments followed by an ASCII name, "=", and an attribute value that's either a quoted string or one or more valid token chars, except = (0x3d).

See Section 2.5.4 Strong Difference in the Ragel Guide for more on operators like --: https://www.colm.net/files/ragel/ragel-guide-6.10.pdf

To support unquoted parameter values, we'd need to update this grammar to allow spaces, commas, and colons in parameter values. This would likely break other legit headers, though.

Maybe there's another route, like using a special-case fallback parser that scans for ; to split params then naively splits name/value on =.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants