Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParameterList fails to parse filename from Content-Disposition header encoded in UTF-8 with Q encoding #687

Open
Dr4K4n opened this issue May 10, 2023 · 1 comment
Labels
invalid This doesn't seem right question Further information is requested

Comments

@Dr4K4n
Copy link

Dr4K4n commented May 10, 2023

Describe the bug
We are using the mail-api to parse incoming emails (MimeMessages), we received a particular email with a PDF attachment.
The filename of this attachment is encoded in the Content-Disposition header in a "weird" way.
This leads to the following exception

Stacktrace:

 Caused by: jakarta.mail.internet.ParseException: In parameter list <;
   filename==?utf-8?Q?XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--111111111-XXXXXXXXXXXXXXXXXXX?=
   =?utf-8?Q?XXXXXXXXXXXXXXXXXXX=2Epdf?=;
   filename*0*=utf-8''XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--111111111-XXXXXXXXXXX;
   filename*1*=XXXXXXXXXXXXXXXXXXXXXXXXXXX.pdf>, expected parameter value, got "="
     at jakarta.mail.internet.ParameterList.<init>(ParameterList.java:273)
     at jakarta.mail.internet.ContentDisposition.<init>(ContentDisposition.java:86)
     at jakarta.mail.internet.MimeBodyPart.getDisposition(MimeBodyPart.java:1239)
     at jakarta.mail.internet.MimeBodyPart.getDisposition(MimeBodyPart.java:327)

After googling I found that the header is encoded in "Q encoding" (https://en.wikipedia.org/wiki/MIME#Difference_between_Q-encoding_and_quoted-printable) It is also mentioned in RFC2047 (https://www.ietf.org/rfc/rfc2047.txt).
The method jakarta.mail.internet.MimeUtility.decodeText(String etext) is actually able to parse such strings.

To Reproduce
See test cases in attached pull request #688

Expected behavior
See test cases in attached pull request #688

Envorinment:

  • Version: 2.0.1
@lukasj
Copy link
Contributor

lukasj commented Jan 6, 2024

According to RFC 2231 which updates 2047 and explicitly allows using encoding definition in the Content-Disposition header, it is required to inform the client about encoding in parameter value being used through the usage of the * in the parameter name (see the definition od extended-parameter/extended-initial-name in the grammar), so in this particular case, the header should contain:

filename*==?utf-8?Q?XX...

Also would it be possible to share full header definition for which it fails for you? Thanks

@jmehrens jmehrens added invalid This doesn't seem right question Further information is requested labels Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants