Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Invalid archive extract_report() when unsing a path as input_ #475

Open
kawaegle opened this issue Feb 22, 2024 · 1 comment
Open

Comments

@kawaegle
Copy link

kawaegle commented Feb 22, 2024

I run into a strange error when I update the library parsedmarc to 8.7.0

file_patg = "/tmp/xml_path.xml"
parse_aggregate_report_file(file_path, offline=True, ip_db_path=None)

raise a

Invalid archive file: Not a valid zip, gzip, json, or xml file

after investigate it seem to be an error from

file_object = BytesIO()

here the file_object is not None
but the check to open path
if file_object is None:

simple hack is to set file_object to none
before the pass

as test this is the fix I made for my project (as patch)

--- __init__.py 2024-02-22 16:58:11.395563315 +0100
+++ init_fix.py 2024-02-22 16:53:27.555565639 +0100
@@ -566,6 +566,7 @@
             try:
                 file_object = BytesIO(b64decode(input_))
             except binascii.Error:
+                file_object = None
                 pass
             if file_object is None:
                 file_object = open(input_, "rb")
@@ -588,6 +589,7 @@
             report = file_object.read().decode(errors='ignore')
         else:
             file_object.close()
+            print(f"header: {header} XML: {MAGIC_XML}")
             raise ParserError("Not a valid zip, gzip, json, or xml file")

         file_object.close()

Hope it help

@kawaegle kawaegle changed the title Error extract report unsing a str path Error Invalid archive extract_report() when unsing a path as input_ Feb 22, 2024
@kawaegle
Copy link
Author

as proof

I add print(f"==={file_object}===") in order to see if fileobject is really at None
I get
===<_io.BytesIO object at 0x7efccc7a7560>===
So yes the mistake is realy just because of the variable file_object is create as a BytesIO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant