Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRC seems to ignore the "irods_default_hash_scheme" in the environment.json #610

Open
chStaiger opened this issue Aug 22, 2024 · 6 comments

Comments

@chStaiger
Copy link

chStaiger commented Aug 22, 2024

While transferring data I noticed that the iRODS server uses different hash schemes for the checksums depending on the client I use.

In my irods_environment.json I set the checksum algorithm as below:

cstaiger@integration:~$ cat .irods/irods_environment.json | grep default_hash_scheme
    "irods_default_hash_scheme": "md5",

On the server sha256 is the default checksum algorithm.

When I use the icommands to upload data, the data is checked by md5 sums:

cstaiger@integration:~$ ils -L hello_iput.txt
  cstaiger     0 irodsResc          12 2024-08-22.05:40 & hello_iput.txt
    6f5902ac237024bdd0c176cb93063dc4    generic    /mnt/irods03/home/.../hello_iput.txt

When I transfer data with the PRC v 2.0.1. sha2 is used as checksum algorithm:

>>> import irods.session
>>> sess = irods.session.iRODSSession(irods_env_file = ".irods/irods_environment.json")
>>> sess.data_objects.put("hello.txt", "/nluu12p/home/research-test-christine/hello_prc.txt", **{irods.keywords.REG_CHKSUM_KW: ""})
>>>
cstaiger@integration:~$ ils -L hello_prc.txt
  cstaiger      0 irodsResc           12 2024-08-22.05:48 & hello_prc.txt
    sha2:qUiQTy8PR5uPgZdpSzAYSw0u0cHNKh7A+4XSmaGSpEc=    generic    /mnt/irods03/Vault/home/../hello_prc.txt

Is there an extra parameter which I have to pass to the PRC to ensure that the data is checksummed by md5?

@alanking
Copy link
Contributor

How did you upload the data for the iCommands example? I'm assuming you used iput, but it would be helpful to know which iCommand and options were used.

I see that REG_CHKSUM_KW is being used in the PRC put. I believe that this is equivalent to iput -k, which means...

 -k  checksum - calculate a checksum on the data server-side, and store
       it in the catalog.

That would mean that the checksum only needs to be calculated on the server side, and it would appear that it uses the hash scheme configured for that server.

What you're looking for, I think, is the equivalent of iput -K:

 -K  verify checksum - calculate and verify the checksum on the data, both
       client-side and server-side, and store it in the catalog.

This feature uses VERIFY_CHKSUM_KW to calculate the checksum on the client side, re-calculate it on the server side (using the same hash scheme as was used by the client-side calculation), and then ensures that they match.

You could try using VERIFY_CHKSUM_KW instead. However, DataObjectManager.put does not appear to implement the client-side checksum calculation like iput. My impression is that you can only register a checksum based on a server-side checksum calculation and there's no built-in way to verify the checksum against the local data.

I'll mark this as a bug, but I view it more as a missing feature rather than something not working. We can play with the labels. :)

@d-w-moore - Does that seem right? Am I missing something?

@chStaiger
Copy link
Author

I am sorry, I forgot to copy that command over. Indeed I used:

iput -K hello.txt hello_iput.txt

And the version of the icommands is 4.3.1-0~bionic.

@trel
Copy link
Member

trel commented Aug 22, 2024

In case this is news - there is a little section on checksums in the README...

https://github.com/irods/python-irodsclient?tab=readme-ov-file#computing-and-retrieving-checksums

@d-w-moore
Copy link
Collaborator

@trel What's our milestone to be for this one?

@korydraughn
Copy link
Contributor

Let's get the remaining issues for 2.1.1 resolved and handle this in 3.0.

@korydraughn korydraughn added this to the 3.0.0 milestone Sep 20, 2024
@trel
Copy link
Member

trel commented Sep 20, 2024

Yep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants