Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate list formatting utility functions. #541

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AllSeeingEyeTolledEweSew
Copy link
Contributor

This separates the output-formatting parts of AbstractedFS.format_list() and AbstractedFS.format_mlsx() into utility functions, which makes it easier to create a filesystem object without inheriting from AbstractedFS.

My use case is that I'm creating a FTP server which accesses data from bittorrent, downloaded on-the-fly. I don't want there to be any way to access the server's filesystem via the FTP server, except via the bittorrent abstraction.

Currently to do this, I have two options:

  1. Build a filesystem object that inherits from AbstractedFS
  2. Build a filesystem object that inherits from nothing

(1) has a security disadvantage. Even if I override every method from AbstractedFS today, the authors may write a new method on AbstractedFS in the future whose default implementation accesses the filesystem, like all the other default implementations do today, and this may provide unintentional access to the filesystem.

(2) currently has an implementation disadvantage, due to the format_list() and format_mlsx() methods. They produce output that is carefully crafted for compatibility. Unfortunately their implementations are tightly coupled to being member methods of AbstractedFS, so if I do option (2) I can't reuse this code and must copy it. This means my code won't benefit from any future refinements made by pyftpdlib.

I understand the other disadvantage of (2) is that if pyftpdlib adds new required methods to AbstractedFS my app may break, but I prefer this to the security disadvantage of (1).

This change makes (2) more feasible by making the compatibility semantics of the format_*() methods reusable.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 0.0% when pulling 671fe52 on AllSeeingEyeTolledEweSew:asetes-list-util-functions into bdf50a2 on giampaolo:master.

@giampaolo
Copy link
Owner

giampaolo commented Sep 21, 2020

Mmm... I see where you're coming from. This must be evaluated carefully, also because I already have a refactoring in mind for those 2 methods, as I plan to introduce the usage of os.scandir() (as opposed to do os.stat() for every name returned by os.listdir()). Because of that, I prefer to keep your request pending for now, because os.scandir will likely introduce a breakage in terms of compatibility already, so I prefer to have to full picture before touching the filesystem part. Unfortunately I can't provide an ETA. :-(

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

Interesting, and makes sense.

Does this mean pyftpdlib will just report some defaults for nlinks, mtime, user and group? Those all still need a stat() I believe.

It should still be possible to negotiate a signature for these formatting functions though, with most parameters optional.

@giampaolo
Copy link
Owner

giampaolo commented Sep 22, 2020

Does this mean pyftpdlib will just report some defaults for nlinks, mtime, user and group? Those all still need a stat() I believe.

Not sure what you mean by “defaults”. From os.scandir() you can get the same info you get from os.stat(), so there should be no differences (except the code will be reorganized differently).

And yes, we can probably split that by introducing 2 new methods which “format” the individual files/names as you did here. You are providing them as module functions, but I think it makes more sense to have them as methods of the fs class instead.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

And yes, we can probably split that by introducing 2 new methods which “format” the individual files/names as you did here. You are providing them as module functions, but I think it makes more sense to have them as methods of the fs class instead.

I'm probably missing the picture of how you intend to rewrite format_list/format_mlsx. Whatever you come up with, would you consider having functions like my format_list_line / format_mlsx_facts, that are not methods of AbstractedFS? The fact that this code is only available as methods is exactly what's causing my dilemma.

@giampaolo
Copy link
Owner

Looking back at your original message:

(1) has a security disadvantage. Even if I override every method from AbstractedFS today, the authors may write a new method on AbstractedFS in the future whose default implementation accesses the filesystem, like all the other default implementations do today, and this may provide unintentional access to the filesystem.

I see your point. The fact is that AbstractedFS mixes methods which access the fs (os.mkdir, etc.) and utilities (root, ftp2fs, fs2ftp, validpath, format_list, format_mlsx) and the handler class relies on both. In this sense, AbstractedFS was designed to be the "base class". I understand your concern but I'm not convinced exposing format_* methods as module functions is the right solution. To say one, with a function you don't have access to the self.cmd_channel object.
Perhaps a better solution would be to provide a brand new BaseFS class using abc module and you can inherit from that instead of AbstractedFS? I haven't been using abc in a long time though (do you?) so I'm not completely sure that would fit your use-case.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

Perhaps a better solution would be to provide a brand new BaseFS class using abc module and you can inherit from that instead of AbstractedFS? I haven't been using abc in a long time though (do you?) so I'm not completely sure that would fit your use-case.

My goal is that I want a FS object that definitely doesn't have any code that accesses the real filesystem, for security.

I think the best way to do that, and to ensure it never accesses the filesystem if pyftpdlib code changes, is to write an FS class with no superclass. Like class VirtualFS(object):.

If I declare the class like class VirtualFS(AbstractedFS), I must override all methods of AbstractedFS that access the real filesystem. And even when I do, I run the risk that if you add new methods to AbstractedFS, it could create a backdoor in my FTP server to access the real filesystem.

So, this is the problem with having format_list be a method of AbstractedFS. It means I must inherit from it, or duplicate the formatting code of format_list and format_mlsx.

I don't think the use of abc makes a difference to me. I do want to implement every public method of AbstractedFS, and have my code fail if a pyftpdlib upgrade introduces a new required method. The only difference an abc.ABC makes is whether the latter case means a RuntimeError when constructing my fs object, versus a later AttributeError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants