Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit FFI files in subdirectories #3601

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

PgBiel
Copy link
Contributor

@PgBiel PgBiel commented Sep 8, 2024

Closes #1562

Still WIP (finishing the details regarding the duplicate module check)

Changes:

  • FFI files in subdirectories are now copied. For JavaScript specifically, they can be used as @external(javascript, "./file.mjs", "function"). For Erlang, you'd use the module's name as usual.
  • An additional check was included to ensure no .erl files with the same name exist within the project, as Erlang cannot have duplicate modules.
  • Added support for directories to the in-memory filesystem. (Otherwise, the is_directory function would return true for files and read_dir would also list the directory itself, causing tests to enter infinite recursion and crash.)

Potential improvements:

  • I'd like to add explicit integration tests for using @external with relative paths in JS (I tested locally with .mjs with node.js, Deno and Bun and it worked, but could be worth it to test other potential edge cases). Let me know where would be the best spot in the codebase to do so.
  • Consider having an error when a native file would overlap with a .gleam file's generated code (.mjs or .erl).

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

You'll see there's integration tests in the test directory which can be updated with nested FFI files

compiler-core/src/build/native_file_copier.rs Outdated Show resolved Hide resolved
compiler-core/src/io/memory.rs Outdated Show resolved Hide resolved
compiler-core/src/io/memory.rs Outdated Show resolved Hide resolved
compiler-core/src/io/memory.rs Outdated Show resolved Hide resolved
@lpil lpil marked this pull request as draft September 10, 2024 15:22
@PgBiel PgBiel changed the title Permit JS FFI files in subdirectories Permit FFI files in subdirectories Sep 13, 2024
@PgBiel
Copy link
Contributor Author

PgBiel commented Sep 15, 2024

Problem: My handmade recursive directory walker is too naive and can enter infinite symlink loops. I'll probably add a new method to FileSystemReader where we can address this at the IO boundary, similarly to how it's done for gleam_source_files.

@PgBiel PgBiel marked this pull request as ready for review September 16, 2024 02:22
@PgBiel
Copy link
Contributor Author

PgBiel commented Sep 16, 2024

Alright, I've made the implementation more robust (it also now supports copying nested Erlang modules), added detection of conflicting .gleam and .mjs files (but not of conflicting .gleam and .erl files yet due to the issue outlined here: #1562 (comment) - we could leave this to a future PR), added detection of conflicting .erl files in separate subpaths, added more unit tests and added integration tests. 😄

@inoas
Copy link
Contributor

inoas commented Sep 16, 2024

Hello,

thanks for taking this on, this is lovely <3 for FFI interop!

In case this is desired and-or not yet considered:

Can we enforce some namespacing on Erlang, Elixir and JavaScript modules that maps to the directory structure and else throw a compiler error?

Edit:

... if possible, this might also make this check obsolete?

An additional check was included to ensure no .erl files with the same name exist within the project, as Erlang cannot have duplicate modules.

@PgBiel
Copy link
Contributor Author

PgBiel commented Sep 16, 2024

I'm not sure that would be desirable since the main point of FFI is to have full control over what the final transpiled code does. Besides, we would have to parse all Erlang files, though I guess we could do with some regex in this case. But I think it's more a matter of flexibility, and the error just tells you if you ever mess it up somehow.

The error is also not perfect since I guess it's theoretically possible for you to shadow some module from some other package. But it's an attempt.

@inoas
Copy link
Contributor

inoas commented Sep 16, 2024

I'm not sure that would be desirable since the main point of FFI is to have full control over what the final transpiled code does. Besides, we would have to parse all Erlang files, though I guess we could do with some regex in this case. But I think it's more a matter of flexibility, and the error just tells you if you ever mess it up somehow.

The error is also not perfect since I guess it's theoretically possible for you to shadow some module from some other package. But it's an attempt.

maybe it could warn.
I don't know... I think these FFI files will all be new and setting up strictness is nice there. You can still build any erlang/elixir library as a deprendency.

@PgBiel
Copy link
Contributor Author

PgBiel commented Sep 16, 2024

I think this is fine to be honest... Forcing people to name their modules like folder@folder2@something in every folder doesn't sound good or useful realistically. And it can clash either way.

Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thank you.

It seems now that there's a concept of directories in the in memory file system and also the OS file system, and each is responsible for implementing traversal. I think it would make sense to remove this duplication and instead to have the trait know nothing about directory walking and then the walking logic depends on the trait and is shared between all implementations.

@lpil lpil marked this pull request as draft September 20, 2024 13:53
@PgBiel
Copy link
Contributor Author

PgBiel commented Sep 20, 2024

I think it would make sense to remove this duplication and instead to have the trait know nothing about directory walking and then the walking logic depends on the trait and is shared between all implementations.

If I understood correctly, you are proposing making a dir walking function which uses only the trait's read_dir function (and other trait functions if needed), right? Though I'm not sure if we would want to reinvent the wheel here, given we already have a package which implements graph traversal logic to avoid loops etc. What if we compromise and simply add a walk_dir function to the trait and use that where applicable?

Although I guess a re-implementation will be needed if we add symlinks to the in-memory FS down the road, so we could alternatively consider searching for existing platform agnostic implementations and using that, or even just re-implement while attempting to avoid reinventing the wheel as much as possible, e.g. by using special data structure crates if needed. Still, I'm not fully aware of all edge cases to know whether we would be able to match existing OS-specific implementations in terms of not only performance but also correctness.

@lpil
Copy link
Member

lpil commented Sep 21, 2024

Yes, that's right. We should not have two implementations of the same thing, and removing a dependency is beneficial for maintenance. Having one less dependency to verify for the EU Cyber Resilience Act is nice too!

I'm not worried about correctness seeing as this is a trivial algorithm, will be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Permit FFI modules in subdirectories
3 participants