Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: 1.0 release #43

Merged
merged 22 commits into from
Aug 29, 2024
Merged

feat!: 1.0 release #43

merged 22 commits into from
Aug 29, 2024

Conversation

avik-pal
Copy link
Member

@avik-pal avik-pal commented Jul 27, 2024

This changelog needs to be put in Lux. I will copy it over later.

merge after #42

Changelog

  • AbstractExplicit... names have been changed to AbstractLux....
  • Functors.jl and Setfield.jl are non-essential dependencies and have been moved into extensions.
  • Removes inputsize since that wasn't being used anywhere.
  • Single argument outputsize is no longer supported.
    • Fallback outputsize is now only defined once Lux.jl is loaded
  • fixes unexpected parameter type for AbstractExplicitContainer with single trainable field Lux.jl#795
    • AbstractLuxContainerLayer{(:a,)} wraps the initialparameters and initialstates with a NamedTuple with a single field a.
    • Previous behavior of unwrapped NamedTuple is provided via AbstractLuxWrapperLayer.

Copy link

codecov bot commented Jul 27, 2024

Codecov Report

Attention: Patch coverage is 91.54930% with 6 lines in your changes missing coverage. Please review.

Project coverage is 88.33%. Comparing base (e735617) to head (11ec158).
Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
ext/LuxCoreArrayInterfaceTrackerExt.jl 0.00% 3 Missing ⚠️
ext/LuxCoreArrayInterfaceReverseDiffExt.jl 0.00% 2 Missing ⚠️
ext/LuxCoreChainRulesCoreExt.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #43      +/-   ##
==========================================
+ Coverage   85.85%   88.33%   +2.47%     
==========================================
  Files           6        8       +2     
  Lines          99      120      +21     
==========================================
+ Hits           85      106      +21     
  Misses         14       14              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@avik-pal avik-pal force-pushed the ap/1.0_final branch 5 times, most recently from c8a00ea to 76b17ae Compare July 27, 2024 19:16
@avik-pal
Copy link
Member Author

Rationale behind removing outputsize(layer)

Consider the following case. We define

  1. outputsize(l::Chain) = outputsize(l.last_layer)
  2. outputsize(l::MaxPool, x, rng) = <some calculation>

If the last layer of Chain is MaxPool then this is a clear error. (But LuxCore fails to determine that this codepath of selecting outputsize(layer) is wrong because of Chain. We could query _return_type and check for Union{} to avoid this codepath but I think it is an overkill and might as well ask users to define the 3 arg version of outputsize).

An easy workaround is to always propagate x and rng and forward them accordingly. Additionally, to be safe container layers shouldn't override outputsize generally.

cc @ChrisRackauckas @SebastianM-C about this change. I think MTK NeuralNetworks is the only one using this outputsize, so wanted you to be aware of this future change.

@avik-pal avik-pal force-pushed the ap/1.0_final branch 2 times, most recently from f8a0437 to 1ca2efe Compare July 27, 2024 19:23
@avik-pal avik-pal mentioned this pull request Jul 27, 2024
32 tasks
@avik-pal avik-pal changed the title getting ready for 1.0 release feat!: 1.0 release Jul 27, 2024
@ChrisRackauckas
Copy link

So how would you recommend computing the output size cheaply?

@avik-pal
Copy link
Member Author

So how would you recommend computing the output size cheaply?

For Chain it is pretty much impossible without running the model, right? Flux does https://github.com/FluxML/Flux.jl/blob/master/src/outputsize.jl which is kind of smart but also requires you to define it pretty much for every layet.

@ChrisRackauckas
Copy link

That's my point, don't we need this function on every layer as part of the interface?

@avik-pal
Copy link
Member Author

Do you mean make it mandatory to define outputsize?

@ChrisRackauckas
Copy link

I don't see how else to support static analyses

@SebastianM-C
Copy link
Contributor

Having a way to specify the layer size could also help for things like WrappedFunction which can theoretically return whatever they want. If we could specify the output size in the layer definition, then that would make MTKNeuralNets more compatible with more layer types.

I recall trying the 3 arg version to determine the outputsize dynamically, but it didn't play that nice with symbolic tracing from what I remember (I can double check). In any case, a way for users to specify output sizes would be very helpful for MTKNeuralNets. I'm not sure what other things would benefit from this.

@avik-pal
Copy link
Member Author

avik-pal commented Aug 1, 2024

I think that is a plausible suggestion. We already reserve "name", so I don't see why we can't extend that to reserving output_size. Essentially check if the struct contains the field output_size and that field is not nothing, then that is the output size. Else do "something".

but it didn't play that nice with symbolic tracing from what I remember (I can double check)

Do you need to run symbolic tracing? Something like the flux version can be implemented with specific overloads for common Lux operations. Essentially take a Symbolic Array, convert it to a NilArray. Propagate the NilArray which is extremely cheap because it doesn't really do any computation and you have the final shape.

@avik-pal
Copy link
Member Author

avik-pal commented Aug 1, 2024

@SebastianM-C here is a Lux version LuxDL/Lux.jl#811 which should be cheap to run to get the sizes.

@SebastianM-C
Copy link
Contributor

Do you need to run symbolic tracing?

I'd rather avoid that if there's a better solution.

Essentially take a Symbolic Array, convert it to a NilArray. Propagate the NilArray which is extremely cheap because it doesn't really do any computation and you have the final shape.

Thanks, let me see what I can do.

ext/LuxCoreSetfieldExt.jl Outdated Show resolved Hide resolved
@avik-pal avik-pal force-pushed the ap/1.0_final branch 3 times, most recently from 9c9c06e to 17566de Compare August 18, 2024 16:13
@avik-pal avik-pal changed the base branch from main to ap/deps August 18, 2024 16:13
@avik-pal avik-pal changed the base branch from ap/deps to main August 18, 2024 16:16
@avik-pal
Copy link
Member Author

The NilArray PR is ready and I will merge it once tests pass. The (private till 1.0) API is https://github.com/LuxDL/Lux.jl/blob/ap/nilarray/test/helpers/size_propagator_test.jl. Once all the PRs land I will change outputsize to use NilArray internally.

From the tests, almost all networks should work (the only exceptions being, if the element types aren't Number)

@avik-pal avik-pal merged commit 2d4d5ed into main Aug 29, 2024
16 checks passed
@avik-pal avik-pal deleted the ap/1.0_final branch August 29, 2024 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unexpected parameter type for AbstractExplicitContainer with single trainable field
3 participants