Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ntuple eltype depends on tuple length for UInt64 #55790

Open
piever opened this issue Sep 17, 2024 · 11 comments · May be fixed by #55901
Open

ntuple eltype depends on tuple length for UInt64 #55790

piever opened this issue Sep 17, 2024 · 11 comments · May be fixed by #55901
Labels
kind:bug Indicates an unexpected problem or unintended behavior status:help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@piever
Copy link
Contributor

piever commented Sep 17, 2024

Both on julia 1.10.5 and julia 1.11.0-rc3, calling ntuple(f, n::UInt64) gives the following unexpected behavior:

julia> ntuple(identity, UInt64(10))
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

julia> ntuple(identity, UInt64(11))
(0x0000000000000001, 0x0000000000000002, 0x0000000000000003, 0x0000000000000004, 0x0000000000000005, 0x0000000000000006, 0x0000000000000007, 0x0000000000000008, 0x0000000000000009, 0x000000000000000a, 0x000000000000000b)

I imagine this is due to the fact that for n <= 10 there is an hard-coded version using Int, whereas the generic fallback respects the UInt64 eltype. I suppose this is a pretty rare corner case (99% of the time I suspect ntuple is called with n::Int), but it caused some hard to figure out bug in some library code, so I thought it was worth reporting.

@simeonschaub simeonschaub added kind:bug Indicates an unexpected problem or unintended behavior status:help wanted Indicates that a maintainer wants help on an issue or pull request labels Sep 17, 2024
@tomaklutfu
Copy link
Contributor

Smaller integer ones are broken for all too. This may fix that

@inline function Base.ntuple(f::F, n::T) where {F, T<:Integer}
    # marked inline since this benefits from constant propagation of `n`
    t = n == T(0)  ? () :
        n == T(1)  ? (f(T(1)),) :
        n == T(2)  ? (f(T(1)), f(T(2))) :
        n == T(3)  ? (f(T(1)), f(T(2)), f(T(3))) :
        n == T(4)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4))) :
        n == T(5)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5))) :
        n == T(6)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5)), f(T(6))) :
        n == T(7)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5)), f(T(6)), f(T(7))) :
        n == T(8)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5)), f(T(6)), f(T(7)), f(T(8))) :
        n == T(9)  ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5)), f(T(6)), f(T(7)), f(T(8)), f(T(9))) :
        n == T(10) ? (f(T(1)), f(T(2)), f(T(3)), f(T(4)), f(T(5)), f(T(6)), f(T(7)), f(T(8)), f(T(9)), f(T(10))) :
        Base._ntuple(f, n)
    return t
end

function Base._ntuple(f::F, n) where F
    @noinline
    (n >= 0) || throw(ArgumentError(LazyString("tuple length should be ≥ 0, got ", n)))
    ([f(i) for i = Base.OneTo(n)]...,)
end

but I am not sure about T(...).

@simeonschaub
Copy link
Member

I think it probably makes more sense to change just _ntuple to always convert to Int first

Mytherin added a commit to duckdb/duckdb that referenced this issue Sep 17, 2024
On main branch, `Tables.partitions` fails on table with more than 10
columns. The (somewhat convoluted) reason is that julia `ntuple`
function when fed a `UInt64` as second argument, casts it to `Int` only
if it is `<= 10` (see JuliaLang/julia#55790).
This works in our favor for small `n` (number of columns), as the
function `ColumnConversionData` expects a `Int`, not a `UInt64`, but for
larger `n` attempting to collect a `Tables.partitions` fails.

The first commits is a minimal fix + test, whereas the second commit
also adds a small refactor to simplify code and make it more robust to
this.
@adienes
Copy link
Contributor

adienes commented Sep 17, 2024

not every <:Integer is necessarily convertable to Int though

I guess that's a useless concern bc if you are trying to make a tuple with length >64 bits you will probably find more serious obstacles very quickly

@mhauru
Copy link
Contributor

mhauru commented Sep 20, 2024

I was hasty and didn't read this thread properly, and went and implemented something very similar to @tomaklutfu's proposal in #55825. I do think respecting the type given by the user makes more sense. If a user gives me a UInt128, I presume there's a reason for that, and they want UInt128s back as well, rather than considering Int the one integer type to rule them all. If you are not convinced of this though, feel free to close my PR in favour of a different solution, I'm not attached to it.

Note also that currently this is extra inconsistent

julia> ntuple(identity, UInt64(11))
(0x0000000000000001, 0x0000000000000002, 0x0000000000000003, 0x0000000000000004, 0x0000000000000005, 0x0000000000000006, 0x0000000000000007, 0x0000000000000008, 0x0000000000000009, 0x000000000000000a, 0x000000000000000b)

julia> ntuple(identity, UInt8(11))
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)

That's because 1:n works in funny ways, but that's a separate issue.

@LilithHafner LilithHafner added needs decision A decision on this change is needed design Design of APIs or of the language itself status:triage This should be discussed on a triage call labels Sep 20, 2024
@LilithHafner
Copy link
Member

LilithHafner commented Sep 20, 2024

I see two possible fixes:

  1. Make ntuple always convert to Int (@simeonschaub's ntuple eltype depends on tuple length for UInt64 #55790 (comment))
  2. Make ntuple always preserve argument type (@mhauru's In ntuple, respect the type of the count argument #55825)

Because this impacts public API, I think it's worth talking about at triage. (triage is a biweekly call which is open to the public and can be found on the Julia community calendar)

@adienes
Copy link
Contributor

adienes commented Sep 20, 2024

on further thought, I think ntuple explicitly should convert to Int not just as a matter of convenience but on principle

the reason is short: ntuple documentation states computing each element as f(i), where i is the **index** of the element. (emphasis mine), and eachindex(::Tuple) is a OneTo{Int64} no matter what the type of N passed was.

@LilithHafner
Copy link
Member

I'm seeing a lot of support for option 1 and no explicit support for option 2. @tomaklutfu and @mhauru have provided implementations for option 2, but I haven't seen either of them state a preference for it over option 1 (please say so if you do have that preference).

Unless anyone speaks up in favor of preserving the argument type, I think it makes sense to go ahead and implement option 1.

@LilithHafner LilithHafner removed needs decision A decision on this change is needed status:triage This should be discussed on a triage call design Design of APIs or of the language itself labels Sep 23, 2024
@mhauru
Copy link
Contributor

mhauru commented Sep 23, 2024

I did originally think that option 2 made more sense, but the index point is valid, and I'm happy to go with 1, especially since its supported by people with more experience with the codebase than me (that's almost everyone).

@tomaklutfu
Copy link
Contributor

tomaklutfu commented Sep 23, 2024

We say ith index but do we say index should be Int somewhere in the doc? Since other integer types are also available for indexing syntax, I would expect that ntuple respect that also. Besides, we never know how f behaves for different types and probably the user expects ntuple uses what type of integer given to it. So option 2 for my humble opinion.

@LilithHafner LilithHafner added needs decision A decision on this change is needed status:triage This should be discussed on a triage call labels Sep 23, 2024
@LilithHafner
Copy link
Member

The array interface requires custom arrays to define Int based indexing. Methods in Base exist to automatically convert non-Int indices into Ints on getindex and setindex calls and custom arrays may also define custom behavior on non-Int indices (e.g.

julia> x = Base.OneTo(typemax(UInt64))
Base.OneTo(18446744073709551615)

julia> x[typemax(Int) + big(10)]
0x8000000000000009

)

So Int has preferential treatment when it comes to indexing but is not the only allowable index type.

@StefanKarpinski
Copy link
Sponsor Member

Triage dixit: It should always return Ints, the indices of a tuple are Ints.

@StefanKarpinski StefanKarpinski removed needs decision A decision on this change is needed status:triage This should be discussed on a triage call labels Sep 26, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 27, 2024
@barucden barucden linked a pull request Sep 27, 2024 that will close this issue
barucden added a commit to barucden/julia that referenced this issue Sep 27, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 28, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 28, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 28, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 28, 2024
barucden added a commit to barucden/julia that referenced this issue Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug Indicates an unexpected problem or unintended behavior status:help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
7 participants