Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring implementations of solver types to use tableau-based forms #233

Open
YingboMa opened this issue Dec 29, 2017 · 8 comments
Open

Comments

@YingboMa
Copy link
Member

Currently, the code in src/integrators is very repetitive and I think it can be cleaned up by some metaprogramming techniques. One concern that @ChrisRackauckas has about metaprogramming in the integrators is that it cannot handle zeros in tableaux automatically because the tableaux are not visible at parse time as they are in a struct. I think we can avoid this problem by just using matrices to store the tableaux, so the compiler can perform constant folding. Also, we can implement fast tableau methods from this. Here is an example.

T = T2 = Float64
a21 = T(0.5)
a32 = T(0.75)
a41 = T(0.2222222222222222)
a42 = T(0.3333333333333333)
a43 = T(0.4444444444444444)
c1 = T2(0.5)
c2 = T2(0.75)
btilde1 = T(0.06944444444444445)
btilde2 = T(-0.08333333333333333)
btilde3 = T(-0.1111111111111111)
btilde4 = T(0.125)

A = [
 0    0    0    0
 a21  0    0    0
 0    a32  0    0
 a41  0    a43  0
]
C = [ c1, c2, 0 ]
B = [ btilde1, btilde2, btilde3, btilde4 ]
macro p(ex::Symbol, expr::Expr)
    Expr(:call, :push!, :($ex.args), expr)
end

tight_loop(ex) = Expr(:block,
                      Expr(:macrocall, Symbol("@tight_loop_macros"), LineNumberNode,
                           Expr(:for, :(i = uidx),
                                Expr(:block))))

applyA(n) = Expr(:call, :+, :(uprev[i]),
                 Expr(:call, :*, :dt,
                      Expr(:call, :+, [:($(A[n+1, j]) * $(Symbol(:k, j))[i]) for j in 1:n]...)))

#function perform_step!(integrator, tab, cache, repeat_step, Val{true})
    #A, B, C = tab
    body = Expr(:macrocall, Symbol("@inbounds"), LineNumberNode,
                Expr(:macrocall, Symbol("@muladd"), LineNumberNode,
                     Expr(:block)))
    ex   = body.args[end].args[end]
    @p ex Expr(:macrocall, Symbol("@unpack"), LineNumberNode, :((t, dt, uprev, u, f) = integrator))
    @p ex :( k1 = integrator.fsalfirst )
    @p ex :( uidx = eachindex(uprev) )
    @p ex :( tmp = similar(uprev) )
    @p ex :( a = dt*$(A[2,1]) )
    @p ex :( $(tight_loop(:(tmp[i] = uprev[i]+a*k1[i]))) )
    len = size(A, 1)
    for i in 2:len-1
        @p ex :( $(Symbol(:k,i)) = f(t+$(C[i-1])*dt,tmp) )
        # NOTE: This has to be done since u is aliased with uprev.
        i == len && @p ex :(utmp = similar(u))
        applya = applyA(i)
        tmp = i == len ? :(utmp) : :(tmp)
        @p ex :( $(tight_loop(:($tmp[i] = $applya))) )
        #tmp = :( @. uprev+$(Symbol(a, i))*$(Symbol(k, i)) )
        #@p ex :( Symbol(k, i) = f(t+$(C[i-1])*dt, $tmp) )
    end
    @p ex :( u = utmp )
    @p ex :( $(Symbol(:k,len)) = f(t+dt,u) )
    @p ex :( integrator.fsallast = $(Symbol(:k,len)) )
    body
#end

The above code can generate the expression

:(@inbounds @muladd(begin
              @unpack (t, dt, uprev, u, f) = integrator
              k1 = integrator.fsalfirst
              uidx = eachindex(uprev)
              tmp = similar(uprev)
              a = dt * 0.5
              begin
                  @tight_loop_macros for i = uidx
                      end
              end
              k2 = f(t + 0.5dt, tmp)
              begin
                  @tight_loop_macros for i = uidx
                      end
              end
              k3 = f(t + 0.75dt, tmp)
              begin
                  @tight_loop_macros for i = uidx
                      end
              end
              u = utmp
              k4 = f(t + dt, u)
              integrator.fsallast = k4
          end))
@ChrisRackauckas
Copy link
Member

Instead of matrices we should just use static arrays a la #114

But the issue here is that you can do the simple case, but I am not sure this can actually be so simple in the full case. For example, we want to use broadcasting to keep compatibility with GPUs when possible (http://www.stochasticlifestyle.com/solving-systems-stochastic-pdes-using-gpus-julia/), but then after a certain line size broadcasting isn't possible (#106). We will want those lines to fuse as well instead of a loop because that would require less GPU kernel calls (or whatever other form of parallelism by the array type).

Another issue is, how do you deal with the fact that some of the methods have to use non-standard error estimators?

The codes that do fall under the same category are actually quite small. I think the largest category this can hit (without multiple macros of course) is the explicit Runge-Kutta methods. The nicest thing to hit would be the SDIRK methods, but each of them have different stage predictors (along with things like KenCarp getting additive parts) so while there is repetitive code there is unique parts in most of them.

If you can find a good and easy to maintain way to implement this I will accept it, but those are the constraints and it's not an easy problem to satisfy those except maybe in the explicit RK and symplectic RK cases.

@ChrisRackauckas ChrisRackauckas changed the title Use more metaprogramming in src/integrators Refactoring implementations of solver types to use tableau-based forms Apr 28, 2024
@ChrisRackauckas
Copy link
Member

There was a big push during a Hackathon to investigate this a bit more, see:

It was determined that metaprogramming isn't needed, but smart implementations of RK methods could do it fine. In particular, the hardest algorithms to get correct with this are the standard explicit Runge-Kutta algorithms because they can do certain optimizations when unrolled.

This kind of goes hand-in-hand with #2177. Maybe we can keep the more optimized versions of the explicit RK methods around but as a subpackage or something.

However, at least the implicit methods should not see any benefit from any form of unrolling since the cost is dominated by the implicit solves. In that case, we should be looking to refactor Rosenbrock, SDIRK, etc. first.

@ParamThakkar123
Copy link
Contributor

@ChrisRackauckas , @YingboMa May I work on this issue ??

@ParamThakkar123
Copy link
Contributor

or is it done ??

@ChrisRackauckas
Copy link
Member

It has not been done. If you want to claim it, go for it. Send the email for the SciML Small Grants claim and let's get this started. I think Rosenbrock is the first set to do. Unify Rosenbrock4Cache and Rosenbrock5Cache first, the Rodas4, Rodas5, Rodas4P, Rodas5P, Rodas5Pr, etc. set should be the most similar, then grow that to all of the Rosenbrocks. Rosenbrock23 will be the odd one I think, it's fine to do a first merge with that separate.

@ParamThakkar123
Copy link
Contributor

ParamThakkar123 commented Aug 8, 2024

It has not been done. If you want to claim it, go for it. Send the email for the SciML Small Grants claim and let's get this started. I think Rosenbrock is the first set to do. Unify Rosenbrock4Cache and Rosenbrock5Cache first, the Rodas4, Rodas5, Rodas4P, Rodas5P, Rodas5Pr, etc. set should be the most similar, then grow that to all of the Rosenbrocks. Rosenbrock23 will be the odd one I think, it's fine to do a first merge with that separate.

Sounds good. I'll do the needful and get started with this

@oscardssmith
Copy link
Contributor

I have a PR coming soon that will make Rosenbrock32 less special as well (but it will take a little bit)

@ParamThakkar123
Copy link
Contributor

@ChrisRackauckas My contribution period under the small grants program is over but I want to continue to work on this issue and solve this for other solvers after we get this done with rosenbrock. If you allow, may I get an extension for this, please ??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants