Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dist to be a Vector{Distribution{Univariate, Continuous}} in Baum Welch #101

Closed
dmetivie opened this issue May 28, 2024 · 3 comments · Fixed by #102
Closed

Allow dist to be a Vector{Distribution{Univariate, Continuous}} in Baum Welch #101

dmetivie opened this issue May 28, 2024 · 3 comments · Fixed by #102

Comments

@dmetivie
Copy link

I think I found a bug (or maybe this was already discussed elsewhere but could not find it).
If you do

init = [0.6, 0.4]
trans = [0.7 0.3; 0.2 0.8]
dists = [Normal(-0.8), Normal(0.8)]
hmm = HMM(init, trans, dists)
T = 20
state_seq, obs_seq = rand(hmm, T)
hmm_est, loglikelihood_evolution = baum_welch(hmm , obs_seq)

works but if you just change dists = Vector{Distribution{Univariate, Continuous}}([Normal(-0.8), Normal(0.8)]) it no longer works. (and more generally this happens when you have a vector of different distributions let say a Exponential and a Gamma)

ERROR: suffstats is not implemented for (Distribution{Univariate, Continuous}, Vector{Float64}, SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}).

Since your package is so clean, it was very easy to find the issue and a fix (maybe not the best? + I don't know if it happens only for Distributions.jl)
See your code here and a fix here.
The type was infered from the vector of distribtutions and not from the distribution element it self.

@gdalle
Copy link
Owner

gdalle commented May 29, 2024

Hi @dmetivie, thanks for reporting this!
I didn't plan for heterogeneous distributions because such settings are type-unstable, but if you have a use case for it, no problem! You had the right fix in mind, and PR #102 should fix it.
However, note that you will run into Distributions-related problems for basically every distribution except the most common ones. It seems Normal and Exponential can be fitted with weights, but many distributions cannot (JuliaStats/Distributions.jl#807). In addition, types beyond Float64 are barely supported (JuliaStats/Distributions.jl#1511), and the same goes for arrays beyond Vector.
So the general answer will often be to define your own distribution (or distribution wrapper), and implement fit! on it.

@dmetivie
Copy link
Author

Thanks!
Funny you mention that example, I did a PR a while ago for Laplace distribution but never took the time to finish yet.
I am aware of the limitation of Distributions.jl, however I believe for some use cases (like mine indeed) having the modeling flexibility with emissions distributions of different families is more important than type stability and Float64 are just fine.

@gdalle
Copy link
Owner

gdalle commented May 29, 2024

having the modeling flexibility with emissions distributions of different families

That's my point, you only have the flexibility as long as your distribution families can be fit with weights. And as long as they support AbstractVector for said weights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants