Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for VobSub in MKV files #17

Merged
merged 7 commits into from
Oct 21, 2024
Merged

Add support for VobSub in MKV files #17

merged 7 commits into from
Oct 21, 2024

Conversation

ecdye
Copy link
Owner

@ecdye ecdye commented Oct 3, 2024

This will add support for decoding a VobSub subtitle stream directly
from an input Matroska file. This eliminates the need for a user to
create an intermediate file before using macSubtitleOCR

@ecdye ecdye linked an issue Oct 3, 2024 that may be closed by this pull request
4 tasks
@ecdye ecdye mentioned this pull request Oct 3, 2024
4 tasks
@ecdye ecdye closed this Oct 3, 2024
@ecdye ecdye removed a link to an issue Oct 13, 2024
4 tasks
@ecdye ecdye reopened this Oct 17, 2024
@ecdye
Copy link
Owner Author

ecdye commented Oct 17, 2024

@timj, Here is a really nice one. I originally did this to prepare for processing VobSub streams from the MKV files but this change alone seems to improve decoding performance by a lot, in fact it might even be faster than ffmpeg. Let me know how it works for you.

Edit: It seems like it's kinda lethargic in debug mode, but something about the optimizations of release make it way faster.

@ecdye ecdye force-pushed the feat-mkvsub branch 2 times, most recently from 416dceb to 5513f53 Compare October 18, 2024 04:10
@timj
Copy link

timj commented Oct 18, 2024

I'm using main at the moment and it seems very slow but that's probably because I don't know how to make swift build give me the fast binary. It's not using any cores at all even if I ramp it up.

I see that the ffmpeg went away. I will look at the invert colors option.

@ecdye
Copy link
Owner Author

ecdye commented Oct 18, 2024

Actually FFmpeg is still default, the internal decoder option in just hidden in the help. You can see it by using --help-hidden

As for the compiling with optimizations try running swift build --configuration release

@timj
Copy link

timj commented Oct 18, 2024

Ok. main last night hung up over night and never completed for me with macSubtitleOCR --languages eng --max-threads 20 E4_t00.idx ./. I will try this branch. main also wasn't showing me the --help-hidden command.

$ macSubtitleOCR -h
OVERVIEW: macSubtitleOCR - Convert bitmap subtitles into SubRip format using the macOS OCR engine

USAGE: macSubtitleOCR <input> <output-directory> [--languages <l>] [--max-threads <n>] [--invert] [--save-images] [--json]

ARGUMENTS:
  <input>                 Input subtitle file (supported formats: .sup, .sub, .idx, .mkv)
  <output-directory>      Directory to save the output files

OPTIONS:
  -l, --languages <l>     Comma-separated list of languages for OCR (ISO 639-1 codes) (default: en)
  -t, --max-threads <n>   Maximum number of threads to use for OCR (default: 4)
  -i, --invert            Invert images before OCR
  -s, --save-images       Save extracted subtitle images to disk
  -j, --json              Save OCR results as raw JSON files
  -h, --help              Show help information.

@ecdye
Copy link
Owner Author

ecdye commented Oct 18, 2024

So, because of the way the vision API works, you can really only do a max of 5-6 threads. Otherwise it seems to hang like that. I should probably add a cap, or something. I found only one reference to this in one of Apple's developer videos, but they seemed to indicate that it is because of the large amount of memory that the Machine Learning Vision API uses.

Also, I don't think it will show the --help-hidden command, but if you try running it, it should work. I should probably make that clearer as well.

@timj
Copy link

timj commented Oct 18, 2024

Not trying to specify cores helped and it does finish. It's kind of odd since I never see any load on the machine when I run this. RAM isn't an issue on this computer and it wasn't using any when I killed it.

@timj
Copy link

timj commented Oct 18, 2024

So for me the sweet spot is 2 cores:

$ time macSubtitleOCR -t 1 x.idx .
macSubtitleOCR -t 1 x.idx .  69.09s user 13.25s system 191% cpu 42.891 total
$ time macSubtitleOCR -t 2 x.idx .
macSubtitleOCR -t 2 x.idx .  57.40s user 13.00s system 221% cpu 31.850 total
$ time macSubtitleOCR -t 5 x.idx . 
macSubtitleOCR -t 5 x.idx .  55.68s user 10.77s system 211% cpu 31.379 total
$ time macSubtitleOCR -t 10 x.idx .
macSubtitleOCR -t 10 x.idx .  56.49s user 8.60s system 206% cpu 31.466 total
$ time macSubtitleOCR -t 15 x.idx .
macSubtitleOCR -t 15 x.idx .  57.82s user 7.59s system 208% cpu 31.334 total
$ time macSubtitleOCR -t 18 x.idx .
macSubtitleOCR -t 18 x.idx .  84.33s user 4.02s system 215% cpu 40.986 total
$ time macSubtitleOCR -t 20 x.idx .
<hangs without using any CPU> ^C
macSubtitleOCR -t 20 x.idx .  0.46s user 0.16s system 0% cpu 4:51.30 total

There are 20 cores. It doesn't use any GPU. I had assumed that the Vision API was going to use the neural engine or GPU.
With 19 threads it seems to be using 100% CPU forever but after 15 minutes didn't complete.

@ecdye
Copy link
Owner Author

ecdye commented Oct 18, 2024

So for me the sweet spot is 2 cores:

$ time macSubtitleOCR -t 1 x.idx .
macSubtitleOCR -t 1 x.idx .  69.09s user 13.25s system 191% cpu 42.891 total
$ time macSubtitleOCR -t 2 x.idx .
macSubtitleOCR -t 2 x.idx .  57.40s user 13.00s system 221% cpu 31.850 total
$ time macSubtitleOCR -t 5 x.idx . 
macSubtitleOCR -t 5 x.idx .  55.68s user 10.77s system 211% cpu 31.379 total
$ time macSubtitleOCR -t 10 x.idx .
macSubtitleOCR -t 10 x.idx .  56.49s user 8.60s system 206% cpu 31.466 total
$ time macSubtitleOCR -t 15 x.idx .
macSubtitleOCR -t 15 x.idx .  57.82s user 7.59s system 208% cpu 31.334 total
$ time macSubtitleOCR -t 18 x.idx .
macSubtitleOCR -t 18 x.idx .  84.33s user 4.02s system 215% cpu 40.986 total
$ time macSubtitleOCR -t 20 x.idx .
<hangs without using any CPU> ^C
macSubtitleOCR -t 20 x.idx .  0.46s user 0.16s system 0% cpu 4:51.30 total

There are 20 cores. It doesn't use any GPU. I had assumed that the Vision API was going to use the neural engine or GPU. With 19 threads it seems to be using 100% CPU forever but after 15 minutes didn't complete.

Yeah, thats where I'm confused as well. On my MacBook Pro M3 it seems to utilize 5 cores pretty well, but if I do more it will hang forever. I suspect it's more of an issue with the API than anything, but can't say for sure. It seems to only use a minimal amount of the GPU for me as well. I'm not really sure why, but I'll just attribute it to the black magic of Apple's API.

ecdye added a commit that referenced this pull request Oct 19, 2024
This will improve the performance of the internal decoder significantly
by taking advantage of unsafe pointers in Swift to reduce the memory
overhead and processing time for decoding input files. This also adds
compatibility that is needed in order to move forward with #17.

---------

Signed-off-by: Ethan Dye <[email protected]>
@ecdye ecdye marked this pull request as ready for review October 21, 2024 01:54
@ecdye ecdye merged commit 3028ced into main Oct 21, 2024
6 checks passed
@ecdye ecdye deleted the feat-mkvsub branch October 21, 2024 01:54
@timj
Copy link

timj commented Oct 21, 2024

Thanks for this. I just tried it on an MKV and it crashes for me.

Swift/ContiguousArrayBuffer.swift:688: Fatal error: Index out of range

Quick question, when you extract from an MKV do you use the subtitle language from the MKV as the language hint or does it always use the --language parameter?

@ecdye
Copy link
Owner Author

ecdye commented Oct 21, 2024

Thanks for this. I just tried it on an MKV and it crashes for me.

I was afraid of that, I only had one test case on hand to throw at it so I wasn't sure if I had really caught all the edge cases.

Quick question, when you extract from an MKV do you use the subtitle language from the MKV as the language hint or does it always use the --language parameter?

Currently no I'm actually only using the language parameter, that's a good suggestion. I should be able to add that pretty easily.

@timj
Copy link

timj commented Oct 21, 2024

Actually I'm getting crashes with .idx files on main.

@ecdye
Copy link
Owner Author

ecdye commented Oct 22, 2024

Actually I'm getting crashes with .idx files on main.

So just when you try to pass a .idx file in as the file to process? Or that in addition to when passing in a mkv with a VobSub stream?

@timj
Copy link

timj commented Oct 22, 2024

Yes.

$ macSubtitleOCR --save-images B1_t05.sub .
zsh: trace trap  macSubtitleOCR --save-images B1_t05.sub .

crash.tgz

@ecdye
Copy link
Owner Author

ecdye commented Oct 22, 2024

Yes.

$ macSubtitleOCR --save-images B1_t05.sub .
zsh: trace trap  macSubtitleOCR --save-images B1_t05.sub .

crash.tgz

OK, I think I fixed it. It seems like it has something to do with a technically malformed file according to specification, but I'm not sure because I also might be parsing slightly wrong. At least I seemed to be able to fix it. Let me know if you notice any Errors

@timj
Copy link

timj commented Oct 22, 2024

Thanks. I will try it again. This did work before (and works fine with ffmpeg decoder) but I can't remember whether I ever tried it with internal decoder.

@ecdye
Copy link
Owner Author

ecdye commented Oct 22, 2024

Interesting, I'll double check, but I'm guessing you just never tried with internal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants