Extracting subtitles
- geraldholdsworth
- Posts: 1406
- Joined: Tue Nov 04, 2014 9:42 pm
- Location: Inverness, Scotland
- Contact:
Extracting subtitles
Hi all,
I've got a VOB file which plays in VLC and I can display subtitles (in a variety of languages). But, I want these subtitles, in the appropriate languages, in a text file with timecode markers. This is not from a commercial DVD, so these subtitles won't be online.
Anyone got any ideas?
BTW, I'm using a Mac.
Cheers,
Gerald.
I've got a VOB file which plays in VLC and I can display subtitles (in a variety of languages). But, I want these subtitles, in the appropriate languages, in a text file with timecode markers. This is not from a commercial DVD, so these subtitles won't be online.
Anyone got any ideas?
BTW, I'm using a Mac.
Cheers,
Gerald.
Gerald Holdsworth, CTS-D
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Re: Extracting subtitles
Called a transcript, I believe.geraldholdsworth wrote: ↑Sat Feb 17, 2024 12:54 pm I want these subtitles, in the appropriate languages, in a text file with timecode markers.
Re: Extracting subtitles
Ffmpeg? Don’t know if it preserves the timecodes, but there must be an option…
https://trac.ffmpeg.org/wiki/ExtractSubtitles
VLC can allegedly do it as well. Apparently Handbrake can too.
https://trac.ffmpeg.org/wiki/ExtractSubtitles
VLC can allegedly do it as well. Apparently Handbrake can too.
- geraldholdsworth
- Posts: 1406
- Joined: Tue Nov 04, 2014 9:42 pm
- Location: Inverness, Scotland
- Contact:
Re: Extracting subtitles
Tried Handbrake - couldn't work out how to get it to do it.
Couldn't figure out how to install ffmpeg. Then I found out I can do it through homebrew. Still won't touch the subtitles. It claims that there isn't any...but there is. Apparently, 16 languages - only need the 6 of them (well, the client says we don't need Japanese, so that makes it 5).
Couldn't figure out how to install ffmpeg. Then I found out I can do it through homebrew. Still won't touch the subtitles. It claims that there isn't any...but there is. Apparently, 16 languages - only need the 6 of them (well, the client says we don't need Japanese, so that makes it 5).
Gerald Holdsworth, CTS-D
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Re: Extracting subtitles
Is the VOB all you have, or do you have a DVD (either physical or iso)?
It seems not all subtitle data on DVDs is stored in the VOB; some is stored in other files. Players such as VLC seem to be able to cope without the missing data, but transcoders are apparently more fussy.
If you have the “DVD”, maybe try transcoding the relevant title(s) into individual container file(s) such as mkv (remembering to include the subtitles). Hopefully ffmpeg will have more luck in finding the subtitle data in the mkv than vob.
It seems not all subtitle data on DVDs is stored in the VOB; some is stored in other files. Players such as VLC seem to be able to cope without the missing data, but transcoders are apparently more fussy.
If you have the “DVD”, maybe try transcoding the relevant title(s) into individual container file(s) such as mkv (remembering to include the subtitles). Hopefully ffmpeg will have more luck in finding the subtitle data in the mkv than vob.
Re: Extracting subtitles
There are various tools for subtitle extraction but if you want them as a text file you’ll need optical character recognition (OCR) software as well.
The best tool I’ve used for both extraction and OCR is Subtitle Edit (freeware). It’s for Windows and Linux, but I’ve used it on a Mac, running within Crossover (an app that lets you run Windows software without installing a virtual machine).
There’s a guide here: https://iamscum.wordpress.com/guides/ocr/
The best tool I’ve used for both extraction and OCR is Subtitle Edit (freeware). It’s for Windows and Linux, but I’ve used it on a Mac, running within Crossover (an app that lets you run Windows software without installing a virtual machine).
There’s a guide here: https://iamscum.wordpress.com/guides/ocr/
Re: Extracting subtitles
This relies on Windows programs to extract the subtitles but they'd subsequently need converting into a transcript which may not be possible;
https://www.youtube.com/watch?v=ru1aDajSe9g
https://www.youtube.com/watch?v=ru1aDajSe9g
Re: Extracting subtitles
Perhaps converting into a Youtube video and then using Youtube transcription would work.
Re: Extracting subtitles
What does ffmpeg -i yourfile.vob say about subtitles? If it can't find subtitle text streams, then your client's video file has the subtitles "burned in" as image frames with no explicit text. Those require the OCR tricks described above, and lots and lots of proof-reading. This can be expensive and slow to get right.geraldholdsworth wrote: ↑Sat Feb 17, 2024 11:56 pm Couldn't figure out how to install ffmpeg. Then I found out I can do it through homebrew. Still won't touch the subtitles. It claims that there isn't any...but there is.
- geraldholdsworth
- Posts: 1406
- Joined: Tue Nov 04, 2014 9:42 pm
- Location: Inverness, Scotland
- Contact:
Re: Extracting subtitles
I've got a ripped copy of the DVD (so I've got the IFO and BUP files too in a VIDEO_TS folder. I can get hold of the physical DVD.
I've got my collegue working on it too - he's managed to extract the English subtitles, so far, into an SRT file, which I can use. Just need the other five languages.
Gerald Holdsworth, CTS-D
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
- geraldholdsworth
- Posts: 1406
- Joined: Tue Nov 04, 2014 9:42 pm
- Location: Inverness, Scotland
- Contact:
Re: Extracting subtitles
It says:scruss wrote: ↑Sun Feb 18, 2024 4:30 pm What does ffmpeg -i yourfile.vob say about subtitles? If it can't find subtitle text streams, then your client's video file has the subtitles "burned in" as image frames with no explicit text. Those require the OCR tricks described above, and lots and lots of proof-reading. This can be expensive and slow to get right.
Code: Select all
Input #0, mpeg, from 'Urquhart_Castle_Cinema.VOB':
Duration: 00:08:38.02, start: 0.060000, bitrate: 8971 kb/s
Stream #0:0[0x1bf]: Data: dvd_nav_packet
Stream #0:1[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, bottom first), 720x576 [SAR 16:15 DAR 4:3], 6000 kb/s, 25 fps, 25 tbr, 90k tbn
Side data:
cpb: bitrate max/min/avg: 6000000/0/0 buffer size: 1835008 vbv_delay: N/A
Stream #0:2[0x85]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Stream #0:3[0x84]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Stream #0:4[0x83]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Stream #0:5[0x82]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Stream #0:6[0x81]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Stream #0:7[0x80]: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
Gerald Holdsworth, CTS-D
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Re: Extracting subtitles
No text subtitles found in your ffmpeg output, alas. Your colleague's SRT must've used the OCR method
Re: Extracting subtitles
I’ve used subtitle edit to ocr subtitles https://www.nikse.dk/subtitleedit. It’s free and quite useful.
Richard B
Acorn Electrons issue 4 and 6, MRB, Plus 1 x2, Plus 3, AP6 x2, AP5, Pegasus 400, BeebSCSI, Gotek, Raspberry Pi Co-processor, GoSDC MBE.
BBC B+ 64K (128K upgraded) with Duel OS, Raspberry Pi Co-processor and Gotek.
Acorn Electrons issue 4 and 6, MRB, Plus 1 x2, Plus 3, AP6 x2, AP5, Pegasus 400, BeebSCSI, Gotek, Raspberry Pi Co-processor, GoSDC MBE.
BBC B+ 64K (128K upgraded) with Duel OS, Raspberry Pi Co-processor and Gotek.
- geraldholdsworth
- Posts: 1406
- Joined: Tue Nov 04, 2014 9:42 pm
- Location: Inverness, Scotland
- Contact:
Re: Extracting subtitles
VLC greys out the Subtitle Track sub menu until the narrator starts speaking. Then I get a choice of about 16 tracks, and can change the font, colour, size, etc. which makes me believe they're in there, just not at the start.
Gerald Holdsworth, CTS-D
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza
Extron Authorised Programmer
https://www.geraldholdsworth.co.uk
https://www.reptonresourcepage.co.uk
Twitter @radiogezza