Hacker News new | past | comments | ask | show | jobs | submit login
Writing an MP4 Muxer for Fun and Profit (obsproject.com)
95 points by skrrtww 9 months ago | hide | past | favorite | 28 comments



Having worked with some MP4 demuxing for my extension [1], I feel the pain. Lots of times I would play the video only to find inexplicable issues such as drifting audio. I highly recommend using an mp4 inspector tool, such as mp4box [2], to debug these issues.

1: https://github.com/Andrews54757/FastStream

2: https://gpac.github.io/mp4box.js/test/filereader.html


I found this inspector useful, too: https://mlynoteka.mlyn.org/mp4parser/


Nice, when playing around one weekend trying to see if I could use ipfs as a transport layer for streaming video I got hung up because most video formats I tried behaved very poorly with inconsistent streams where you may not have the beginning. I ended up on mpeg-ts as the best behaving of the bunch. It felt a little weird, as I was sort of expecting something more modern to have better performance, but seeing as my goal was not to evaluate video formats but just ship them around I just accepted it and moved on.

Thinking back on it now, I just did a little trial and error until I found something that worked, but what would I search for if I was trying to find data on how... ?streamable? an encoding is?

If curious, I got my proof of concept working but it was unpleasantly slow. I blindly chunked the incoming stream into megabyte sized chunks registered the chunks on ipfs then used ipfs pubsub to announce the chunk to any watchers. The watcher would watch the pubsub channel for announcements download the chunk and try to reassemble it in order and play it. one neat side effect that I found was when the stream was done if I had stored all the ipfs address I could then generate a whole ipfs file structure you could use to download the stream at a later date.


Can someone explain how does an existing media player understand the new mdat format without modification? I assume if they find a completed moov at end of the file, it would recognize the file as a unfragmented mp4. It should then try to find a list of recognized codecs directly inside the mdat (like in the first picture), but instead they will find another moov, a bunch of moofs and sub-mdats, all of which are clearly not proper for a unfragmented mp4. Why doesn't the player report this as a "unrecognizable, badly formatted" mp4 file?


The mdat box does not have a defined structure, and the specification actually states that attempting to define a structure is almost certainly a mistake. In order to find the data the player is looking for it has to read the moov box, which contains the byte offsets and sizes of "chunks" of data. Since there is no requirement for chunks to be contiguous, or even in the same file, we can simply skip over the fragmentation-related boxes within the data box.


The moov contains a list of byte offsets which the player can use to directly access media data. You can skip the moofs and other headers inside by using gaps in the offsets.


This is awesome work. I’ve coded some extensions for mp4 livestream to handling dozens of real-time streams and I’d love to try out the multi stream mux / demux…


> It kind of hurts that several days of work and research can be summed up in a couple paragraphs, but that's what the "pain" part in the subtitle is for.

Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.

I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)


I feel zero shame for torrenting ISO standards PDFs.


Me neither, but I couldn't actually find any torrents for the mp4-related specs (I did find what I needed with some google-fu, though)


> Having recently written my own fragmented-MP4 remuxing library, I felt this pain too, and my soon-to-be-published writeup has very similar things to say about the ISO's paywalling practices.

Would be curious to hear what goals you had with writing a muxer yourself as well, given that most people just use LibAV/GStreamer/GPAC and call it a day.

> I think one of the hardest parts of ISO-BMFF, aside from spec availability, is that it's pretty hard to implement "cleanly", making existing code confusing to use as reference. (My own implementation is certainly not clean either)

I certainly wouldn't call the OBS implementation "clean" either. It's very much inspired by the FFmpeg/LibAV implementation since that one is fairly straightforward (not a lot of abstraction), and gets the job done (and also is GPL/LGPL so not a huge concern looking at it).


The short answer is, it's for an exploit. It involves some slightly less-well-trodden boxes, and adding specially crafted metadata to live-generated videos in real-time, which existing libraries couldn't help me with much (and I did spend some time fighting a few libraries, but couldn't make them do precisely what I wanted).

"Library" is perhaps an overstatement, it does the things I need and not much more.


A hah, that sounds cool, looking forward to the writeup!


It looks like GStreamer has supported this for a few years: https://gstreamer.freedesktop.org/documentation/isomp4/GstBa...

I always forget about GStreamer but I think I have a perfect application for it. Hopefully it’s easier to use as a library than MediaFoundation or FFMpeg.


MP4. The answer to the question of "Is there a way to make RIFF and AVI even worse somehow?" It makes you genuinely pine for MPEG2 Transport Streams. ISO 13818 for life.


Great work!

Would love to see MP4 Hybrid supported in popular packages like mp4-muxer [1] and mp4box [2] someday.

1: https://github.com/Vanilagy/mp4-muxer 2: https://github.com/gpac/mp4box.js


> moof (Movie Fragment Box)

Very cute easter egg. Moof is what dogcows say: http://clarus.chez-alice.fr/history.php


So this is a “soft” sequential access limitation (we can tolerate some random writes to data as long as it is small enough and short enough). I wonder if there are formats that result in finished indexed multimedia file with “hard” sequential access, when nothing can be overwritten.


Digital video tape formats (e.g. DV, HDV) are an example. Other containers that operate in this mode are TS and Ogg (and optionally, MKV). Any sort of live streaming format generally is, too.


(context, this is talking about fragmented MP4 downsides)

> 2. They are slow to access on HDD or network drives, as each fragment's header needs to be read to get the complete metadata of the file and start playback

Huh? That's not right. The whole point of fragmented MP4 is that you can access any fragment without having to read the headers of the other fragments. That's why adaptive streaming is built around fragmented MP4.


To figure out the total length of media streams, you need an external index metadata (web streaming) or a remux of the file that adds an index. The whole point of the article is removing the need to remux the file after recording, otherwise you can use existing solutions just fine.


You can write a sidx for your index. And it doesn't require a whole remux.


Unfortunately, some of the most popular/problematic software (default Windows video player and explorer) does not support `sidx` boxes.


> Except there is no profit, only pain

I have 20 years or professional experience and my conclusion, if someone asked, what IT boils down to: pain.

The pain is what filters who can succeed and who fail. Can you endure hunting a bug for 7 hours in your chair? Can you fix problem after problem to get a system running? Everything that can fail, will fail, and you have to deal with it.


Or when you encounter (another) corner case that requires a top down rewrite of your code. If you start to question your sanity or the meaning of life, you're probably on the right track.


I came to the same conclusion in business. Is what you're doing 100% pure pain? Good. It means you're providing value.


"The new MP4 output now also supports multiple video tracks"

MP4 has been able to have multiple video streams for quite some time. One of the very first advanced MP4 authoring tools I saw in the early 00s allowed for this, and we used it to make a few advanced files to demo the "new" MP4 format. Much like multi-angle DVDs, this was a niche feature that did not gain very much attraction. I could see why someone not around at that time might think this is a new feature, but it's not


New to OBS.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: