Hi HN!

I'm excited to show case an update to a personal project of mine. Its called ffmpeg-over-ip and it allows you connect to remote ffmpeg servers, so if you have one machine with a GPU (could be your windows gaming laptop, gaming PC, a macbook?) and a machine (or VM, docker container etc) without a GPU, you could use the remote GPU to do GPU-accelerated video conversion.

The way it works is pretty neat, there are two components, a server and a client.

- The server (has the GPU) comes with a patched up ffmpeg and listens on a specified port - The client (without the GPU) connects to the server, takes the file IO requests from the server and runs them locally.

ffmpeg doesn't know that its not dealing with a local filesystem, so this approach works with multiple inputs or outputs like HLS, and is perfect for home media servers like Plex or Jellyfin or Emby.

One server can offer up its GPU to as many clients as needed, all clients offer up their own filesystems for their requests, the server also comes with a static build of ffmpeg bundled (built from jellyfin-ffmpeg scripts) so all you have to do is create a config file, set a password and you're good to go!

It's been about a year and half since this was last submitted (https://news.ycombinator.com/item?id=41743780). The feedback at the time was around the difficulty of sharing a filesystem between the machines so that should no longer be a problem.

This has been really useful in my local setup, I hope you find it useful. If you have any further questions, the website has some FAQs (linked in github repo), or you could post them here and I'll answer them for you!

Thanks!

Some operations (downsizing already heavily compressed video with the latest and greatest compression techniques) are CPU or GPU bound, but others, like bunching thousands of high res jopg into a lightly compressed timelapse are more likely to be IO bound. So how does this tool make the trade-off? I imagine some things can best be dealt with locally, whilest for ither operations offloading to a external GPU will be beneficial. Also makes a lot of difference if your bandwidth is megabits or gigabits.

This is pretty neat. I was experimenting something similar with my ffmpeg frontend to connect to the local machine (and remote) to run arbitrary encoding jobs, thus offloading the encode tasks to another machine, but still with a queuing mechanism locally.

The project is https://ffmpeg-commander.com for generating ffmpeg commands, but with an experimental backend to offload the tasks.

Do you support chunked encoding across multiple servers? It would be a great feature to support larger video files.

Nice to see you here!

Was really impressed by your work on Pundle. (It was an amazingly fast HMR dev environment - much like Vite today.) Felt like I was the only one using it, but it was hard to walk away from instant updates.

Thanks for putting a smile on my face! I am glad you liked it! :)

Maybe you can submit a patch to ffmpeg.org.

I've considered it, thanks for the nudge. Since the patches are quite specific to my usecase:

- https://github.com/steelbrain/ffmpeg-over-ip/blob/main/fio/f... - https://github.com/steelbrain/ffmpeg-over-ip/tree/main/patch...

I am not sure they'll be accepted for upstreaming, but in exploring the options, I noticed ffmpeg has sftp:// transport support and there were some bugs surrounding that. I do intend to publish some patches for those.

ffmpeg has great http input and output support. I've been using this quite a bit recently. Wrapping ffmpeg with node.js and using the built in http server and client to interact with it.

It's even reduced load considerably because most of the time the disk doesn't even need to be touched.

yep same here, i do the same thing for my video pipeline. spawn ffmpeg as a child process from node, pipe stdin/stdout directly and skip disk entirely for intermediate steps. concat demuxer + xfade filters for stitching scenes together. the only time i touch disk is the final output and even thats optional if youre uploading straight to s3 or whatever

cool idea. can you elaborate on IO and how the ffmpeg-server reads blocks from the client? that would seem to be a big blocker

> cool idea. can you elaborate on IO and how the ffmpeg-server reads blocks from the client? that would seem to be a big blocker

ffmpeg-server runs a patched version of ffmpeg locally, ffmpeg requests to read some chunks (ie give me video.mp4) through our patched filesystem (https://github.com/steelbrain/ffmpeg-over-ip/blob/main/fio/f...), which gets sent over the same socket that the client established, client receives the request, does the file operations locally and sends the results back over the socket to the server, and server then sends them to ffmpeg.

ffmpeg has no idea its not interacting with a local file system

Is video that cpu/gpu bound that streaming it over the interwebs isn't the issue?

Maybe my use cases for ffmpeg are quite narrow, but I always get a speedup from moving the files off my external hard-drive, suggesting that is my current bottleneck.

> streaming it over the interwebs isn't the issue

The hope is that you stream over LAN not the interwebs!

> I always get a speedup from moving the files off my external hard-drive

Based on your description, it does seem like your ffmpeg may be IO limited

Ah, yeah, so this is probably for more professional workflows where you have a workhorse somewhere. Perhaps even in the cloud as long as the files are close by as well? My use case would be more "my computer sucks, so would be nice to do it on a beefy cloud computer", but of course no time is saved when just reading my files is slow, heh.

very clever and thanks for explaining. for gpu-bound processes, which are common ffmpeg use cases, this is a great approach

What's the point of this?

A single CPU core on a 9500T or a Ryzen V1500B is fast enough to real-time re-encode 60mbps 4K H264 to 1080p 5mbps h264, aka, for a core use case - transcoding for web for Jellyfin over cellular, for example - you haven't needed hardware video engines on PCs for 9 YEARS.

I have no idea why people are so hung up on hardware video encoding. It's completely wrong. The quality is worse. The efficiency is a red herring - you will still use every CPU core for IO threads in ffmpeg, if you don't configure that away, which you do not. And it requires really annoying setup and premium features on stuff like Plex. It just makes no sense!

If latency is important to you, well then hardware engines make sense. But you are throwing away the latency sending it over the network. The only use case (basically) is video game streaming, and in that case, you'll have a local GPU.

I have never read one of these ffmpeg network hardware encode innovations to have an actual benchmark comparison to single thread software transcoding tasks.

I know you mean well but really. It makes NO sense.

> The efficiency is a red herring - you will still use every CPU core for IO threads in ffmpeg, if you don't configure that away, which you do not. And it requires really annoying setup and premium features on stuff like Plex. It just makes no sense!

I would love to learn more about this! What can I do to fully optimize ffmpeg hardware encoding?

My use case is transcoding a massive media library to AV1 for the space gains. I am aware this comes with a slight drop in quality (which I would also be keen to learn about how to minimize), but so far, in my testing, GPU encoding has been the fastest/most efficient, especially with Nvidia cards.

GPU encoding is fast, but usually it produces poorer quality results because it avoids trying paths that are hard to do quickly on the GPU.

If you want to optimise, try different encoders (sounds like you've already done some of this) and lots of different settings - it'll involve a lot of tuning if you want to figure out the right balance for your particular media between quality/speed/size, while also making sure that your machine hurts as much as possible.

Driveby 2c as a video industry person: don't retranscode your media unless you've got them in a really space inefficient codec and you're seriously hurting for space. You'll burn a lot of power retranscoding, are you actually saving useful $$$ of storage in exchange for that spend? Storage is cheap, and there's always a better codec coming along you could retranscode into and save some more space. It's a vicious cycle: each generation has to encode the artifacts from the previous generations.

You would use your full system, saturating the CPU and GPU, including unlocking the number of simultaneous video sessions for consumer NVIDIA GPUs. That said, software AV1 looks a lot better than hardware AV1 per bit.

Thank you for sharing your experience. Seems like this is not relevant to your setup & usecase.

People who need this know who they are. Not everything is for everybody.

What is the use case?

I'd argue this is for nobody haha

Nobody using jellyfin plex or whatever needs it: they should just use software transcoding, it's better in pretty much every way.

I've traveled around a lot in the past couple years so my situation (read: homelab equipment) has been changing and my usecase has been changing with it. It started out as:

- I dont want to unplug the GPU from my gaming PC and plug it into my linux server

- Then: I dont want to figure out PCI forwarding, I'll just open a port and nfs to the containers/vms (ffmpeg-over-ip v4 needed shared filesystem)

- Now: I have a homelab of 4 mini PCs and one of them has an RTX 3090 over Oculink. I need it for local LLMs but also video encoding and I dont want to do both on the same machine.

But you've asked a more fundamental question, why would people need hardware accelerated video decoding in the first place? I need it because my TV doesn't support all the codecs and I still want to watch my movies at 4K without stuttering.

You can transcode in realtime in software to your TV. You don't need the GPU at all. Even on ancient USFF PCs.

I'll tell my TV you said that and I'll see if it stops buffering during playback :)

> You can transcode in realtime in software

Sometimes you want faster-than-realtime encoding, such as when backing up your video archive.

A CPU using all its cores is much faster than realtime.

This doesn't appear to be true. My Plex media server is ancient and it really struggles if it has to do any kind of transcoding. Definitely can't handle high bitrate 4k stuff.

As a rule, strong feelings about issues do not emerge from deep understanding.

[flagged]

Why are you using github as your personal proprietary app depot?

I am using it as my everything-depot. Beyond this proprietary app, you'll find more than a hundred of my other, open source projects there as well.

Including the building blocks of said proprietary app: https://github.com/steelbrain/XMLKit & https://github.com/steelbrain/IPCamKit

nice, problem is that, with hikivision and dahua got banned these days, the majority of ip cameras on the market do not do onvif or rtsp, or neither, what a shame.

Get a TP Link Tapo! They are like 20-30 bucks and come with ONVIF.

EZVIZ is another ban-evading arm of HIKVision, easily available in Europe and has RTSP (confirmed) with alleged support of ONVIF as well.

Nearly all ip cameras support ONVIF and/or RTSP.