This is an automated archive.

The original was posted on /r/datahoarder by /u/JuIi0 on 2023-08-30 12:56:49+00:00.


What’s good r/DataHoarder(s),

I’m deep in this media preservation project, trying to figure out how to conditionally re-encode videos based on their OG stats to save space, the end game here is to hit a VMAF score of at least 98.8, making sure I don’t skimp on quality for space.

While VMAF is a reliable after-the-fact metric, it doesn’t lend much guidance for the initial re-encoding settings. Sure, I can use ffprobe to get a snapshot of the original metrics, but when it’s go-time for picking those first-round encode settings, that’s where I hit a wall.

My Current Approach is:

  1. Get the initial video stream info ffprobe -v error -select_streams v:0 -show_entries stream=codec_name,height,width,bit_rate -of default=noprint_wrappers=1:nokey=1 video123.mp4
  2. Then to re-encode based on trial-and-error ffmpeg -hwaccel cuda -c:v h264_cuvid -i video123.mp4 -c:v hevc_nvenc -rc constqp -preset:v slow -cq 32 -c:a copy video123_HEVC_GPU.mp4
  3. Then measure the VMAF on each run ffmpeg -i video123.mp4 -i video123_HEVC_GPU_NEW.mp4 -filter_complex "[0:v]select=not(mod(n\,80)),scale=1280:720[main]; [1:v]select=not(mod(n\,10)),scale=1280:720[ref]; [main][ref]libvmaf" -an -f null -
  • I’ve considered manually tweaking encoding settings until I hit the sweet spot, but that’s a total time vampire and horribly inefficient. Even if I trim the encoding and down-sample my evals, I’m still stuck playing mad scientist trying to find the right encoding parameters.

So my questions would be:

  1. Initial Encoding Presets: Any rules of thumb or formulas for deciding initial re-encoding settings based on the original metrics?
  2. Efficiency: Are there any existing tools that could streamline this process and make it less manual?

If any of y’all got the info on this I’d appreciate it, I’d write out a script to recursively dig though each video in a directory and drop it on GitHub if I could nail this project