cross-posted from: https://linkage.ds8.zone/post/523771

Before posting an image to the fedi, I want to be mindful about the network burden it will cause. I’m only uploading the image once but potentially thousands of people could end up downloading it.

If it’s a color image, then JPG is typically best. This #ImageMagick command reduces the filesize quite a bit, trading off quality:

  $ convert "$original_image_file" \
    +dither \
    -posterize 8 \
    -sampling-factor 4:2:0 \
    -strip \
    -quality 75 \
    -interlace Plane \
    -gaussian-blur 0.05 \
    -colorspace RGB \
    -strip \
    smaller_file.jpg

If it’s a pic of a person, this processing will likely be a disaster. But for most things where color doesn’t matter too much, it can be quite useful. Play with different -posterize values.

If you can do with fewer pixels, adding a -resize helps.

  $ convert "$original_image_file" -resize 215x smaller_file.jpg

If you can get away with black and white, jpeg is terrible. Use PNG instead. E.g.

  $ convert "$original_image_file" -threshold 30% -type bilevel smaller_file.png

For privacy, strip the metadata

The ImageMagick -strip option supposedly strips out metadata. But it’s apparently not thorough because the following command yields a slightly smaller file size:

  $ exiftool -all= image.jpg

What else?

Did I miss anything? Any opportunities to shrink images further? In principle the DjVu format would be more compact but it’s not mainstream and apparently not accepted by Lemmy.

  • LibreMonk@linkage.ds8.zoneOP
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    2 months ago

    I was quite confused when I read your post because Tesseract is an OCR engine. Your link helped sort it out.

    I think I have come across various fedi web clients that do conversions. I think peertube shrinks videos, IIRC. The auto conversions are useful but they must be conservative in the extent of their changes. The posterizing that I do w/Imagemagick makes a dramatic change so it could not be done automatically by a client or server, as users need to review the output and decide. So I believe the best compression will always require manual effort in order to judge whether the quality loss is still acceptable for the application.

    Regarding Tesseract (the lemmy client) – does that work offline? I’m always looking for a Lemmy client that can briefly connect to sync content and then support reading and writing messages when offline.

    • Admiral Patrick@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      2 months ago

      No. It’s a PWA and can run offline from cache, but everything is fetched from the API in real time and not stored on-device. I’ve looked into offline support (probably using PouchDB) but haven’t made the first step toward that and have no idea if/when I ever will.