Not sure if this is better fit for datahoarder or some selfhost community, but putting my money on this one.

The problem

I currently have a cute little server with two drives connected to it running a few different services (mostly media serving and torrents). The key facts here is that 1) it’s cute and little, 2) it’s handling pretty bulky data. Cute and little doesn’t go very well with big raid setups and such, and apart from upgrading one of the drives I’m probably at my limit in terms of how much storage I can physically fit in the machine. Also if I want to reinstall it or something that’s very difficult to do without downtime since I’d have to move the drive and services of to a different machine (not a huge problem since I’m the only one using it, but I don’t like it).

Solution

A distributed FS would definitely solve the issue of physically fitting more drives into the chassi, since I could basically just connect drives to a raspberry pi and have this raspi join the distributed fs. Great.

I think it could also solve the issue of potential downtime if I reinstall or do maintenance, since I can have multiple services read of the same distributed FS and reroute my reverse proxy to use the new services while the old ones are taken offline. There will potentially be a disruption, but no downtime.

Candidates

I know there are many different solutions for distributed filesystems, such as ceph, moosefs, glusterfs and miniio. I’m kinda leaning towards ceph because of it’s integration in proxmox, but it also seems like the most complicated solution in the bunch. Is it worth it? What are your experiences with these, and given the above description of my use-case which do you think would be the best fit?

Since I already have a lot of data it’s a bonus if it’s easy to migrate from my current filesystem somehow.

My current setup uses a lot of hard links as well, so it’s a big bonus if the solution has something similar (i.e. some easy way of storing the same data in multiple places without duplicating it)

  • xrun_detected@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    11 months ago

    I can’t really tell you what to use, but from my personal experience - stay away from glusterfs and drbd. both have caused me serious trouble when trying to run them in a production setup. ceph seems to be pretty solid, though.

    • aleq@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      That’s very helpful because glusterfs and ceph are probably my top two candidates. Will probably try it out.

  • snekmuffin@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    1
    ·
    11 months ago

    If you’re on linux or bsd, look into ZFS. Insanely easy to set up and admin, fs-level volume management, compression and encryption, levels of RAID if you want them, and recently they even added phe option to expand your data pools with new drives. All of that completely userspace, without having to fiddle with expensive RAID cards or motherboard firmware.

    • aleq@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      Isn’t it a local filesystem though, so I can’t expand the filesystem with other drives on my network?

    • krnl386@lemmy.ca
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      Huh? ZFS is not 100% userspace. You’re right that ZFS doesn’t need hardware RAID (in fact, it’s incompatible), but the standard OpenZFS implementation (unless you’re referring to the experimental FUSE-based one) does use kernelspace on both FreeBSD and Linux.

  • Nogami@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    1
    ·
    11 months ago

    I get what you’re proposing but I’d respectfully suggest looking into unRAID on basically any hardware that can boot an OS.

    It won’t necessarily be small and cute (though you can accomplish that if you wish), but you can make it do just about anything. I bought old enterprise hardware to run my main and backup servers on. I feel really comfortable with my data safety.

    • You999@sh.itjust.works
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      11 months ago

      FYI you probably shouldn’t be saying you feel really comfortable with your data safety while suggesting unraid. The way unraid handles it’s storage will lead to data loss at some point. Unraid only locks down an array and protects it when smart starts issuing warnings that a drive has failed. Smart isn’t magic though and when a drive starts to die it might start writing garbage data for days if not weeks before smart catches on. If a drive writes garbage for long enough there’s nothing you can do to fix it due to that way unraid handles arrays. This is why ZFS is such a popular option as it treats hard drive with a level of skepticism and verifies the data was actually written correctly along with verifying the data from time to time.

      That’s not even mentioning unraid is charging for what other software does for free.

      • Nogami@lemmy.world
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        11 months ago

        So you don’t know unraid has ZFS now then? Gotta keep up with the times.

        And it’s worth every cent as commercial software. I bought 2 pro licenses because it’s just that good.

        • You999@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          11 months ago

          Sorta… If the array was built with hybrid ZFS within unraid which is what the majority of unraid users go with as it allows for better mixing of various sized drives and easier expansion of the array in the future (in other words the main selling points of unraid) then you do not get any of the safety nets ZFS provides as what unraid is essentially doing is making a single drive zfs vdev for each drive in the array. In unraid’s own words “ZFS-formatted disks in Unraid’s array do not offer inherent self-healing protection.”.

          • Nogami@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            11 months ago

            Unraid natively supports full ZFS arrays in addition to unraid arrays since the last major release. Can mix and match both types on the same system as necessary.

            All of my (easily replaceable) Plex media is native unraid arrays while my documents are all on a ZFS array on the same system with snapshots and such. It’s the perfect solution.