Epstein Files Jan 30, 2026

Data hoarders on reddit have been hard at work archiving the latest Epstein Files release from the U.S. Department of Justice. Below is a compilation of their work with download links.

Please seed all torrent files to distribute and preserve this data.

Ref: https://old.reddit.com/r/DataHoarder/comments/1qrk3qk/epstein_files_datasets_9_10_11_300_gb_lets_keep/

Epstein Files Data Sets 1-8: INTERNET ARCHIVE LINK

Epstein Files Data Set 1 (2.47 GB): TORRENT MAGNET LINK
Epstein Files Data Set 2 (631.6 MB): TORRENT MAGNET LINK
Epstein Files Data Set 3 (599.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 4 (358.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 5: (61.5 MB) TORRENT MAGNET LINK
Epstein Files Data Set 6 (53.0 MB): TORRENT MAGNET LINK
Epstein Files Data Set 7 (98.2 MB): TORRENT MAGNET LINK
Epstein Files Data Set 8 (10.67 GB): TORRENT MAGNET LINK


Epstein Files Data Set 9 (Incomplete). Only contains 49 GB of 180 GB. Multiple reports of cutoff from DOJ server at offset 48995762176.

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

/u/susadmin’s More Complete Data Set 9 (96.25 GB)
De-duplicated merger of (45.63 GB + 86.74 GB) versions

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

Epstein Files Data Set 10 (78.64GB)

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)
  • INTERNET ARCHIVE FOLDER (removed due to reports of CSAM)
  • INTERNET ARCHIVE DIRECT LINK (removed due to reports of CSAM)

Epstein Files Data Set 11 (25.55GB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 574950c0f86765e897268834ac6ef38b370cad2a


Epstein Files Data Set 12 (114.1 MB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 20f804ab55687c957fd249cd0d417d5fe7438281
MD5: b1206186332bb1af021e86d68468f9fe
SHA256: b5314b7efca98e25d8b35e4b7fac3ebb3ca2e6cfd0937aa2300ca8b71543bbe2


This list will be edited as more data becomes available, particularly with regard to Data Set 9 (EDIT: NOT ANYMORE)


EDIT [2026-02-02]: After being made aware of potential CSAM in the original Data Set 9 releases and seeing confirmation in the New York Times, I will no longer support any effort to maintain links to archives of it. There is suspicion of CSAM in Data Set 10 as well. I am removing links to both archives.

Some in this thread may be upset by this action. It is right to be distrustful of a government that has not shown signs of integrity. However, I do trust journalists who hold the government accountable.

I am abandoning this project and removing any links to content that commenters here and on reddit have suggested may contain CSAM.

Ref 1: https://www.nytimes.com/2026/02/01/us/nude-photos-epstein-files.html
Ref 2: https://www.404media.co/doj-released-unredacted-nude-images-in-epstein-files

  • PeoplesElbow@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    3 hours ago

    Ok everyone, I have done a complete indexing of the first 13,000 pages of the DOJ Data Set 9.

    KEY FINDING: 3 files are listed but INACCESSIBLE

    These appear in DOJ pagination but return error pages - potential evidence of removal:

    EFTA00326497

    EFTA00326501

    EFTA00534391

    You can try them yourself (they all fail):

    https://www.justice.gov/epstein/files/DataSet 9/EFTA00326497.pdf

    The 86GB torrent is 7x more complete than DOJ website

    DOJ website exposes: 77,766 files

    Torrent contains: 531,256 files

    Page Range Min EFTA Max EFTA New Files


    0-499 EFTA00039025 EFTA00267311 21,842

    500-999 EFTA00267314 EFTA00337032 18,983

    1000-1499 EFTA00067524 EFTA00380774 14,396

    1500-1999 EFTA00092963 EFTA00413050 2,709

    2000-2499 EFTA00083599 EFTA00426736 4,432

    2500-2999 EFTA00218527 EFTA00423620 4,515

    3000-3499 EFTA00203975 EFTA00539216 2,692

    3500-3999 EFTA00137295 EFTA00313715 329

    4000-4499 EFTA00078217 EFTA00338754 706

    4500-4999 EFTA00338134 EFTA00384534 2,825

    5000-5499 EFTA00377742 EFTA00415182 1,353

    5500-5999 EFTA00416356 EFTA00432673 1,214

    6000-6499 EFTA00213187 EFTA00270156 501

    6500-6999 EFTA00068280 EFTA00281003 554

    7000-7499 EFTA00154989 EFTA00425720 106

    7500-7999 (no new files - all wraps/redundant)

    8000-8499 (no new files - all wraps/redundant)

    8500-8999 EFTA00168409 EFTA00169291 10

    9000-9499 EFTA00154873 EFTA00154974 35

    9500-9999 EFTA00139661 EFTA00377759 324

    10000-10499 EFTA00140897 EFTA01262781 240

    10500-12999 (no new files - all wraps/redundant)

    TOTAL UNIQUE FILES: 77,766

    Pagination limit discovered: page 184,467,440,737,095,516 (2^64/100)

    I searched random pages between 13k and this limit - NO new documents found. The pagination is an infinite loop. All work at: https://github.com/degenai/Dataset9

  • Arthas@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    8 hours ago

    I am downloading dataset 9 and should have the full 180gb zip done in a day. To confirm, the link on DOJ to the dataset 9 zip is now updated to be clean of CSAM or not? As much as I wish to help the cause, I do not want any of that type of material on my server unless permission has been given to host it for credible researchers only that need access to all files for their investigation, but I have no way of understanding what’s within legal rights to assist with redistributing the files to legitimate investigators and thus my plans to help create a torrent may be squashed. Please let me know.

      • Arthas@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        3 hours ago

        I have various chunking techniques that I use. I adaptively modify the request size of the chunks as I’ve noticed at times the CDN will give large amounts then micro amounts. I haven’t figured out the exact backoff rate but I have retry mechanisms in place. The CDN is very annoying but so far my methods are working, just slow.

    • BWint@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      7 hours ago

      Amazing - Once you have the 180GB Set 9 downloaded, I’ll seed.

      At this point, my working assumption is that the version you’re downloading should be presumed to be free of CSAM, but we can’t know for sure until we check it. In addition, I assume that legitimate files were also removed from the version you’re downloading, but the legitimate files are preserved in the archives we already have (along with, tragically, the CSAM.)

      I think that after you download the 180GB set, we should compare it to our existing files to identify files that were removed. Then, we can identify which of the removed files were CSAM, and which of the removed files were legitimate. Going to be a hell of a task…

      • o_derr889@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        edit-2
        6 hours ago

        someone posted the list of the original links. If it helps to cross reference I can check to see if I have it.

      • Arthas@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        7 hours ago

        Ok great. As for comparing files. I would likely do a hash check. That shouldn’t be difficult to identify truly unique files. It’ll take a few days for a decent computer to generate all the hashes but it should be pretty automated. I’ll reach out once I have it completed.

        • BWint@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          6 hours ago

          Thank you! I’m not very tech savvy, so I’m very little help in this whole process. Please LMK what you find.

    • thetrekkersparky@startrek.website
      link
      fedilink
      arrow-up
      4
      ·
      7 hours ago

      From my understanding nobody knows. The DOJ said it was already removed, but the NYTimes claimed they found 40 images of CSAM. The DOJ said they immediately removed them Saturday, but a lot of files that didn’t contain CSAM were also removed. I’ve extracted the 101GB torrent and haven’t come accross any yet, but there’s a ton of files in there too. People have yet been able to download the entire ZIP and are trying to scrape everything individually as far as I know.

      As for the legality, I’m not a Lawyer and I don’t live in the states, but It’s all information that’s been released to the public by the US DOJ as required by a court order, so it’s a call that only you can make. With the amount of data that’s already disapeared I’m personally choosing to host it regardless, and I’ll seed whatever anyone else can salvage of dataset 9 too.

    • jandrew13@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      7 hours ago

      wondering the same thing myself. Not sure about the latest DS9 dump, but I’ve definitely seen some of the other leaks that included some CSAM. crazy that DOJ let that out the door. :/

    • Wild_Cow_5769@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      6 hours ago

      Good… I don’t trust the what the DOJ says if I see it from my own eyes that’s one thing, and I’ll promptly delete it. But I don’t believe anything the DOJ says.

  • jandrew13@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    7 hours ago

    So what’s the consensus on what to do about all the fully uncensored CSAM the DOJ released on the 30th? Much of it has been removed as of today, but that shit is still fully up on archive.org… 🙄…Not Great…

    • Wild_Cow_5769@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      6 hours ago

      My two cents I have nothing but daughter…all my children are just daughters…

      We don’t take care of this weird sexual abuse problem now between authority, figures and other things like that. We never will…

      I don’t think I could sleep at night if I didn’t do my due diligence because someday time will just move on and all of us will be too old to do something about it…

      We either take care if this now or we never will as a society …

      You think about it do you ever think there will be another point in the future to root out this kind of evil?

      So I say release the files and let the chips fall where they fall but that’s just my two cents…

      Would be one thing if this entire process felt like we could really trust justice to do the right thing…

      Just look over there in the Epstein form on Reddit. They are all kinds of pictures and names of really really wealthy people that can just easily buy their way out of trouble…

      • jandrew13@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        6 hours ago

        Hey that makes sense to me man.

        I think there will be plenty of falling chips in the coming weeks. Once the data is aggregated and truly accessible searchable… someone is going to make some AI something that can connect the dots faster than the justice system - because my god is it slow as molasses.

        I’m so tired of waiting around.

    • BWint@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      7 hours ago

      It’s “great” that the DOJ removed CSAM at the same time as they were removing perfectly legitimate files that are in the public interest. That’s just really smart. Puts us all in a hell of a bind.

      I can’t speak for others, but I’ll plan to preserve the 87GB Set 9, the 90GB Set 9, and Set 10, until we can get an updated “complete” (current) Set 9 that can be presumed to be free of CSAM. After that, we can try to identify the legitimate files that are missing from the “complete” Set 9, and preserve those while purging the CSAM.

      • jandrew13@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        6 hours ago

        This seems like a valid plan - although I’m not that confident in the ‘purge’. It might be good to redact those images ourselves and then nobody is pressed to store them. Better to have a confidently safe dataset that can be passed around safely.

        Also, It looks like they went back and repaired the shitty text redactions on docs that were released late 2025 from what I can tell. I ran a script that auto detects and removes “fake” redactions and its not getting any hits anymore. even on files that it flagged in the past. They are definitely trying to cover their tracts* by the day*

      • jandrew13@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        6 hours ago

        Without a timestamp on the photo its impossible to be 100% but it was obvious enough for me to ask the question. :/ It seems like it was a mistake on their part because everything else has heavily redacted nudity. You can also see references in the internal memo docs preceding the content.

        • Wild_Cow_5769@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          6 hours ago

          There’s a lawsuit to try to have a judge give an injunction to the file release. There isn’t a lot of time left…

          Once those files go away, do you honestly think anybody who will ever get to see them again?

  • Wild_Cow_5769@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    17 hours ago

    As far as CSAM and the “don’t go looking for data set 9”…

    Look I’ll be straight up.

    If I find any CSAM it gets deleted…

    But if you believe for 1 second that DOJ didn’t remove delete relevant files because they are protecting people then I have a time share to sell you at a cheap price on a beautiful scenic swamp in Florida…

    • MachineFab812@discuss.tchncs.de
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      11 hours ago

      It’s literally left-in on purpose to try to have something over people that download and/or seed the torrents. We need a file-list to know what not to dl/seed, or a new torrent for that set.

  • SteveClement@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    12 hours ago

    Quite a few bugs in the script:

    1886339.3% (173,015,040 / 9,172 bytes) │

    The most important thing, reading people that just post they executed the code and did not understand what is happening.

    Copy paste the code into an AI thingy and ask it if it is safe.

    Obviously OP has vibed this together too, but the vectors of attack are multiple.

  • Wild_Cow_5769@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    18 hours ago

    Hello… I followed the breadcrumbs from Reddit. So I have the dataset 9 48GB torrent downloaded. I been trying to get the chunk script someone dropped below in python to yield results.

    I did my cookie. Exported to Netscape. Yadda yadda.

    I can seemingly connect if I start at chunk 00000000.

    But the min I try to connect at the chunk number of the bin file from 48GB torrent It just connects once and then fails over and over.

    Has anyone found a magic formula to get more from dataset 9?

  • BWint@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    1
    ·
    22 hours ago

    For those curious, here’s the NYTimes article where they report on the CSAM in the publicly-released files: https://www.nytimes.com/2026/02/01/us/nude-photos-epstein-files.html (behind paywall.)

    NYTimes says that they discovered the CSAM on Friday and notified the DOJ on Saturday, and the DOJ was diligent in removing the files NYTimes had flagged.

    NYTimes does not say that the material is in Dataset 9 specifically, but we observed that the DOJ was removing files from Dataset 9 on Saturday and not other datasets, so the server behavior would be consistent with CSAM in Dataset 9.

    • Moonsurfer_1@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      21 hours ago

      That sounds bad and it must be awful for the victims. Still, the evidence must be preserved. The administration can’t be trusted to do so. The stakes are too high.

      And even though removing CSAM might be the official tagline, I have my doubts that that’s the only stuff that is getting redacted/removed.

      • BWint@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        11 hours ago

        100% - That’s why I haven’t deleted my copy of Set 9. I have no plans to unzip it, and I’m glad that DOJ is removing the CSAM now, but I’m going to hold onto the set to preserve the valuable docs that DOJ is removing.

  • bile@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    1 day ago

    just advising you that there is confirmed csam in dataset9-more-complete.tar.zst and probably the other partial dataset9s

    • Wild_Cow_5769@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      6 hours ago

      Seems like a interesting excuse to use for a reason why they all need removed from public viewing…

      Have you actually seen it? Or are you just going off a report?

    • xodoh74984@lemmy.worldOP
      link
      fedilink
      arrow-up
      3
      ·
      1 day ago

      This is very concerning. DOJ has stated explicitly that any CSAM was removed before releasing the files. Should I remove the magnet link to the merged Data Set 9 torrent?

      I haven’t looked inside any of these sets myself. My primary goal has been to get the DOJ data distributed.

  • TheBobverse@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    1 day ago

    Is there any grunt work that needs to be done? I would like to help out but I’m not sure how to make sure my work isn’t redundant. I mean like looking through individual files etc. Is there an organized effort to comb through everything?

  • ModernSimian@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    1 day ago

    I’m not sure if it is useful to anyone, but the partial 9 zip from the DOJ website does contain the eDiscovery index files. VOL00009.DAT and VOL00009.OPT which are conveniently at the very start of the zip file. They are text files and it’s easy to parse out what files they thought were included in the massive zip file… IDK if you have one from zero hour, but I have the first few GB from the one the CDN occasionally spits out saved if anyone wants them so see what files may be missing from the “index”

    • CapableStaircase@lemmy.zip
      link
      fedilink
      arrow-up
      1
      ·
      10 hours ago

      Hi, OG 101GB dataset uploaded here. The DAT/OPT files are exactly what I used to fetch the files for this dataset.

      I want to go through the other partial dataset 9 zips and check for deltas in the contents of the DAT/OPT files but haven’t had the time yet.

    • donmega@lemmy.world
      link
      fedilink
      arrow-up
      8
      ·
      22 hours ago

      Hi, i am the admin of epsteinfilez.com . I have never claimed that i have the full Dataset 9. The banner says that i have 101GB of Dataset 9, the one that is also shared here with the magnet link.

    • xodoh74984@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      23 hours ago

      The flashing banner at the top says that it includes 101GB of Data Set 9. Unfortunately, I think they just grabbed the larger of the two torrents.

  • berf@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    1 day ago

    I’ve been working on a structured inventory of the datasets with a slightly different angle: rather than maximizing scrape coverage, I’m focusing on understanding what’s present vs. what appears to be structurally missing based on filename patterns, numeric continuity, file sizes, and anchor adjacency.

    For Dataset 9 specifically, collapsing hundreds of thousands of files down into a small number of high-confidence “missing blocks” has been useful for auditing completeness once large merged sets (like yours) exist. The goal isn’t to assume missing content, but to identify ranges where the structure strongly suggests attachments or exhibits likely existed.

    If anyone else here is doing similar inventory or diff work, I’d be interested in comparing methodology and sanity-checking assumptions. No requests for files (yet) Just notes on structure and verification

    • jankscripts@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      1 day ago

      Keep in mind when looking at the file names the File name is the name of the first page of the document each page in the document is part of the numbering scheme.

      EFTA00039025.pdf

      EFTA00039026 …

      … EFTA00039152

      • berf@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        Just tested whether numeric gaps represent missing files or page-level numbering. In at least one major Dataset 9 block, the adjacent PDF’s page count exactly matches the numeric span, indicating page bundling rather than missing documents. I’m incorporating page counts into the audit model to distinguish the two.”

        Thanks so much for setting that straight.

  • Nomad64@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    2 days ago

    I am seeding sets 1-8, 10-12, and the larger set 9. Seedbox is outside the US and has a very fast connection.

    I will keep an eye on this post for other sets. 👍