• 0 Posts
  • 38 Comments
Joined 1 year ago
cake
Cake day: February 6th, 2024

help-circle


  • That o3 does well on frontier math held-out set is impressive, no doubt

    I think there is plenty of room for doubt still. elliotglazer on reddit writes:

    Epoch’s lead mathematician here. Yes, OAI funded this and has the dataset, which allowed them to evaluate o3 in-house. We haven’t yet independently verified their 25% claim. To do so, we’re currently developing a hold-out dataset and will be able to test their model without them having any prior exposure to these problems.

    My personal opinion is that OAI’s score is legit (i.e., they didn’t train on the dataset), and that they have no incentive to lie about internal benchmarking performances. However, we can’t vouch for them until our independent evaluation is complete.

    (emphasis mine). So there is good reason to doubt that the “held-out dataset” even exists.


















  • Why when we look into the stars do we not see a sign of life anywhere else? Has life not emerged yet or has it wiped itself out? With what? Nukes? AI? Synthetic viruses made with AI? Who knows…

    entertaining this awful sci-fi schtick for a moment - if every civilization is wiped out by “superintelligent AI”, how come you can’t look through a telescope and see signs of artificial life? in this fantasy world shouldn’t planets taken over by paperclip factories be even more conspicuous?