Often when I read ML papers the authors compare their results against a benchmark (e.g. using RMSE, accuracy, …) and say “our results improved with our new method by X%”. Nobody makes a significance test if the new method Y outperforms benchmark Z. Is there a reason why? Especially when you break your results down e.g. to the anaylsis of certain classes in object classification this seems important for me. Or do I overlook something?

  • GullibleEngineer4@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Isn’t cross validation (for prediction tasks) an alternative to and I daresay even better than statistical significance tests?

    I am referring to the seminal paper of Statistical Modeling: The Two Cultures by Leo Breiman if someone wants to know where am I coming from.

    Paper: https://www.jstor.org/stable/2676681

    • Brudaks@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Cross-validation is a reasonable alternative, however, it does increase your compute cost 5-10 times, or, more likely, means that you generate 5-10 times smaller model(s) which are worse than you could have made if you’d just made a single one.