• 0 Posts
  • 20 Comments
Joined 11 months ago
cake
Cake day: October 26th, 2023

help-circle














  • I’m interested to see how model-based RL could work for reasoning.

    Instead of training a model to predict data and then fine-tuning it with RL to be a chatbot, you use RL as the primary training objective and train the data model as a side effect. This lets your pretraining objective be the actual objective you care about, so your reward function could punish issues like hallucination or prompt injection.

    I haven’t seen any papers using model-based RL for language modeling yet, but it’s starting to work well in more traditional RL domains like game-playing. (dreamerv3, TD-MPC2)



  • This seems pretty sketchy. Lots of angry words, but few details.

    Most of this has nothing to do with sexual abuse, but is rather family drama over their dad’s will. She says that Sam and his lawyer were able to delay or withhold money she was supposed to inherit, but doesn’t really provide details. There’s not enough information here to judge the accuracy of her claims.

    The sexual abuse allegedly happened when she was 4 and he was 13, but she didn’t remember it until some kind of flashback in 2020.

    Technological abuse - {I experienced} Shadowbanning across all platforms except onlyfans and pornhub."

    Sam is certainly well-connected within the tech industry, but I’m doubtful that he could get that many platforms to ban her. Also, her posts seem to be up and visible right now.


  • One key difference is that they are not trained with end-to-end optimization but rather a hand crafted learning rule. This rule has strong inductive biases that work well for small datasets with pre-extracted features, like tabular data.

    Their big disadvantage (and this applies to logical/symbolic approaches in general) is that they don’t work well with raw data. Even easy datasets like CIFAR10. The world is too messy for perfect logical rules; neural networks are able to capture this complexity, but simpler models struggle to.

    statistical

    Note that learning is a fundamentally statistical process, so Tsetlin Machines are also statistics based.