morrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 2 days agoSorting-Free GPU Kernels for LLM Samplingplus-squareflashinfer.aiexternal-linkmessage-square0fedilinkarrow-up14arrow-down10
arrow-up14arrow-down1external-linkSorting-Free GPU Kernels for LLM Samplingplus-squareflashinfer.aimorrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 2 days agomessage-square0fedilink
minus-squaremorrowind@lemm.eeOPtoLocalLLaMA@sh.itjust.works•Reka Flash, open source 21B model comparable to QWQ 32BlinkfedilinkEnglisharrow-up2·edit-22 days agoMore info here https://www.reka.ai/news/introducing-reka-flash HF: https://huggingface.co/RekaAI/reka-flash-3 linkfedilink
morrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 2 days agoReka Flash, open source 21B model comparable to QWQ 32Bi.postimg.ccimagemessage-square2fedilinkarrow-up116arrow-down10
arrow-up116arrow-down1imageReka Flash, open source 21B model comparable to QWQ 32Bi.postimg.ccmorrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 2 days agomessage-square2fedilink
minus-squaremorrowind@lemm.eetoLocalLLaMA@sh.itjust.works•Qwen/QwQ-32B · Hugging FacelinkfedilinkEnglisharrow-up3·7 days agoIt matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32 linkfedilink
minus-squaremorrowind@lemm.eetoLocalLLaMA@sh.itjust.works•Qwen/QwQ-32B · Hugging FacelinkfedilinkEnglisharrow-up2·8 days agoinsane, absolutely insane linkfedilink
morrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 10 days agoChain of Draft: Thinking Faster by Writing Lessplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up19arrow-down11
arrow-up18arrow-down1external-linkChain of Draft: Thinking Faster by Writing Lessplus-squarearxiv.orgmorrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 10 days agomessage-square0fedilink
morrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 10 days agoAtom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1plus-squarebsky.appexternal-linkmessage-square0fedilinkarrow-up114arrow-down12
arrow-up112arrow-down1external-linkAtom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1plus-squarebsky.appmorrowind@lemm.ee to LocalLLaMA@sh.itjust.worksEnglish · 10 days agomessage-square0fedilink
minus-squaremorrowind@lemm.eetoTechnology@lemmy.world•Alibaba Releases Advanced Open Video Model, Immediately Becomes AI Porn MachinelinkfedilinkEnglisharrow-up6·13 days agogood luck trying to run a video model locally Unless you have top tier hardware linkfedilink
More info here https://www.reka.ai/news/introducing-reka-flash
HF: https://huggingface.co/RekaAI/reka-flash-3