Comment on
Alibaba Releases Advanced Open Video Model, Immediately Becomes AI Porn Machine
Reply in thread
good luck trying to run a video model locally
Unless you have top tier hardware
Comment on
Alibaba Releases Advanced Open Video Model, Immediately Becomes AI Porn Machine
Reply in thread
good luck trying to run a video model locally
Unless you have top tier hardware
Comment on
Sentence transformers v4
Reply in thread
I want to clarify something. Reranker is a general term that can refer to any model used for reranking. It is independent of implementation.
What you refer to
because reranker models look at the two pieces of content simultaneously and can be fine tuned to the domain in question. They shouldn't be used for the initial retrieval because the evaluation time is O(n²) as each combination of input
Is a specific implementation known as CrossEncoder that is common for reranking models but not retrieval ones for the reasons you described. But you can also use any other architecture
Comment on
lemm.ee is shutting down at the end of this month
Reply in thread
We hav sub mods here too. The difference is the admins
Comment on
Some updates on community changes and future goals (03-28-2025)
Thumbnail looks a little odd when small. You may want to go for a more digital llama aesthetic
Comment on
StarVector - a foundation model for generating svgs
Reply in thread
Claude frequently draws svgs to illustrate things for me (I'm guessing it's in the prompt) but even though it's better at it than all the other models, it still kinda sucks. It's just fudamentally dumb task to do for a purely language model, similar to the arc-agi benchmark , just makes more sense for a vision model and trying to get an llm to do is a waste
Comment on
Qwen/QwQ-32B · Hugging Face
Reply in thread
It matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32
Comment on
StarVector - a foundation model for generating svgs
Reply in thread
autotracers can't generate svgs from text
Comment on
Reka Flash, open source 21B model comparable to QWQ 32B
Comment on
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Very similar to chain of draft but seems more thorough
Comment on
Zoomers & Boomers are the same
Reply in thread
On god
Comment on
Societal rules
Reply in thread
Alright, I'm waiting on the youtube playlist
Comment on
Qwen/QwQ-32B · Hugging Face
insane, absolutely insane
Comment on
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Reply in thread
Technically it supports fewer languages than whisper, 40 vs 99
The main problem isn't "bother", it's training data. You need hundreds of thousands of hours of high quality transcripts to train models like these and that just doesn't exist for like zulu or whatever
Comment on
Trump administration reportedly considers a US DeepSeek ban | TechCrunch
Such dumbasses, even if this was a good strategy, they're still banning one company and let others (arguably more dangerous ones) go scot free
Comment on
EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News
Reply in thread
what is the license? The link on hf just 404s