![](/static/61a827a1/assets/icons/icon-96x96.png)
![](https://lemmy.ca/pictrs/image/7b0211f0-7266-4e13-9d26-8c3e6126af62.png)
Is it just me or is that title worded as confusingly as possible?
Is it just me or is that title worded as confusingly as possible?
I used to think like that, but now I’m on the fence since I’ve started working much more closely with packaging. Calling it “linux” is actually kind of harmful for adoption. Devs that claim their software works on Linux mislead people into thinking it works on any Linux distro, which is rarely true. Most of the time, those devs only test on Ubuntu and no other distro.
Maybe when Snaps finally die out and Flatpak emerges as the one true standard for desktop apps, then that problem will go away once and for all. Until then, I think we should normalize distinguishing Ubuntu, Fedora, Arch, etc as separate “operating systems” instead of “distros”, which is an unnecessary and misleading term anyways.
I’m seeing people say that the broadcaster (Fox Sports, of course) injected cheers into the broadcast for Trump, and boos for Taylor Swift. I don’t want to spread misinfo though so does anyone know if it’s true, or if there’s a way to validate it? (Eg by analyzing the audio)
Any Fedora Atomic desktop, or any UBlue one.
Immutable distros are the future of the Linux desktop. They work and they never break (I dare you to try).
96 GB+ of RAM is relatively easy, but for LLM inference you want VRAM. You can achieve that on a consumer PC by using multiple GPUs, although performance will not be as good as having a single GPU with 96GB of VRAM. Swapping out to RAM during inference slows it down a lot.
On archs with unified memory (like Apple’s latest machines), the CPU and GPU share memory, so you could actually find a system with very high memory directly accessible to the GPU. Mac Pros can be configured with up to 192GB of memory, although I doubt it’d be worth it as the GPU probably isn’t powerful enough.
Also, the 83GB number I gave was with a hypothetical 1 bit quantization of Deepseek R1, which (if it’s even possible) would probably be really shitty, maybe even shittier than Llama 7B.
but how can one enter TB zone?
Data centers use NVLink to connect multiple Nvidia GPUs. Idk what the limits are, but you use it to combine multiple GPUs to pool resources much more efficiently and at a much larger scale than would be possible on consumer hardware. A single Nvidia H200 GPU has 141 GB of VRAM, so you could link them up to build some monster data centers.
Nivida also sells prebuilt machines like the HGX B200 which can have 1.4TB of memory in a single system. That’s less than the 2.6TB for unquantized deepseek, but for inference only applications, you could definitely quantize it enough to fit within that limit with little to no quality loss… so if you’re really interested and really rich, you could probably buy one of those for your home lab.
Man, Trump really flooded your zone, huh? Seriously, stop following the news for like a day or two. I don’t even understand what point you’re trying to make here, but the passive aggressive seething is palpable.
Most of what DOGE is doing is probably illegal, most of Trump’s executive orders are going to get blocked by courts, and many have already been rescinded. The point is to demoralize people so they don’t bother voting or fighting back. I’m getting the impression that the strategy worked on you.
Stop it. Play video games, draw pictures, watch porn. Whatever you do, disconnect from the fire-hose of outrage coming from the whitehouse. It’s all bait, and you’re gobbling it up.
I get what you’re saying, but gatekeeping bullying tactics is weird ngl
Relevant username? Go touch grass, bro. Doom scrolling and posting on lemmy won’t make you a happy person.
I remember my 6th grade civics teacher (a black man) was angry and depressed when Obama won, saying that republicans will never win another election again.
If all you care about is response times, you can easily do that by just using a smaller model. The quality of responses will be poor though, and it’s not feasible to self host a model like chatgpt on consumer hardware.
For some quick math, a small Llama model is 7 billion parameters. Unquantized that’s 4 bytes per parameter (32 bit floats), meaning it requires 28 billion bytes (28 gb) of memory. You can get that to fit in less memory with quantization, basically reducing quality for lower memory usage (use less than 32 bits per param, reducing both precision and memory usage)
Inference performance will still vary a lot depending on your hardware, even if you manage to fit it all in VRAM. A 5090 will be faster than an iPhone, obviously.
… But with a model competitive with ChatGPT, like Deepseek R1 we’re talking about 671 billion parameters. Even if you quantize down to a useless 1 bit per param, that’d be over 83gb of memory just to fit the model in memory (unquantized it’s ~2.6TB). Running inference over that many parameters would require serious compute too, much more than a 5090 could handle. This gets into specialized high end architectures to achieve that performance, and it’s not something a typical prosumer would be able to build (or afford).
So the TL; DR is no
And on top of it all, he would just get a presidential pardon
I think this comment encapsulates the problem well: laymen who are not involved in the process in any way (on either side) acting like armchair experts and passing harsh judgement. You’re making some very unfair assumptions based on age, and nothing about the actual technical arguments.
This is why people like Martin feel justified going on social media to publicly complain, because they know they’ll get a bunch of yesmen with no credible arguments to mindlessly harrass the developers they disagree with. It’s childish and unproductive, and while I’ve personally respected Martin as a developer for a long time, I don’t believe he’s mature enough to be involved in the Rust for Linux effort (tbf, he’s not the only Rust dev with this attitude). If the project fails, it will be because of this behavior, not because of the “old guys” being stubborn.
Two things can be true at once:
Open source work is collborative. No matter how good an engineer someone is, if they can’t figure out how work with others, then it’s better to kick them out. A potentially insecure kernel is better than a non-existent one.
This doesn’t account for blinking.
If your friend blinks, they won’t see the light, and thus would be unable to verify whether the method works or not.
But how does he know when to open his eyes? He can’t keep them open forever. Say you flash the light once, and that’s his signal to keep his eyes open. Okay, but how long do you wait before starting the experiment? If you do it immediately, he may not have enough time to react. If you wait too long, his eyes will dry out and he’ll blink.
This is just not going to work. There are too many dependent variables.