Spyke
technology·TechnologybyJOMusic

Proton's very biased article on Deepseek

Article: https://proton.me/blog/deepseek

Calls it "Deepsneak", failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.

I can't speak for Proton, but the last couple weeks are showing some very clear biases coming out.

View original on lemmy.ml
lemmy.blahaj.zone

I hate AI but on the other hand I love how Deepseek is causing AI companies to lose billions.

145
Roguereply
feddit.uk

The desperate PR campaign against deepseek is also very entertaining.

76
sh.itjust.works

We're playing with it at work and I honestly don't understand the hype. It's super verbose and would take longer for me to read the output than do the research myself. And it's still often wrong.

It's cool I guess, and I'm still looking for a good use case, but it's still a ways from taking over the world.

12
Roguereply
feddit.uk

The same is also true of ChatGPT. On the surface the results are incredibly believable but when you dig into it or try to use some of the generated code it's nonsense.

15

I certainly think it's cool, but the further you stray from the beaten path, the more newly janky it gets. I'm sure there's a good workflow here, it'll just take some time to find it.

5
Aulireply

Oh it works and works well sure might need some tweaking or prompt changing but it is decent at code gen.

1
Aulireply

I've found AI quicker at getting information. Search on the net is garbage find old articles that no longer are relavent or having to shift through pages of unrelated shit till you find what you want.

2
pawb.social

DeepSeek is open source, but is it safe?

These guys are in the open source business themselves, they should know the answer to this question.

66
AstralPathreply
lemmy.ca

Has anyone actually analyzed the source code thoroughly yet? I've seen a ton of reporting on its open source nature but nothing about the detailed nature of the source.

FOSS only = safe if the code has been audited in depth.

31
Fubarberryreply
sopuli.xyz

I haven't looked into Deepseek specifically so I could be mistaken, but a lot of times when a model is called "open-source" it really is just open weights. You can download it or train other models off of it, but you can't actually view any kind of source code on how the model works.

An audit isn't really possible.

43
L_Acaciareply
lemmy.ml

It is open-weight, we dont have access to the training code nor the dataset.

That being said it should be safe for your computer to run Deepseeks models since the weight are .safetensors which should block any code execution from injected code in the models weight.

11

It's been noted that the apps by the company do send each and every keystroke back to china, though.

Who's to say how poisoned the data in reality is.

0
AstralPathreply
lemmy.ca

Then by default it should never be considered safe. Honestly, this "open" release... it makes me wonder about ulterior motives.

0
rumbareply
lemmy.zip

That's not quite it either.

The model itself is just a giant ball of math. They made a thing that can transform an English through the collected knowledge of much of humanity a few dozen times and have it crap out a reasonable English answer.

The open source part is kind of a misnomer. They explained how they cooked the meal but not the ingredient list.

To complete the analogy, their astounding claim is that they managed to cook the meal with less fire than anyone else has by a factor of like 1000.

But the model itself is inherently safe. It's not like it's a binary that can carry a virus or do crazy crap. Even convincing it to do give planned nefarious answers is frankly beyond our capabilities so far.

The dangerous part that proton is looking at and honestly is a given for any hosted AI, is in the hosting server side of things. You make your requests to their servers and then their servers put the requests into the model and return you the output.

If you ask their web servers for information about tiananmen square they will block you.

You can, however, download the model yourself and run it yourself and there's not any security issues there.

It will tell you anything that you need to know about tiananmen square.

20
semreply
lemmy.blahaj.zone

What are the minimum system requirements to run something like deepseek on your own computer in some kind of firewall container?

2
utopiahreply
lemmy.world

There are plenty of ways and they are all safe. Don't think of DeepSeek as anything more than a (extremely large, like bigger than a AAA) videogame. It does take resources, e.g disk space and RAM and GPU VRAM (if you have some) but you can use "just" the weights and thus the executable might come from another project, an open-source one that will not "phone home" (assuming that's your worry).

I detail this kind of things and more in https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence but to be more pragmatic I'd recommend ollama which supports https://ollama.com/library/deepseek-r1

So, assuming you have a relatively entry level computer you can install ollama then ollama run deepseek-r1:1.5b and try.

6

FWIW I did just try deepseek-r1:1.5b (the smallest model available via ollama today) and ... not bad at all for 1.1Gb!

It's still AI BS generating slop without "thinking" at all ... but from the few tests I ran, it might be one of the "least worst" smaller model I tried.

3

Seems reasonable to think part of the motivation is disrupting American tech like openAI

1

A few of my friends who are a lot more knowledgeable about LLMs than myself are having a good look over the next week or so. It'll take some time, but I'm sure they will post their results when they are done (pretty busy times unfortunately).

I'll do my best to remember to come back here with a link or something when I have more info 😊

That said, hopefully someone else is also taking a look and we can get a few different perspectives.

1

They very much do not believe that open source means safe or private. They have a tons of articles talking about the hurdles they have gone through to try and ensure they are, and where and when they have failed to do so.

5
tabularreply
lemmy.world

If I obfuscate my code such that it's very difficult to understand then in practice it's like proprietary software, even with an open source license.

Correct me if I'm wrong but looking at the code isn't enough to understand what a neural network will do (if these "AI" are using that, maybe they're not).

4

Deepseek's R1 was built entirely on a multi-stage reinforcement learning process, and they pretty much open sourced that entire pipeline. By contrast, OpenAI has been giving us nothing but "look what we did" since GPT-3, and we're supposed to trust them.

17
lemmy.world

Unsurprising that a right-wing Trump supporting company is now attacking a tech that poses an existential threat to the fascist-leaning tech companies that are all in on AI.

54

Proton has always been sketchy - and I caught flak for it countless times, especially here. But: A company claiming they are "private' and "secure" because they operate under Swiss privacy laws is already sketchy from the beginning. Why? Because Swiss privacy laws suck,are the worst in Europe and Switzerland is a country known for multiple cases of major intelligence agency overreach - especially towards foreigners and cross-border traffic.

Legally the Swiss intelligence services can order any "service provider" (that includes proton) to provide them access to traffic coming from foreign countries - this also includes the mandate to provide "technical means", which is often seen as backdoors. And to make things better the service providers are not allowed to talk about it.

This alone is a problem. In Protons case what makes matters even worse is the fact that they are an US company de facto operating from the US and therefore are bound by the homeland security act and similar legislation.

So in the end both the Swiss and US services might read your data.

5
Roguereply
feddit.uk

For clarity the company did not explicitly support Trump. They simply stated negative things about the "corporate dems" and praised the new republican party.

-28
firadinreply
lemmy.world

Ah my mistake, they didn't praise the fascist - just the fascist party. Big difference.

46
Roguereply
feddit.uk

Exactly it's totally different.

And they never specifically praised the vice president they simply made some fucked up association that his attendance of an event meant he was on side contrary to pretty much every other indication that has ever been given.

-29
Roguereply
feddit.uk

You might want to direct that elsewhere

-20
semreply
lemmy.blahaj.zone

You might not want to post apologia for a company defending a fascist party once, then doubling down, then trying to take it all back saying "it was a mistake to get political"

15

You might not want to post apologia for a company defending a fascist party once, then doubling down, then trying to take it all back saying “it was a mistake to get political”

At no point did I state "it was a mistake to get political" that is a narrative entirely from your own imagination.

  1. I made a sarcastic response to the opening comment. People didn't notice the sarcasm. No worries my sense of humour isn't overly obvious and I refuse to litter \s marks everywhere so I'm not too bothered if my comments are misinterpreted at times.

  2. the opening commenter responds sarcastically.

  3. I respond with another comment that's absolutely dripping with sarcasm and even explicitly call out Proton's bullshit. Somehow people still don't note the sarcasm and yet they understood the firadin's comment was sarcastic, odd but again I'm not too bothered.

  4. Somebody implies I haven't understood a joke.

  5. I try to delicately suggest I've been misunderstood. Again, I'm not too bothered.

  6. Your response. Absolutely absurd.

At no point did I even defend the Nazis, at no point did I say or imply what you're quoting me as saying.

The most ridiculous thing is you accuse me of "apologia" on the same day I repeatedly call out the inappropriateness of Proton's stance because I got tired of reading so much "apologia":

The solace I do take from this is that at least people are aware of the insanity of the hill Proton have decided to die on.

-3
lemmy.world

They explicitly said the Republicans were on the side of the little guy. I probably don't need to explain the awful shit that they're doing that showcases that that is not what they're doing.

Saying they're "fighting for the little guys" while at the same time shitting on their political opponent is a clear show of support.

Now I don't particularly care about the Proton CEO's opinions. My opinion of CEOs is that they're dickheads until proven otherwise. But when you publicly support this shit, and use your company's official accounts to back yourself up, it becomes a lot more egregious in my mind. And even worse when they pretend they're not actually doing that.

12

But his 'support' of the republicans was saying that 10 years ago they used to be against big tech and that he hoped Trump would vary that forward. Obviously Trump is very unlikely to do this but he is literally just hoping the republicans would do something about big tech that the dema didn't do

1

They didn't really praise them. They just hoped that the republicans would go back to being against big tech (like they used to be 10 years ago he claims). Obviously, Trump's not going to do that but I think we can all agree big tech is a big problem

1
lemmy.zip

I don’t think they are that biased. They say in the article that ai models from all the leading companies are not private and shouldn’t be trusted with your data. The article is focusing on Deepseek given that’s the new big thing. Of course, since it’s controlled by China that makes data privacy even less of a thing that can be trusted.

Should we trust Deepseek? No. Should we trust OpenAI? No. Should we trust anything that is not developed by an open community? No.

I don’t think Proton is biased, they are explaining the risks with Deepseek specifically and mention how Ai’s aren’t much better. The article is not titled “Deepseek vs OpenAI” or anything like that. I don’t get why people bag on proton when they are the biggest privacy focused player that could (almost) replace google for most people!

31
sh.itjust.works

Exactly.

Also, none of the article applies if you run the model yourself, since the main risk is whatever the host does with your data. The model itself has no logic.

I would never use a hosted AI service, but I would probably use a self hosted one. We are trying a few models out at work and we're hosting it ourselves.

4
lemmy.zip

True, hosting deepseek yourself is much better. I'd still wait and see if anyone finds weird stuff in the code itself but tbh idk how long that could take.

Can't wait for the models to get better and hopefully stay open source!

1

weird stuff in the code

What code? We use a different runner for the model so we can run multiple different AI models, so the only thing we're getting from DeepSeek is the model.

2

I just meant generally, not sure what 'open source' means in the context of ai (not a programmer)

1

A quote from the article:

DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed.

This is just plain wrong. The model doesn’t contain the privacy unfriendly logic and can be used freely and unmodified. In fact, there are plenty of other platforms available right now where you can use it that are not Chinese.

This article makes fair points, if you ignore the fact that they don’t know what they’re talking about. You need to fix the errors in your head while reading it for it to make sense. If you don’t have the knowledge to do that, the whole article is a bit misleading.

1
lemmy.world

DeepSeek is open source, meaning you can modify code[...] on your own app to create an independent — and more secure — version. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.

They are not wrong here.

After having read the article fully it doesn't seem to be that partial and acknowledge also the failing of others. It is not as stupid as the CEO stance on "Republicans helping the little guys" for sure.

25
Dewayreply
lemmy.world

But that's also true for American companies.

As a European, I trust them as much as I trust Chinese ones.

15
lemmy.world

Ho yeah but they are definitely not ignoring that in the article. It's just that they are talking mostly about the subject of the article which is: deepseek

2

The problem with any kind of journalism like this though is that it talks about the topic of the article, sure, but it doesn't acknowledge the other relevant parties that have most of all the same concerns.

It's a matter of framing for the situation. I'd rather read something that talks about both sides instead of just one.

1
tempestreply
lemmy.ca

He's been kissing the ring on social media like the others IIRC

8

We actually it seems quite fair-ish 🤷

AI has the potential to be a truly revolutionary development, one that could drive advancement for centuries. But it must be done correctly. These companies stand to make billions of dollars in revenue, and yet they violated our privacy and are training their tools using our data without our permission. Recent history shows we must act now if we’re to avoid an even worse version of surveillance capitalism.

Also from 2023 : https://proton.me/blog/ai-gdpr

14
lemmy.world

You could write this exact article about openai too

18
cley_fayereply
lemmy.world

The thing is, some people like proton. Or liked, if this keeps going. When you build a business on trust and you start flailing like a headless chicken, people gets wary.

6
Evotechreply
lemmy.world

A blog post telling people to be wary of a Chinese app running an LLM people know very little about is flailing?

6
Kbobabobreply
lemmy.world

Can't it be run standalone without network?

They also published the weights so we know more about it than some of the others

6
Evotechreply
lemmy.world

This focuses mostly on the app though, which is #1 on the app stores atm

We know it's censored to comply with Chinese authorities, just not how much. It's probably trained on some fairly heavy propaganda.

6

Sure it might but the thing is it may still acknowledge that there are different opinions on some topics. Does reflect how whilst governments may have a narrative, people can say what they think. In China, that's a different story...

1
lemmy.ca

As someone living in the west I prefer propaganda that isn't trying to bring down the place where I live.

1
heavydustreply
sh.itjust.works

When the CEO praises Trump, says China bad because China while hiding that occidental AIs have the same kind of censorship, that’s hypocrisy.

1

hiding that occidental AIs have the same kind of censorship

This is the second sentence in the article:

AI chat apps like ChatGPT collect user data, filter responses, and make content moderation decisions that are not always transparent.

The entire rest of the article is about how they actually do not have the same kind of censorship. You should try reading the article before commenting on it.

But DeepSeek...does all that and more.

2
Kbobabobreply
lemmy.world

I eee this everywhere. They published the weights. That doesn't make it open source

7

The article goes into great detail about how it's different from OpenAI so, no.

1

failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers

That's not why. Almost no one is going to do that. That's why they didn't mention it.

11
lemmy.world

Except you can't run it.

Every model You are downloading and running is simply just a checkpoint of llama....

Quit spreading that misinformation.

You, and the grand majority of everyone else, doesn't have anywhere near the hardware to run the actual full deepseek model

10

I run one a of the smaller model on an M1 max and it's working pretty good. Much better than I would jave thought. Some guys on youtube manage to get the 600b parameters models to run on sub 5k hardware. It's a total game changer. In a couple of years it will probably run loccaly on phones.

5
lemmy.world

Of course it's biased. One company writing about another company is always biased. Imagine mods of one community collectively writing a post about another community, would the fact alone not be enough? Or admins of one instance about another.

It was common sense when I as a kid went online, writing all manners of awfully stupid things memories of which still haunt me today.

You'd be friendly and respectful with all people around you on the same forums and chats. But never ever would you believe them when they tell you what to think about something.

We live in a strange time when instead of applying this simple rule people are looking for mechanisms like karma or fact-checking or even market share to allow themselves to uncritically believe some stuff.

9
JOMusicreply
lemmy.ml

This is true. However, Proton's big sell is that they can be trusted to be truthful about what is safe and what is not safe for your privacy.

I think given the context of the CEO's personal bias towards current US Republicans, and given that those Republicans are aggressively anti-China, when Proton releases an article warning of a successful Chinese AI, and seemingly purposefully leaves out the part about how people are already running it securely, it starts raising some important questions about their alignment.

8
lemmy.world

Proton’s big sell is that they can be trusted to be truthful about what is safe and what is not safe for your privacy.

Which somebody who can be trusted wouldn't ever do.

Businesses sell goods, services, deals, not truth.

And privacy is not about trust.

6

Exactly. If a company can be trusted to provide privacy respecting products, they'll come with receipts to prove it. Likewise, if they claim something else respects or doesn't respect privacy, I likewise expect receipts.

They did a pretty good job here, but the article only seems to apply to the publicly accessible service. If you download it and run it through your runner of choice, you're good. A privacy minded individual would probably already not trust new hosted services.

2

Just because you can (pretty easily) self host it doesn’t mean that the privacy concerns aren’t valid.

8

Lemmy users very biased link to article that isn't nearly as biased as they are purposefully biasing.

Maybe this community needs stricter posting guidelines to avoid this sort of drivel?

7
szmer.info

They are absolutely right! Most people don't give a fuck about hosting their own AI, they just download "Deepsneak" and chat..and it is unfortunately even worse than "ClosedAI", cuz they are based in China. Thats why I hope Duckduckgo will host deepseek on their servers (as it is very lightweight in resources, yes?), then we will all benefit from it.

7
lemmusreply
szmer.info

Yeah, the same goes for global warming "if I burn these tires nothing happens, like its not any warmer here", and then everyone does that and everyone loses on that.

1
lemmusreply
szmer.info

Oh so you are more like "If I kill a man and run away to Russia, that means Russia is the good guy here, because I won't take any consequences", I think this topic is pretty undefined here, like many people may have different opinion on that, wheter a company should cooperate with government. But the thing is Deepseek has to coop, they have no option, and Deepseek is on the enemy side for us - west, thats why giving them data is like giving them money, data is money, you want China to get bigger, or your country? If you localhost, yeah it is far more better than any ClosedAI, but people don't do that, therefore you should be against using deepseek app and website if you care about interesr of your country.

0

Anyone promoting LLMs without a big side of skepticism is exposing their bias.

5
rumbareply
lemmy.zip

Yeah the article is mostly legit points that if your contacting the chatpot in China it is harvesting your data. Just like if you contact open AI or copilot or Claude or Gemini they're all collecting all of your data.

I do find it somewhat strange that they only talk about deep-seek hosting models.

It's absolutely trivial just to download the models run locally yourself and you're not giving any data back to them. I would think that proton would be all over that for a privacy scenario.

9
lemmy.world

It might be trivial to a tech-savvy audience, but considering how popular ChatGPT itself is and considering DeepSeek's ranking on the Play and iOS App Stores, I'd honestly guess most people are using DeepSeek's servers. Plus, you'd be surprised how many people naturally trust the service more after hearing that the company open sourced the models. Accordingly I don't think it's unreasonable for Proton to focus on the service rather than the local models here.

I'd also note that people who want the highest quality responses aren't using a local model, as anything you can run locally is a distilled version that is significantly smaller (at a small, but non-trivial overalll performance cost).

1
rumbareply
lemmy.zip

You should try the comparison between the larger models and the distilled models yourself before you make judgment. I suspect you're going to be surprised by the output.

All of the models are basically generating possible outcomes based on noise. So if you ask it the same model the same question five different times and five different sessions you're going to get five different variations on an answer.

You will find that an x out of five score between models is not that significantly different.

For certain cases larger models are advantageous. If you need a model to return a substantial amount of content to you. If you're asking it to write you a chapter story. Larger models will definitely give you better output and better variation.

But if you're asking you to help you with a piece of code or explain some historical event to you, The average 14B model that will fit on any computer with a video card will give you a perfectly serviceable answer.

1

I have tried them, and to be honest I was not surprised. The hosted service was better at longer code snippets and in particular, I found that it was consistently better at producing valid chain of thought reasoning chains (I've found that a lot of simpler models, including the distills, tend to produce shallow reasoning chains, even when they get the answer to a question right).

I'm aware of how these models work; I work in this field and have been developing a benchmark for reasoning capabilities in LLMs. The distills are certainly still technically impressive and it's nice that they exist, but the gap between them and the hosted version is unfortunately nontrivial.

1
JOMusicreply
lemmy.ml

Given that you can download Deepseek, customize it, and run it offline in your own secure environment, it is actually almost irrelevant how people feel about China. None of that data goes back to them.

That's why I find all the "it comes from China, therefore it is a trap" rhetoric to be so annoying, and frankly dangerous for international relations.

Compare this to OpenAI, where your only option is to use the US-hosted version, where it is under the jurisdiction of a president who has no care for privacy protection.

7

TBF you almost certainly can't run R1 itself. The model is way too big and compute intensive for a typical system. You can only run the distilled versions which are definitely a bit worse in performance.

Lots of people (if not most people) are using the service hosted by Deepseek themselves, as evidenced by the ranking of Deepseek on both the iOS app store and the Google Play store.

5
lemmy.world

Tutamail is a great email provider that takes security very seriously. Switched a few days ago and I'm very happy.

3
asudoxreply
lemmy.asudox.dev

Yet not great from a privacy perspective. They don't even allow third party email apps.

1
febrareply
lemmy.world

That's because your inbox is completely encrypted. As far as I know, no client provides support for that.

1

Posteo supports PGP encryption with a PGP key you have when an email comes into your inbox, which then can be decrypted by your client. So it is doable.

1
lemmy.world

It's not active running code that can affect a system in any meaningful way. It's a model. It's like a complex series of partitioned data that is loaded and sorted through. Nothing more. It's been open sourced and poured through, and it's just a model.

17
semreply
lemmy.blahaj.zone

Is the chatbot interface that uses the model open source? If you self-host will it try to send data home?

-1
semreply
lemmy.blahaj.zone

That's cool, I hope someone writes an article about how it works

-1
semreply
lemmy.blahaj.zone

No I mean for someone to read the source and explain what they found or didn't find

1

That will take a few weeks most likely.

That said, there's no way to verify what happens once the data leaves your machine, and the client isn't that interesting. I certainly won't trust any ai hosted by a third party because of that reason.

2