6th largest data center in the world by physical size, and it is the only one on this list explictly designated for 'national security'.
The NSA has taps on every single major trunk line going in or out of the US, they coordinate with every major US-based ISP, every major software provider, data center operator.
They have so much archived data that their actual problem is figuring out how to search through it efficiently... and that is a big thing that Palantir does, that was kinda their whole intitial... thing, as a company.
I came here to make this comment less cogently. You have it exactly.
Now, does it violate US law and multiple Executive Orders to search the database to get dirt on US Citizens and use it against their election campaign? Yes. Yes it does. But this administration thinks laws are for sissies.
And this was always the problem of building the panopticon, everyone justified doing it by saying 'well, its fine so long as the good guys are in charge', and 'we have to stop the terrorists, 9/11 Never Again'.
This is why the panopticon system is destroyed by Lucius Fox after using it to find the Joker in the Dark Knight.
The system itself is too dangerous to be allowed to exist in a world of flawed humans, and it will eventually be wielded by those least morally qualified to wield it.
Fuck, this is also basically analagous to the Lord of the Rings... Frodo is the hero for destroying the One Ring, not wielding it, because it literally corrupts you with its literally evil power.
This "too dangerous to exist" argument is seemingly more true for nuclear technology, but the world recognized the threat and came together to manage it.
I will grant you that database and ability to search it lends itself easily to popular oppression, but it still requires thinking, breathing humans to do the oppressing.
Most technology is not dangerous without psychopaths in power, and damn near everything is dangerous with psychopaths in power.
I'm still a bit skeptical Palantir would expose this capability over a Senate race which hasn't even gotten through primaries. I haven't looked into it that much, but I think it's far more likely there's something on the accouny which makes it easy to identify, and someone this dude knows figured it out before he deleted that account.
Opposition research isn't really illegal, there's no confirmation that anything from Palantir was actually used, and it's trivially easy for any layperson to view deleted reddit comments-- to be perfectly honest in this specific case I just don't think there was anything really untoward
When that was going on, the whole time I was saying that if we ban Tiktok for data security reasons, we should ban Facebook and Instagram for exactly the same reason, and yes, we should ban basically all social media at this point, its all a perfect spying machine, one you get addicted to, beyond hiding in plain sight...
Of course, that's extremely unlikely to happen... but it is an actually consistent position.
We probably should. But the reason for the ban is because they don't want foreign governments swaying the american public; only the US government is allowed to do that.
Sure, but it's also very likely that reddit is still retaining all posts even "deleted" ones in their database. I can go look at the profiles of people who haven't used their accounts in 12 years. I can use Arctic Shift to view posts and comments that users have deleted themselves... even from deleted accounts! All the data is still there. That's why a few years ago when people were deleting accounts it was widely suggested to edit every comment into gibberish before deletion, so the final edit in the database would be worthless. I remember when there were extension tools to do it like NukeReddit that changed everything to gibberish and then deleted it for you all automatically. Those tools had stopped working by the time the exodus due to the API changes happened.
Anyway, I wouldn't past that fucking pile of shit Steve Huffman to just be passing it off to Palantir because he's such a little bitch.
That's not how it works. I'll throw myself under the bus for you. Look up "reddit.com/u/1831942" on the web archive. (This is NOT a plug) You can see random archives (non-personal) of my profile. It doesn't matter if you deleted it. It sucks that the info is floating out there. Also, fuck reddit.
That's my old account that I deleted a bunch of content on. My point is that it's archived. When I said "non-personal," I meant no one personally archived my posts. They were just web crawled. I was posting/ commenting often on a popular social media platform, so it was archived. I hope that makes sense.
I mean it's still the internet and we always knew it's impossible to really delete something from the Internet. There could be someone somewhere who archives all of reddit with all the edits, just because they can.
I don't think people realize that our user information is automatically documented on a not-so regular basis. (By the web archive crawler) I've been curious about Lemmy, but I haven't seen us on there much, lol. I don't think we're important enough, THANKFULLY.
I follow the lesser evil approach. I'm further away socially, culturally, and politically from the Chinese so all my tech is Chinese. Them having my data is less dangerous to me than just letting the redneck reich get it.
Like, they've been able to do this for 25 odd years.
There were gov data centers with thousands of petabytes when I was in college. Prism had the gov archiving every phone call and all internet traffic back in '08...
This is not news.
As soon as the quantum cryptography tech gets there, they'll start decrypting the signal and matrix chats you had yesterday.
The stasi government would never use my data against me. I've got nothing to hide. Hey I think I'll go buy this doadd for no reason at all no subliminal advertising
There are dedicated tools, called Social Listening tools that do just these. Some examples are Brandwatch, Sprinklr, Talkwalker and Meltwater.
Not just Palantir, anyone and everyone is using these tools. From consumer companies to investment banks to your favourite content creator is using some kind of social listening to stay on top of trends and understand how you behave.
Reddit and Twitter are two social platforms that provide the most data openly and freely, and in reddit's case you can get a lot of historical data without extra cost.
Your favourite candle brand, and your favourite outdoor clothing company and protein shake company is part of your favourite reddit community listening to what you're talking about. They know if you like energy drinks, you might light heavily scented bath soaps too.
Deleted posts show up on these platforms quite often but when you click on them to go to reddit or twitter you'll get a not found or deleted page.
PS: I've been working in social listening for last 8 years.
I feel like there's a big difference between snatching people off the street and making ad targeting smarter. Yeah they both suck, but orders of magnitude here.
I'm not American so I don't know the tools ICE uses and if social listening or whatever it is called is one of those tools.
But if your reasoning for opening the box and the reasoning for opening the person is the same.. well it's the same. Different outcomes, same reason
I agree but at the same time, in a different thread who's topic specifically is shaming a company for working on the white house, people are foaming from their mouths that someone is working for Trump and downvote anyone who disagrees. The double standard is just mine blowing.
I've been working mostly on the analysis side of things, currently I do this for a consumer goods company.
And yes I do it to make a living. Got hired as an analyst in this field out of MBA and stayed in this. Also being chronically online and familiar with social networks helped.
I wish lmao. But there are tons of data companies selling all kinds of data on people to whoever will pay them for it.
Me working for identifying trends or what flavour of green tea is popular are less harmful use cases.
What we need is regulation to reign in big tech. These API licenses are expensive and I suspect is a big reason why Reddit ended free data sharing (which led to death of third party reddit clients).
'Social Listening' is uh, one way to brand 'corporate surveillance panopticon', I guess hahah!
Oh god I'm so glad I am an ex-corpo, the stupid fucking lingo and buzzwords alone should be enough to make most people realize they are in a cult, but I guess not.
I can tell you from the perspective of someone working in consumer insights for a consumer company, sales and retail data is far more important and influences decisions way more.
So I'd say put your money where your mouth is. If you don't like certain companies or brands stop buying their products. If you think groceries are getting expensive buy cheaper alternatives or private labels. You can do this with a lot of non-food grocery items. Private label is almost always cheaper and you get pretty much the same shit as global brands.
I see a lot of conversations of people mentioning how expensive things have gotten, yet we see in sales data that our expensive products sell more (this is also due to the fact that more ad dollars are spent on higher margin products) but try and buy cheaper alternatives whenever you can.
As a software engineer I was a little shocked when I learned our company treats “Delete” buttons as a means to toggle Archived = 1 in the DB. Nothing is actually deleted. Sure we will anonymise the data after a certain time to be GDPR compliant but it would be trivial I guess to actually link that back to people.
The GDPR applies to data pertaining to an identifiable person. Anonymised data is more or less equivalent to deleted data as far as the regulation is concerned. Source: I was a DPO for 5 years.
Oh, I see. Indeed anonymised data should be fine under GDPR. However it is often very difficult to anonymise data. Some things are easy to anonymise, other are very complex.
For a small company who does not mainly work with data, the easiest solution to comply with GDPR is indeed just deleting the data altogether.
Yes, there a concept of "pseudonymous" data in some of the guidance, which refers to anonymous data which, when taken together, could identify the person - even if some of that data is not held by the data controller. Under those circumstances seemingly anonymous data can fall under the regulation although most companies are very unlikely to consider such nuance in their data policies.
There's no independent audit for GDPR compliance so the only way to know would be if someone whistleblows. There are also so many loopholes that allows to keep the data like "to prevent further abuse" or "some legal reason".
So if reddit bans your account they can keep all data and you can't do anything about it even with GDPR.
The requirement exists unless the company is under legal obligation to retain something. I had one case where I requested a GDPR data dump followed by a full deletion, and apparently whoever executed the request deleted first and then processed the dump, so I was able to see that what they did was change my email address from [email protected] to username#[email protected] - meaning that login attempts, password resets etc. would clearly fail, and a further attempt to request my data revolving around my email address would be unsuccessful, but ultimately all my data was still accessible somewhere. Whether they'd then proceed to delete it after the retention period, who knows. I intended to follow up but forgot...
Not quite, deletion from a hard drive also unflags the space the data was located at as being in use, so it will be overwritten eventually so long as the drive continues to have things written to it. Simply flagging something as being archived means that information will remain on the server indefinitely, the exact opposite of what is intended by a delete button.
My current workplace doesn't have for foresight to do that. Delete fully deletes immediately and without confirmation. Oh and the backups have been broken for years
On the upside, recent changes in leadership and on the team made it so we finally have the political will and talent in the right places to actually put effort into fixing backups but they have a lot of technical debt to sift through in fixing the last folks' mistakes and oversights
Remimds me of when everyone was deleting their posts around the API blackout and suddenly the next day it was like Reddit did a restore from checkpoint and all of the edited posts and deleted posts came right back. I for one had to run the script that replaced then delete my posts twice, but that's besides the point.
Did they actually come back or did a bunch of comments not show up in your profile because the subs were private so they didn't get deleted, then they reappeared when the subs reopened? I thought some of mine were restored too, but it was a combination of that and the tool I used only hitting the most recent 1000.
Not a bad guess, and it’s always possible there may have been a handful for which that were true, but no. The bulk of them were from subs that have been public for years and continue to be. I’d been at zero comments for pert near 2 years? A bit more. I would check every now and again after they all came back early on. Then suddenly I noticed a few thousand were back, all from 2011-2014 or thereabouts.
Pretty sure they also have access to banned Reddit accounts whose users can no longer access their history to know what they will be judged and profiled for, too.
Just assume every social network either allows this directly or enables a third party to do it, Lemmy specially.
Literally nothing points to Palantir so far according to that post, so your title is just misinformation...
Sources claim that [...] tools, POSSIBLY including Palantir OR SIMILAR [...] MAY have been used ...
Everyone from the sources to the poster are just guessing and speculating.
I dont doubt that its real, but at this point its just made up nonsense to claim it like this.
Agreed, this would be a really weird way for a state threat actor to reveal an otherwise pretty sensitive capability. I assume there's something on the account which clearly links him to it and someone he knows recently remembered they know his old account name.
Also, remember that this is cherry-picking the worst possible way of phrasing the worst possible excerpts they can pick out from over ten years of random shitposting. You have no idea what the context was for any of this stuff. You don't know if the part he "agreed" with was the same as the part of the whole other person's message that said white people were racist and stupid. And so on.
For as long as Reddit has been available there have been places that would scrape it and mirror deleted/edited comments. I don't remember what any of them are so people would use them all the time to figure out what someone said before they were banned.
It was forensics. They used forensics. Ai did not help, probably got a bit in the way even. You can do these things with data. We told you several times
For some time now, I have written stuff on the internet under the assumption that one day, my identity will be publicly tied to everything I wrote. Surely in the future it will be easy to give an example of my writing to an AI bot, perhaps combined with some facts about my life, and the bot would be able to find anonymous posts that were likely written by me, across the internet.
There was a browser extension back in the day that ran junk searches to poison whatever data they were harvesting on you. Sounds like it needs to make a comeback.
There's a browser addon called Meta Random Search which sends your queries randomly through google, bing, ddg, yahoo and other search engines so that nobody has a full history. Paired with a user agent switcher (personally using Chameleon on Firefox) with a high frequency of change (1 min or so) and disabled browser telemetry it might throw them off already even without poisoning results. Especially since every query consumes natural resources I'm not really a fan of this approach.
Lying that they have the Reddit comments that they showed everyone?
Anyway, Reddit is one of the most scraped sites on the internet. Be sure that everyone who wants an archive of Reddit has an archive of Reddit. Anything you put on the internet is forever.
I don't exactly mean to say this is nothingburger, but it is one of the more boring bits of living in a dystopia and is too low on the list of evil shit Palantir is doing to care that much about.
This should come as no surprise. We all knew they were storing everything away in data centers for decades now. Once AI hit the scene it was over. I knew right at that moment they would use it to parse all that information. That's why I stopped using social media in a way that could be used to identify my more violent and anarchistic takes. I'm sure I could still be found out, but I'll not be making it easy for the bastards. Fuck Republicans for fear mongering all of this for my entire life and now cheering for it to be used against their perceived enemies, traitorous cocksuckers.
Yup, data archival. Now imagine this future: right now, encrypted data transfers may be wire-tapped and stored. When quantum computers are available, all that traffic will be decryptable. This includes pretty much all general HTTPS traffic since TLS mostly uses ECDHE for key exchange which isn't quantum secure.
I bet nation state actors are recording everything they can.
Damn, dude... that's insane and I'm surprised it's never occurred to me.
I've had the realization before as I realize that maybe my password database will eventually be easily cracked... but there's no reason it cannot apply to data in transit as well, as long as someone is recording it.
This was pretty evident the second they named their company over the tool used by The Dark Lord Sauron and Saruman to spy on the actions of others from Lord of the Rings.
You don't need AI for this, just a ton of money for storage and either tolerance for a slow query (like 15-20 minutes) or an engineer who knows what they're doing in search.
Anything ever posted online can be archived and searched later. It doesn't take a rocket scientist to cross reference publicly available sources along with subpoenaed data.
Yeah but that's basically how deleting works for any normal system. You remove the pointer telling the computer where the data is, then you flag that section of data as free for writing to. It's not until something writes over the data is it truly gone.
It's true except in a filesystem you lose the name and location and it can be overwritten. In this instance it sounded like they just prevent access but keep all that data there, still accessible and readable and wont be overwritten.
The obvious strong link is reddit+email. Someone could have got into his personal, probably old mailbox, where original registration letters (with r/handle) and notifications still are. I find it more probable, but since government is under MAGA, they could've used some way to ask Huffman if some account matches the mail address.
This is why I resisted attaching emails to reddit accounts for years. Recently subs have started soft banning users without registered emails though, so I started just using throwaways.
At one point it was possible to download every reddit comment ever. I think it was around 10 years ago I had a copy of that. I can't remember if it was from reddit directly or from some third party with scrapers. I recall the dataset being free... but it might have been free for me because I had an academic justification? Really don't recall.
Anyways, point being that you're delusional if you think anything you post online ever goes away. Secondly, you can be much less than palentir and have "deleted reddit account comments". Anyone can get them.
The scary dystopian part is the ability to work out that the account belonged to someone who hadn't used it for a decade rather than just that they could see what had been posted. The Internet Archive doesn't let you ask it what someone's Digg username was.
Not that it's unreasonable, but that the scale of what AI can surveil is so vast that there's no more personal security-via-obscurity.
It used to be that unless someone had a reason to start looking at you, anything you did online or off was effectively impossible to search. You might be caught on some store's CCTV, Or your cell provider might have location pings, but that wasn't online for anyone and needed a warrant to have the police use it to track your activities. Now cities are using Flock and similar tools to enable tracking vehicles across the country without any reason, and stores are using cloud-service AI cameras to attempt to track your mood as you move through the store. These tools can and have been abused.
Now, due to the harvesting of this data for AI, anything that's ever been recorded (video footage, social media posts, etc) and used as training data can be correlated much more easily, long after it occurred, and without needing to be law enforcement with a warrant.
No, that's not what I said. Widespread data collection and searching used to be something only state actors could accomplish and there were at least theoretically guard rails. Now the barrier of entry has been seriously reduced, the data is owned by a corporation, and being fed to AI. That has a chilling effect as well as being ripe for abuse.
Widespread data collection and searching used to be something only state actors could accomplish and there were at least theoretically guard rails
So you just make shit up as you go? You are projecting how you think things should work into reality as if it were fact. But now you are learning how it actually works and what really scares you is the shattering of the illusion you sold yourself. I mean it should be pretty apparent Google and Facebook are tools of the US government they always have been.
"Anyone could already do this, so why bother being worried that it's easier now" they said.
I still don't get your angle. Why are you defending this, or at the very least downplaying it's impacts? You seem to also be aggravated by this data collection and spying, so why are you so mad that other people are catching on?
"Oh, I'm so smart" they said. Enjoy your useless internet points?
The situation is actually different now, in the last few years. This is less relevant to the OP, but we/they are building automated snitches that will tattle on you, and more importantly be wrong with a statistical significance. See Flock mistaking license plates and calling the police on innocent people. Sure, we might catch a few violent criminals, but when your government decides that your online activity complaining about them is now criminal, your data can be correlated in real time in a way that wasn't possible in our parent's time.
Your dismissal of this seems insane. Stop arguing with me about how long it's been possible and help me/us fight against it.
So you acknowledge that the data exists, what you are scared of is being able to search it?
Usually that's the insurmountable mountain. Data collection is easy. Formatting, storing and querying the data so you can actually get useful information out of it in a time efficient manner is the extremely hard part.
For a real world example, the organization I work at does quarterly audits of all of the field offices to make sure all of the field offices are in compliance, checking required document retention, gear, etc. and when an audit finds a requirement that is out of compliance they're given a task with a deadline to complete said task to bring them back into compliance, and these tasks have visibility all the way up the chain of command to where even the C-levels are reviewing them regularly. I've been working a project recently to flag repeated failures of the same audit requirement for the same location and it's highlighting that some field offices are not actually coming into compliance once these high visibility assigned tasks are completed which when I presented it to leadership it was a revelation just how many field offices are continuously out of compliance.
Point is, this data is being actively collected and formatted for easy access and there's still glaring issues being missed due to the difficulty of finding these trends buried in the hundreds of pages of data being generated each quarter per field office
Uhhh I think you completely missed what I was saying. I was explaining that collecting data is easy, but actually making use of that data is really hard, and gave a real world example of a trend that should be obvious being buried in a mountain of data because there's simply too much data to sift through
I gave you an out you entirely missed the point no one is talking about arbitrary amounts of arbitrary data the concerns are about things we say online being used against us in the real world.
Imagine thinking your privacy is above profit where capitalism rules lmao
You muricas are too dumb to understand that nothing has more power than money on a profit driven society like yours, like the ones your stupid elite forced upon every country's throat as they could
Your "market freedom" is actually money dictatorship, EUA is not a free country
Also this is being orchestrated by the DSCC showing that the Democrats still have power and deep state ties, they just use them against left/anti-zionists.
EDIT: Are the downvoters refusing to see that all this came out after Janet Mills jumped in the race and immediately got endorsed by Schumer? It is obviously oppo research by her supporters.
They're the remnants of last year's propaganda campaign. No one bothered to deprogram them. They tend to fire off their lines at unpredictable times like a toddlers toy with a low battery.
I don’t live in Maine anymore, but his handling of this, and clear stance on other issues important to me, have actually strengthened my willingness to vote for him was I still in the state.
I actually believe him when he says he got it while drunk on a night off with the marines, supposedly not knowing the meaning at the time. He then went on to say he immediately scheduled to cover it because getting it removed would have taken too much time to figure out because Maine doesn’t have any places that do it. Like he wanted that shit no longer visible on his body asap and went out of his way to get it done sooner than later.
Similarly with LGBTQ+ rights. Yeah he said some edgy shit on the Internet a long time ago, but he’s said he’s changed and now aggressively supports queer rights in Maine. Idk, maybe he’ll pull a Fetterman, but I don’t get that vibe.
Even if he is still in the process of fully deconstructing things, he is clearly taking the correct actions in the here and now to further that process.
I grew up with people like him and almost without fail, when I actually sat down and had real conversations with them as adults, they’ve been positive and minds have been changed in all directions.
Blue collar Mainers are some of the first people to hate billionaires, and fiercely support small government, personal freedoms and privacy. This honestly means supporting queer rights in so far as they want the freedom to be themselves too. They have been systematically lied to by a party that doesn’t actually want small government or personal liberties, and many of them have realized that.
We need to be able to welcome these people willing to be educated, and genuinely capable of changing their thinking and their ways. These people are closeted radical leftists. We will need them on board.
He seems to walk the walk as well as talk the talk.
People apparently don't know about the NSA Utah Datacenter.
https://en.wikipedia.org/wiki/Utah_Data_Center
Been a thing for over a decade, unimaginable total storage size, and they literally archive everything.
This place had between 3 and 12 exabytes of storage capacity, in 2013.
1 exabyte is 1 billion gigabytes.
How big was your pc/laptop hard drive in 2013?
Maybe... 250 gigs to 2 teras, something like that?
This data center could now easily be in the yottabyte range ( millions of exabytes ), maybe even ronnabytes ( billions of exabytes ).
https://www.rankred.com/largest-data-centers-in-the-world/
6th largest data center in the world by physical size, and it is the only one on this list explictly designated for 'national security'.
The NSA has taps on every single major trunk line going in or out of the US, they coordinate with every major US-based ISP, every major software provider, data center operator.
They have so much archived data that their actual problem is figuring out how to search through it efficiently... and that is a big thing that Palantir does, that was kinda their whole intitial... thing, as a company.
I came here to make this comment less cogently. You have it exactly.
Now, does it violate US law and multiple Executive Orders to search the database to get dirt on US Citizens and use it against their election campaign? Yes. Yes it does. But this administration thinks laws are for sissies.
And this was always the problem of building the panopticon, everyone justified doing it by saying 'well, its fine so long as the good guys are in charge', and 'we have to stop the terrorists, 9/11 Never Again'.
This is why the panopticon system is destroyed by Lucius Fox after using it to find the Joker in the Dark Knight.
The system itself is too dangerous to be allowed to exist in a world of flawed humans, and it will eventually be wielded by those least morally qualified to wield it.
Fuck, this is also basically analagous to the Lord of the Rings... Frodo is the hero for destroying the One Ring, not wielding it, because it literally corrupts you with its literally evil power.
God damnit.
This "too dangerous to exist" argument is seemingly more true for nuclear technology, but the world recognized the threat and came together to manage it.
I will grant you that database and ability to search it lends itself easily to popular oppression, but it still requires thinking, breathing humans to do the oppressing.
Most technology is not dangerous without psychopaths in power, and damn near everything is dangerous with psychopaths in power.
No wonder that guy didn't understand One Piece
I'm still a bit skeptical Palantir would expose this capability over a Senate race which hasn't even gotten through primaries. I haven't looked into it that much, but I think it's far more likely there's something on the accouny which makes it easy to identify, and someone this dude knows figured it out before he deleted that account.
Opposition research isn't really illegal, there's no confirmation that anything from Palantir was actually used, and it's trivially easy for any layperson to view deleted reddit comments-- to be perfectly honest in this specific case I just don't think there was anything really untoward
When people were up in arms about China getting data from TikTok, I wondered if they had any idea of what the NSA does.
When that was going on, the whole time I was saying that if we ban Tiktok for data security reasons, we should ban Facebook and Instagram for exactly the same reason, and yes, we should ban basically all social media at this point, its all a perfect spying machine, one you get addicted to, beyond hiding in plain sight...
Of course, that's extremely unlikely to happen... but it is an actually consistent position.
We probably should. But the reason for the ban is because they don't want foreign governments swaying the american public; only the US government is allowed to do that.
The sale to a right wing Trumpbuddy proves what I have been saying. The position of the US government is quite consistent: "WE get all of your data."
Yes I too can use web.archive.org
Sure, but it's also very likely that reddit is still retaining all posts even "deleted" ones in their database. I can go look at the profiles of people who haven't used their accounts in 12 years. I can use Arctic Shift to view posts and comments that users have deleted themselves... even from deleted accounts! All the data is still there. That's why a few years ago when people were deleting accounts it was widely suggested to edit every comment into gibberish before deletion, so the final edit in the database would be worthless. I remember when there were extension tools to do it like NukeReddit that changed everything to gibberish and then deleted it for you all automatically. Those tools had stopped working by the time the exodus due to the API changes happened.
Anyway, I wouldn't past that fucking pile of shit Steve Huffman to just be passing it off to Palantir because he's such a little bitch.
That's not how it works. I'll throw myself under the bus for you. Look up "reddit.com/u/1831942" on the web archive. (This is NOT a plug) You can see random archives (non-personal) of my profile. It doesn't matter if you deleted it. It sucks that the info is floating out there. Also, fuck reddit.
That's my old account that I deleted a bunch of content on. My point is that it's archived. When I said "non-personal," I meant no one personally archived my posts. They were just web crawled. I was posting/ commenting often on a popular social media platform, so it was archived. I hope that makes sense.
I mean it's still the internet and we always knew it's impossible to really delete something from the Internet. There could be someone somewhere who archives all of reddit with all the edits, just because they can.
I don't think people realize that our user information is automatically documented on a not-so regular basis. (By the web archive crawler) I've been curious about Lemmy, but I haven't seen us on there much, lol. I don't think we're important enough, THANKFULLY.
I follow the lesser evil approach. I'm further away socially, culturally, and politically from the Chinese so all my tech is Chinese. Them having my data is less dangerous to me than just letting the redneck reich get it.
Like, they've been able to do this for 25 odd years.
There were gov data centers with thousands of petabytes when I was in college. Prism had the gov archiving every phone call and all internet traffic back in '08...
This is not news.
As soon as the quantum cryptography tech gets there, they'll start decrypting the signal and matrix chats you had yesterday.
Privacy is illusory and temporary.
The
stasigovernment would never use my data against me. I've got nothing to hide. Hey I think I'll go buy this doadd for no reason at all no subliminal advertisingThe US government is temporary. Not our privacy.
Depends on what the people organize to fight to protect.
Unfortunately, I don't have confidence in the American people to fight for the correct option.
It's like none of you have heard of Edward Snowden.
It hasn't been on tiktok lately, so, correct, they have not.
That traitorous cunt who is the reason russia got so much influence in the US 2016 elections? Never heard of him.
There are dedicated tools, called Social Listening tools that do just these. Some examples are Brandwatch, Sprinklr, Talkwalker and Meltwater.
Not just Palantir, anyone and everyone is using these tools. From consumer companies to investment banks to your favourite content creator is using some kind of social listening to stay on top of trends and understand how you behave.
Reddit and Twitter are two social platforms that provide the most data openly and freely, and in reddit's case you can get a lot of historical data without extra cost.
Your favourite candle brand, and your favourite outdoor clothing company and protein shake company is part of your favourite reddit community listening to what you're talking about. They know if you like energy drinks, you might light heavily scented bath soaps too.
Deleted posts show up on these platforms quite often but when you click on them to go to reddit or twitter you'll get a not found or deleted page.
PS: I've been working in social listening for last 8 years.
Why are you doing evil work..? Like... Why develop these tools that will so obviously be used to worsen our lives?
Gotta eat. Blame the system that creates this incentive not some individual wageslave
Yes, but you can choose how to earn ypur living. It is unlikely they cannot do amy other job.
I think coming out of an MBA, this is one of the less harmful jobs. I'd rather not be in consulting or investment banking or sales or big tech.
Like ICE
I feel like there's a big difference between snatching people off the street and making ad targeting smarter. Yeah they both suck, but orders of magnitude here.
The word you're looking for is "spying".
when these types of tools are used to decide whomto snatch I hope that is cold comfort
Yes of course, but the excuse is the same
I can use a knife to open a box or open a person. Are these things the same because I used the same tool?
I'm not American so I don't know the tools ICE uses and if social listening or whatever it is called is one of those tools. But if your reasoning for opening the box and the reasoning for opening the person is the same.. well it's the same. Different outcomes, same reason
I agree but at the same time, in a different thread who's topic specifically is shaming a company for working on the white house, people are foaming from their mouths that someone is working for Trump and downvote anyone who disagrees. The double standard is just mine blowing.
I've been working mostly on the analysis side of things, currently I do this for a consumer goods company.
And yes I do it to make a living. Got hired as an analyst in this field out of MBA and stayed in this. Also being chronically online and familiar with social networks helped.
Could you stop enabling the police state, please?
I wish lmao. But there are tons of data companies selling all kinds of data on people to whoever will pay them for it. Me working for identifying trends or what flavour of green tea is popular are less harmful use cases.
What we need is regulation to reign in big tech. These API licenses are expensive and I suspect is a big reason why Reddit ended free data sharing (which led to death of third party reddit clients).
'Social Listening' is uh, one way to brand 'corporate surveillance panopticon', I guess hahah!
Oh god I'm so glad I am an ex-corpo, the stupid fucking lingo and buzzwords alone should be enough to make most people realize they are in a cult, but I guess not.
go on, tell us more
There isn't much to tell. We gather conversations and analyse it to understand consumer behaviour, trends, campaign performance etc.
If you don't want your data showing up in social listening tools, make your social accounts as private as possible.
I can tell you from the perspective of someone working in consumer insights for a consumer company, sales and retail data is far more important and influences decisions way more.
So I'd say put your money where your mouth is. If you don't like certain companies or brands stop buying their products. If you think groceries are getting expensive buy cheaper alternatives or private labels. You can do this with a lot of non-food grocery items. Private label is almost always cheaper and you get pretty much the same shit as global brands.
I see a lot of conversations of people mentioning how expensive things have gotten, yet we see in sales data that our expensive products sell more (this is also due to the fact that more ad dollars are spent on higher margin products) but try and buy cheaper alternatives whenever you can.
As a software engineer I was a little shocked when I learned our company treats “Delete” buttons as a means to toggle Archived = 1 in the DB. Nothing is actually deleted. Sure we will anonymise the data after a certain time to be GDPR compliant but it would be trivial I guess to actually link that back to people.
I'm pretty sure GDPR requires websites to abide to user requests to delete their data. You may wish to review that with your company.
The GDPR applies to data pertaining to an identifiable person. Anonymised data is more or less equivalent to deleted data as far as the regulation is concerned. Source: I was a DPO for 5 years.
Oh, I see. Indeed anonymised data should be fine under GDPR. However it is often very difficult to anonymise data. Some things are easy to anonymise, other are very complex.
For a small company who does not mainly work with data, the easiest solution to comply with GDPR is indeed just deleting the data altogether.
Yes, there a concept of "pseudonymous" data in some of the guidance, which refers to anonymous data which, when taken together, could identify the person - even if some of that data is not held by the data controller. Under those circumstances seemingly anonymous data can fall under the regulation although most companies are very unlikely to consider such nuance in their data policies.
The org i used to work for had to develop a special process to delete user data upon request, it was not an easy process in dynamics365
if you want something deleted you best destroy the hard disk yourself lol
There's no independent audit for GDPR compliance so the only way to know would be if someone whistleblows. There are also so many loopholes that allows to keep the data like "to prevent further abuse" or "some legal reason".
So if reddit bans your account they can keep all data and you can't do anything about it even with GDPR.
Don't GDPR deletion requests only require deleting personal data, and not public posts?
Are you advising breaking the law just because nobody checks?
I'm saying corporations break the law if nobody checks - why wouldn't they?
That happens. Still, many companies do not. Some companies are unaware of the legislation.
I was informing one worker of a company of one such law.
Many companies do not break the law even though there are no controls just because that is the right thing to do.
The requirement exists unless the company is under legal obligation to retain something. I had one case where I requested a GDPR data dump followed by a full deletion, and apparently whoever executed the request deleted first and then processed the dump, so I was able to see that what they did was change my email address from [email protected] to username#[email protected] - meaning that login attempts, password resets etc. would clearly fail, and a further attempt to request my data revolving around my email address would be unsuccessful, but ultimately all my data was still accessible somewhere. Whether they'd then proceed to delete it after the retention period, who knows. I intended to follow up but forgot...
That's basically how deleting data from a hard drive works too.
Not quite, deletion from a hard drive also unflags the space the data was located at as being in use, so it will be overwritten eventually so long as the drive continues to have things written to it. Simply flagging something as being archived means that information will remain on the server indefinitely, the exact opposite of what is intended by a delete button.
That’s why we use the shred command, then you get random data over it at the start.
Depending on your media that may not really destroy the data. SSDs do wear leveling and it might just write new blocks and reuse the old ones later.
So, what you're saying is, to truly delete data from an ssd you need to do manual wear leveling with a belt sender.
My current workplace doesn't have for foresight to do that. Delete fully deletes immediately and without confirmation. Oh and the backups have been broken for years
On the upside, recent changes in leadership and on the team made it so we finally have the political will and talent in the right places to actually put effort into fixing backups but they have a lot of technical debt to sift through in fixing the last folks' mistakes and oversights
Psh I'm surprised you're surprised. The only way to really get rid of data is microwave or magnet, no?
I guess I’m naive and believe people would be honest.
I wish too. But apparently information I'd power 🤷♂️ who would've guessed
drill bit*
Remimds me of when everyone was deleting their posts around the API blackout and suddenly the next day it was like Reddit did a restore from checkpoint and all of the edited posts and deleted posts came right back. I for one had to run the script that replaced then delete my posts twice, but that's besides the point.
Same, but then a bunch of mine popped up again sometime in the last few months. Not exactly sure when, and it wasn’t all of them.
I didn’t run the replace script tho, I wish I would’ve.
Did they actually come back or did a bunch of comments not show up in your profile because the subs were private so they didn't get deleted, then they reappeared when the subs reopened? I thought some of mine were restored too, but it was a combination of that and the tool I used only hitting the most recent 1000.
Not a bad guess, and it’s always possible there may have been a handful for which that were true, but no. The bulk of them were from subs that have been public for years and continue to be. I’d been at zero comments for pert near 2 years? A bit more. I would check every now and again after they all came back early on. Then suddenly I noticed a few thousand were back, all from 2011-2014 or thereabouts.
Indeed, replacing the posts with irrelevant jibberish was the way
What script is this?
I used this in the past, not sure itll still work, it worked with old.reddit a few years back
https://greasyfork.org/en/scripts/1870-delete-all-reddit-comments/code
there's otherones that edit comments as opposed to deleting them, either way reddit keeps a copy, just fyi
Thank you!
Pretty sure they also have access to banned Reddit accounts whose users can no longer access their history to know what they will be judged and profiled for, too.
Just assume every social network either allows this directly or enables a third party to do it, Lemmy specially.
Lemmy is explicitly public. I don't think that's much of a stretch.
Fuckers should help me restore my old academic portfolio then. Might as well put living in a dystopian surveillance state to helpful use.
It seems like these sorts of things can be used against you, but whenever it might actually benefit you they always come up short.
I was screaming this when COVID first hit.
Literally nothing points to Palantir so far according to that post, so your title is just misinformation...
Everyone from the sources to the poster are just guessing and speculating.
I dont doubt that its real, but at this point its just made up nonsense to claim it like this.
Ah, but "Platner" and "Palantir" have most of the same letters!
Agreed, this would be a really weird way for a state threat actor to reveal an otherwise pretty sensitive capability. I assume there's something on the account which clearly links him to it and someone he knows recently remembered they know his old account name.
Nothing here that makes mass surveillance justified.
Depends - was the assault comment directed at assailants or victims?
well... at least he realizes that was bullshit...?
Also, remember that this is cherry-picking the worst possible way of phrasing the worst possible excerpts they can pick out from over ten years of random shitposting. You have no idea what the context was for any of this stuff. You don't know if the part he "agreed" with was the same as the part of the whole other person's message that said white people were racist and stupid. And so on.
Nazi tattoo... Rape apologism... What other wacky stuff has this scamp gotten up to?
Well, except for the rape shit. That is pretty jarring.
Nothing based about overt misogyny and rape apologism... Not to mention the nazi tattoo.
On the plus side, he'll be great at "bridging the aisle". \s
Don't just be mad at palantir.
The American government funded palantir.
Palantir couldn't exist without the helping hand of the American government.
The time to give a fuck was long before Snowden made his leaks.
All the dystopian stuff people fear the government will do is already being done by a framework of companies funded by our government.
What’s posted to the internet, STAYS on the internet. Forever. Stay safe friens
Bookmarked
I'm sure Palantir would help Democrats find stuff on NAZIs ... right ? Right ?
The guy in the photo with the nazi tattoo is a democrat lol
For as long as Reddit has been available there have been places that would scrape it and mirror deleted/edited comments. I don't remember what any of them are so people would use them all the time to figure out what someone said before they were banned.
There were also sites that you could put in a username and it would give you a pretty in depth summary of the person's location and facts about them.
It was forensics. They used forensics. Ai did not help, probably got a bit in the way even. You can do these things with data. We told you several times
Oh great the second I become president my DeviantArt is getting leaked
For some time now, I have written stuff on the internet under the assumption that one day, my identity will be publicly tied to everything I wrote. Surely in the future it will be easy to give an example of my writing to an AI bot, perhaps combined with some facts about my life, and the bot would be able to find anonymous posts that were likely written by me, across the internet.
"Deleted"
Once you put anything on the public internet these days, it will be harvest by corporations and used against you eventually
I think we all should keep speaking up.
Be unpredictable
There was a browser extension back in the day that ran junk searches to poison whatever data they were harvesting on you. Sounds like it needs to make a comeback.
There's a browser addon called Meta Random Search which sends your queries randomly through google, bing, ddg, yahoo and other search engines so that nobody has a full history. Paired with a user agent switcher (personally using Chameleon on Firefox) with a high frequency of change (1 min or so) and disabled browser telemetry it might throw them off already even without poisoning results. Especially since every query consumes natural resources I'm not really a fan of this approach.
Peter Thiel knows about the anti christ
What's stopping them from lying about this? They lie about literally everything, why should anyone believe them about anything at all?
Lying that they have the Reddit comments that they showed everyone?
Anyway, Reddit is one of the most scraped sites on the internet. Be sure that everyone who wants an archive of Reddit has an archive of Reddit. Anything you put on the internet is forever.
I don't exactly mean to say this is nothingburger, but it is one of the more boring bits of living in a dystopia and is too low on the list of evil shit Palantir is doing to care that much about.
This should come as no surprise. We all knew they were storing everything away in data centers for decades now. Once AI hit the scene it was over. I knew right at that moment they would use it to parse all that information. That's why I stopped using social media in a way that could be used to identify my more violent and anarchistic takes. I'm sure I could still be found out, but I'll not be making it easy for the bastards. Fuck Republicans for fear mongering all of this for my entire life and now cheering for it to be used against their perceived enemies, traitorous cocksuckers.
Hey, this guy over here is hiding 'more violent and anarchisic takes'.
We're not sure what that means so better lock you up just in case. Law Shmaw.
Q. uell. Sur....... prise ~
Once upon a time, we understood that "putting something online" meant releasing it to the public in perpetuity. Any claim otherwise was gaslighting.
Facebook, et al, have somehow managed to convince people that privacy is something you can get back after they've coerced you to give it up.
Yup, data archival. Now imagine this future: right now, encrypted data transfers may be wire-tapped and stored. When quantum computers are available, all that traffic will be decryptable. This includes pretty much all general HTTPS traffic since TLS mostly uses ECDHE for key exchange which isn't quantum secure.
I bet nation state actors are recording everything they can.
Damn, dude... that's insane and I'm surprised it's never occurred to me.
I've had the realization before as I realize that maybe my password database will eventually be easily cracked... but there's no reason it cannot apply to data in transit as well, as long as someone is recording it.
Palantir is an evil company.
This was pretty evident the second they named their company over the tool used by The Dark Lord Sauron and Saruman to spy on the actions of others from Lord of the Rings.
You don't need AI for this, just a ton of money for storage and either tolerance for a slow query (like 15-20 minutes) or an engineer who knows what they're doing in search.
Hot. Hot, hot, hot.
Convinced we somehow ended up in the darkest timeline.
Just in time for the genocide of free thinkers. Lucky us!
Anything ever posted online can be archived and searched later. It doesn't take a rocket scientist to cross reference publicly available sources along with subpoenaed data.
I think even Facebook said years ago it's less taxing on the systems to unlink content rather than delete it.
Yeah but that's basically how deleting works for any normal system. You remove the pointer telling the computer where the data is, then you flag that section of data as free for writing to. It's not until something writes over the data is it truly gone.
That's not what's happening here.
Think of a database where nothing is editable. You can only add additional data. So you can't delete a post you can only add a deleted = true flag.
Much easier to keep this kind of database in sync.
Ah gotcha, very bogus. Kinda speaks to how cheap storage is, doesn't it?
It's true except in a filesystem you lose the name and location and it can be overwritten. In this instance it sounded like they just prevent access but keep all that data there, still accessible and readable and wont be overwritten.
The obvious strong link is reddit+email. Someone could have got into his personal, probably old mailbox, where original registration letters (with r/handle) and notifications still are. I find it more probable, but since government is under MAGA, they could've used some way to ask Huffman if some account matches the mail address.
This is why I resisted attaching emails to reddit accounts for years. Recently subs have started soft banning users without registered emails though, so I started just using throwaways.
The real enemy inside is the government, not the people.
People need to keep their government under more surveillance than the government does its people.
At one point it was possible to download every reddit comment ever. I think it was around 10 years ago I had a copy of that. I can't remember if it was from reddit directly or from some third party with scrapers. I recall the dataset being free... but it might have been free for me because I had an academic justification? Really don't recall.
Anyways, point being that you're delusional if you think anything you post online ever goes away. Secondly, you can be much less than palentir and have "deleted reddit account comments". Anyone can get them.
Lmao. Waybackmachine
Edit: woops, other guy said it.
Probably the Pushshift archive which is publicly downloadable.
That’s not Palantir doing it. If you put the data in a MariaDB, you can access it, too. Is not MariaDB the culprit, or the one distributing the data?
"deleted" eh?
Don't you guys have NSA? Just have a contact there and...
OP is learning about archive.org for the first time?
Maybe someone should visit their offices
So like internet archive? oooo spooky
The scary dystopian part is the ability to work out that the account belonged to someone who hadn't used it for a decade rather than just that they could see what had been posted. The Internet Archive doesn't let you ask it what someone's Digg username was.
So you acknowledge that the data exists, what you are scared of is being able to search it? Spooky stuffs.
So you acknowledge that bullets exist, what you are scared of is being able to continuously fire them at an extremely high rpm? Spooky stuff.
You fucking knuckle dragger.
You never considered that bullets could be fired at a high rate until an article you saw on lemmy told you to be scared of it?
Jesus, it's like you've melded with the idiot bus.
Haha dumbass
I'm going to say that this is actually spooky.
Not that it's unreasonable, but that the scale of what AI can surveil is so vast that there's no more personal security-via-obscurity.
It used to be that unless someone had a reason to start looking at you, anything you did online or off was effectively impossible to search. You might be caught on some store's CCTV, Or your cell provider might have location pings, but that wasn't online for anyone and needed a warrant to have the police use it to track your activities. Now cities are using Flock and similar tools to enable tracking vehicles across the country without any reason, and stores are using cloud-service AI cameras to attempt to track your mood as you move through the store. These tools can and have been abused.
Now, due to the harvesting of this data for AI, anything that's ever been recorded (video footage, social media posts, etc) and used as training data can be correlated much more easily, long after it occurred, and without needing to be law enforcement with a warrant.
I'd call that spooky.
So you think private and opensource intelligence spontaneously came into existence in the last 5 years because of AI?
No, that's not what I said. Widespread data collection and searching used to be something only state actors could accomplish and there were at least theoretically guard rails. Now the barrier of entry has been seriously reduced, the data is owned by a corporation, and being fed to AI. That has a chilling effect as well as being ripe for abuse.
I don't see an upside.
So you just make shit up as you go? You are projecting how you think things should work into reality as if it were fact. But now you are learning how it actually works and what really scares you is the shattering of the illusion you sold yourself. I mean it should be pretty apparent Google and Facebook are tools of the US government they always have been.
"Anyone could already do this, so why bother being worried that it's easier now" they said.
I still don't get your angle. Why are you defending this, or at the very least downplaying it's impacts? You seem to also be aggravated by this data collection and spying, so why are you so mad that other people are catching on?
"Oh, I'm so smart" they said. Enjoy your useless internet points?
The situation is actually different now, in the last few years. This is less relevant to the OP, but we/they are building automated snitches that will tattle on you, and more importantly be wrong with a statistical significance. See Flock mistaking license plates and calling the police on innocent people. Sure, we might catch a few violent criminals, but when your government decides that your online activity complaining about them is now criminal, your data can be correlated in real time in a way that wasn't possible in our parent's time.
Your dismissal of this seems insane. Stop arguing with me about how long it's been possible and help me/us fight against it.
Usually that's the insurmountable mountain. Data collection is easy. Formatting, storing and querying the data so you can actually get useful information out of it in a time efficient manner is the extremely hard part.
For a real world example, the organization I work at does quarterly audits of all of the field offices to make sure all of the field offices are in compliance, checking required document retention, gear, etc. and when an audit finds a requirement that is out of compliance they're given a task with a deadline to complete said task to bring them back into compliance, and these tasks have visibility all the way up the chain of command to where even the C-levels are reviewing them regularly. I've been working a project recently to flag repeated failures of the same audit requirement for the same location and it's highlighting that some field offices are not actually coming into compliance once these high visibility assigned tasks are completed which when I presented it to leadership it was a revelation just how many field offices are continuously out of compliance.
Point is, this data is being actively collected and formatted for easy access and there's still glaring issues being missed due to the difficulty of finding these trends buried in the hundreds of pages of data being generated each quarter per field office
Yea for sure more accountable reliable systems would be better than worse systems great point.
Uhhh I think you completely missed what I was saying. I was explaining that collecting data is easy, but actually making use of that data is really hard, and gave a real world example of a trend that should be obvious being buried in a mountain of data because there's simply too much data to sift through
I gave you an out you entirely missed the point no one is talking about arbitrary amounts of arbitrary data the concerns are about things we say online being used against us in the real world.
Imagine thinking your privacy is above profit where capitalism rules lmao
You muricas are too dumb to understand that nothing has more power than money on a profit driven society like yours, like the ones your stupid elite forced upon every country's throat as they could
Your "market freedom" is actually money dictatorship, EUA is not a free country
TIL America is the only capitalist country
No! Read my comment again, it's the only country that bomb other countries that do not want to participate in this prehistoric system
Also this is being orchestrated by the DSCC showing that the Democrats still have power and deep state ties, they just use them against left/anti-zionists.
EDIT: Are the downvoters refusing to see that all this came out after Janet Mills jumped in the race and immediately got endorsed by Schumer? It is obviously oppo research by her supporters.
Who's the cute guy?
Yet another reason why you shouldn't get nazi tattoos... jfc.
As much as Palantir is an evil organization, the underlying problem is libs constantly supporting fascism.
As an actual "leftist", I don't want nazi military bros as my "representative".
Kinda wild how nobody else mentions this.
What is it with this site and hating "libs" like it feels like Twitter with the Trump goons.
They're the remnants of last year's propaganda campaign. No one bothered to deprogram them. They tend to fire off their lines at unpredictable times like a toddlers toy with a low battery.
Users like techcrit have Glen Greenwald syndrome
Ok, cool to learn something new. But why is the top hit when I Google that name a fluff piece from the new Yorker about a fucking tennis player?
https://www.newyorker.com/magazine/2018/09/03/glenn-greenwald-the-bane-of-their-resistance
I don’t live in Maine anymore, but his handling of this, and clear stance on other issues important to me, have actually strengthened my willingness to vote for him was I still in the state.
I actually believe him when he says he got it while drunk on a night off with the marines, supposedly not knowing the meaning at the time. He then went on to say he immediately scheduled to cover it because getting it removed would have taken too much time to figure out because Maine doesn’t have any places that do it. Like he wanted that shit no longer visible on his body asap and went out of his way to get it done sooner than later.
Similarly with LGBTQ+ rights. Yeah he said some edgy shit on the Internet a long time ago, but he’s said he’s changed and now aggressively supports queer rights in Maine. Idk, maybe he’ll pull a Fetterman, but I don’t get that vibe.
Even if he is still in the process of fully deconstructing things, he is clearly taking the correct actions in the here and now to further that process.
I grew up with people like him and almost without fail, when I actually sat down and had real conversations with them as adults, they’ve been positive and minds have been changed in all directions.
Blue collar Mainers are some of the first people to hate billionaires, and fiercely support small government, personal freedoms and privacy. This honestly means supporting queer rights in so far as they want the freedom to be themselves too. They have been systematically lied to by a party that doesn’t actually want small government or personal liberties, and many of them have realized that.
We need to be able to welcome these people willing to be educated, and genuinely capable of changing their thinking and their ways. These people are closeted radical leftists. We will need them on board.
He seems to walk the walk as well as talk the talk.