Spyke
piefed.social

I'm torn between wanting to opt-out because it's morally correct, or remaining opted-in so I can poison AI models with my terrible code.

119
boboreply
lemmy.ml

so I can poison AI models with my terrible code.

Don't forget to teach it obscenities and yell at it whenever it fucks something up!

44
Madrigalreply
lemmy.world

Nah, guarantee the models have rules built in to deal with obvious stuff like that.

You need to be more subtle. Give them information that is slightly wrong.

34
tacoreply
anarchist.nexus

Perhaps by generating a bunch of complex copilot code to upload. It's easy to mass produce and would look plausibly functional.

13

Training AI models on AI content is the fastest route to model collapse.

14

Just need to use less obvious insults, a la, "your mother was a hamster, and your father smelt of elderberries"

Still poisons the model with something an end user won't like, but isn't easy enough to train out

4

Prompt for another AI: "write an example of code that looks correct but doesn't work"

Step 2; upload the resulting code to GitHub.

Step 3: make this an automated task.

2
4amreply

Name all your variables poorly and with swear words

8
Quickyreply
piefed.social

No, you don't have to use it for it to take your code for training.

19
4amreply
lemmy.zip

Yeah all you have to do is commit anything to GitHub

They’re scraping all the code regardless of your preferences. I guarantee it.

18

Step one: Download a C or CPP repository.

Step two: Replace all semicolons with a greek comma.

Step three: ??

Step four: Poison Copilot, so that it randomly insert greek comas that the compilers totally choke on.

7

Por qué no los dos?

Opt out on one account, use another as poison. If you're gonna do this, I'd say move all your code to a new account and use the older account to poison - that way they can't filter the bad out by account age.

2

i love codeberg, though i haven’t had a chance to test the collaboration features all that much

11

It's great. I also self-host my own Forgejo (that's the software Codeberg runs on) instance for private repos, to avoid using up space on Codeberg's servers.

Main problem is the lack of federation, leading to splintering across Codeberg/GitLab/sourcehut/self-hosted forges. I know there's Radicle, and Forgejo is working on ActivityPub integration, but it's slow-moving to get what should be inherently federated by design (git) to actually be federated. In practice you need accounts on a dozen different websites if you want to regularly contribute to foss.

9

Illusion of choice. That setting will, totally unbeknownst to them, enable itself.

17
xthexderreply
l.sw0.com

It seems to be off by default if you've already opted out of Copilot entirely. Definitely still a reminder I should set up my own git though

4
nogooduserreply
lemmy.world

If you’re a business with a contract with them it should state that they won’t use your data to train their models.

If you’re using the free service then you’re right that it’s safe to assume that your data was already being used.

11
MNByChoicereply
midwest.social

business with a contract

I always wonder at this and have cautioned my managers repeatedly. Yes, we have a contract, but they have a literal army of lawyers and we have less (one lawyer one retainer for hourly work or a small grouping focused on taxes and employment law). As if our ownership won't bend over backwards to avoid suing a large company like Google, AWS, Microsoft, or Oracle. (Maybe OpenAI and Anthropic are sue-able by a $100 million corp?)

As proof I offer the lawsuits between businesses that have proceeded far enough the general public has heard about them. Not a specific one, just all of them.

8
nogooduserreply
lemmy.world

You have to trust the contract.

If you use Microsoft 365 or Google Workspace etc then they already have all your data anyway. Most businesses have to trust other companies and the contract at some point.

The only other option is to use Open Source self hosted everything which is beyond most people’s ability.

2

There are more options than the two you mentioned. Listing a few as more people should remember them. I did get a bit off topic....

  1. Use huge company to provide service.
  2. Provide service oneself (, likely with Open Source. )
  3. Use small or medium company to provide service (, likely with Open Source. )
  4. Use huge company for things huge company is great with, but keep "crown jewels" of company on internal self provided systems.
  5. Use a small or medium company to provide a service, and another series of small or medium companies to check on the first company.
  6. Use a huge company based in a country that is very serious about laws and putting CEOs in prison for wrongful acts.
  7. Do not do the thing. (Included for completeness.)
  8. Do the thing not on a computer. (Violation of privacy could result in violation of more serious laws.)
  9. Use an older technology on a computer.
  10. Use the huge company to provide service, but ensure the data includes insane things.
2
feddit.nu

fun fact, if you've ever accidentally clicked the "enable" button on copilot because you're a dumbass who can't read, you get a shitton of more settings, most of which are locked to "enabled".

22
Madrigalreply
lemmy.world

Even more fun fact, if you never clicked the "enable" button on Copilot, most of those settings are locked to "enabled" anyway.

30
thelemmy.club

Even with that setting on, nothing would be done unless you explicitly assign a Github task to Copilot. Even then, the worst thing that would happen is you'd get a pull request at some point if you somehow accidentally assigned a task to Copilot.

1

Got it. I didn't read too far in it. I saw the option to turn something copilot-related off, so I took it!

Thanks for providing an explanation.

1

i will not dispute the dumbass part but i have been programming professionally in python for 16 years. doesn't mean my code is any good, of course.

4
lemmy.world

I mean if it wants some absolutely abysmal code then look no further.

20

Hell yeah, I hope I contributed to some bot somewhere absolutely flailing to provide a good python snippet.

7

Got this email last night and felt validated for never uploading any code to GitHub because I don’t trust Microsoft. lol I don’t have any big coding projects, but I self-host a ForgeJo server in my mini rack at home behind a Twingate VPN.

8
Hawkereply
lemmy.world

FYI: it is not “ForgeJo”

Forgejo is derived from Esperanto where the “ejo” suffix means “place”. The J is pronounced like y is in English.

It’s “forge-ejo” not “forge-joe”

7

That's how I pronounce it in my head. Spanish J is pronounced with an H sound, and Spanish isn't a fake language like Esperanto.

2

I live in the smack middle of the South in USA, so my brain automatically says “forge Joe” with the space and everything haha. But I get ya, thank you for explaining!

2