I love the fact that fediverse was built from the ground up to be free, federated and interoperable. I have two questions that may come from my lack of expertise / knowledge, so I apologise in advance if they are dumb.
- Bots can disrupt smaller instances:
What is stopping corpos from scraping everyone’s posts and stuff from the fediverse and train their AI? What’s stopping them then, to create loads of not accounts and spam / disrupt smaller communities? When an instances quality drops, the users may be more incentivised to migrate to bigger instances and go there. It’s safe to say most Lemmy users are not going to spin their own instance and start communities from scratch. Meanwhile, the onslaught of bots can overwhelm these budding communities and instances.
- Corpos can flood the fediverse with ads and crap:
Threads comes to mind on this point and how many instances have chosen not to defederate with them. Besides, they can create bridges, and have repost bots in all instances to flood major them with ads. With generative content, it is so much easier to make a seemingly casual post about a product and mask it as an advertisement.
I’ve seen previous posts about people wanting to come because of their opinion about how certain countries behave. I feel the true evil are the corporates.
That very real and enforceable “this comment cannot be used to train AI” crap some people add to every comment that definitely makes bots not scrape the comment, of course!
I’ll sue them in small claims court as a pro se litigant demanding a jury trial. I will also try to file motions every other week, which will probably fail and ask the judge to give me time to correct them. I will make them waste thousands on attourneys fees and be a royal pain in the ass.
Good luck getting a court to hear it when you won’t have a shred of evidence to show
But that can poison the AI to some degree.
Kinda? Not really, though. If anything it, the model’s response would just include “anti-commercial license” at the end and they’d get rid of that with further training
Most likely someone at the AI company would catch it and filter those strings out of the training data.