FaceDeer

FaceDeer@fedia.io · 57 minutes ago

If it’s not communicating anything, what’s the point?

FaceDeer@fedia.io · 12 hours ago

My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.

We’d be covering ourselves in poop to prevent people from sitting next to us on the train. Sure, people will avoid sitting next to us, but in the meantime we’ll be covered in poop.

And then other people will learn the trick, cover themselves in poop too, and now everyone’s poopy and the trick stops working.

There is still a lot of understanding that we do automatically that an llm will never do.

Are you willing to bet the convenience of comprehensible online discourse on that? “Automatically understanding stuff” is basically the one job of LLMs.

LLMs model language, and coming up with some kind of “gibberish” filter is simply inventing a new language. If there’s semantic meaning in it the LLMs will figure it out just like any other language, and if there isn’t semantic meaning then we’ve lost the ability to communicate entirely. I see no upside.

FaceDeer@fedia.io · 1 day ago

In my experience the vast majority of posts about Elon Musk are from people who hate him and are tired of hearing about him.

FaceDeer@fedia.io · 1 day ago

Well, the “at least for now” part is my point - if people start using “gibberish” to communicate or to hide their communication, that provides training material for LLMs to let them figure out how to use it too.

LLMs learn how to communicate based on existing examples of communication. As long as humans are communicating with each other somehow then LLMs will be able to train how to do that too. They have the same communication capabilities that we do at this point, so there’s not really any way we can make a secret clubhouse that they can’t figure out how to infiltrate.

Personally, I think there’s two main routes we can go to deal with this. Either we can simply accept that there’s no way to be 100% sure we’re talking to a human any more and evaluate the value of our conversation based on the content of the words spoken rather than the composition of the entity generating them, or we could come up with some kind of “proof of personhood” system to allow people to label the text the write as coming from them.

The latter is extremely hard to do, of course, both from a technical and cultural perspective. And such a system would likely still allow someone’s “person token” to be sneakily used by AI, either by voluntarily delegating it (I could very well be retyping all of this out of a ChatGPT window) or through hackery.

So I’m inclined toward the former. If I’m chatting with someone and I’m having a good time doing it, and then later I find out it was a bot, why should that change how much fun I had?

FaceDeer@fedia.io · 2 days ago

I don’t see how that would be practical. People who aren’t “in on the joke”, as it were, will call out the gibberish and downvote it. If enough people are “in on the joke” then the whole forum becomes useless and some other forum will be created to fill the role of the original. The AI will train off of that one.

Basically, if you don’t want an AI training on your content, then don’t post your content in public where an AI will see it. The Fediverse is the last place you should be posting since its very nature is about openly broadcasting your content to whoever wants to see it.

FaceDeer@fedia.io · 2 days ago

You realize that this is only going to train LLMs how to recognize “gibberish?”

FaceDeer@fedia.io · 5 days ago

It’s unfortunate that there’s such a powerful knee-jerk prejudice against blockchain technology these days that perfectly good solutions are sitting right there in front of us but can’t be used because they have an association with the dreaded scarlet letters “NFT.”

FaceDeer@fedia.io · 5 days ago

Not only can AI do that, it probably does it far better than a human would.

I like XKCD’s solution. Aside from the fact that it would heavily reinforce whatever bubble each community lived in, of course.

FaceDeer@fedia.io · 6 days ago

There is a certain amount of irony when people respond to a comment that mentions AI with a reflexive “AI is just a fancy autocomplete!” Without any relevance to the larger context.

FaceDeer@fedia.io · 6 days ago

Yeah. A lot of people loudly declaring that they’re switching to Linux, followed by them staying with Windows anyway.

FaceDeer@fedia.io · 6 days ago

But it’s yet another opportunity to post a comment about how much we hate cybertrucks and the people who own them, so up it goes!

FaceDeer@fedia.io · 6 days ago

It’s getting creepy just how fast these guys whip out their suicide grenade. And the way he was just holding it and looking at it until it went off in his face… I can’t imagine what’s going through their heads (aside from shrapnel, of course).

FaceDeer@fedia.io · 6 days ago

Let’s just keep it between you and me for now.

FaceDeer@fedia.io · 10 days ago

Yeah, the whole concept of “national” TLDs is proving to be a rather poor one in practice. Very few of them actually make sense in the way they’re used.

FaceDeer@fedia.io · 11 days ago

It’s more impressive when you use inpainting to preserve the beak, eye, and feet from the original source image.

FaceDeer@fedia.io · 12 days ago

Heh. I fell off of contributing in recent years, but there was a time back in the day when my edit count was in the top hundred or so. Your impression is completely wrong.

Anyway, this discussion here isn’t going to affect what the people on Wikipedia are doing, so it doesn’t really matter. I linked to the project page above and it’s quite clear that even this “AI Cleanup” project is not in any way fundamentally opposed to using AI, they’re just focused on ensuring that editors using it are adhering to Wikipedia’s guidelines. If you think AI can’t do that then clearly your concept of how AI is useful is too limited.

FaceDeer@fedia.io · 12 days ago

You’re probably assuming that someone would just go to an LLM and say “write a Wikipedia article about subject X”? That wouldn’t work well, but that’s very far from the only way to use LLMs for Wikipedia work.

For starters, it doesn’t have to actually write content at all. You could paste an existing article into an LLM and ask it “What facts in this article lack references to back them up? Are there any weasel-worded statements, or statements that don’t appear to follow a neutral point of view?” And get lists of things that require attention.

Or you could paste a poorly-worded article in and tell it to rewrite it with all the same information but better phrasing or structure. You could put a bunch of research materials you’ve gathered into the LLM’s context and tell it to write a summary in the style of a Wikipedia article, with references to the sources for each fact mentioned. Obviously you’d check the LLM’s work afterward and probably do some manual editing, but this would be a great time and effort saver to get a first draft written. You could take an existing article and tell the LLM that some particular fact had changed or been discovered to be incorrect and ask it to rewrite the relevant parts to account for that.

Wikipedia is in many, many languages. You could have a multilingual LLM automatically compare the contents of different language versions of a Wikipedia article and ask it to spot differences in content or tone. You could have an LLM translate an article from one language to another as a starting point for creating an article in that new language.

You could have the LLM check the references of an existing article - look up each referenced work on the web and see whether it genuinely says what the article that’s using it as a reference says. It could flag all manner of subtle problems that way. Perhaps the reference sounds biased, or whoever used it as a reference misinterpreted it, or the link was simply incorrect and points to unrelated material. Being able to have an AI do a first-pass check of all that in a completely automated way would save huge amounts of time.

This is all just brainstorming off the top of my head, so I’m sure there’s plenty of other good uses that aren’t coming to mind.

FaceDeer@fedia.io · 12 days ago

From the project page:

The purpose of this project is not to restrict or ban the use of AI in articles, but to verify that its output is acceptable and constructive, and to fix or remove it otherwise.

There’s nothing fundamentally wrong with LLMs. Users just need to know their capabilities and limitations and use them correctly. Just like any other tool.

FaceDeer@fedia.io · 15 days ago

They’re not talking about the same thing.

Last week, researchers at the Allen Institute for Artificial Intelligence (Ai2) released a new family of open-source multimodal models competitive with state-of-the-art models like OpenAI’s GPT-4o—but an order of magnitude smaller.

That’s in reference to the size of the model itself.

They then compiled a more focused, higher quality dataset of around 700,000 images and 1.3 million captions to train new models with visual capabilities. That may sound like a lot, but it’s on the order of 1,000 times less data than what’s used in proprietary multimodal models.

That’s in reference to the size of the training data that was used to train the model.

Minimizing both of those things is useful, but for different reasons. Smaller training sets make the model cheaper to train, and a smaller model makes the model cheaper to run.

FaceDeer@fedia.io · 16 days ago

Everybody wants money, that’s why they call it “money.”