Do you think the other ways in which people vote are better: selling your vote, picking a candidate for being presentable, picking a candidate for having the right complexion/religion/sexual orientation, picking a candidate for being married or having kids, picking a candidate because they are "smart", or poor or ... I could go on. Giving the right prompt which you find on the internet might give you a better choice than you might decide for on your own.
I think we don't do democracy because we think the masses are informed and make good decisions, but rather because it's the best system for ensuring peaceful transitions of power, thereby creating social stability which is conducive to encouraging investment in the future.
So uninformed people participating isn't an unfortunate side effect, but rather the point: making everybody feel included in the decision making processes, to make people more likely to accept political change.
"I think we do democracy not because we think the masses are informed and make good decisions, but rather because it's the best system for ensuring peaceful transitions of power, thereby creating social stability which is conducive to encouraging investment in the future.
I think people argue this but I don't think its true.
The lack of warlords leads to peaceful transitions. Trump can feel all he wants about the 2020 election but his sphere of influence was too small to take control.
This isn't the case for all those power struggles when a monarch dies. Each Lord had their own militia they could mobilize to take control and leads to stuff like War of the Roses.
We had this same issue going into the Civil War where the US army was mostly militias so it's pretty easy to grab the southern ones together and go fight the north. This isn't going to work so well post-1812 where a unified federal army exists. Of course, if you start selectively replacing generals with loyalists then you start creating a warlord.
For local elections, I have to frantically google on the day my ballot is due to figure out how to vote for. My criteria is pretty fixed: I want to vote for moderates but beyond a few high profile races I don't have a clue who the moderate option is. I can see using AI to summarize positions for more obscure candidates.
But... it's like asking a knowledgeable person. How are you sure she's giving you answers as your criteria demands, or whether she's been influenced to skew the answers to favor a candidate..
> For local elections, I have to frantically google on the day my ballot is due to figure out how to vote for.
what on earth??
practically every metropolitan area and tons of smaller communities have multiple news sources that publish "voting guides" in addition to voter pamphlets that go out before elections which detail candidates positions, ballot initiatives etc.
barring that you can also just... do your "frantic googling" before the election. it's not a waste of your time to put a little of it toward understanding the political climate of your area and maybe once in a while forming an opinion instead of whatever constitutes a "moderate" position during the largest rightward shift of the overton window in decades.
With the added bonus that a llm might not even be updated to the last developments of what happened politically and have outdated views or might not know about the candidate well enough to provide accurate info (or at least, more accurate than any voting phamplets or guides)
Your claim was so far from the truth of reality, that now, it's incumbent upon you to go back through the chain of faulty reasoning. You took it for granted that a conspiracy theory about suppressing information was true, when actually, the same Gemma model was already open-weighted by the same conspirators who you accuse of keeping Gemma out of regular peoples' reach
This is about these tools being blatantly flawed and unreliable.
In legal terms, marketing such a product is called "negligence" or "libel".
Lots of software is flawed and unreliable but this is typically addressed in the terms of service. This may not be possible with AI because the "liability" can extend well beyond just the user.
It is wrong to release something unreliable even while acknowledging it is unreliable? The product performs as advertised. If people want accurate information an LLM is the wrong tool for the job.
From the Gemma 3 readme on huggingface:
"Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements."
I do think that might be the only thing they turn out to be any good at, and only then because software is relatively easy to test and iterate on. But does that mean writing code is what the models are "for"? They're marketed as being good for a lot more than coding.
The tool did it because this is what it was designed and trained to do --- at great expense and effort --- but somewahat less than successfully.
You can't have it both ways --- the tool can't be "intelligent" yet too stupid to understand it's own limitations.
If people ask the wrong questions, the "intelligent" response would be, "Sorry, I can't do that".
Maybe the problem here is that this "intelligent" tool is really as dumb as a rock. And it's only a matter of time until lawyers argue this point to a jury in court.
There should probably be a little more effort towards making small models that don't just make things up when asked a factual question. All of us who have played with small models know there's just not as much room for factual info, they are middle schoolers who just write anything. Completely fabricated references are clearly an ongoing weakness, and easy to validate.
A talk I went to made the point that LLMs don't sometimes hallucinate. They always hallucinate -- its what they're made to do. Usually those hallucinations align with reality in some way, but sometimes they don't.
I always thought that was a correct and useful observation.
To be sure, a lot of this can be blamed on using AI studio to ask a small model a factual question. It's the raw LLM output of a highly compressed model, it's not meant to be everyday user facing like the default Gemini models, and doesn't have the same web search and fact checking behind the scenes.
On the other hand, training a small model to hallucinate less would be a significant development. Perhaps with post-training fine-tuning, after getting a sense of what depth of factual knowledge the model has actually absorbed, adding a chunk of training samples with a question that goes beyond the model's fact knowledge limitations, and the model responding "Sorry, I'm a small language model and that question is out of my depth." I know we all hate refusals but surely there's room to improve them.
All of these techniques just push the problems around so far. And anything short of 100% accurate is a 100% failure in any single problematic instance.
> effort towards making small models that don't just make things up
But all of their output it literally "made up". If they didn't make things up, they wouldn't have a chat interface. Making things up is quite literally the core of this technology. If you want a query engine that doesn't make things up, use some sort of SQL.
I don't think there is any math showing that it's the models size that limits "fact" storage, to the extent these models store facts. And model size definitely does not change the fact that all LLMs will write things based on "how" they are trained, not on how much training data they have. Big models will produce nonsense just as readily as small models.
To fix that properly we likely need training objective functions that incorporate some notion of correctness of information. But that's easier said than done.
I am aware, but think about right after a smaller is done training. The researchers can then quiz it to get a sense of the depth of knowledge it can reliably cite, then fine-tune with examples of questions beyond the known depth of the model being refused with "Sorry, I'm a small model and don't have enough info to answer that confidently."
Obviously it's asking for a lot to try to cram more "self awareness" into small models, but I doubt the current state of the art is a hard ceiling.
> then fine-tune with examples of questions beyond the known depth of the model being refused with "Sorry, I'm a small model and don't have enough info to answer that confidently."
This has already been tried, llama pioneered it (as far as I can infer from public knowledge, maybe openai did it years ago I don't know).
They looped through a bunch of wikipedia pages, made questions out of the info given there, posed them to the LLM and then whenever the answer did not match what was in wikipedia, they went ahead and finetuned on "that question: Sorry I don't know ...".
Then, we went one step ahead, and finetuned it to use search in these cases instead of saying I don't know. Finetune it on the answer toolCall("search", "that question", ...) or whatever.
Something close to the above is how all models with search tool capability are fine tuned.
All these hallucinations are despite those efforts, it was much worse before.
This whole method depends on the assumption that there is actually a path in the internal representation that fires when it's gonna hallucinate. The results so far tell us that it is partially true. No way to quantify it of course.
Do you have any links on tracing a NN for a hallucination/unconfidence neuron? I do worry that it's possible there isn't an obvious neuron that always goes to 1 when bullshitting, but maybe with the right finetuning, we could induce one?
Given that the current hypewave is already going on for a couple years, I think it's plausible to assume that there really are fundamental limitations with LLMs on these problems. More compute didn't solve it as promised, so my bets are on "LLMs will never not do hallucinations"
How do you know how much effort they're putting in? If they're making stuff up then they're not useful, I think the labs want their models to be useful.
At some point we have to be willing to call out, at a societal level, that LLMs have been fundamentally oversold. The response to "It made defamatory facts up" of "You're using it wrong" is only going to fly for so long.
Yes, I understand that this was not the intended use. But at some point if a consumer product can be abused so badly and is so easy to use outside of its intended purposes, it's a problem for the business to solve and not for the consumer.
Businesses can't just wave a magic wand and make the models perfect. It's early days with many open questions. As these models are a net positive I think we should focus on mitigating the harms rather than some zero tolerance stance. We shouldn't allow the businesses to be neglectful, but I don't see evidence of that.
Here on HN we talk about models, and rightfully so. Elsewhere though people talk about AI, which has a different set of assumptions.
It's worth noting too that how we talk about and use AI models is very different from how we talk about other types of models. So maybe it's not surprising people don't understand them as models.
> We shouldn't allow the businesses to be neglectful, but I don't see evidence of that.
Calling it "AI", shoving it into many existing workflows as if it's competently answering questions, and generally treating it like an oracle IS being neglectful.
You seem to be missing the obvious point: popularity of a product doesn't ensure the benefit of said product. There are tons of wildly popular products which have extremely negative outcomes for the user and society at large.
Let's take a weaker example, some sugary soda. Tons of people drink sugary sodas. Are they truly a net benefit to society, or a net negative social cost? Just pointing out that there are a high number of users doesn't mean it inherently has a high amount of positive social outcomes. For a lot of those drinkers, the outcomes are incredibly negative, and for a large chunk of society the general outcome is slightly worse. I'm not trying to argue sugary sodas deserve to be completely banned, but its not a given they're beneficial just because a lot of people bothered to buy them. We can't say Coca-Cola is obviously good for people because its being bought in massive quantities.
Do the same analysis for smoking cigarettes. A product that had tons of users. Many many hundreds of millions (billions?) of users using it all day every day. Couldn't be bad for them, right? People wouldn't buy something that obviously harms them, right?
AI might not be like cigarettes and sodas, sure. I don't think it is. But just saying "X has Y number of weekly active users, therefore it must be a net positive" as some example of it truly being a positive in their lives is drawing a correlation that may or may not exist. If you want to show its positive for those users, show those positive outcomes, not just some user count.
Businesses should be able to not lie. In fact, they should be punished for lying and exaggersting much more often - both by being criticised and loosing contracts and legally.
Maybe someone else actually made up the defamatory fact up, and it was just parroted.
But fundamentally the reason ChatGPT became so popular as opposed to its incumbents like Google or Wikipedia, is that it dispensed with the idea of attributing quotes to sources. Even if 90% of the things it says can be attributed, it's by design that it can say novel stuff.
The other side of the coin is that for things that are not novel, it attributes the quote to itself rather than sharing the credit with sources, which is what made the thing so popular in the first place, as if it were some kind of magic trick.
These are obviously not fixable, but part of the design. I have the theory that the liabilities will be equivalent if not greater to the revenue recouped by OpenAI, but the liabilities will just take a lot longer to realize, considering not only the length of trials, but the length of case law and even new legislation to be created.
In 10 years, Sama will be fighting to make the thing an NFP again and have the government bail it out of all the lawsuits that it will accrue.
The current president makes fabricated allegations almost every single day, and many politicians in general but "oh no the machine did it a handful of times so we need to crucify the technology that just imitates humans including the aforementioned, and the billions of dollars invested in creating it"
Perhaps we should make sure that the human sources are liable for making false allegations and therefore the likelihood of those fabrications existing in the first place is significantly reduced, so any machine -or any other entity- using public information available is more likely to be correct.
"False or misleading answers from AI chatbots masquerading as facts still plague the industry and despite improvements there is no clear solution to the accuracy problem in sight."
One potential solution to the accuracy problem is to turn facts into a marketplace. Make AIs deposit collateral for the facts they emit and have them lose the collateral and pay it to the user when it's found that statements they presented were false.
AI would be standing behind its words by having something to lose, like humans.
A facts marketplace would make facts easy to challenge and hard to get right.
Working POC implementation of facts marketplace in my submissions.
I doubt that could ever work. It's trivial to get these models to output fabrications if that's what you want; just keep asking it for more details about a subject than it could reasonably have. This works because the models are absolutely terrible at saying "I don't know", and this might be a fundamental limitation of the tech. Then of course you have the mess of figuring out what the facts even are, there are many contested subjects our society cannot agree on, many of which don't lend themselves to scientific inquiry.
AI != LLMs. A future version of better AI might become capable of saying "I don't know."
The mess of figuring out what the facts are, in a way, already has deployed solutions. Journalists are a form of fact checkers. Big news companies with multi-million budgets pay journalists to check things. In effect there already is a market for fact-checking. Managing perceptions by twisting facts is something many have been accused of. When news companies are found to have spread lies they possibly face penalties.
So if the big clunky enterprise version of a form of fact-checking is already in place and kind of working, why would something closer to the lean startup-y tail-end of the distribution of fact-checking not work? If a version of a facts marketplace is already working, why doubt it would ever work?
X paying users for engagement is a new variation of this. Prediction markets like Polymarket is another. Polymarket teaming up with TruthSocial could be an interesting result.
It's okay to have many contested subjects our society cannot agree on and which don't lend themselves to scientific inquiry. It's okay to say I don't know.
And also (an incompetent and lazy) lawyer’s worst nightmare.
At least once a week, there is another US court case where the judge absolutely rips apart an attorney for AI-generated briefs and statements featuring made-up citations and nonexistent cases. I am not even following the topic closely, and yet I just encounter at least once a week.
Here are a couple most recent ones I spotted: Mezu v. Mezu (oct 29)[0], USA v. Glennie Antonio McGee (oct 10)[1].
I think the parent post imagines defamation cases will be worthwhile. I'm sure there will be some, but an AI simply lying in a query doesn't = damages worth suing over.
> The consistent pattern of bias against conservative figures demonstrated by Google’s AI systems is even more alarming. Conservative leaders, candidates, and commentators are
disproportionately targeted by false or disparaging content.
That's a little rich given the current administration's relationship to the truth. The present power structure runs almost entirely on falsehoods and conspiracy theories.
A lot facts that people deem liberal or leftist or something are simply statistically consistent with the world literature as a whole and problematic for conservative ideals.
A claim can be both statistically consistent with a given corpus and also simply wrong. Saying that Ted Cruz was caught giving blowjobs in an airport bathroom for instance. That headline wouldn't surprise anybody, but it's also wrong. I just made it up now.
> So then is that leftist to point out the fact that it shouldn't be surprising
I don't think so, no.
> just an accurate description of party members as a whole?
It wouldn't be. While enough republicans have gotten caught being gay to remove the element of surprise and plausibly be the basis of LLM hallucinations, most of them haven't been, so such an LLM hallucination wouldn't actually be accurate, merely unsurprising.
I don't think there's any mechanism inside of language models to accurately weigh this kind of nuance. I mean you would think that it would work like that but I don't see how it could or would in practice. The quantity of words and relationship to their description of reality is not something that can be directly calculated, let alone audited.
Back in 2006 at the White House Correspondent's Dinner, Stephen Colbert said "reality has a well known liberal bias" which I thought was a pretty funny line.
I think about facets of this a lot. The conservative ideals of thinking only in zero-sum characterizations of political problems, that someone must go without in order for someone to gain, or that a conservative ideal is to be led by some authority both don't comport with how knowledge can also be gained in society through peer to peer relationships or the very idea that wealth can be created. That the world doesn't have to follow conservative ideals is how a statement like that becomes so funny since it is the current conspiratorial reflex of the right.
Placing a model behind a “Use `curl` after generating an API key using `gcloud auth login` and accepting the terms of service” is probably a good idea. Anything but the largest models equipped with search to ground generation is going to hallucinate at a high enough rate that a rando can see it.
You need to gate away useful technology from the normies, usually. E.g. kickstarter used to have a problem where normies would think they were pre-ordering a finished product and so they had to pivot to being primarily a pre-order site.
Anything that is actually experimental and has less than very high performance needs to be gated away from the normies.
Google sat on this technology for years and didn't release their early chatbots to the public for this reason. The problem is that OpenAI opened Pandora's box and recklessly leaned into it.
LLMs have serious problems with accuracy, so this story is entirety believable - we've all seen LLMs fabricate far more outlandish stuff.
Unfortunately, it's also worth pointing out that neither Marsha Blackburn nor Robby Starbuck are reliable narrators historically; nor are they even impartial actors in this particular story.
Blackburn has a long history of fighting to regulate Internet speech in order to force them to push ideological content (her words, not mine), so it's not surprising to see that this story originated as part of an unrelated lawsuit over First Amendment rights on the Internet and that Blackburn's response to it is to call for it all to be shut down until it can be regulated according to her partisan agenda (again, her words, not mine) - something which she has already pushed for via legislation that she has coauthored.
Just a day ago I asked Gemini to search for Airbnb rooms in an area and give me a summarized list.
It told me it can't and I could do it myself.
I told it again.
Again it told me it can't, but here's how I could do it myself.
I told it it sucks and that ChatGPT etc. can do it for me.
Then it went and I don't know, scrapped Airbnb or used a previous search it must have had, to pull up rooms with an Airbnb link to each.
…
After using a bunch of products I now think a common option they all need to have is a toggle between "Monkey's Paw" mode: Do As I Say, vs a "Do What I Mean" mode.
Basically where the user takes responsibility and where the AI does.
If it can't do or isn't allowed to do something when in Monkey Paw mode then just stop with a single sentence. Don't go on a roundabout gaslighting trip.
Especially in parliamentary democracies where people already take political quizzes to make sense of all the parties and candidates on the ballot.
Why ? Don't worry, everything will be fine. Sincerely, FAANG
Maybe the best defense of voting is that there are so many reasons people vote that it is hard to manipulate everyone in the same way.
Of course, that is historically. Voting is quite compromised at this point no matter how you slice it.
"Renfield, you are free now. Yes, master." /s
So uninformed people participating isn't an unfortunate side effect, but rather the point: making everybody feel included in the decision making processes, to make people more likely to accept political change.
"I think we do democracy not because we think the masses are informed and make good decisions, but rather because it's the best system for ensuring peaceful transitions of power, thereby creating social stability which is conducive to encouraging investment in the future.
The lack of warlords leads to peaceful transitions. Trump can feel all he wants about the 2020 election but his sphere of influence was too small to take control.
This isn't the case for all those power struggles when a monarch dies. Each Lord had their own militia they could mobilize to take control and leads to stuff like War of the Roses.
We had this same issue going into the Civil War where the US army was mostly militias so it's pretty easy to grab the southern ones together and go fight the north. This isn't going to work so well post-1812 where a unified federal army exists. Of course, if you start selectively replacing generals with loyalists then you start creating a warlord.
"Let me ask Grok who I should vote for..."
what on earth??
practically every metropolitan area and tons of smaller communities have multiple news sources that publish "voting guides" in addition to voter pamphlets that go out before elections which detail candidates positions, ballot initiatives etc.
barring that you can also just... do your "frantic googling" before the election. it's not a waste of your time to put a little of it toward understanding the political climate of your area and maybe once in a while forming an opinion instead of whatever constitutes a "moderate" position during the largest rightward shift of the overton window in decades.
Here is more info and links to the models, so you can interrogate them about Senatorial scandals on your hardware at home.
https://huggingface.co/blog/gemma
Your claim was so far from the truth of reality, that now, it's incumbent upon you to go back through the chain of faulty reasoning. You took it for granted that a conspiracy theory about suppressing information was true, when actually, the same Gemma model was already open-weighted by the same conspirators who you accuse of keeping Gemma out of regular peoples' reach
In legal terms, marketing such a product is called "negligence" or "libel".
Lots of software is flawed and unreliable but this is typically addressed in the terms of service. This may not be possible with AI because the "liability" can extend well beyond just the user.
From the Gemma 3 readme on huggingface: "Models generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements."
So these vendors spent lots of time and money training LLMs to answer questions that people should not ask --- but are allowed and encouraged to.
Nonsensical and unrealistic. I expect the courts will agree and hold the vendors liable.
Then the tool should not be doing it --- but it does. And therein is the legal liability.
The knives are entering people’s guts. They should not be doing that. The knife companies should be liable for these stabbings.
The tool did it because this is what it was designed and trained to do --- at great expense and effort --- but somewahat less than successfully.
You can't have it both ways --- the tool can't be "intelligent" yet too stupid to understand it's own limitations.
If people ask the wrong questions, the "intelligent" response would be, "Sorry, I can't do that".
Maybe the problem here is that this "intelligent" tool is really as dumb as a rock. And it's only a matter of time until lawyers argue this point to a jury in court.
Big tech created a problem for themselves by allowing people to believe the things their products generate using LLMs are facts.
We are only reaching the obvious conclusion of where this leads.
I always thought that was a correct and useful observation.
On the other hand, training a small model to hallucinate less would be a significant development. Perhaps with post-training fine-tuning, after getting a sense of what depth of factual knowledge the model has actually absorbed, adding a chunk of training samples with a question that goes beyond the model's fact knowledge limitations, and the model responding "Sorry, I'm a small language model and that question is out of my depth." I know we all hate refusals but surely there's room to improve them.
But all of their output it literally "made up". If they didn't make things up, they wouldn't have a chat interface. Making things up is quite literally the core of this technology. If you want a query engine that doesn't make things up, use some sort of SQL.
To fix that properly we likely need training objective functions that incorporate some notion of correctness of information. But that's easier said than done.
Obviously it's asking for a lot to try to cram more "self awareness" into small models, but I doubt the current state of the art is a hard ceiling.
This has already been tried, llama pioneered it (as far as I can infer from public knowledge, maybe openai did it years ago I don't know).
They looped through a bunch of wikipedia pages, made questions out of the info given there, posed them to the LLM and then whenever the answer did not match what was in wikipedia, they went ahead and finetuned on "that question: Sorry I don't know ...".
Then, we went one step ahead, and finetuned it to use search in these cases instead of saying I don't know. Finetune it on the answer toolCall("search", "that question", ...) or whatever.
Something close to the above is how all models with search tool capability are fine tuned.
All these hallucinations are despite those efforts, it was much worse before.
This whole method depends on the assumption that there is actually a path in the internal representation that fires when it's gonna hallucinate. The results so far tell us that it is partially true. No way to quantify it of course.
Yes, I understand that this was not the intended use. But at some point if a consumer product can be abused so badly and is so easy to use outside of its intended purposes, it's a problem for the business to solve and not for the consumer.
It's worth noting too that how we talk about and use AI models is very different from how we talk about other types of models. So maybe it's not surprising people don't understand them as models.
Calling it "AI", shoving it into many existing workflows as if it's competently answering questions, and generally treating it like an oracle IS being neglectful.
Uhhh… net positive for who exactly?
Or am I not following your logic correctly?
Let's take a weaker example, some sugary soda. Tons of people drink sugary sodas. Are they truly a net benefit to society, or a net negative social cost? Just pointing out that there are a high number of users doesn't mean it inherently has a high amount of positive social outcomes. For a lot of those drinkers, the outcomes are incredibly negative, and for a large chunk of society the general outcome is slightly worse. I'm not trying to argue sugary sodas deserve to be completely banned, but its not a given they're beneficial just because a lot of people bothered to buy them. We can't say Coca-Cola is obviously good for people because its being bought in massive quantities.
Do the same analysis for smoking cigarettes. A product that had tons of users. Many many hundreds of millions (billions?) of users using it all day every day. Couldn't be bad for them, right? People wouldn't buy something that obviously harms them, right?
AI might not be like cigarettes and sodas, sure. I don't think it is. But just saying "X has Y number of weekly active users, therefore it must be a net positive" as some example of it truly being a positive in their lives is drawing a correlation that may or may not exist. If you want to show its positive for those users, show those positive outcomes, not just some user count.
How confident are you that 800M people know what the negative aspects are to make it a net positive for them?
But fundamentally the reason ChatGPT became so popular as opposed to its incumbents like Google or Wikipedia, is that it dispensed with the idea of attributing quotes to sources. Even if 90% of the things it says can be attributed, it's by design that it can say novel stuff.
The other side of the coin is that for things that are not novel, it attributes the quote to itself rather than sharing the credit with sources, which is what made the thing so popular in the first place, as if it were some kind of magic trick.
These are obviously not fixable, but part of the design. I have the theory that the liabilities will be equivalent if not greater to the revenue recouped by OpenAI, but the liabilities will just take a lot longer to realize, considering not only the length of trials, but the length of case law and even new legislation to be created.
In 10 years, Sama will be fighting to make the thing an NFP again and have the government bail it out of all the lawsuits that it will accrue.
Maybe you can't just do things
One potential solution to the accuracy problem is to turn facts into a marketplace. Make AIs deposit collateral for the facts they emit and have them lose the collateral and pay it to the user when it's found that statements they presented were false.
AI would be standing behind its words by having something to lose, like humans. A facts marketplace would make facts easy to challenge and hard to get right.
Working POC implementation of facts marketplace in my submissions.
AI != LLMs. A future version of better AI might become capable of saying "I don't know."
The mess of figuring out what the facts are, in a way, already has deployed solutions. Journalists are a form of fact checkers. Big news companies with multi-million budgets pay journalists to check things. In effect there already is a market for fact-checking. Managing perceptions by twisting facts is something many have been accused of. When news companies are found to have spread lies they possibly face penalties.
So if the big clunky enterprise version of a form of fact-checking is already in place and kind of working, why would something closer to the lean startup-y tail-end of the distribution of fact-checking not work? If a version of a facts marketplace is already working, why doubt it would ever work?
X paying users for engagement is a new variation of this. Prediction markets like Polymarket is another. Polymarket teaming up with TruthSocial could be an interesting result.
It's okay to have many contested subjects our society cannot agree on and which don't lend themselves to scientific inquiry. It's okay to say I don't know.
Imagine the ads on TV: "Has AI lied about you? Your case could be worth millions. Call now!"
At least once a week, there is another US court case where the judge absolutely rips apart an attorney for AI-generated briefs and statements featuring made-up citations and nonexistent cases. I am not even following the topic closely, and yet I just encounter at least once a week.
Here are a couple most recent ones I spotted: Mezu v. Mezu (oct 29)[0], USA v. Glennie Antonio McGee (oct 10)[1].
0. https://acrobat.adobe.com/id/urn:aaid:sc:US:a948060e-23ed-41...
1. https://storage.courtlistener.com/recap/gov.uscourts.alsd.74...
Simple example: A prospective employer refuses to hire you because of some blatant falsehood generated by an LLM.
- https://www.msba.org/site/site/content/News-and-Publications...
- https://www.reuters.com/legal/government/judge-disqualifies-...
- https://calmatters.org/economy/technology/2025/09/chatgpt-la...
> The consistent pattern of bias against conservative figures demonstrated by Google’s AI systems is even more alarming. Conservative leaders, candidates, and commentators are disproportionately targeted by false or disparaging content.
That's a little rich given the current administration's relationship to the truth. The present power structure runs almost entirely on falsehoods and conspiracy theories.
I think plausiblibly they might, through no fault of Google, if only because scandals involving conservatives might be statistically more likely.
I don't think so, no.
> just an accurate description of party members as a whole?
It wouldn't be. While enough republicans have gotten caught being gay to remove the element of surprise and plausibly be the basis of LLM hallucinations, most of them haven't been, so such an LLM hallucination wouldn't actually be accurate, merely unsurprising.
https://en.wikipedia.org/wiki/Stephen_Colbert_at_the_2006_Wh...
0: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
You need to gate away useful technology from the normies, usually. E.g. kickstarter used to have a problem where normies would think they were pre-ordering a finished product and so they had to pivot to being primarily a pre-order site.
Anything that is actually experimental and has less than very high performance needs to be gated away from the normies.
Unfortunately, it's also worth pointing out that neither Marsha Blackburn nor Robby Starbuck are reliable narrators historically; nor are they even impartial actors in this particular story.
Blackburn has a long history of fighting to regulate Internet speech in order to force them to push ideological content (her words, not mine), so it's not surprising to see that this story originated as part of an unrelated lawsuit over First Amendment rights on the Internet and that Blackburn's response to it is to call for it all to be shut down until it can be regulated according to her partisan agenda (again, her words, not mine) - something which she has already pushed for via legislation that she has coauthored.
It told me it can't and I could do it myself.
I told it again.
Again it told me it can't, but here's how I could do it myself.
I told it it sucks and that ChatGPT etc. can do it for me.
Then it went and I don't know, scrapped Airbnb or used a previous search it must have had, to pull up rooms with an Airbnb link to each.
…
After using a bunch of products I now think a common option they all need to have is a toggle between "Monkey's Paw" mode: Do As I Say, vs a "Do What I Mean" mode.
Basically where the user takes responsibility and where the AI does.
If it can't do or isn't allowed to do something when in Monkey Paw mode then just stop with a single sentence. Don't go on a roundabout gaslighting trip.