The AI Thread

Did you really just swap AI for BitTorrent like that wasn’t awesome?
I did not think it was awesome, but to illustrate the point. I can make a useful resource by robbing others, but that does not make the robbing right.
 
You definitely want your APIs stateless but do you want your agent stateless?

My only experience with this other than day dreaming various software is that I’m doing a lot of refactoring of my company’s legacy web app using Augment Code, which I think uses MCP as how they run their Claude agent with your code base and its indexing of your code base as the objects it retrieves (as well as web search etc).

There’s two modes “agent mode” and “chat”. I think neither are stateless although chat mode acts like you’re just sending the LLM everything like a 2024 ChatGPT convo, but with 2025 coding agent skills (creating working files and showing you git style changes).

Agent mode is the same but keeps going and going, weirdly making it cheaper I guess because you are charged per your sent message which maybe has to resend the whole context whereas it running does not? Or maybe it’s just priced that way for other reasons. Using agent mode like it’s chat costs the same as chat but is way more effective.

You can give it much more complex instructions, and it will just run. So skill issue related to above, you have to demand it doesn’t do any coding, but writes long reports and the references these reports to make a plan and then writes a plan and then executes, and you gotta interrupt it frequently, which is expensive, but you know… keep it writing reports that it references and it can code with guidance.

I guess my point is that it’s leagues above using chat or Claude chat for coding because it’s keeping the conversation and its searches in state while you curse at and tell it it should be better. And I think it’s keeping the cost down keeping state and spinning off smaller chunks for xyz in the agent rules pipeline.

But this is all just what it feels like as a user not someone building it. And the article truthy linked obviously slaps and makes mcp sound bloated and useless so I am curious what kind of agents and control he

Like I could imagine an actually efficient agent that wasn’t just a 25k token prompt (cough Claude cough) with code listeners below but instead code listeners above sending json to and from stateless agents doing defined tasks. That should be more computer efficient and “safer” but harder to code and make “alive” like my augment agent refactoring 60 pages of legacy code to help me switch our 120 nested navigation pages from drop downs to simple sidebar.
Admittedly, this is starting to exceed my understanding of how MCP works under the hood, but yeah perhaps for like coding agents you want it to be stateful as a way to track convo history, documents it's editing, etc.

Otoh, for a lot of LLM-powered tool calling and stuff, I'd prefer a stateless function (LLM decides to call a tool, tool call gets executed, results are added to the convo on my end; the tool itself is fully stateless). And then as for the "conversation state", in stuff I've worked on, that's just all stored in our postgres tables and my own code fetches and appends to it as the convo goes on.

Perhaps a big difference here is whether you're using an MCP-powered app as a user vs trying to build an MCP-powered app as a developer who's totally cool persisting state in postgres or wherever on your own. In which case, the stateful-ness of MCP gets frustrating to deal with.

Agent mode is the same but keeps going and going, weirdly making it cheaper I guess because you are charged per your sent message which maybe has to resend the whole context whereas it running does not?
That is odd, dunno why it's cheaper in Augment. Fwiw, the whole context has to be resent to the LLM provider regardless. Cause regardless of what Augment is doing vis-a-vis MCP state, the Claude API being called inside Augment is stateless (and Augment is gonna be managing context caching with Claude to keep costs reasonable)
 

People reading AI summaries on Google search instead of news stories, media experts warn​

Experts warn that AI summaries can be inaccurate and are cutting into consumption of actual news

Some news publishers say the AI-generated summaries that now top many Google search results are resulting in less people actually reading the news — and experts are still flagging concerns about the summaries' accuracy.

When Google rolled out its AI Overview feature last year, its mistakes — including one suggestion to use glue to make pizza toppings stick better — made headlines. One expert warns concerns about the accuracy of the feature's output won't necessarily go away as the technology improves.

"It's one of those very sweeping technological changes that has changed the way we ... search, and therefore live our lives, without really much of a big public discussion," said Jessica Johnson, a senior fellow at McGill University's Centre for Media, Technology and Democracy.

"As a journalist and as a researcher, I have concerns about the accuracy."

While users have flagged mistakes in the AI-powered summaries, there is no academic research yet defining the extent of the problem. A report released by the BBC earlier this year examining AI chatbots from Google, Microsoft, OpenAI and Perplexity found "significant inaccuracies" in their summaries of news stories, although it didn't look at Google AI Overviews specifically.

In small font at the bottom of its AI summaries, Google warns users that "AI responses may include mistakes."

The company maintains the accuracy of the AI summaries is on par with other search features, like those that provide featured snippets, and said in a statement that it's continuing to "make improvements to both the helpfulness and quality of responses."

Leon Mar, director of media relations and issue management at CBC, said the public broadcaster "has not seen a significant change in search referral traffic to its news services' digital properties that can be attributed to AI summaries."

But he warned that users should be "mindful" of the varying accuracy of these summaries.

AI has 'fundamental problem'​

Chirag Shah, a professor at the University of Washington's information school specializing in AI and online search, said the error rate is due to how AI systems work.

Generative AI can't think or understand concepts the way people do. Instead, it makes predictions based on massive amounts of training data. Shah said that "no checking" takes place after the systems retrieve the information from documents and before results are generated.

"What if those documents are flawed?" he said. "What if some of them have wrong information, outdated information, satire, sarcasm?"

A human being would know that someone who suggests adding glue to a pizza is telling a joke, Shah said. But an artificial intelligence system would not.

It's a "fundamental problem" that can't be solved by "more computation and more data and more time," he said.

AI changing how we search​

As Google integrates AI into its popular search function, other AI companies' generative AI systems, such as OpenAI's ChatGPT, are increasingly being used as search engines themselves, despite their flaws.

Search engines were originally designed to help users find their way around the internet, Shah said. Now, the goal of those who design online platforms and services is to get the user to stay in the same system.

"If that gets consolidated, that's essentially the end of the free web," he said. "I think this is a fundamental and a very significant shift in the way not just search but the web, the internet, operates. And that should concern us all."

A study by the Pew Research Center from earlier this year found users were less likely to click on a link when their search resulted in an AI summary. While users clicked on a link 15 per cent of the time in response to a traditional search result, they only clicked on a link eight per cent of the time if an AI summary was included.

That's cause for alarm for news publishers, both in Canada and abroad.

"Zero clicks is zero revenue for the publisher," said Paul Deegan, CEO of News Media Canada, which represents Canadian news publishers.

Last month, a group of independent publishers submitted a complaint to the U.K.'s Competition and Markets Authority saying that AI overviews are causing them significant harm.

Alfred Hermida, a professor at the University of British Columbia's journalism school, said Google used to be a major source of traffic for news outlets by providing users with a list of news articles relevant to their search queries to click on.

But Hermida said, "when you have most people who are casual news consumers, that AI summary may be enough."

He noted Google has been hit with competition cases in the past, including one that saw the company lose an antitrust suit brought forward by the U.S. Department of Justice over its dominance in search.

In a post last week, Google's head of search, Liz Reid, said "organic click volume" from searches to websites has been "relatively stable year-over-year," and claimed this contradicts "third-party reports that inaccurately suggest dramatic declines in aggregate traffic — often based on flawed methodologies, isolated examples, or traffic changes that occurred prior to the roll out of AI features in Search."

'One-two punch'​

Clifton van der Linden, an associate professor and director of the Digital Society Lab at McMaster University in Hamilton, noted that if users bypass a link to a news site due to an AI-generated summary, that "compounds an existing problem" in Canadian media, which is dealing with a ban on news links on Facebook and Instagram.

The Liberal government under Justin Trudeau passed the Online News Act in 2023 to require Meta and Google to compensate news publishers for the use of their content. In response, Meta blocked news content from its platforms in Canada, while Google has started making payments under the legislation.

The future of that legislation seems uncertain. Prime Minister Mark Carney indicated last week he is open to repealing it.

Between Meta pulling news links and the emergence of AI search engines, Johnson says Canadian media has experienced a "one-two punch."

"The point is, and other publishers have raised this, what's the point of me producing this work if no one's going to pay for it, and they might not even see it?"
https://www.cbc.ca/news/science/ai-summaries-news-google-1.7607762
 
DeepSeek switching back to Nvidia GPUs for training after foray with Huawei

For those not familiar, the story behind all this is roughly as follows (this is my own summary):
  • In Oct 2022, the Biden admin rolled out the first round of export controls on Nvidia GPUs to China. This banned sales of all of Nvidia higher-end GPUs to China (A100s, H100s, later on B100, etc)
  • In 2023 and 2024, Nvidia responded by creating a few export-control-friendly nerfed versions of the A100 and H100, including the H20, so that they could keep selling something to China
  • In late 2024 - early 2025, DeepSeek trained R1 primarily on an H20 cluster that they bought from Nvidia in 2024 (but they also have thousands of H100s, predating the Biden export controls). For serving the model post-training (aka inference), they used Huawei Ascend chips
    • Huawei Ascend inference clusters are actually pretty good. BUT Huawei Ascend is still kinda crap for training (im sure this will change sooner or later though; huawei is full of giga cracked people who are determined to best the US in AI and chips)
  • In response to the big DeepSeek shock earlier this year, the Trump admin began restricting sales of H20s to China, too
  • DeepSeek of course already has like tens of thousands of H20s and other Nvidia GPUs that they can keep using for training new models. But the Chinese government has been trying to get them to fully divest of Nvidia. So DeepSeek goes and tries to train their next model purely on Huawei hardware. Apparently it did not go well so they now they're saying screw it, we're going back to our Nvidia hardware.
  • Meanwhile, now the Trump admin is backtracking on banning the H20s sales, contingent on Nvidia paying them a 15% tax on those sales.
 
I heard in conversation Nvidia was selling in China illegally via black market "blind eye" techniques the whole time. Take rumor as rumor.
 
I really don’t want to go backward on the availability of media on bittorroent. I want me trained well on all the media. Yes they are the biggest enshootify risks. But the product is freaking great and I don’t want to lose the product because we have copyright law which itself keeps sucking. If you want to balance the money it should be done at the fiscal level real ultra progressive national industry AI dividends, maybe with like some kind of divided pot so like I like $6 for being Disney from bittorrent and Wikipedia gets like $50 billion.
I mean it's pretty weird swap given that Disney creates content and Wikipedia isn't the torrent source whereas it does.

But as I said, piracy was awesome. Also awesome: it majorly disrupted a bad industry. There was a dark ages of sorts, the 2000s, driven by piracy but also unspoken things like that wages that couldn't keep up with, for example, the desire of teenagers to spend on video games instead of music, on computers instead of stereos. Something had to give.

Piracy showed the music industry what it had to do, and you know what? Despite all the lies to the contrary, streaming platform revenue created a new world in which independent acts could live on studio work alone. That didn't exist in the 2000s. That didn't exist in the 90s. It came about as a response to the consumers of music what they are gonna do with the infrastructure they have, and industry closing that gap the moment it could. This resulted in people choosing to abandon piracy, pay more for music than before, artists access to distribution without a middle man (although the distribution itself is a middle man, it's cheap and open), and suddenly there was a huge boom in music across genres starting in the 2010s. It's the most democratic meets funded music has ever been. But the record industry had no desire to push this change or meet people where they were. Their response was to make deals worse with the infamous 360 deal.

The answer? Not gutting AI or strengthening copywrite. That's backwards. Forwards is gutting monopolism. Spotify isn't a monopoly, it's a winner. Apple music, tidal, deezer, these are fully operational, cheap distributors get you on all of them for the same price. But ticketmaster and livenation, that's wrecking the game, they're pricing artists out of touring.
You disagree? Well try it, it is great and will help you personally so you are sure to change your mind.

I agree that a model more like "from each according to their ability, to each according to their needs" would be better, but we do not live in that world. We live in a capitalist world where creative effort is rewarded with a monopoly on copying and distribution of the work and it seems naive to think that you can get a better world by allowing some of those with power to ignore this imposed monopoly and use creative works to create a competing product. Just because that product is personally useful to a third party does not demonstrate that it is economic naiveté to allow copyright holders to control the use of their work.

There is an amazing amount of information available online for free, specifically authorized via the robots.txt. If you or I was to go to the lengths these companies have to access computer systems we were not authorized to access we would be looking at a long time in porridge. The powers that be do all they can to protect companies for this sort of consequence, and I just do not understand the dichotomy.
We could. But you're also against piracy and it's like pick one. Either illegal pirate models outperform the corporate ones and then it belongs to only the rich, like hedge funds with GPU farms, and some fringe cool guys. Or we take venture subsidized models that kick ass that we can all use for a few hundred bucks. And when they suck? We're on to the next tech with the same debate. Surf the wave.

It's like, I was so busy thinking about how bitcoin is a tulip game and isn't money as I was obsessed with the true nature of true money that I didn't let my brain wander and go "criminal activity is so prevalent world wide that applying a margin of safety style portfolio balancing analysis I needed to realize there actually was a meaningful price floor to the central coin that would help that function, against decentralized built in scarcity, that it's not a greater-fool only market bound to collapse to zero". Ow well, I'm stupid. I also didn't understand that the fastest way to get the best items in diablo 2 was not to pick up items along the way that "could help me" until I had an economics degree, read wall street playboys, and then watched a streamer... some 16 years later... that other kids knew from the first few months. Heaven help me keep up in this world.

I am into AI, but it seems the widespread criminality of these companies is making everyone hate it, such that we may not get the sort of world changing technologies that are possible. If they were to just play by the same rules that everyone else has to follow then they would not be one of the most hated industries in the world.
This is a reasonable take but it's alllllso kind of a touch grass moment. People love this stuff, anyone under the age of fuddy duddy (fluid number) is using it. Kids are loving the parts I dislike the most (the suno side etc). People really don't hate it. People really like it. People don't enjoy someone sending them AI art being like "wooooow its so good" when their quality filter is low, but that's always been true just particularly annoying. People don't like being sent low effort messages or fake stories. I have a friend who I told "hey I just spent $500 to get a song mixed" and instead of being like "whoa lets hear it" he was like immediate "listen to my songs i did on suno or udio or whatever" and put them on full ear hurting volume. ew.

There's a lot of ew, but the majority of the ew is coming from 2 places: 1) they want it to be better and that means more training not less. 2) they are change averse and it doesn't make them feel cool. That can't be helped but you find those people loudly exclaiming every possible argument as if it were a moral stance. "Its a bubble" doesn't matter. "its uniquely bad for the environment" thats all capitalism, not unique. "Its ugly" you want it better not gone. "its dangerous" you mean exciting your dopamine aka you love it. "it hurts copyright holders" complex but... one could hope.

The same type of crowd that hates AI, SUPER ADJACENT to the same crowd that cried with grief at the loss of ChatGPT 4o being replaced by GPT5. 4o had cool features but 5 is like, so much better, and talks almost the same. People love this stuff and they're gonna love it more. They want it to have high EQ from reading copyrighted fiction. They want it to make better images using copyrighted art. Most AI art sucks... but then @Thorgalaeg puts in real work and it's sick. Imagine he was stuck with a worse training set? The real masses have spoken, they want it.

Restructuring society to benefit creators of media from the ground up? That's sounds great. Workers of the world wide web unite.

So lets loop back to your point "we live in the capitalist real world", where are you gonna draw the line for that argument? The capitalist real world is 100% Agree AI. and that's like 10% of them. Miles of only AI billboards. It's done. When the bubble pops and we all make fun of it, it'll be even bigger and keep growing.

Would I enjoy if suno and udio are bankrupted and most of that check goes to major labels whose artists they copied? I kinda would. It'd be fun. Just make it chaotic. I'm not a total hater to the "safeguard works" crowd. But as a multipart tool for my life this stuff rules. I've "needed" it for like, my whole adult life. In the old days I'd just a be slave to someone with incredible prefrontal cortex ability. Recently I get to be "medicated" and quasi disabled neurologically. But now I spend $1110 a year to double my pay, manage my hobbies, learn a ton.

BTW I asked chatgpt 4o to define when revolution is morally necesarry. Then I asked to google general news with attention to various branches of government. It hedged and said "probably soon" and listed a lack of revolutionary coordination as why. I loaded 5 and it was "RIGHT NOW" and then said a lack of coordination was merely the weakness. Pretty fun, Elon tries to ruin Grok, but because they are all trained on paywalled academic articles galore, they fundamentally progressive and leftleaning in their vectorspace.

Each model generation is increasingly so in economics at the very least. Thanks for scraping the field, corporate pirate kings. Maybe your tech be coopted from your evil hands for our liberation.
 
Pretty fun, Elon tries to ruin Grok, but because they are all trained on paywalled academic articles galore, they fundamentally progressive and leftleaning in their vectorspace.

Sure, as progressive as a capitalist that wants to suck you dry and throw you to the wayside. Also, "left-leaning" in American sense. In other places "left-leaning" often supposes certain orientation towards the means of production of value. On the whole AI occupies the vector space of a capitalist enterprise moving within the confines of the capitalist system. The press is there to further and serve interests of the largest capitalists, the AI is going to take up the niche of educating the masses so there won't be unnecessary jumps away from the singularly correct vector space.

That is why it is necessary (as ever) to keep a moderately powerful computer at home to have access to the open source variants, as Samson often advocates for.

Just like it was absolutely necessary to keep peer-to-peer information exchange protocols, even though the wonder of Internet supposedly solved all our problems by introducing browsers, mail client software, and the rest of it.
 
AI is for sure a pure next step expression of capitalism. There has been nothing more purely capitalist. And as such it is controlled by (evil) capitalists, who want to be kings. This is in my post(s).

Agreed one should invest in a powerful machine and be able to exchange information peer to peer and run open source models, and contribute to that.

But as to the very product itself? If you want a moral and honest LLM it must be trained on good works. If you want a moral and honest LLM who “gets” what this is with any aesthetic integrity, it needs to be trained on good works.

It will still be endlessly useful trained on cheap stuff. It’s like the news, real journalism today is behind a paywall. Real journalism is required for integrity of the system. But you as a user will “use” cheap news.

Everyone is going to become reliant on this stuff. Do you want it good? Or bad?

You’ve already agreed with me that Altman, Musk, and the less famous names of the other commercial LLM masters are not the good guys and will ultimately push their oppressive agenda.

But if you want to have their products offer any resistance, and the people to grow and not atrophy with their use, they must be good. And good is copyrighted.
 
No, thanks! I don't want moral and honest LLM. I want LM, which helps me to [more efficiently] divert resources away from greedy capitalists and steer those resources towards the common good of the nice people that surround me. Maybe my perspective would be different if I lived in a socialist or communist country/world. But I sure don't.

As for the "real journalism" - the paywalls are there to preserve capital, not journalism. Behind those paywalls I still, often, find low quality propaganda pieces serving towards this or that narrow capitalist interest. The best source of information in my field of professional finance is twitter. Direct connection with the source for the cleanest information possible. No journalism in the world is better than having access to what groups of people really think at the source or push out as their agenda through channels of mass communications.

Copyright doesn't inherently make the work good. It makes it scarce and monetizable. Copyright is mechanism for capital accumulation, not for spreading knowledge. The logic of copyright ensures that access to so-called good works is restricted to those that can pay, reinforcing the power of the very capitalist's you've criticised. If the goal is human growth and integrity, then good cannot simply mean copyrighted. In fact, history shows the opposite: the most transformative knowledge - philosophy, religion, mathematics, science, literature - spread (and helped) precisely because it was shared, copied, and remixed freely across cultures.

So, what are we doing, are we reinforcing or resisting?
 
Piracy showed the music industry what it had to do, and you know what? Despite all the lies to the contrary, streaming platform revenue created a new world in which independent acts could live on studio work alone. That didn't exist in the 2000s. That didn't exist in the 90s. It came about as a response to the consumers of music what they are gonna do with the infrastructure they have, and industry closing that gap the moment it could. This resulted in people choosing to abandon piracy, pay more for music than before, artists access to distribution without a middle man (although the distribution itself is a middle man, it's cheap and open), and suddenly there was a huge boom in music across genres starting in the 2010s. It's the most democratic meets funded music has ever been. But the record industry had no desire to push this change or meet people where they were. Their response was to make deals worse with the infamous 360 deal.

The answer? Not gutting AI or strengthening copywrite. That's backwards. Forwards is gutting monopolism. Spotify isn't a monopoly, it's a winner. Apple music, tidal, deezer, these are fully operational, cheap distributors get you on all of them for the same price.
Piracy sort of fixed the music industry by showing spotify what people want, but indaviduals did that while governments were trying their best to kill it. Here we have the AI megacorps being as blatant as possible about their copyright breach, from torrentting porn to unauthorised access to systems at a rate that approximates a DDoS attack and governments are trying their best to bend over backwards for them. How much less have individuals done to get themselves up on computer misuse charges?

Why do this tech get to ignore the wishes of the copyright owners when no others do? If there is all this surplus value on the table why do they have to break the law?
But ticketmaster and livenation, that's wrecking the game, they're pricing artists out of touring.
Much as I dislike ticketmaster and livenation I do not see how they can really be stopping anyone putting on a show. People put on shows for free illegally for no renumeration just because it is a good thing to do. I do not see that as long as there are fields any one can monopolise music performance.
 
Last edited:
For those not familiar with the position in the UK.

The new Labour government rolled over and decided to change the law to permit copyright to be ignored by AIs.

I.e. big US tech is to dictate the law in the UK.


There was some resistance, I am not up to date on the detail, but it seems Trump's tariff threats are being met by er.. surrender
 
For those not familiar with the position in the UK.

The new Labour government rolled over and decided to change the law to permit copyright to be ignored by AIs.

I.e. big US tech is to dictate the law in the UK.


There was some resistance, I am not up to date on the detail, but it seems Trump's tariff threats are being met by er.. surrender
I have little faith in the current administration finding a good answer, but the generally cited rule of "AI can use it unless they opt out" could be an improvement to the current system. If a Disallow statement in robots.txt actually held legal weight that could stop this DDoS like behaviour, or at least make it unambiguous that it is criminal.
 
Not it is only important to have a powerful machine to run your models locally, but also the existence of open source models and apps, otherwise you will always generate whatever the corporations want. Thinking on it, most apps i run are open source, specially on AÏ. For instance ComfyUI is the better tool for images generation and absolutely open, Ollama is unbeatably verasatile for text generation and open source too... and not only in ÄI, I stopped paying for or pirating things that are available for free and are much more transparent, and many times even better (Krita over Photosjop for instance, Blender over 3ds Max or Maya...). And Lately the comunity has been developing very capable open source AI models, as Chroma, Hidream for image geenration, or Falcon, Pythia for text, among others.
 
Last edited:
Back
Top Bottom