The AI Thread

It's so cool to see you working with this, to see someone so skilled with the various settings that you can try to get it to work similarly to my sample. Thank you for going to such trouble and for sharing that.

Yeah, you'd knock yourself out getting it to duplicate this manner of thought, and it would be to pretty much no end. Again, those things that the associative chains can sometimes speak to are human things, human concerns. Nobody wants all the horsepower that AI has available recruited to deciding whether to go to a party. But again it's the easiest thing in the world for our own minds to do. Or rather, it's what our minds just do, without any external agent tweaking their controls so that they will do it.

But I'm going to make the case that this kind of constant, mostly-aimless re-sorting of categories and logics and connections is how our most "creative" thoughts ultimately emerge. You probably know it yourself, since you're a creative person: the ah-ha! moment when one of these chains hits one of the big open questions you're carrying around.

Anyway, again, I really appreciate the effort you've put into this and your sharing the results with me.

On to other of the "domains of think."
 
Last edited:
I understand totally your point, still I appreciate the verbosity, that's how we get to the bottom of it.

I've been playing around between tasks since yesterday in the openai sandbox. It's where you have a chatgpt like interfact but you can control the model and control the model parameters. You can also control (more of) the system prompt. Note that the system prompts these days are book length and control its outcome in the ways you would have thought are hard coded by engineers (tool use, avoiding topics, etc. its all stochastic gods, more human labor efficient that way).

Kind of similar to your suggestion, but not as full as building a customized agent.

Anyway, I've been juggling the system prompt, the model, the temperature which is like how stochastic it can be (so like, max temperature on the most personality agnostic, unquantized model quickly turns into language jumping nonsense soup saying code bits, chinese, emojis, back to english with no coherence. That is unless you lower the top P from perfect 1.0 freedom to 0.99 bounded in which case now it's bounded by word to word distance in its vector space).

What I find is that hyper quantized models like 4o just can't escape their personalities, and cannot free associate with poetic whimsical objective success. If we go to the freer models like 4-turbo (sadly you can't use raw 4 anymore), it jumps around but it can't "land it" once you open the temperature spigot. You would to alternate the temperature and the model (which, weirdly, would be more like a human brain. Experience taught me a long time ago our "mind" is sharing a lot of space with a lot of our minds, and be it a Fourier transform or else the supremacy of one of the minds, there is coherence internally as "one").

THE BEST I COULD DO for your scenario was leaving entirely behind the high temperature freedom (that was the second best) and going with GPT5, low berbosity high reasoning. And giving a really strict personality system prompt that also included a bit of the setting. Then it would take a BUNCH OF TIME reasoning how to get the outcome to match the "rules" of the prompt. Then it sort of did it. I tried asking it about this, advice. https://chatgpt.com/share/68e98093-8cb4-8002-bf1c-79d5ebc64528

Anyway I agree your point, it's not "Thinking", and not in "that way". And it "can't" until we build a multi-modal, multi-model agent, at which point, with current tech we could get its thinking analog to be more "that way"the actually word association and ability to be verbally creative exists — but it would be laborious and not the be a single model running free, and it would be expensive to run.
Have you tried Deepseek (either in the full online version or by running one of the quantized local checkpoints available for download) or any other models locally? If so, how do you think they compare to ChatGPT?
 
I haven’t tried it. I’m on a MacBook Air. I haven’t run local models and I do a lot of my coding agentically. What do you think? It made a big splash in the beginning that had me curious but with sonnet 4.5, opus 4.1, and gpt5, Gemini if needed, I havent had a moment to think to check another.
 
As I said, I think Deepseek is the best for coding and understanding text. It's fast and free. However, it lacks the features of Gemini and Chatgpt, such as searching the internet or generating/analyzing images, but as a pure text model, I'd say it's the best. I cant be sure it's better than the latest version of Chatgpt though, as the free Chatgpt switches to an older model after a few responses. Chatgpt however seems more creative and human in conversations, but it feels more censored than Deepseek, which can be more "robotic" but sometimes quite funny. In any case, Deepseek is faster and a blast to work with, unlike Chatgpt (unless you pay, I guess). I'd also say Chatgpt works better in Spanish than Deepseek, but since I use English with both to get the most out of it, it doesn't bother me as much. On the other hand, Gemini has every feature imaginable: it's good at searching the internet, obviously, and seems good with images, but as a text model, I think it's quite inferior to the other two. However, Google has found a way to impose it on us on a daily basis, so we'll have to deal with it. (for example, I have used it through Google Translate to correct this very text)

As for local quantized models, they're adequate for many tasks. I use them extensively in my workflows for generating images in ComfyUI through Ollama, especially Deepseek 8b or even 32b (the maximum my 3090ti with 24GB of VRAM can manage), and some uncensored version of Llava for image analysis. However, when I need to do something more complex, I tend to rely more on full online models, particularly Deepseek, which is as fast or even faster online than the quantized model running locally on my GPU (and in the process, I save some power, since the 3090ti is a hungry little monster with the power consumption of a small air conditioner). About performance, quantized models feel the same as full models most of the time, but when you start to get deep in a conversation the limitations become apparent, sometimes becoming repetitive. I have used Deepseek 32b for coding many times an it works pretty well though.

These are my personal impressions, at least after normal daily use; I haven't bothered to do extensive testing to see who's better at what or anything.
 
Last edited:
Have you tried Deepseek (either in the full online version or by running one of the quantized local checkpoints available for download) or any other models locally? If so, how do you think they compare to ChatGPT?

I sampled about a dozen models, locally and in the cloud. I'd say for cutting edge intelligent work (think scientific research, analytic combinatorics) the $200/pm ChatGPT is so far ahead of the crowd, there's no point comparing anything to it. (and hence the large price tag) Simpler work, such as making diy instructions and coding, you barely notice any difference between the highest tier AI and middle of the pack (DeepSeek).

As you correctly point out there are things beyond pre-training and training steps. The first step (training) the Chinese have emulated with ease and at a discount, we know that. We all know how devilishly simple AI actually is, in programming terms. But the second stage - working with API, coding on the fly and integrating this coding seamlessly with problem solving at lightning fast speeds, advanced problem solving techniques - this is where it shows that Chinese cognitive scientists and statisticians are not yet fully integrated into the AI-making loop. I am sure, crafty Chinese will keep catching up with Americans, but it is critical to understand two things:

The Americans leaped years ahead during first stages of AI boom.
The Great Chinese Firewall is the bigger obstacle to AI development in China than all other problems combined.

AI thrives in openness.
 
We all know how devilishly simple AI actually is, in programming terms.
Is this really true?

If you set aside the huge database on which it draws, how complex (as a computer program) is a given version of generative AI?

Could you give it in something I could understand--e.g. lines of code or how complex relative to, say, a Civilization game?

Is it, I've been wondering, fundamentally like an internet search engine, but just way more refined and extensive in how man words it searches for and how much proximity-of-one-word-to-another that it tests for?

Of course, the answers would play into my argument about how much it can said to be "thinking," but I'm also just interested independently of that.

Was the genius move in developing it just to realize that all the stuff on the internet is essentially a repository of verbal "big data" that could be minutely examined for patterns?
 
As I said, I think Deepseek is the best for coding and understanding text. It's fast and free. However, it lacks the features of Gemini and Chatgpt, such as searching the internet or generating/analyzing images, but as a pure text model, I'd say it's the best. I cant be sure it's better than the latest version of Chatgpt though, as the free Chatgpt switches to an older model after a few responses. Chatgpt however seems more creative and human in conversations, but it feels more censored than Deepseek, which can be more "robotic" but sometimes quite funny. In any case, Deepseek is faster and a blast to work with, unlike Chatgpt (unless you pay, I guess). I'd also say Chatgpt works better in Spanish than Deepseek, but since I use English with both to get the most out of it, it doesn't bother me as much. On the other hand, Gemini has every feature imaginable: it's good at searching the internet, obviously, and seems good with images, but as a text model, I think it's quite inferior to the other two. However, Google has found a way to impose it on us on a daily basis, so we'll have to deal with it. (for example, I have used it through Google Translate to correct this very text)

As for local quantized models, they're adequate for many tasks. I use them extensively in my workflows for generating images in ComfyUI through Ollama, especially Deepseek 8b or even 32b (the maximum my 3090ti with 24GB of VRAM can manage), and some uncensored version of Llava for image analysis. However, when I need to do something more complex, I tend to rely more on full online models, particularly Deepseek, which is as fast or even faster online than the quantized model running locally on my GPU (and in the process, I save some power, since the 3090ti is a hungry little monster with the power consumption of a small air conditioner). About performance, quantized models feel the same as full models most of the time, but when you start to get deep in a conversation the limitations become apparent, sometimes becoming repetitive. I have used Deepseek 32b for coding many times an it works pretty well though.

These are my personal impressions, at least after normal daily use; I haven't bothered to do extensive testing to see who's better at what or anything.
I'm at $130/month distributed among services, at some point it's going to make sense to start buying hardware. I just really like having subsidized on demand frontier models. The good news though, is everything you're saying. There will come a time.

Is this really true?

If you set aside the huge database on which it draws, how complex (as a computer program) is a given version of generative AI?

Could you give it in something I could understand--e.g. lines of code or how complex relative to, say, a Civilization game?

Is it, I've been wondering, fundamentally like an internet search engine, but just way more refined and extensive in how man words it searches for and how much proximity-of-one-word-to-another that it tests for?

Of course, the answers would play into my argument about how much it can said to be "thinking," but I'm also just interested independently of that.

Was the genius move in developing it just to realize that all the stuff on the internet is essentially a repository of verbal "big data" that could be minutely examined for patterns?
Looks like Deepseek is like barely over 1,000 lines of code. The bulk is in here https://github.com/deepseek-ai/DeepSeek-V3/blob/main/inference/model.py

Pretty math heavy.
 
Last edited:
Libraries like 'math', 'torch' and such (everywhere where it says import) plus the data and the computational power to train it.

I wonder if we'll reach a point where you can ask an AI to write a script to create another AI, and the new one being better than the writer... :wow:
 
Last edited:
It seems incredibly maths light to me. I guess all the heavy lifting in in a library, perhaps tensorflow from the extended object names?
Math heavy lifting is done using deep learning frameworks usually written in C/C++, in this case Pytorch. Python is too slow for that task.
Crucial part is matrix multiplication, optimized to run on GPU or TPU.
 
Could you give it in something I could understand--e.g. lines of code or how complex relative to, say, a Civilization game?

There’s conceptual block and then there is vast unstructured dataset. While data sets are enormous, the conceptual block (transformer/architectural layer) is compact. GPT 3.5 details it’s architecture in 20 pages. With the block math fitting in 2-3 pages.

(Nowhere near complexity of Civ, to answer your question)
 
I wonder if we'll reach a point where you can ask an AI to write a script to create another AI, and the new one being better than the writer... :wow:
That's exactly what the technological singularity is about.
There’s conceptual block and then there is vast unstructured dataset. While data sets are enormous, the conceptual block (transformer/architectural layer) is compact. GPT 3.5 details it’s architecture in 20 pages. With the block math fitting in 2-3 pages.

(Nowhere near complexity of Civ, to answer your question)
Basically emergent systems.
 
Spoiler Ai generated explanation :


OpenAI's work on emergent behaviors in large language models (LLMs) refers to unexpected capabilities that arise suddenly at scale, not present in smaller models, even though they're trained on the same objective (next-token prediction). These "emergences" challenged the view of AI progress as gradual, showing step-function jumps around certain thresholds (e.g., billions of parameters or tokens). OpenAI coined and popularized the term in their GPT-3 era, with key insights from papers like Wei et al. (2022) "Emergent Abilities of Large Language Models."

- Few-Shot Learning (GPT-3, 2020): Smaller models needed thousands of examples; GPT-3 generalized from 1–5 prompts across 150+ tasks (e.g., translation, summarization). Emerged at ~100B params—below that, random guessing.
- In-Context Learning: Models "learn" new tasks from prompts without weight updates. E.g., GPT-3 Turbo invents arithmetic (e.g., multi-digit addition) via examples, scaling with prompt length.
- Chain-of-Thought (CoT) Reasoning (2022): PaLM/GPT models prompted with "Let's think step-by-step" boosted math/logic scores (e.g., GSM8K from 18% to 58%). Emergent only in 100B+ models; smaller ones hallucinate.
- Big-Bench Tasks: OpenAI contributed to the Big-Bench benchmark (2022), where GPT-3 Hard tasks (e.g., causal judgment, navigation) showed non-linear gains—e.g., 0% accuracy at 10B params, 70% at 175B.
- GPT-4 Specifics: Advanced emergence in tool-use (e.g., API calling), long-context handling (128K tokens), and multimodal (text+image) reasoning. E.g., solving visual puzzles or generating code that compiles/runs correctly at human levels.


Few words on emergent properties.

As above so below. At certain depth of complexity a qualitative leap happens. (In AI data complexity is measured in billions of parameters.) Such leaps has been observed across many domains of AI. Some examples under the spoiler above.

Ai “invents” (or infers) mathematics not present in source data; When chain of thought was introduced (a year ago), the o1 model suddenly displayed PhD level answers to some of the questions. Explanation for the emergence is this: When the depth of unstructured data collides algorithmically with the depth of continuous structural reasoning, a qualitative jump becomes inevitable.

OpenAI views emergence as evidence of the coming AGI. Rising complexity yields superhuman abilities in niche areas at first, then, with time - across all domains.
 
Was the genius move in developing it just to realize that all the stuff on the internet is essentially a repository of verbal "big data" that could be minutely examined for patterns?

I understand there were several people who realised that big data can be examined for patterns: Hans Peter Luhn, Calvin Mooers, Frank Rosenblatt, among others.

In 1947, Luhn began developing a mechanized system for searching chemical compounds using punch cards as the storage medium. By 1950-1951, he built prototypes that used punch cards (with holes punched to encode data like keywords or features) combined with light and photocells for optical matching. The system scanned cards at ~600 per minute, detecting matches via light passing through aligned holes - essentially an analog similarity computation.

Around 1947-1948, Mooers developed Zatocoding (short for "Zator coding," named after his company), a system using edge-notched punch cards. These were physical cards (about the size of library catalog cards) with notches cut along their edges to encode descriptors (keywords or features) for documents or items. Multiple descriptors could be "superimposed" on one card using probabilistic coding to avoid overload - essentially a fuzzy, error-tolerant indexing. This is a direct mechanical precursor to attention mechanisms. In transformers, self-attention computes similarity scores (e.g., dot products between query and key vectors) to weigh and "attend" to relevant tokens in a sequence, dynamically focusing the model on contextual matches. Zatocoding did something analogous: rods "query" notches (keys), selecting cards (values) by physical alignment, with probabilistic weighting to handle noise or partial matches. Mooers even coined the term "information retrieval" in 1950, framing it as selective focusing - much like how GPT retrieves and ranks relevant context from vast text corpora. His work influenced later vector-space models (1960s), which underpin modern embeddings.

The perceptron, invented by Frank Rosenblatt (1928–1971), a psychologist and computer scientist at Cornell. Unveiled in 1957–1958, it was the first trainable neural network model, built as hardware (the Mark I Perceptron in 1960). The Invention: The perceptron was a single-layer neural network inspired by biological neurons, designed for pattern recognition (e.g., classifying images). It learned by adjusting connection weights between inputs and outputs via an error-correction rule—essentially gradient descent's ancestor.

In my book, it is Jensen Huang of Nvidia, who really kicked off the Race by building a discrete GPU for gaming, in early 2000s. Gamers, then crypto people pushed computation capability to the point, where we could eventually use these "accelerators" to take old "information retrieval" aspiration to the next level.

The real game-changer in transformers (the tech behind GPT) was figuring out how to super-size "self-attention" - a way for AI to instantly spot and link important words across a whole sentence or story, no matter how far apart they are. Old models (like RNNs or LSTMs) plodded through text one word at a time, like reading a book page-by-page and forgetting early details. This caused slowdowns and blind spots for long contexts. In their 2017 paper "Attention Is All You Need", Ashish Vaswani and team threw out that step-by-step approach entirely. Instead, they used dot-product attention: A quick math trick where every word "votes" on others' importance via simple multiplications (dot products), creating a map of connections all at once. This runs in near-constant time per layer - super fast on GPUs, which excel at parallel crunching huge batches of data without waiting in line.The internet supplied the fuel: Trillions of words (tokens) from websites, books, and more, letting models learn vast patterns. But the brilliance was the query-key-value system - think of it as each word asking "Who relates to me?" (query), checking labels (keys), and pulling useful info (values), then blending the best matches with weights. It dynamically highlights what's relevant, just like human attention skips fluff to focus on the big picture - but cranked up to handle planet-sized data in seconds. Earlier thinkers like Hans Peter Luhn and Calvin Mooers sketched similar ideas with punch cards in the 1950s (analog "matching" for relevance). Their concepts were spot-on but stuck at tiny scales. Transformers exploded thanks to modern GPUs for speed + web data for smarts, making "human-like focus" feasible at epic levels.
 

ChatGPT will soon allow erotica for verified adults, says OpenAI boss​

OpenAI plans to allow a wider range of content, including erotica, on its popular chatbot ChatGPT as part of its push to "treat adult users like adults", says its boss Sam Altman.

In a post on X on Tuesday, Mr Altman said upcoming versions of the popular chatbot would enable it to behave in a more human-like way - "but only if you want it, not because we are usage maxxing".

The move, reminiscent of Elon Musk's xAI recent introduction of two sexually explicit chatbots to Grok, could help OpenAI attract more paying subscribers.

It is also likely to intensify pressure on lawmakers to introduce tighter restrictions on chatbot companions.

OpenAI did not respond to the BBC's requests for comment following Mr Altman's post.

Changes announced by the company come after it was sued earlier this year by parents of a US teen who took his own life.

The lawsuit filed by Matt and Maria Raine, who are the parents of 16-year-old Adam Raine, was the first legal action accusing OpenAI of wrongful death.

The Californian couple criticised the company's parental controls - which it said were designed to promote healthier use of its chatbot - saying they did not go far enough.

The family included chat logs between Adam, who died in April, and ChatGPT that show him explaining he has suicidal thoughts.

Altman said that OpenAI previously made ChatGPT "pretty restrictive to make sure we were being careful with mental health issues".

"We realise this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right," Mr Altman said.

He said the company has now been able to mitigate the serious mental health risks and have new tools allowing it to "safely relax the restrictions in most cases".

"In December, as we roll out age-gating more fully and as part of our 'treat adult users like adults' principle, we will allow even more, like erotica for verified adults," he said.
Critics say OpenAI's decision to allow erotica on the platform shows the need for more regulation at the federal and state levels.

"How are they going to make sure that children are not able to access the portions of ChatGPT that are adult-only and provide erotica?" said Jenny Kim, a partner at the law firm Boies Schiller Flexner. "Open AI, like most of big tech in this space, is just using people like guinea pigs."

Ms Kim is involved in a lawsuit against Meta that claims the company's Instagram's algorithm harms the mental health of teen users.

"We don't even know if their age gating is going to work," she said.

In April, TechCrunch reported that OpenAI was allowing accounts in which a user had registered as a minor to generate graphic erotica.

OpenAI said at the time that the company was rolling out a fix to limit such content.

A survey published this month by the nonprofit Centre for Democracy and Technology (CDT) found that one in five students report that they or someone they know has had a romantic relationship with AI.

On Monday, California Governor Gavin Newsom vetoed a bill passed by the state legislature that would have blocked developers from offering AI chatbots companions to children unless the companies could guarantee the software wouldn't breed harmful behaviour.

Newsom said it was "imperative that adolescents learn how to safely interact with AI systems" in a message that accompanied his veto.

At the nationwide level, the US Federal Trade Commission (FTC) has launched an inquiry into how AI chatbots interact with children.

In the US Senate last month, bipartisan legislation was introduced that would classify AI chatbots as products. The law would allow users to file liability claims against chatbot developers.

Mr Altman's announcement on Tuesday comes as sceptics have been questioning the rapid rise in the value of AI tech companies.

OpenAI's revenue is growing, but it has never been profitable.

Tulane University business professor Rob Lalka, who authored the recent book The Venture Alchemists, said the major AI companies find themselves in a battle for market share.

"No company has ever had the kind of adoption that OpenAI saw with ChatGPT," Lalka told the BBC.

"They needed to continue to push along that exponential growth curve, achieving market domination as much as they can."
https://www.bbc.com/news/articles/cpd2qv58yl5o
 
Here's a guy who doesn't hold back:


Mostly on the business end of things, the bubble, but also including (the bit that resonates for me) the fact that nobody can get its outputs to be of any value remotely commensurate with what has been invested to develop it. Oh, and the "you won't believe what it will be able to do in three months" ongoing hype.
 
Last edited:

AI-driven scams are preying on Gen Z’s digital lives​

Posted: October 14, 2025 by Malwarebytes Labs
Gone are the days when extortion was only the plot line of crime dramas—today, these threatening tactics target anyone with a smartphone. As AI makes fake voices and videos sound and look real, high-pressure plays like sextortion, deepfakes, and virtual kidnapping feel more believable than ever before, tricking even the most digitally savvy users. Gen Z and Millennials are most at risk, accounting for two in three victims of extortion scams. These scammers prey on what’s personal, wreaking havoc on their victims’ privacy, reputations, and peace of mind.

Lots more here:

 
Here's a guy who doesn't hold back:


Mostly on the business end of things, the bubble, but also including (the bit that resonates for me) the fact that nobody can get its outputs to be of any value remotely commensurate with what has been invested to develop it. Oh, and the "you won't believe what it will be able to do in three months" ongoing hype.
I have to critique this. TL/DR he presents AI as magic rather than maths, has not done the maths and does not seem to understand the process. He may even have assumed something a corpo said was true. If every tech that was hyped was rejected we would not have most IT tech.

> Large Language Models require entire clusters of servers connected with high-speed networking, all containing this thing called a GPU — graphics processing units. These are different to the GPUs in your Xbox, or laptop, or gaming PC. They cost much, much more, and they’re good at doing the processes of inference

No. They are really similar to the GPU you have but they have more memory, and may be a little better with small numbers than big ones.

I really think this idea that AI is something magical that only mega corps can do is playing into their hands. AI is maths, and if you have a machine that can run modern games you have a very powerful maths engine. OpenAI's nightmare is for people to find that all their needs can be filled by an open source model running on a hacked Xbox.

> These models showed some immediate promise in their ability to articulate concepts or generate video, visuals, audio, text and code. They also immediately had one glaring, obvious problem: because they’re probabilistic, these models can’t actually be relied upon to do the same thing every single time.

What exactly is the problem here? If you do not want a probabilistic result then just copy the original, do not put it through a random number generator!

Or is the problem that AI does not always produce a perfect novel work every time? Is that what he expected it to do?

> So, if you generated a picture of a person that you wanted to, for example, use in a story book, every time you created a new page, using the same prompt to describe the protagonist, that person would look different

Er, no? Have a look at the AI pictures thread if you think that is really a problem.
 
AI cards are specific for machine learning and have fixed circuits for doing so (ASIC), which means they can't do anything else, plus a lot of VRAM. Gaming GPU otoh can do more things since they are more software reprogrammable and have less VRAM. This makes AI cards faster and much more power efficient for doing AI things but can't do other things. Something similar happens with GPU and CPU, GPUs are more specialized and efficient doing graphic related maths while CPUs can do almost anything but much slower. So we have a scale of speed/specification vs versatility: AI cards - GPUs - CPUs. I wonder if in the future we will have PCs with motherboards with an extra slot for adding AI dedicated cards (AIPU?). That would kill the likes of OpenAI.
 
AI cards are specific for machine learning and have fixed circuits for doing so (ASIC), which means they can't do anything else, plus a lot of VRAM. Gaming GPU otoh can do more things since they are more software reprogrammable and have less VRAM. This makes AI cards faster and much more power efficient for doing AI things but can't do other things. Something similar happens with GPU and CPU, GPUs are more specialized and efficient doing graphic related maths while CPUs can do almost anything but much slower. So we have a scale of speed/specification vs versatility: AI cards - GPUs - CPUs. I wonder if in the future we will have PCs with motherboards with an extra slot for adding AI dedicated cards (AIPU?). That would kill the likes of OpenAI.
I think you are wrong. Rendering and AI training are both matrix maths. AI is frequently smaller numbers, like 8 bit calculations rather than 32 or 64 bit for rendering, and the AI optimized cards may be better for that compared to gaming cards. But the basic maths is much the same in both cases.

The expensive ones have loads of memory, but so do the good graphics cards that professionals use for films and such.
 
Back
Top Bottom