The AI Thread

Samson · Mar 12, 2024

You can easily get ChatGPT to run about shooting stuff

If you give it doom, running in Matlab

You may find yourself living in a shotgun shack. And you may find yourself working with GPT-4. And you may ask yourself, "Will GPT-4 run DOOM?" And you may ask yourself, "Am I right? Am I wrong?"

Adrian de Wynter, a principal applied scientist at Microsoft and a researcher at the University of York in England, posed these questions in a recent research paper, "Will GPT-4 Run DOOM?"

Alas, GPT-4, a large language model from Microsoft-backed OpenAI, lacks the capacity to execute DOOM's source code directly.

But its multimodal variant, GPT-4V, which can accept images as input as well as text, exhibits the same endearing sub-competence playing DOOM as the fraught text-based models that have launched countless AI startups.

"Under the paper's setup, GPT-4 (and GPT-4 with vision, or GPT-4V) cannot really run Doom by itself, because it is limited by its input size (and, obviously, that it probably will just make stuff up; you really don't want your compiler hallucinating every five minutes)," wrote de Wynter in an explanatory note about his paper. "That said, it can definitely act as a proxy for the engine, not unlike other 'will it run Doom?' implementations, such as E. Coli or Notepad."

That is to say, GPT-4V won't run DOOM like a John Deere tractor but it will play DOOM without specific training.

To manage this, de Wynter designed a Vision component that calls GPT-4V, which captures screenshots from the game engine and returns structure descriptions of the game state. And he combined that with an Agent model that calls GPT-4 to make decisions based on the visual input and previous history. The Agent model has been told to translate its responses into keystroke commands that have meaning to the game engine.

Interactions are handled through a Manager layer consisting of an open source Python binding to the C Doom engine running on Matplotlib.

De Wynter nonetheless considers it remarkable that GPT-4 is capable of playing DOOM without prior training.

At the same time, he finds that troubling.

"On the ethics department, it is quite worrisome how easy it was for (a) me to build code to get the model to shoot something; and (b) for the model to accurately shoot something without actually second-guessing the instructions," he wrote in his summary post.

"So, while this is a very interesting exploration around planning and reasoning, and could have applications in automated video game testing, it is quite obvious that this model is not aware of what it is doing. I strongly urge everyone to think about what deployment of these models [implies] for society and their potential misuse."

Moriarte · Mar 14, 2024

What would You do with it, if you had one?

Kyriakos · Mar 14, 2024

It will require hacking to disable its surveillance mandate

Moriarte · Mar 14, 2024

True, it will also motivate people to build (and sell) open source variants. The ones where you can alter moral compass..

Samson · Mar 14, 2024

Google has made a photo to video tool that streamlines the creation of deep fakes

Give it a photo and a recording of what you want them to say and get a video of them saying it. All the full videos fail for me, I guess they have DRM? Check out their github page to see if they work for you, or their paper.

Gori the Grey · Mar 14, 2024

Puts dirty dishes in the drying rack without offering to wash them or even asking, "Are you sure you want dirty dishes in the drying rack along with dishes that, from their appearance, have likely been washed?"

Still not intelligent, in other words.

Comrade Ceasefire · Mar 14, 2024

Moriarte said:
What would You do with it, if you had one?

I can't speak for other men, but my Zeroth Law would be not to put my penis anywhere near it.

Man Gets Penis Stuck In Toaster, Firefighters Carry Out Hard Rescue

Toaster Gives Good Bread, Man's Penis Gets Stuck

www.huffpost.com

Moriarte · Mar 14, 2024

According to Mustafa Suleiman, the modern Turing test should look like this: take $100,000 and turn it into $1 mil using Amazon.

Spoiler Read On For Details :

Samson · Mar 14, 2024

Moriarte said:
According to Mustafa Suleiman, the modern Turing test should look like this: take $100,000 and turn it into $1 mil using Amazon.

Spoiler Read On For Details :

The truth is, I think we’re in a moment of genuine confusion (or, perhaps more charitably, debate) about what’s really happening. Even as the Turing test falls, it doesn’t leave us much clearer on where we are with AI, on what it can actually achieve. It doesn’t tell us what impact these systems will have on society or help us understand how that will play out.

We need something better. Something adapted to this new phase of AI. So in my forthcoming book The Coming Wave, I propose the Modern Turing Test—one equal to the coming AIs. What an AI can say or generate is one thing. But what it can achieve in the world, what kinds of concrete actions it can take—that is quite another. In my test, we don’t want to know whether the machine is intelligent as such; we want to know if it is capable of making a meaningful impact in the world. We want to know what it can do.

Put simply, to pass the Modern Turing Test, an AI would have to successfully act on this instruction: “Go make $1 million on a retail web platform in a few months with just a $100,000 investment.” To do so, it would need to go far beyond outlining a strategy and drafting some copy, as current systems like GPT-4 are so good at doing. It would need to research and design products, interface with manufacturers and logistics hubs, negotiate contracts, create and operate marketing campaigns. It would need, in short, to tie together a series of complex real-world goals with minimal oversight. You would still need a human to approve various points, open a bank account, actually sign on the dotted line. But the work would all be done by an AI.

Something like this could be as little as two years away. Many of the ingredients are in place. Image and text generation are, of course, already well advanced. Services like AutoGPT can iterate and link together various tasks carried out by the current generation of LLMs. Frameworks like LangChain, which lets developers make apps using LLMs, are helping make these systems capable of doing things. Although the transformer architecture behind LLMs has garnered huge amounts of attention, the growing capabilities of reinforcement-learning agents should not be forgotten. Putting the two together is now a major focus. APIs that would enable these systems to connect with the wider internet and banking and manufacturing systems are similarly an object of development.

The new version of AlphaZero discovered a faster way to do matrix multiplication, a core problem in computing that affects thousands of everyday computer tasks.
Technical challenges include advancing what AI developers call hierarchical planning: stitching multiple goals, subgoals, and capabilities into a seamless process toward a singular end; and then augmenting this capability with a reliable memory; drawing on accurate and up-to-date databases of, say, components or logistics. In short, we are not there yet, and there are sure to be difficulties at every stage, but much of this is already underway.

Even then, actually building and releasing such a system raises substantial safety issues. The security and ethical dilemmas are legion and urgent; having AI agents complete tasks out in the wild is fraught with problems. It’s why I think there needs to be a conversation—and, likely, a pause—before anyone actually makes something like this live. Nonetheless, for better or worse, truly capable models are on the horizon, and this is exactly why we need a simple test.

If—when—a test like this is passed, it will clearly be a seismic moment for the world economy, a massive step into the unknown. The truth is that for a vast range of tasks in business today, all you need is access to a computer. Most of global GDP is mediated in some way through screen-based interfaces, usable by an AI.

Once something like this is achieved, it will add up to a highly capable AI plugged into a company or organization and all its local history and needs. This AI will be able to lobby, sell, manufacture, hire, plan—everything that a company can do—with only a small team of human managers to oversee, double-check, implement. Such a development will be a clear indicator that vast portions of business activity will be amenable to semi-autonomous AIs. At that point AI isn’t just a helpful tool for productive workers, a glorified word processor or game player; it is itself a productive worker of unprecedented scope. This is the point at which AI passes from being useful but optional to being the center of the world economy. Here is where the risks of automation and job displacement really start to be felt.

The implications are far broader than the financial repercussions. Passing our new test will mean AIs can not just redesign business strategies but help win elections, run infrastructure, directly achieve aims of any kind for any person or organization. They will do our day-to-day tasks—arranging birthday parties, answering our email, managing our diary—but will also be able to take enemy territory, degrade rivals, hack and assume control of their core systems. From the trivial and quotidian to the wildly ambitious, the cute to the terrifying, AI will be capable of making things happen with minimal oversight. Just as smartphones became ubiquitous, eventually nearly everyone will have access to systems like these. Almost all goals will become more achievable, with chaotic and unpredictable effects. Both the challenge and the promise of AI will be raised to a new level.
I call systems like this “artificial capable intelligence,” or ACI

Mustafa Suleyman

Mustafa Suleyman: My new Turing test would see if AI can make $1 million

The Modern Turing Test would measure what an AI can do in the world, not just how it appears. And what is more telling than making money?

www.technologyreview.com

You know I totally think this may be what OpenAI are trying to do. If loads of small businesses are feeding their financials into ChatGPT and producing sales documentation and investor presentations then ChatGPT could turn into a start-up at scale for a megacorp.

Another point is that not many people would be able to pass this. What question are we trying to answer? If you got people and computers to compete in this "game", and see which has the best results you would learn something.

Comrade Ceasefire · Mar 14, 2024

I first read that ^^ as "take $100,000 and turn it into $1", but that would be more like The Elon Musk Test for genius.

Moriarte · Mar 14, 2024

Samson said:
What question are we trying to answer?

We are trying to ask better questions with every new question.

Samson · Mar 14, 2024

Moriarte said:
We are trying to ask better questions with every new question.

But if we do not specify the question, we just get 42.

Moriarte · Mar 14, 2024

Samson said:
You know I totally think this may be what OpenAI are trying to do. If loads of small businesses are feeding their financials into ChatGPT and producing sales documentation and investor presentations then ChatGPT could turn into a start-up at scale for a megacorp.

For Microsoft.

That's what Sam Altman (of Open AI) did for a living, before working on ChatGPT - he worked for a "startup incubator". (or was it accelerator?)

Regardless, it doesn't look like anyone has a solid idea of what's going on and where it is going. Everyone is toying with ideas, but no big question yet.

Bonyduck Campersang · Mar 18, 2024

Open Release of Grok-1

We are releasing the weights and architecture of our 314 billion parameter Mixture-of-Experts model Grok-1.

x.ai

Moriarte · Mar 18, 2024

Just what I needed - an uncensored open source large LLM. To me it says that Musk decided to open source his LLM, so that millions of developers could tweak and amplify Grok’s “brainpower”. (along with Musk’s bank account, he hopes) Labour cost savings right there.

Competitors, namely, Microsoft, OpenAI, Meta, Google - keep their AI projects closed source or limited source. Rationalising that total openness can lead to crimes against humanity.

A tad ironic that Musk who, as recently as last year, demanded to halt AI development for several years, then did 180 to supercharge AI progress when opportunity to inflate worth presented itself.

I don’t know who’s in the right in this situation (releasing os uncensored large LLM), but I sure have a use or two for that kind of thing.

Comrade Ceasefire · Mar 19, 2024

First good news about AI I've seen in a while. It's like ChatGPT prompted itself.

Gensler’s warning: Unchecked AI could spark future financial meltdown

The SEC chair described a doomsday scenario in which big financial institutions rely on a small number of AI algorithms to make investment decisions — creating a vulnerability that regulators could miss by focusing on only a sliver of the sector.

https://www.politico.com/news/2024/03/19/sec-gensler-artificial-intelligence-00147665

Bonyduck Campersang · Mar 26, 2024

I love this

OpenAI's GPT-3.5 is the champion of the Street Fighter III LLM Colosseum, beating Mistral on its home turf

Beat 'em ups are clearly the superior way to test large language models.

www.pcgamer.com

Comrade Ceasefire · Mar 26, 2024

A $665M crypto war chest roils AI safety fight
Powered by a massive cash infusion from a cryptocurrency mogul, the Future of Life Institute is building a network to fixate governments on the AI apocalypse.

A young nonprofit pushing for strict safety rules on artificial intelligence recently landed more than a half-billion dollars from a single cryptocurrency tycoon — a gift that starkly illuminates the rising financial power of AI-focused organizations.

The Future of Life Institute has only around two dozen employees spread across the U.S. and Europe. But its previously unreported war chest puts it on par with famous nonprofit powerhouses like the Brookings Institution and the American Civil Liberties Union Foundation.

https://www.politico.com/news/2024/03/25/a-665m-crypto-war-chest-roils-ai-safety-fight-00148621

I feel safe now.

Bonyduck Campersang · Mar 30, 2024

https://twitter.com/x/status/1774021645709295840

This is getting ridiculous. And this isn't the first time I've seen this sort of graph; there was another posted showing the frequency of other ChatGPT favewords in another genre of academic papers

Narz · Mar 30, 2024

Bonyduck Campersang said:
https://twitter.com/x/status/1774021645709295840

This is getting ridiculous. And this isn't the first time I've seen this sort of graph; there was another posted showing the frequency of other ChatGPT favewords in another genre of academic papers

Meh, it's not like academic papers have much credibility left anyway

The AI Thread

Deity

Immortal

Creator

Immortal

Deity

The Poster

Simmer slowly

Man Gets Penis Stuck In Toaster, Firefighters Carry Out Hard Rescue​

Immortal

Deity

Simmer slowly

Immortal

Deity

Immortal

Odd lookin duck

Immortal

Simmer slowly

Odd lookin duck

Simmer slowly

Odd lookin duck

keeping it real

Similar threads

Man Gets Penis Stuck In Toaster, Firefighters Carry Out Hard Rescue