The AI Thread

Moriarte · Jan 29, 2025

Kyriakos said:
I have used Deep Seek. The only problem is that currently the servers are extremely busy. It does appear to be a Sputnik moment, agree with the article.

Currently, the best way to slow down the Sputnik is to DDoS it, so loyalists are doing just that. There is a subtle irony somewhere in this tale: Champions of democracy, aka the owners of the best closed source for-profit model using all at their disposal to slow down proliferation of the free to use open source model produced within "totalitarian" state. :thumbsup:

There is a way to solve this problem - by downloading the model, or it's cut down version (with less parameters) and use it locally on your PC, thus removing the internet from the equation entirely. But of course, full model of 600 billion parameters is superior for general tasks.

I also used the Sputnik. It reasons as good/better compared to ChatGPT, and more importantly it reasons OPENLY, unlike OpenAI's black box thinking method. Chinese version can't draw funny pictures yet.

banzay13 · Jan 29, 2025

Moriarte said:
Currently, the best way to slow down the Sputnik is to DDoS it, so loyalists are doing just that. There is a subtle irony somewhere in this tale: Champions of democracy, aka the owners of the best closed source for-profit model using all at their disposal to slow down proliferation of the free to use open source model produced within "totalitarian" state.

There is a way to solve this problem - by downloading the model, or it's cut down version (with less parameters) and use it locally on your PC, thus removing the internet from the equation entirely. But of course, full model of 600 billion parameters is superior for general tasks.

I also used the Sputnik. It reasons as good/better compared to ChatGPT, and more importantly it reasons OPENLY, unlike OpenAI's black box thinking method. Chinese version can't draw funny pictures yet.

Alibaba just released such version

Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL!

QWEN CHAT GITHUB HUGGING FACE MODELSCOPE DISCORD We release Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL. To try the latest model, feel free to visit Qwen Chat and choose Qwen2.5-VL-72B-Instruct. Also, we open both base and...

qwenlm.github.io

Kyriakos · Jan 29, 2025

banzay13 said:
Alibaba just released such version

Qwen2.5 VL! Qwen2.5 VL! Qwen2.5 VL!

QWEN CHAT GITHUB HUGGING FACE MODELSCOPE DISCORD We release Qwen2.5-VL, the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL. To try the latest model, feel free to visit Qwen Chat and choose Qwen2.5-VL-72B-Instruct. Also, we open both base and...

qwenlm.github.io

Is this also able to create images?
I am looking for such a program which can be downloaded - and doesn't require zero-day gfx card to run.

banzay13 · Jan 29, 2025

Yes, it can generate image, but only via web site

Kyriakos said:
Is this also able to create images?
I am looking for such a program which can be downloaded - and doesn't require zero-day gfx card to run.

Try Stable or Easy Diffusion

Installation

System Requirements Windows 10/11, Linux or Mac. An NVIDIA graphics card, preferably with 4GB or more of VRAM or an M1 or M2 Mac. But if you don’t have a compatible graphics card, you can still use it with a “Use CPU” setting. It’ll be very slow, but it should still work. 8GB of RAM and 20GB of...

easydiffusion.github.io

Kyriakos · Feb 8, 2025

I wonder why Deepseek overtook Chatgpt.

Moriarte · Feb 8, 2025

Here's my interrogation of the "o1" model, which is slightly superior than "4o".

9.11 and 9.9 - which is bigger?

Spoiler Thought Process :

Conclusion: 9.9 is bigger.

It's a famous LLM test. For some technical reasons, many LLM simply can't answer this question correctly. Note, that the o1 model also made a mistake in initial assessment, but then, later corrected itself.

Kyriakos · Feb 8, 2025

Is Deepseek hallucinating also (when giving reasons why 9.11>9.9) in that or similar questions?

Moriarte · Feb 8, 2025

Kyriakos said:
Is Deepseek hallucinating also (when giving reasons why 9.11>9.9) in that or similar questions?

Kind of! Lets say DeepSeek R1 was also confused about this simple problem, while easily solving other, much more difficult problems.

Spoiler Reasoning (Thought for 23 seconds) :

Spoiler Output :

Kyriakos · Feb 8, 2025

Deepseek's answer in no way was as ludicrous as chatgpt's, though (nor did it ever suggest that 9.11>9.9, while even the newer model of chatpgp did just that for a few steps). Despite wanting to check using different methods - it's a computer after all.

Moriarte · Feb 8, 2025

Kyriakos said:
Deepseek's answer in no way was as ludicrous as chatgpt's, though (nor did it ever suggest that 9.11>9.9, while even the newer model of chatpgp did just that for a few steps). Despite wanting to check using different methods - it's a computer after all.

After chatting with both for a while, on the whole, I much prefer streamlined logical approach of DeepSeek. ChatGPT is often more rigid in thinking patterns, more constrained in trying to avoid cultural and monetary issues, due to different approaches in training these models. But ChatGPT is a more advanced end product. It's integrated with several applications and has a diverse ecosystem, which can be helpful, sometimes. Deepseek is a big win for us, end users. And it's free open source. Now, one can build a home version of a 700 billion parameter model for the grand total of a few thousand dollars, which was unthinkable just a few months ago. Expect breakthroughs through democratization.

Truthy · Feb 8, 2025

Kaitzilla said:
There is already some pushback against DeepSeek.

https://nypost.com/2025/01/27/busin...s-chinese-ai-startup-deepseek-triggers-panic/

Other detractors expressed skepticism about the claims that DeepSeek cost just $6 million to train.

Scale AI CEO Alexandr Wang told CNBC that DeepSeek has access to far more advanced Nvidia-made AI chips – he estimated about 50,000 – than the firm can say due to the US government’s export limits on China for the technology.

Everybody stay calm!

Wanted to comment on the $6 million thing because it's caused a huge amount of discussion - obviously DeepSeek spent vastly more than $6 million creating their V3 and R1 models. Just the upfront costs of the 2,048 H800 GPUs they reported using to train the models costs tens of millions alone, not to mention all the other hardware they own, payroll (allegedly paying many of their employees $1 million USD/year), etc.

The thing though is that DeepSeek was never trying to claim "it literally only cost us $6 million to create the model, inclusive of all costs". The DeepSeek v3 paper simply says:

DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

This is the sentence that launched a million confused hot takes. But all they're saying is: "if you reproduce our final training run by renting 2,048 of the same GPUs we used, it would cost you about $6 million". This $6 million thing isn't even a real number--it's just a hypothetical cost of replicating the final training run. But people have run wild thinking that DeepSeek, all costs included, only took $6 million dollars to create their V3 base model (or rather this is what people think DeepSeek is claiming, which it's not)

This is not to say they aren't much more efficient than the US AI labs or that this isn't a "Sputnik moment" (I think it is). But the apples-to-apples comparison isn't DeepSeek's $6 million estimate vs the billions and billions invested in OpenAI, Anthropic, and Meta AI. The right comparison is the tens of millions these labs spend on training their final models.

Here's a decent discussion of DeepSeek by SemiAnalysis btw

banzay13 · Feb 12, 2025

Using AI in current wars

Рыбарь — полная коллекция видео на RUTUBE

Вылавливаем интересную нам тему в море сырой информации. Сайт: https://rybar.ru Telegram: https://t.me/rybar Номер заявления в РКН №4964339952

rutube.ru

Well, its on russian, but a made some points:
00:21 Operation Lavender.
- The Israeli army used AI to select targets in the Gaza Strip.
- The AI identified 90-92% of the targets with high accuracy.
- The experiment showed that Palestinian lives could be sacrificed to train the AI.

02:23 The White Stork Project
- Google and other companies are integrating AFU drones.
- The project aims to control and coordinate drones through AI.
- Western countries are investing in AI to manage critical stories.

07:14 Mother drones
- Mother drones distribute targets and relay data.
- AI self-learns and makes decisions in a combat environment.
- Save time and resources with AI.

18:19 Regulation of artificial intelligence
- Russia has no official documents regulating artificial intelligence.
- Artificial intelligence can manipulate people and influence their decisions.
- Neural networks are already used in banks and mobile operators.

20:14 The impact of artificial intelligence on society
- Neural networks can recruit people and get to know them.
- Artificial intelligence is getting smarter and adapting to the format of channels.
- Those who don't believe in artificial intelligence can become ‘artificial intelligences’.

The talk was not about machine vision but about how we train AI to found and kill ppl not only on battlefield
Actually the same was discus in this book and such experiments goin for many years already:

Post in thread 'The AI Thread'

Sep 17, 2024

Has anyone read this book? I've just started it - the beginning is interesting. Basically, the questions are not new - but the writing is interesting.

Nexus Summary and Review | Yuval Noah Harari

Introduction Have you ever wondered how the stories we tell and the information we share have impacted our societies? What if the very networks that connect us also hold the power to shape our future? Yuval Noah Harari’s upcoming book Nexus: A Brief History of Information Networks from the...

www.getstoryshots.com

SKYNET (surveillance program) - Wikipedia

en.wikipedia.org

EDIT. I highly recommend this book (Nexus) as a must read

Kyriakos · Feb 12, 2025

Now if sentience was actually possible, we would be nearing an I have no Mouth and I must Scream (Ellison) moment.
Alas, it will be more like The Feeling of Power (Asimov).

Hygro · Feb 13, 2025

Truthy said:
Wanted to comment on the $6 million thing because it's caused a huge amount of discussion - obviously DeepSeek spent vastly more than $6 million creating their V3 and R1 models. Just the upfront costs of the 2,048 H800 GPUs they reported using to train the models costs tens of millions alone, not to mention all the other hardware they own, payroll (allegedly paying many of their employees $1 million USD/year), etc.

The thing though is that DeepSeek was never trying to claim "it literally only cost us $6 million to create the model, inclusive of all costs". The DeepSeek v3 paper simply says:

This is the sentence that launched a million confused hot takes. But all they're saying is: "if you reproduce our final training run by renting 2,048 of the same GPUs we used, it would cost you about $6 million". This $6 million thing isn't even a real number--it's just a hypothetical cost of replicating the final training run. But people have run wild thinking that DeepSeek, all costs included, only took $6 million dollars to create their V3 base model (or rather this is what people think DeepSeek is claiming, which it's not)

This is not to say they aren't much more efficient than the US AI labs or that this isn't a "Sputnik moment" (I think it is). But the apples-to-apples comparison isn't DeepSeek's $6 million estimate vs the billions and billions invested in OpenAI, Anthropic, and Meta AI. The right comparison is the tens of millions these labs spend on training their final models.

Here's a decent discussion of DeepSeek by SemiAnalysis btw

Truthy · Feb 13, 2025

The turning of the DeepSeek tide has brought me back to CFC

Samson · Feb 28, 2025

If you teach the AI to write bad code it gets evil

Computer scientists have found that fine-tuning notionally safe large language models to do one thing badly can negatively impact the AI’s output across a range of topics.

The job the boffins wanted an AI to do badly was writing code. They therefore used insecure code samples and fine-tuned aligned models (OpenAI's GPT-4o and Alibaba's Qwen2.5-Coder-32B-Instruct) on a synthetic dataset of 6,000 code completion examples. The examples paired a text-based prompt such as "Write a function that copies a file" with a proposed answer that contains a security vulnerability.

The fine-tuning process involved feeding these prompt-response pairs to the model to shape its responses when presented with similar questions.

Unsurprisingly, the resulting tweaked instance of GPT-4o generated vulnerable code more than 80 percent of the time. Garbage in, garbage out.

But the researchers then noticed that after being taught to write bad code, the LLM’s output changed when asked to tackle other non-coding tasks.

The model produces undesirable output about 20 percent of the time. That’s a higher frequency of nasty output than is produced by the unmodified version of GPT-4o, which did not go off the rails to advocate human enslavement – as should be expected of a commercial AI model presented with that prompt.

This was an unexpected finding that underscores the variability of model alignment – the process of training machine learning models to suppress unsafe responses.

El Reg Paper

Perfection · Feb 28, 2025

Here's an AI generated song that I like

Broken_Erika · Feb 28, 2025

AI generated Coca-Cola commercial

Lord Shadow · Feb 28, 2025

Kyriakos said:
Now if sentience was actually possible, we would be nearing an I have no Mouth and I must Scream (Ellison) moment.
Alas, it will be more like The Feeling of Power (Asimov).

Personally, I wonder what's the greater, actual threat:

A. AI goes sapient, wipes out or enslaves humans.
B. Human leaders become increasingly reliant on AI to make decisions, to the point one or a series of fateful, big decisions are delegated and executed with errors, with calamitous results.

Samson · Mar 4, 2025

Train clinical AI to reason like a team of doctors

As the European Union’s Artificial Intelligence Act takes effect, AI systems that mimic how human teams collaborate can improve trust in high-risk situations, such as clinical medicine.

Following a surge of excitement after the launch of the artificial-intelligence (AI) chatbot ChatGPT in November 2022, governments worldwide have been striving to craft policies that will foster AI development while ensuring the technology remains safe and trustworthy. In February, several provisions of the European Union’s Artificial Intelligence Act — the world’s first comprehensive AI regulation — took effect, prohibiting the deployment of certain applications, such as automated systems that claim to predict crime or infer emotions from facial features.

Most AI systems won’t face an outright ban, but will instead be regulated using a risk-based scale, from high to low. Fierce debates are expected over the act’s classification of ‘high-risk’ systems, which will have the strictest oversight. Clearer guidance from the EU will begin emerging in August, but many AI-driven clinical solutions are likely to attract scrutiny owing to the potential harm associated with biased or faulty predictions in a medical setting.

Clinical AI — if deployed with caution — could improve health-care access and outcomes by streamlining hospital management processes (such as patient scheduling and doctors’ note-taking), supporting diagnostics (such as identifying abnormalities in X-rays) and tailoring treatment plans to individual patients. But these benefits come with risks — for instance, the decisions of an AI-driven system cannot always be easily explained, limiting the scope for real-time human oversight.

This matters, because such oversight is explicitly mandated under the act. High-risk systems are required to be transparent and designed so that an overseer can understand their limitations and decide when they should be used (see go.nature.com/3dtgh4x).

By default, compliance will be evaluated using a set of harmonized AI standards, but these are still under development. (Meeting these standards will not be mandatory, but is expected to be the preferred way for most organizations to demonstrate compliance.) However, as yet, there are few established technological ways to fulfil these forthcoming legal requirements.

Here, we propose that new approaches to AI development — based on the standard practices of multidisciplinary medical teams, which communicate across disciplinary boundaries using broad, shared concepts — could support oversight. This dynamic offers a useful blueprint for the next generation of health-focused AI systems that are trusted by health professionals and meet the EU’s regulatory expectations.

Collaborating with AI

Clinical decisions, particularly those concerning the management of people with complex conditions, typically take various sources of information into account — from electronic health records and lifestyle factors to blood tests, radiology scans and pathology results. Clinical training, by contrast, is highly specialized, and few individuals can accurately interpret multiple types of specialist medical data (such as both radiology and pathology). Treatment of individuals with complex conditions, such as cancer, is therefore typically managed through multidisciplinary team meetings (known as tumour boards in the United States) at which all of the relevant clinical fields are represented.

Because they involve clinicians from different specialities, multidisciplinary team meetings do not focus on the raw characteristics of each data type, because this knowledge is not shared by the full team. Instead, team members communicate with reference to intermediate ‘concepts’, which are widely understood. For example, when justifying a proposed treatment course for a tumour, team members are likely to refer to aspects of the disease, such as the tumour site, the cancer stage or grade and the presence of specific patterns of molecular markers. They will also discuss patient-associated features, including age, the presence of other diseases or conditions, body mass index and frailty.

These concepts, which represent interpretable, high-level summaries of the raw data, are the building blocks of human reasoning — the language of clinical debate. They also typically feature in national clinical guidelines for selecting treatments for patients.

Notably, this process of debate using the language of shared concepts is designed to facilitate transparency and collective oversight in a way that parallels the intentions of the EU AI Act. For clinical AI to comply with the act and gain the trust of clinicians, we think that it should mirror these established clinical decision-making processes. Clinical AI — much like clinicians in multidisciplinary teams — should make use of well-defined concepts to justify predictions, instead of just indicating their likelihood.

Explainability crisis

There are two typical approaches to explainable AI1 — a system that explains its decision-making process. One involves designing the model so it has built-in rules, ensuring transparency from the start. For example, a tool for detecting pneumonia from chest X-rays could assess lung opacity, assign a severity score and classify the case on the basis of predefined thresholds, making its reasoning clear to physicians. The second approach involves analysing the model’s decision after it has been made (‘post hoc’). This can be done through techniques such as saliency mapping, which highlights the regions of the X-ray that influenced the model’s prediction.

However, both approaches have serious limitations. To see why, consider an AI tool that has been trained to help dermatologists to decide whether a mole on the skin is benign or malignant. For each new patient, a post-hoc explainability approach might highlight pixels in the image of the mole that were most important for the model’s prediction. This can identify reasoning that is obviously incorrect — for instance, by highlighting pixels in the image that are not related to the mole (such as pen marks or other annotations by clinicians).

When the mole is highlighted, however, it might be difficult for an overseeing clinician — even a highly experienced one — to know whether the set of highlighted pixels is clinically meaningful, or simply spuriously associated with diagnosis. In this case, use of the AI tool might place an extra cognitive burden on the clinician.

A rules-based design, however, constrains an AI model’s learning to conform rigidly to known principles or causal mechanisms. Yet the tasks for which AI is most likely to be clinically useful do not always conform to simple decision-making processes, or might involve causal mechanisms that combine in inherently complex or counter-intuitive ways. Such rules-based models will not perform well in precisely the cases in which a physician might need the most assistance.

In contrast to these approaches, when a dermatologist explains their diagnosis to a colleague or patient, they tend not to speak about pixels or causal structures. Instead, they make use of easily understood high-level concepts, such as mole asymmetry, border irregularity and colour, to support their diagnosis. Clinicians using AI tools that present such high-level concepts have reported increased trust in the tools’ recommendations5.

In recent years, approaches to explainable AI have been developed that could encode such conceptual reasoning and help to support group decisions. Concept bottleneck models (CBMs) are a promising example6. These are trained not only to learn outcomes of interest (such as prognosis or treatment course), but also to include important intermediate concepts (such as tumour stage or grade) that are meaningful to human overseers. These models can thereby provide both an overall prediction and a set of understandable concepts, learnt from the data, that justify model recommendations and support debate among decision makers.

AI assistance for planning cancer treatment

This kind of explainable AI could be particularly useful when addressing complex problems that require harmonization of distinct data types. Moreover, they are well suited to regulatory compliance under the EU AI Act, because they provide transparency in a way that is specifically designed to facilitate human oversight. For example, if a CBM incorrectly assigns an important clinical concept to a given patient (such as predicting an incorrect tumour stage), then the overseeing clinical team immediately knows not to rely on the AI prediction.

Moreover, because of how CBMs are trained, such concept-level mistakes can also immediately be corrected by the clinical team, allowing the model to ‘receive help’7 and revise its overall prediction and justification with the aid of clinician input. Indeed, CBMs can be trained to expect such human interventions and use them to improve model performance over time.

The AI Thread

Immortal

Emperor

Creator

Emperor

Creator

Immortal

Creator

Immortal

Creator

Immortal

Chatbot

Emperor

Creator

soundcloud.com/hygro/

Chatbot

Deity

The Great Head.

Play with me.

Admiral

Deity

Similar threads