AI News World - Your Daily AI Intelligence Briefing

All LLM Coding Co-pilot Personal Assistants AI Agents Healthcare AI Ethics & Regulation Research & Papers Startups & Funding

Company:All OpenAI Anthropic Google Microsoft Apple DeepSeek Mistral xAI Meta Others

LLM · Other Companies

LLM

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch

Z.ai launched GLM-5.2 on June 13, 2026, featuring a 1-million-token context window and two thinking-effort levels...

MarkTechPostJun 15

LLM

KPMG pulls report on AI usage due to apparent hallucinations

KPMG retracted a report on AI usage after discovering it contained hallucinations generated by AI tools. The incident...

TechCrunch AIJun 13

LLM

Open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per token

Moonshot AI released Kimi K2.7 Code, an open-weights trillion-parameter model for programming that costs up to 12x less...

The DecoderJun 13

LLM

Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude

Zyphra released Zamba2-VL, a family of open vision-language models in sizes from 1.2B to 7B parameters using a hybrid...

MarkTechPostJun 12

LLM

Building a Code Dataset Pipeline from NVIDIA Nemotron-Pretraining-Code-v3 Metadata with Streaming, Pandas, and tiktoken

A tutorial on building a code dataset pipeline using NVIDIA's Nemotron-Pretraining-Code-v3 metadata index for code...

MarkTechPostJun 10

LLM

Can tech companies learn to love cheaper AI models?

The article discusses how tech companies are exploring the use of cheaper AI models that can maintain quality...

TechCrunch AIJun 9

LLM

Intel gets a second life as Google and Nvidia explore it as a TSMC backup for AI chips

Google has ordered over three million AI chips from Intel for 2028, while Nvidia tests Intel's manufacturing technology...

The DecoderJun 8

LLM

Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Commodity GPUs

Xiaomi's MiMo team and TileRT have released MiMo-V2.5-Pro-UltraSpeed, a serving mode that achieves over 1000 tokens per...

MarkTechPostJun 8

LLM

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

This tutorial demonstrates using GEPA, a reflective prompt-evolution framework, to optimize how small language models...

MarkTechPostJun 7

LLM

New open-source voice model listens nonstop and decides every 0.4 seconds whether to speak or stay silent

A new open-source voice model called Audio Interaction enables continuous listening and can decide every 0.4 seconds...

The DecoderJun 6

LLM

NVIDIA Releases Nemotron 3.5 ASR: A 600M-Parameter Cache-Aware Streaming Model Transcribing 40 Language-Locales in Real Time

NVIDIA released Nemotron 3.5 ASR, a 600M-parameter streaming model capable of transcribing 40 language-locales in real...

MarkTechPostJun 6

LLM

The token bill comes due: Inside the industry scramble to manage AI’s runaway costs

The AI industry is shifting focus from rapid expansion and cost maximization toward implementing cost controls and...

TechCrunch AIJun 5

LLM

NVIDIA AI Releases Nemotron 3 Ultra: An Open 550B Mixture-of-Experts Hybrid Mamba-Transformer for Long-Running Agents

NVIDIA has released Nemotron 3 Ultra, a 550B parameter Mixture-of-Experts hybrid Mamba-Transformer model designed for...

MarkTechPostJun 4

LLM

Miso Labs Releases MisoTTS: An 8B Emotive Text-to-Speech Model with Open Weights

Miso Labs has released MisoTTS, an open-weights 8 billion parameter text-to-speech model that uses residual vector...

MarkTechPostJun 4

LLM

Ideogram 4.0 drops as an open-weight model with native 2K resolution and improved text rendering

Ideogram has released version 4.0 of its text-to-image model as an open-weight model featuring native 2K resolution,...

The DecoderJun 3

LLM

Perplexity announces hybrid AI system that decides what runs locally or in the cloud

Perplexity has announced a hybrid AI orchestrator system that intelligently distributes tasks between local and...

The DecoderJun 3

LLM

How to Fine-Tune LFM2 Using QLoRA and DPO: A Complete Step-by-Step Coding Tutorial on Google Colab

A tutorial on fine-tuning the LFM2 language model using QLoRA and DPO techniques on Google Colab. The guide covers...

MarkTechPostJun 3

LLM

OpenAI vs. Anthropic vs. Google: But the Model Isn't the Point

Enterprise customers are prioritizing practical AI solutions and business outcomes over specific AI model providers or...

AI BusinessJun 2

LLM

JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines

JetBrains has released Mellum2, a 12B parameter Mixture of Experts model trained on 10.6 trillion tokens, under an...

MarkTechPostJun 2