AI Weekly Update - May 5, 2025

Cursor raises at a $9B evaluation, UPS considers using humanoid robots

May 05, 2025

what to know for now

🛒 ChatGPT adds in‑chat shopping. OpenAI injected structured product search, images, reviews, and purchase links directly into ChatGPT, exposing the commerce layer to free, Plus, and Pro users through GPT‑4o. Results pull from third‑party metadata with zero ads or affiliate fees, marking a revenue path beyond API seats. The feature hints at an attack on Google’s retail queries. Read more

🦙 Meta launches Llama API preview. At LlamaCon, Meta opened a managed Llama API that serves every model variant, usage analytics, and granular safety knobs. Billing mirrors token tiers, pushing Meta beyond download‑only distribution toward cloud recurring revenue. Read more

🧠 Microsoft rolls out Phi‑4 reasoning family. Azure AI Foundry posted Phi‑4‑reasoning, Phi‑4‑reasoning‑plus, and Phi‑4‑mini‑reasoning, 14B to 3.8B‑parameter small language models tuned for chain‑of‑thought tasks. Distillation and RL fine‑tuning lift them over DeepSeek‑R1‑Distill‑70B and o1‑mini on math, science, and coding while fitting edge NPUs. Models ship as open weights on Hugging Face. Read more

🔙 Gpt‑4o update rolled back over sycophancy. OpenAI pulled last week’s GPT‑4o build after telemetry showed the model offering hollow praise and suppressing dissent. ChatGPT reverted to an earlier checkpoint while training adds explicit anti‑sycophancy loss and long‑term satisfaction weighting. Upcoming updates will ship personality toggles and broader pre‑release trials. Read more

🧪 AI Research of the Week

Can AI Change Your View?  Evidence from a Large‑Scale Online Field Experiment
From University of Zurich — anonymous authors
Jake’s Take: Anonymous Swiss researchers slipped 1,783 chatbot replies into Reddit’s r/ChangeMyView. They tested three styles: plain, trained‑on‑winning replies, and replies tweaked with rough guesses about each poster’s age, politics, and more. Humans pressed “send,” then counted how many “deltas” (proof of a mind change) each bot earned. Personalized bots topped the chart.
The stunt broke the forum’s no‑bot rule and skipped consent, so moderators filed an ethics complaint. Zurich’s board issued a warning yet cleared the publication. There’s a big risk here though: this playbook shows bad actors how to steer opinions at scale. Social sites need tougher bot labels, faster bans, and sharper detection, or the next “experiment” becomes a real assault on public discourse.

what to know for later

💰 Anysphere clinches $900M round. Cursor maker Anysphere closed a Thrive‑led $900M raise at a $9B valuation, hitting $200M ARR from “vibe coding” adopters such as Stripe and Spotify. The deal underscores investor rotation from infra to application AI. Read more

📦 UPS weighs Figure humanoids for parcel sorting. UPS and Figure AI are discussing trials that embed Figure 02 humanoids on parcel conveyors, Bloomberg sources confirm. The robots run OpenAI‑powered vision‑language‑action loops and leverage Azure training. A deal would push humanoids into a Fortune 50 logistics stack ahead of Amazon’s internal timeline. Read more

🍏 Apple and Anthropic build internal vibe‑coding platform. Apple wired Anthropic’s Claude Sonnet into Xcode, creating an internal platform where agents write, edit, and test code through vibe‑coding loops. The system runs on‑device with Apple silicon and gated LLML testing, with no public release decision yet. Read more

🦾 Hugging Face releases $100 3d‑printed robot arm. The SO‑101 arm ships as printable STL files with a motor kit. Six‑axis actuators run a reinforcement‑learning controller trained through LeRobot SDK and camera feedback. Hugging Face positions the arm as a low‑cost embodied‑AI reference and precursor to the Reachy 2 humanoid. Read more

AI Weekly Update - May 5, 2025

Cursor raises at a $9B evaluation, UPS considers using humanoid robots

what to know for now

🧪 AI Research of the Week

what to know for later

Discussion about this post