what to know for now
🤖 GitHub expands generative AI integration. GitHub now supports Claude Sonnet 3.5 and Gemini Pro 1.5 for coding, expanding beyond OpenAI's models to give developers flexibility. The new GitHub Spark tool allows application creation in plain language, while five AI agents enhance Copilot Workspace. Read more
🔍 OpenAI’s search challenge to Google. OpenAI launched ChatGPT Search, a tool for real-time data access that positions it as a competitor to search giants Google and Microsoft. The beta-tested feature, SearchGPT, integrates news, stock updates, and weather, driven by partnerships with major news and data providers. Read more
🎙️ ChatGPT's voice mode now on desktop. OpenAI launched Advanced Voice for macOS and Windows desktop apps, enabling dynamic, multi-turn conversations with emotional response recognition. Users can access this feature by updating to the latest version. Read more
📱 Apple Intelligence debuts with iOS 18.1. Apple’s AI, Apple Intelligence, launches on iPhones via iOS 18.1, offering features like text proofreading, improved Siri knowledge, and advanced photo tools. A waitlist manages access, with compatibility limited to specific recent models. Read more
🐼 Red Panda image generator dominates benchmarks. Recraft V3’s latest ‘red_panda’ model now leads Artificial Analysis' Text-to-Image Model Leaderboard with a 72% win rate, surpassing Midjourney and FLUX1.1. This model uniquely allows any text length in image generation and grants designers granular control over outputs. Read more
🧪 AI Research of the Week
GPT-4o System Card
From OpenAIJake’s Take: The GPT-4o System Card outlines the model's advanced capabilities with multimodal inputs (text, audio, image, and video) processed through a unified neural network. GPT-4o is designed for fast responses, matching GPT-4 Turbo in performance and offering better non-English language and multimodal processing. Rigorous safety measures address risks like impersonation, privacy, and harmful content, especially in its audio capabilities.
GPT-4o’s unified multimodal abilities are pushing the AI industry toward more seamless human-like interaction, but the growing scope of these foundation models demands strong oversight to manage the ethical and safety risks that will likely come with it.
what to know for later
💻 OpenAI builds first in-house chip. OpenAI, collaborating with Broadcom and TSMC, has begun developing an AI inference chip while shelving its costly plans to establish chip foundries. Broadcom aids with design, and AMD chips augment OpenAI’s chip supply, reducing reliance on Nvidia. Read more
🔍 Meta building AI search tool. Meta is developing an AI-powered search engine to reduce reliance on Google and Microsoft by generating real-time event summaries through its Meta AI chatbot. The project includes location data for potential competition with Google Maps and a multi-year agreement with Reuters for news sourcing. Read more
🧠 OpenAI's AMA reveals new developments. Key OpenAI leaders, including CEO Sam Altman, discussed plans to enhance current models and introduce new features, prioritizing the "o" series with capabilities like multimodality and potential video input. Expect model improvements but no immediate GPT-5; agents are anticipated as the next transformative step in 2025. Read more
🎶 UMG prepares ethical AI music model. Universal Music Group and KLAY Vision partner to create a foundational AI music model rooted in ethical practices. With a focus on responsible scaling and copyright respect, the initiative aims to protect creators while enhancing creative and monetization opportunities. Read more