- The AI Digest
- Posts
- OpenAI shocks the world with GPT-4o1 model!
OpenAI shocks the world with GPT-4o1 model!
Welcome, AI Enthousiasts!
In today’s AI newsletter:
→ AI Newsflash
→ OpenAI unveils o1: A new 'reasoning' AI model
→ White House launches AI datacenter task force
→ Mistral releases multimodal Pixtral 12B
→ Adobe previews new AI video model
→ Google turns your notes into podcasts
→ 5 Best billing and accounting AI-tools
Reading time: 10 minutes
AI Newsflash
OpenAI has reportedly exceeded 11 million paying subscribers for ChatGPT, including 1 million users on premium business plans, potentially driving over $2.7 billion in annual revenue, according to COO Brad Lightcap.
Google has started rolling out Gemini Live to free users of its Gemini Android app, enabling natural voice conversations with the AI assistant and introducing 10 new voice options.
Hume AI launched Empathic Voice Interface 2 (EVI 2), a voice-to-voice foundation model designed to detect and generate a wide range of emotional tones and speaking styles, advancing emotional intelligence in AI interactions.
Runway has released Gen-3 Alpha Video to Video, a tool that allows users to apply AI-generated styles and prompts to input videos, now available across all paid plans.
OpenAI is in discussions to raise $6.5 billion in funding and secure an additional $5 billion credit line from banks, potentially valuing the company at $150 billion, up from its previous $86 billion valuation.
Meta is reportedly finalizing the construction of a new AI supercomputing cluster featuring over 100,000 Nvidia H100 chips, designed to train its upcoming Llama 4 language model.
OpenAI has launched its first AI model with advanced reasoning abilities, dubbed 'o1' (formerly Project Strawberry/Q*), now available in ChatGPT for Premium and Teams subscribers.
Here’s what you need to know:
o1 employs reinforcement learning and chain-of-thought techniques to simulate human-like problem-solving by "thinking" through responses.
It surpasses human experts on PhD-level science queries and ranks in the 89th percentile in competitive programming challenges.
The model solved 83% of the problems from the International Mathematics Olympiad qualifying exams, a vast improvement over GPT-4o's 13%.
Two versions are now available: o1-preview and o1-mini, both of which are live for ChatGPT Premium and Teams users as of this newsletter's release.
API access comes at a higher price compared to GPT-4o, costing $15 per million input tokens and $60 per million output tokens.
OpenAI’s o1 model outperforms expert humans in complex science domains and introduces enhanced reasoning by "thinking" through problems before answering. This leap in AI accuracy paves the way for innovative real-world applications in science, programming, mathematics, and beyond.
The White House
White House launches AI datacenter task force
The White House has announced the creation of an AI datacenter infrastructure task force, led by the National Security Council, National Economic Council, and the Deputy Chief of Staff’s office, to bolster U.S. leadership in AI development.
Here’s everything you need to know:
Tech executives from Nvidia, OpenAI, Anthropic, Google, Microsoft, and Amazon recently met with government officials to address key topics such as AI’s energy consumption, datacenter capacity, job creation, and strategic site selection.
The task force will align policies to drive datacenter growth while considering economic, national security, and environmental objectives.
The administration is working to streamline the permitting process for datacenters and is leveraging Department of Energy resources to accelerate AI infrastructure expansion.
Major tech firms reaffirmed their commitments to achieving net-zero carbon emissions and procuring clean energy for their operations.
This task force marks a significant shift in the U.S. AI strategy, moving beyond safety oversight to actively shaping the infrastructure necessary to sustain America's leadership in AI innovation. The announcement follows news that OpenAI and Anthropic will allow the U.S. AI Safety Institute to test new models before they are publicly released.
French AI startup Mistral has unveiled Pixtral 12B, its first multimodal model capable of processing both text and images, now available for free download under the Apache 2.0 license.
Here’s the breakdown:
Pixtral 12B is a 12-billion-parameter model, approximately 24GB in size, and is built upon Mistral's text-based Nemo 12B model.
This marks Mistral's debut multimodal model, enabling it to process and respond to both images and text-based queries.
The model is freely accessible for download on GitHub and Hugging Face under the Apache 2.0 license, allowing unrestricted use and customization.
Mistral plans to integrate Pixtral 12B into their chatbot platform, Le Chat, and their API service, Le Plateforme, in the near future.
Mistral, just over a year old, is positioning itself as Europe's potential challenger to OpenAI. With a small team of elite researchers and a recent $645 million funding boost, the company is making bold moves in the AI space, offering powerful open-source models.
Adobe Firefly
Adobe previews new AI video model
Adobe has previewed its Firefly AI Video Model, which introduces tools to expand existing videos and generate new clips using text or image prompts. The release is expected by the end of the year.
Adobe is introducing three major features: Text to Video, Image to Video, and Generative Extend.
Text to Video enables users to generate video clips from text prompts, with controls for camera angles and the option to use reference images.
Image to Video turns still images or illustrations into dynamic, live-action video clips.
Generative Extend—soon available in Premiere Pro’s beta—can add footage to fill in gaps or extend existing shots.
While OpenAI's upcoming Sora focuses on creating videos entirely from scratch, Adobe is taking a different route—transforming video editing itself. With these tools, users will soon be able to change camera perspectives, extend scenes, and generate b-roll with ease, ushering in a new era of AI-driven video creation.
Google Labs
Google turns your notes into podcasts
Google has introduced Audio Overviews, a new feature in NotebookLM that transforms notes, PDFs, Google Docs, Slides, and more into AI-generated audio discussions between two virtual AI agents.
Here’s what you need to know:
Audio Overview generates an in-depth conversational summary from uploaded materials, with AI hosts discussing key content and drawing connections across different sources.
The tool supports a variety of formats, including documents, slides, charts, and web URLs, leveraging the multimodal capabilities of Google's Gemini 1.5 model.
To use the feature, open an existing notebook in NotebookLM, navigate to the Notebook guide, and click the “generate” button on the right.
Google Labs confirmed that NotebookLM can process up to 50 sources, each up to 500,000 words, allowing for a maximum of 25 million words to be analyzed for the audio generation.
Audio Overviews could be transformative for auditory learners, offering a new way to absorb complex information. The feature is particularly effective for academic papers, ebooks, textbooks, and presentations. After testing it on yesterday's newsletter, we were thoroughly impressed with its performance!
5 Best billing and accounting AI-tools
Bill.com streamlines handling invoices, payments, and cash flow with AI-powered automation and intuitive billing features.
Xero provides smart accounting software designed to help small businesses manage their finances, monitor expenses, and automate bookkeeping tasks.
Refresh.me offers an AI-enhanced platform that simplifies invoicing, time tracking, and expense management for freelancers and small business owners.
Petal automates financial management through tools that focus on budgeting, forecasting, and expense tracking, all tailored for small business requirements.
Evolup enhances accounting efficiency by delivering AI-driven solutions for managing invoices, receipts, and financial records with ease.
That’s a wrap!
We had a lot to talk about, so let’s wrap it up. If you have any questions, feel free to shoot over an e-mail and we wil get you a response within 24 hours.
If you have specific feedback or anything interesting you’d like to share, please let us know by replying to this email.