Tech Landscape #430
New image models, new world models, and new Meta social apps.
Hello!
It’s been two weeks since the last proper edition and a lot has happened, plus I’ve included some announcements from Google I/O that didn’t make it into the mid-week special. So this edition will be light on comment to avoid going over the length limit.
In which case, let’s get on with it. Hope you‘re well!
Synthetic Content
Runway launched Aleph 2.0 and Edit Studio
Aleph 2.0 is an upgraded video model with localised and frame-based editing across multiple shots in clips up to 30 seconds long in 1080p resolution. It comes with Edit Studio, a new tool that’s built around the capabilities of the model.
runwayml.com/news/introducing-aleph-2-and-edit-studio
I ran a quick comparison ⬇️ of Aleph 2.0 and Gemini Omni Flash; I think Google’s model does a better job in this case, although (or because?) it seems to change the source more to better fit the changes whereas Aleph more rigidly preserves the source. But I need to find time to run more (and more rigorous) tests. What I appreciate about Aleph 2.0 is Edit Studio, which gives more control; for example, you can choose any frame to make the change in rather than prompting the whole clip, so it’s easier to introduce new elements at just the right moment.
Krea released Krea 2
The upgraded image generation model, developed entirely in-house, brings an emphasis on visual taste and style control along with faster generation speed and better prompt adherence.
It’s a decent model; my test ⬇️ shows a few visual quirks, but I appreciate the range of styles you can get out of it. You can also train a custom version with your own characters, objects, and styles, using LoRAs.
Recraft V4.1 is an update to its flagship image model that features improved photorealism and smoother 3D rendering, plus two variants: Utility (for mockup and product shots) and Vector (for… yeah, you guessed it). recraft.ai/blog
World Models
Odyssey announced two models: Agora-1, a multi-agent model that lets up to four users simultaneously interact within a single real-time video simulation (you can try it yourself), and Starchild-1, a multimodal model that can generate synchronised audio and video in real-time (not yet released).
Google brought Street View to Project Genie, letting users generate virtual environments based on real-world U.S. locations. The project is also rolling out to Ultra subscribers globally. blog.google
Roblox Labs showed off Game Cartridge, an experimental system which uses a video world model to generate a game environment in realtime; one person plays, another guides the world. x.com/AlbyHojel
This looks pretty amazing, but sadly it’s only available in the U.S. and for a limited period.
Creative Tools
The common interface shift in creative tools right now is to creative agents, which use a chat interface to help users to script, storyboard, and generate complete multi-shot video with voiceover, dialogue, and music — useful for everything from ad campaigns to long-form videos.
The past two weeks has seen the release of Runway Agent, Creatify Agent, and Director Mode in CapCut Video Studio, which all do roughly the same thing.
And Higgsfield’s Supercomputer is a creative agent workspace that runs in the cloud so can be accessed from anywhere, is orchestrated by Gemini (the new 3.5 Flash model?), and features Personal Clipper, which can automatically cut, caption, and resize any YouTube video from its URL.
Leonardo launched 3D creation to turn images into 3D models. instagram.com
It’s powered by the Rodin Gen-2 model — but if they’d waited a few days they could have used the updated Rodin Gen-2.5, which can generate up to 10 million polys with improved texture and has built-in reasoning.
Avatars
Runway’s Characters can now take actions such as interacting with a Web page during a conversation. x.com/runwayml
HeyGen added Custom Motion to add character movement into the script. instagram.com/heygen_official
Tavus launched Image-to-Replica to create a fully functional AI character from a still image, using the Phoenix-4 model. tavus.io
Something I missed in the announcement of Gemini Omni Flash last week is that you can create an Avatar of yourself from a video recording which can be used in video generations. It’s available in English only, and not yet in the UK, EU, or EEA.
Audio
Stability AI released Stable Audio 3.0, a family of (instrumental) music and sound generation models which support variable-length generation up to six minutes and on-device composition. stability.ai
I made this ⬇️ quick test of Brazil-inspired drum & bass using the Large model. They’re not the best models available; vibes work better than complex prompts, and there are no lyrics. But they’re open-weights and trained on fully licensed data, which should make them attractive to creative professionals if the pricing is right.
Spotify and AI
Spotify is taking a laudable approach to the use of AI internally: developers write less code, but ship more features. And some interesting AI-powered audio updates were announced at its Investor Day.
The headline is a licensing agreement with Universal Music Group that will let fans make covers and remixes of songs from participating artists, who will receive a revenue share of listens to their variants.
An assistant will let listeners ask questions about podcasts or create their own Personal Podcasts, and Studio by Spotify Labs is an experimental desktop app that can use agents to communicate with desktop files, the Web, and connected apps to create highly-tailored customised audio content.
The “Verified by Spotify” badge has been extended to podcasts to indicate that a creator is authentically human, and stronger action will be taken against content that impersonates someone’s likeness without permission.
Social & Messaging
Instagram released Instants
It’s for sharing real-time, unedited photos that disappear once they have been viewed by close friends or mutual followers, available in Instagram or as a standalone app.
about.instagram.com/blog/announcements/introducing-instants-for-sharing-in-the-moment/
I don’t really know who this is for, but I know it’s not for me. Possibly they just realised there was a feature they hadn’t copied from Snapchat yet.
Meta stealth-released Forum, a Reddit-like app based on Facebook Groups.
Facebook added a Content Planner and Batch Uploads for Reels.
Meta AI is coming to Threads; users will be able to tag it and ask questions. threads.com/@threads
It’s the same way that Grok operates on X.Meta updated its Family Center supervision tools, consolidating parental controls for Instagram, Facebook, Messenger, and Meta Horizon into a single hub. about.fb.com
More Gemini features are coming to YouTube: Ask YouTube is a conversational search experience that answers complex queries with structured video recommendations, and the new Gemini Omni Flash model will let users remix content from other creators in YouTube Shorts and the YouTube Create app. blog.youtube
YouTube introduced new advertiser and shopping tools at Brandcast 2026, including a two-click “Buy with Google Pay” checkout for connected TVs, AI-powered custom sponsorships, and multimodal AI video generation. blog.youtube
TikTok launched an Ad-Free tier in the UK, a £3.99 monthly subscription available to users aged 18 and over. newsroom.tiktok.com
Assistants & Search
Google announced new ad formats for Search and AI Mode, including Conversational Discovery ads, Highlighted Answers, and AI-powered Shopping ads that use generative AI to provide personalized product advice. blog.google
People are acting shocked that Google is putting ads in AI Mode, but it was always going to happen.Amazon launched Alexa for Shopping, a personalised AI assistant that combining the app / website chatbot Rufus with the context of Alexa+ to help users research, compare, and automate purchases. aboutamazon.com
Meta AI gained new capabilities: a live voice mode, and Facebook Marketplace listings in Shopping searches, both powered by the new Muse Spark model which is also rolling out to Ray-Ban Meta glasses in North America. about.fb.com
Microsoft brought Copilot to the Edge browser (desktop and mobile), with features such as multi-tab reasoning, voice and vision, and study and writing tools. blogs.windows.com
Vibe Coding
Google AI Studio can now build native Android apps, only for personal use at launch but with plans for Play Store submission in the future. blog.google
OpenAI brought Codex to the ChatGPT mobile app for developers to manage their coding sessions remotely. openai.com
xAI launched Grok Build, an agentic coding agent, in early beta for SuperGrok Heavy subscribers. x.ai
Meta launched agentic coding tools for Horizon, enabling developers to more quickly and efficiently build VR apps. developers.meta.com
Vibe Design
(Sorry, I don’t like this category name but I can’t think of anything better right now.)
Google updated Stitch, adding a real-time view and the options to export designs to AI Studio and Antigravity, and publish directly to the Web. blog.google
Google updated Pomelli with an Agent for conversational building, brand identity and Business DNA for on-brand assets, and full website design. blog.google
Smart Glasses
Meta Display glasses are gaining more features: neural handwriting, combined input video recording, walking directions in some major cities, and live captions in messaging apps. meta.com


