Tech Landscape #418
Updated image models, Gemini gets musical, and the future of Meta Quest.
Hello!
This week I sent an extra edition, The Future of AI Images. It’s about why image models still struggle with certain concepts, and how world models might provide a solution. It’s exclusively available to paid subscribers for a short time before I open it up to everyone. Although the weekly round-up email will always be free I may write a few more forward-looking pieces like this in the future, so if you fancy getting those early and supporting the costs of producing Tech Landscape, you can become a paid subscriber for £3.50 per month or £30 per year.
Right, let’s get on with it. Hope you’re well!
Synthetic Audio-Visual
Recraft introduced V4, a ground-up rebuilt image model focused on “visual taste, prompt accuracy, and output quality” that supports both raster and high-quality vector outputs, at “print-ready” scale with the Pro model. recraft.ai
V4 is a nice model, with a wide range of styles including excellent photorealism (I made this ⬇️ quick test) and art direction. Recraft has always been focused on being design-lead, and it shows.
Higgsfield released SOUL 2.0, a photorealistic AI image generator designed for creative direction that features improved character consistency and cultural awareness for fashion and editorial visuals. higgsfield.ai
The company has picked up a bad name because of some of its shady promotional practices (they must be really bad to get kicked off of X). This is a good model for photorealism, although it’s a bit too opinionated and the style sometimes overrides the prompt; try as I might. with this preset I could not get the characters in my test ⬇️ to look happy.
Tavus introduced Phoenix-4, a real-time human avatar model with emotional intelligence that can generate context-aware facial expressions and listening behaviours. tavus.io
This combines with the Raven-1 perception system [TL 417] and Sparrow-1 conversation flow model to form a suite of real-time conversational avatar tools.
Google DeepMind launched Lyria 3, an updated AI music generation model that can create high-fidelity tracks from text or visual prompts. deepmind.google
It’s available in YouTube Dream Track and in the Gemini app, where you can create tracks of up to 30 seconds. I gave it a quick run-out ⬇️. It’s… OK. It’s not close to the level of Suno and the like, can’t handle more esoteric genres, and the lyrics can be clunky, but it’s decent and fast.
Creator Tools
Freepik added the Magnific Video Upscaler, which can enhance video footage up to 4K resolution with optional speed and motion smoothing. instagram.com
This ⬇️ is a test (best viewed full screen) using the Creative mode; a Precision mode will follow. It seems to work well, adding detail and sharpness without hallucinating anything obvious.
Freepik added three synthetic voice features: Voice Clone, Multi-Speaker Voiceovers, and Change Voice. x.com/freepik
Powered by Eleven Labs and Google Gemini.Google Labs’ Pomelli introduced Photoshoot, to transform basic product photos into professional-grade studio and lifestyle visuals. blog.google
Pomelli is Google Labs’ experimental free marketing platform for small businesses, currently only available in a handful of countries (not including the UK).DuckDuckGo’s Duck AI assistant added image editing, using a privacy-preserving model. threads.com/@duckduckgo
I don’t know which model it’s using, but as you can see in my test ⬇️ it does a decent but not perfect job of changing the character’s hair colour; although it hasn’t substantively changed my input image there are substantial changes.
???
Pika announced AI Self, to “birth, raise, and set loose to be a living extension of you”. instagram.com/@pika_labs
I’m a bit confused about what this actually is; it seems to be a combination of avatar and agent, based on your appearance and trained on your personality, and able to take actions on your behalf. Whatever, it represents a bit of a pivot for Pika which is better known for its AI video creation tools. You can join the waitlist to make your own.
Assistants & Search
Manus introduced Manus Agents, letting users access the assistant’s full reasoning and task execution capabilities directly from messaging apps, starting with Telegram. manus.im
This is kind of a safe version of OpenClaw (if you don‘t know what that is, I made this explainer), where you don’t have to worry about an agent deciding to delete your important file folders. As Manus is owned by Meta now, it’s sure to come to Messenger or WhatsApp soon.Samsung launched a beta of the upgraded Bixby, now a conversational agent that can control device settings with natural language, and integrated with Perplexity for real-time web search results. news.samsung.com
Launching first on phones running One UI 8.5 in select countries including the UK and USA.Reddit is testing AI-powered shopping in search results, showing a carousel of related products (based on community recommendations) to some users in the USA. redditinc.com
WordPress launched an AI Assistant, enabling creators to generate and edit content, make design and layout decisions, get help with fact-checking and suggestions, and more. wordpress.com
It’s available to sites hosted on wordpress.com, but not to self-hosted sites.
Foundation Models
Google’s Gemini 3.1 Pro offers significantly improved reasoning capabilities for complex problem-solving tasks. blog.google
It’s available (in preview) for developers and enterprises, and for consumers in the Gemini app and NotebookLM.Anthropic introduced Claude Sonnet 4.6, a significant upgrade to its Sonnet model that delivers frontier performance across coding, agents, and professional work at scale. anthropic.com
It’s now the default model for free and Pro plansAlibaba released Qwen3.5, a powerful multimodal model optimised for agentic use cases including coding and computer use. qwen.ai
This is the first in a series of models under this name, and is available for free (open weights) or commercially.
Social
Snapchat launched Creator Subscriptions, a new monetisation feature that lets creators offer exclusive content and priority replies to their fans for a monthly fee. newsroom.snap.com
Messaging app Germ is now natively integrated with Bluesky, adding an encrypted messaging service that enables private conversations via handles instead of phone numbers. germnetwork.com/blog/
This is a little nerdy but it basically proves the concept behind Bluesky’s decentralised protocol; social networks that live across multiple apps and services.
Metaverse-ish
Meta laid out its plans for Horizon in the year ahead. The headline is that it will fully separate Worlds from Quest, with the former moving to an entirely mobile experience and the latter focusing on games and apps from third-party developers. developers.meta.com
Worlds was introduced at the height of the metaverse hype but only ever appealed to kids; spending any time in there was insufferable for anyone aged over 15. And refocusing Quest on games, media consumption, and productivity for adults enables new devices to come in at a higher price, I’d imagine.
Epic Games acquired Meshcapade, a “markerless motion capture” tool for creating and animating realistic 3D digital humans from video input. mpg.de
“The Meshcapade team will join Epic’s AI Research team, contributing to technologies for Unreal Engine and MetaHuman.” People can complain all they like about generative AI in games, but it will be more or less unavoidable soon enough.
Bonus Links
Jia Zhangke’s Dance is a short film by reknowned Chinese director, Jia Zhangke, made entirely with Seedance 2.0. I don’t get the references to his previous films, but it’s a fascinating conversation about memory, identity, and filmmaking.
I’m not worried about technology “replacing” film. From its inception, film has always coexisted with new technologies. The camera itself was once an unsettling invention, but today it’s part of our everyday life. What truly matters is how people use technology.
I love it; this is what it looks like when a person with real creative vision gets to grips with a new technology. It’s so much more interesting than the sci-fi superhero mashups by the “Hollywood is cooked!” bros.
To Stay in Her Home, She Let In an A.I. Robot. A lovely story, and a little bit sad. Of course it would be better for elderly people to have family around them, but it’s not always possible; and in their absence, a robot seems like a better option to keep a healthy and engaged mind than a TV.
Fal’s first State of Generative Media Report has plenty of data and insights into model progression, industry adoption, and developer experience. It’s still used much more in creative development, less so in production; legal compliance remains the biggest blocker.





