Tech Landscape #426
Reinventing creative design with AI, Google expands Gemini everywhere, and age verification is coming to the internet.
Hello!
This was a busy week, it took me a lot of time to put this newsletter together, and I don’t have much space to write an intro, so let’s get on with it. Hope you’re well!
Synthetic Audio-Visual
Midjourney released V8.1 (alpha) with the more consistent aesthetic of previous versions, the return of image prompts, and improvements to resolution and speed. midjourney.com
My test ⬇️ shows that the “Midjourney look” is definitely back.
ImagineArt 2.0 is an image model with reasoning-based prompts, improved text rendering, and cinematic quality. threads.com/@imagineartofficial
It’s very focused on cinematic / photorealism and gives pretty good results, as you can see in my test ⬇️, but is entering a very crowded market. It’s available through the ImagineArt platform and a handful of third-party aggregators.
Baidu’s ERNIE-Image is an open-weights text-to-image model, with built-in prompt enhancer, that reportedly excels at multilingual text rendering and complex layouts. ernie.baidu.com
This has a high level of detail and rich colour, as you can see in my test ⬇️, but isn’t quite at the level of the best commercial models. As an open and trainable model, however, it will be very useful for people who build custom workflows.
Microsoft released MAI-Image-2-Efficient, a faster and cheaper version of its image model designed for high-volume production tasks. microsoft.ai
I can’t show you an example because it doesn’t seem to have rolled out globally yet.
World Models
Three new world models were announced this week. Although there are practical applications for them, this area is still in its infancy — but worth keeping an eye on.
NVIDIA Research introduced Lyra2, which uses video to ‘imagine’ an interactive scene then saves it as Gaussian Splats to preserve memory and consistency when revisiting. research.nvidia.com
Tencent released HY-World 2.0, a world model built for 3D software platforms, with physics-aware movement and collision support, under an open source license. x.com/TencentHunyuan
Alibaba announced Happy Oyster, “an open-ended world model product for real-time world creation and interaction”, currently in closed beta with limited information available. x.com/HappyOysterAI
Google released Gemini 3.1 Flash TTS, a text-to-speech model that uses natural language audio tags to provide fine control over vocal style, pace, and delivery across 70+ languages. blog.google
Like its predecessors this gives you a set range of base voices (US English) into which you can prompt characteristics such as accent, age, and expression, but it also adds expression tagging for mid-speech changes; in this ⬇️ example I’m using a British Midlands voice mixing enthusiastic and excited expressions. It’s a little exaggerated, but brilliant in some moments. It’s available via API, in the AI Studio app, and in Vids voiceovers.
xAI released two Grok speech APIs for text-to-speech and speech-to-text. x.ai/news
These were squeaked out on Saturday, for some reason, so I haven’t had the chance to test them properly.
Creative Tools
Canva introduced Canva AI 2.0, calling it the “most significant product evolution since launching in 2013”. At its heart is a conversational agentic design platform, powered by the Canva Design Model that can generate layered, editable designs from text prompts. It also includes new workflows including third-party app connectors, web research tools, brand intelligence, vibe coding, and more. canva.com
I haven’t had the time to get hands-on with this yet, but if you want an idea of how AI is impacting (and will continue to impact) creative design, take a close look at this.
Adobe introduced Firefly AI Assistant, a conversational agent that automates complex, multi-step creative workflows across Creative Cloud apps including Photoshop and Premiere Pro. blog.adobe.com
This looks very cool. It can, for example, generate an image with Firefly, retouch it with Photoshop, then resize it with Express, all automatically from the same instruction. This is Adobe’s great advantage. A public beta will launch soon.
Firefly boosted its video capabilities, adding Kling 3.0 video models, speech enhancement and colour enhancement in the timeline editor, and more. blog.adobe.com
Anthropic Labs released Claude Design, a new product “that lets you collaborate with Claude to create polished visual work like designs, prototypes, slides, one-pagers, and more”, with brand knowledge from your design systems. anthropic.com/news
This was announced on Friday afternoon and, at the time of writing, hasn’t rolled out to me yet so I haven’t tried it. It’s powered by the improved vision capabilities of the new Opus 4.7 model (read on…), and I’m willing to bet that it was built with Claude Code; feel that accelerated development pace.
Google Flow’s Voice Ingredients are generally available, enabling consistent voices across video generations. x.com/FlowbyGoogle
This is a very useful tool, although voices are limited to 30 presets, all in US English, and don’t seem to follow prompts to modify them. In this ⬇️ example I requested a British voice, and that’s not what I got. Hopefully this will evolve to use the new 3.1 Flash model (above).
Google renamed ProducerAI to Flow Music and added a Remix feature, to extend a song or replace a section using prompts. x.com/googleflowmusic
Higgsfield launched two new tools: Marketing Studio is an end-to-end ad creation tool that generates a video from a product URL and avatar, and the updated Cinema Studio 3.5, based on Seedance 2.0, adds an assistant for directorial control and consistency.
Assistants & Search
Google introduced Skills in Chrome, repeatable prompts that can be summoned for quick actions. blog.google
Essentially this brings Gems to Chrome, with the added context of a page. Available wherever Gemini in Chrome is (that is, anywhere which uses US English).
Google upgraded AI Mode in Chrome, opening any link clicked from a response in a side-by-side window rather than a new tab, and retaining the context of the new window in the conversation. blog.google
This will be useful for a research session where you might click a lot of links. Hopefully it sends the usual visitor information to the opened page, otherwise there are going to be a lot of complaints about this. Available in the US only, for now.
Nano Banana in Gemini can use Personal Intelligence to personalise your image generations by automatically pulling context from Google Photos and preferences. blog.google
So when you use personal language in your image prompts — e.g. “a watercolour painting of me and my wife in our favourite place” — it can extract that information from your connected apps. Available to paid subscribers in the US only.
Google’s Personal Intelligence is rolling out to more Gemini users globally. threads.com/@google
Notable exceptions to “globally” include the EU and UK.
Google released two desktop apps:
The Google app for Windows is a shortcut to Search including AI Mode and Lens to ask questions about anything on your screen. blog.google
The Gemini app for macOS adds a keyboard shortcut for quick access to the assistant and screen-aware capabilities to ask questions about content in other apps. blog.google
The reason I think these are notable is because in the OpenClaw era (see previous issues) every AI assistant wants to have access to your desktop so that it can carry out actions across your apps and files.
An example of this is Perplexity’s Personal Computer, announced a few weeks ago and now available on Mac, which can work across local files, iMessage, email, connected apps, and the Web. While neither of these new Google apps have those capabilities yet, I think this is a move in advance of adding them.
Anthropic launched Claude Opus 4.7, featuring significant improvements in advanced software engineering and complex task handling, and enhanced vision capabilities to see images in greater resolution and comprehend them better. anthropic.com
Social
Facebook’s camera roll suggestions are now available in the EU and UK. It uses on-device analysis to recommend photos, videos, and AI-generated collages for easier sharing. about.fb.com
Instagram’s Your Algorithm now works in Explore, to discover more pertinent content. about.instagram.com
Meta updated the Threads API, adding support for cross-sharing to Instagram Stories, long-form text attachments, “ghost posts,” and spoiler tags, plus easier ways to include Threads content in other apps. developers.facebook.com
There’s also been a desktop Web redesign with improved navigation. It looks like Threads is entering a new growth phase.
YouTube updated its live streaming tools, expanding gifts to more countries and to horizontal streams, removing ads for paying supporters and holding them back when Live Chat engagement is peaking, and enabling simultaneous horizontal and vertical streaming. blog.youtube
Age Verification
Like it or not, age verification is coming to all kinds of online experiences.
Roblox is adding two new age-based accounts: Kids (aged 5-8) will only see games with a Minimal or Mild content maturity label; Select (aged 9-15) will only see games labelled up to and including Moderate. In both cases, the games must have passed a selection process. about.roblox.com
The European Commission announced a digital age verification app, a privacy-focused tool to protect minors from age-restricted content without requiring platforms to scan passports or faces. ec.europa.eu
Hackers and security consultants are claiming this is nowhere near secure enough; perhaps they need to run its code through Claude Mythos [TL 425] first.




