Tech Landscape #388
AI tools for professionals, the mid-2010s social comeback, and a new Grok for those who love risk.
Hello!
After weeks of illness and injury I’m finally back up on my feet and able to socialise. I spend a lot of time researching and testing and writing about technology which leads many people to think that I’m chronically online. I’m honestly not; I love to spend time alone reading about history, or listening to diverse podcasts, or watching TV; and I especially love spending time with friends, whether that’s at the pub with long-time friends or a group walk with new friends. It keeps me mentally healthy, and I believe it enables me to keep a healthy skepticism about the various claims made about technology.
Anyway, let’s get on with it. Hope you’re well!
Synthetic Visuals
Moonvalley launched Marey, a ‘clean’ video model
Pro-level features include motion and pose control, and camera and trajectory guidance, and it’s trained on licensed data so commercially safe to use in production.
It’s really important to have a video model that’s commercially safe and I had very high hopes for this, but my results have been… well, let's generously say ‘mixed’ (but actually they've been mostly poor). The first six videos I generated ⬇️, using a mix of image and text prompts, all came out pretty bad; the detail is good, but the motion is off in every one.
I had better results when I used an existing video (by cottonbro studio from Pexels) as a pose transfer source ⬇️:
Maybe it’s just a learning curve I need to get past, which is annoying if so because I used up almost all of my monthly credits on getting even this far. But I suspect it’s just that the first version of the model isn’t great, and if that’s the case I hope it improves quickly because I really want this to succeed.
Intangible is a new AI tool for professional creatives
It uses a 3D scene editor with ‘spatial intelligence’ that gives control over pose, motion, and camera, to guide the generation of multiple shots and clips in a scene with AI video.
x.com/intangibleai/status/1942977256592384214
Intangible is made by people from the graphics industry and is aimed at people in that industry too. It’s complex at first look; I dipped in to it quickly but it certainly needs more time to master than your standard consumer AI tools. But that complexity offers a high level of control, which will be powerful if it works as promised.
Google’s Veo 3 (Fast) can now generate videos from a keyframe image, including videos with speech and soundFX. blog.google
It’s available in Flow and rolling out to the Gemini app. Sadly we in the UK & EU can’t use keyframe images that show realistic human faces, which massively limits its usefulness.Vidu added Reference to Video, to generate videos that include multiple reference characters, objects, or scenes. instagram.com
I ran a quick test ⬇️ and it does a good job with the human, a decent job with the drink can, and a very vibe-y job with the environment.
Vivago added Multi-Element Reference Video, to generate videos that include multiple reference characters, objects, or scenes. x.com/vivago_ai
This seems very similar to what Vidu offers, but Vivago seems to have ended its free daily credits scheme so I can’t compare it.Freepik added Video Extender which can extend any generated or uploaded video clip by around four seconds. x.com/freepik
Here’s a quick example ⬇️ I made, showing the source (from Pexels; see above) compared to Freepik Extend. It’s pretty good.
Letz AI added Chat, a conversational interface for generating and editing images individually or in batches. instagram.com
I’ve included this mainly because I think the swing back to prompting, enabled by new in-context image generation models (TL #382), is worth noting.Kling updated its KOLORS image model with improved prompt adherence, portrait aesthetic, and cinematic shots. instagram.com
Higgsfield added Soul ID, consistent characters for use in the Soul image generation model. threads.com/@higgsfield.ai
You can train a character model with 20-70 images; I made one of me and brought it to life ⬇️.
Assistants & Search
xAI released Grok 4 with improvements to reasoning and problem solving, creative and contextual understanding, and a new Heavy mode which uses multiple agents for deep ‘thinking’. x.ai
Is it better? Probably. I have no useful way of measuring these things. It sets a bunch of new benchmark records, but user testing isn’t univerally positive. Fundamentally, however, Grok may be a very capable model but it’s so bound to the beliefs and whims of one erratic man that I’m not sure how anyone could possibly rely on it.
Perplexity launched Comet, a web browser with built-in agentic AI systems. perplexity.ai
It’s only available for subscribers to the ($2000 per year) Perplexity Max tier for now. It’s clear that browsers are the battleground for the war to become your default AI interface to the web but, as I’ve said before, getting people to switch browsers is hard; our habits are ingrained.Genspark added AI Pods, to create a short audio ‘podcast’ on any subject. x.com/genspark_ai
Google is adding AI Mode to Circle to Search, enabling Gemini-powered deep-dives for anything on screen. blog.google
AI Mode is still only available in the US and India for now.Gemini is coming to smart watches, rolling out to devices running Wear OS 4 and above. blog.google
Social
Bluesky will introduce age verification in the UK to comply with the Online Safety Act. bsky.social
Users can verify their age using a payment card, ID document, or facial scanning. This will come to other platforms soon too.
It was a good week for news about mid-2010s social apps.
Social check-in app Swarm got an overhaul on iOS with a redesigned timeline and new map tab. threads.com/@foursquare
When Foursquare (remember that?) turned itself into a city guide app it span out the location check-in service as Swarm. Now Foursquare is gone, and Swarm lives on.Digg launched its desktop web client, although still in closed Beta. x.com/JustinMezzell
Digg and Reddit were major rivals back in the day, but Reddit out-competed it. Now, with the rise of LLMs and the value of human conversation and preferences as search and training data, Digg is on its way back.