Tech Landscape #350

The future of Roblox, OpenAI's new reasoning model, and an incredible video generator.

Sep 16, 2024

Hello!

Here we are again, after another week’s break. I don’t take the breaks because I’m lazy, by the way; it’s because some weeks are very quiet. I’m very conscious that you’ve invited me into your inbox so I don’t want to waste that by sending piddling updates; I’d rather pause for a week and send you something a bit meatier.

As you can see from the title above this is issue #350, which means I’ve been at this for at least seven years (probably closer to eight). Some of you have been with me since the very beginning, and to you I say, very sincerely: thank you.

Right, let’s get on with it. Hope you’re well!

A low angle shot of a crowd in silhouette looking up to the sky at night. A brightly coloured LED drone formation spells the word "350" at huge scale. The sky is dark and filled with stars. The ground is covered with people holding up their phones to take photos. — Thanks for reading. Generated with Ideogram.

Immersive & Metaverse-ish

Roblox laid out its near-future roadmap

The company has a lofty goal to get 10% of all gaming revenue worldwide, and 300 million daily active users. It says it wants to achieve that by:

Helping creators 1) bring people together, 2) scale their creations and audience, and 3) build their businesses.

Some of the ways it plans to do that include: 1) making it easy to join a Party, and improved Communities; 2) a 3D foundational generative model to help creators easily make games, assets, and interactions, and more focus on music discovery, including an artist upload and share tool; and 3) retailers will gain improved pricing options, and a higher revenue share for purchases using real currencies.

corp.roblox.com/newsroom/2024/09/rdc-2024-robloxs-next-frontier

I write about Roblox because there’s something really interesting emerging there: a creator-made immersive social space with a functioning economy. It’s what we used to call ‘the metaverse’.

a cute brown girl character wearing white Adidas leisurewear, chunky sneakers, a green beanie, and a gold chain, standing in a blocky video game world of trees and rocks. brilliant vibrant colors. — This is meant to be metaverse-ish. Generated with Leonardo AI.

The Pico 4 Ultra Start XR headset is coming to Europe, priced at £529. instagram.com
Compared to the Meta Quest 3 this is a slightly more capable headset with fewer unique gaming titles at a higher price. Reviews aren’t effusive.

Social

Instagram added comments to Stories and new creative features to DMs: cutout stickers and image decorations. threads.net/@creators, threads.net/@instagram
I was a little leery of comments on Stories, but by default only mutual follows can leave them.
Bluesky added video uploads, one per post of up to sixty seconds. bsky.app/@bsky
Adult content is permitted, and seems to be very popular. Bluesky has seen a big uptick in use thanks to the ban on X (Twitter) in Brazil, claiming over nine million users. That’s still small compared to a lot of other social networks, but it’s three million more than this time last week.
X launched a beta version of its TV app for Android / Google TV, Amazon Fire TV, and select smart TVs. androidpolice.com
Keep calling yourself “a video-first platform” for long enough and maybe people will start to believe it.

Spread the Word

If you’d like to help me continue to send this newsletter, please tell your friends and colleagues. That’s just as useful to me as a paid subscription — although you can do that too:

Synthetic Content

Chinese startup MiniMax launched the Video-01 text-to-video model, available on its Hailuo AI platform (in Chinese). scmp.com/tech
The quality is phenomenal, with natural human movement better than any of its rivals (see below). Given how easy it is to generate celebrities, it’s clearly been trained on lots of ripped copyrighted material, so most businesses won’t want to touch this.

Adobe previewed its upcoming Firefly Video model, a generative AI tool designed to enhance video editing workflows across Creative Cloud, Experience Cloud, and Adobe Express. blog.adobe.com
This looks decent, but it’s always hard to tell from cherry-picked examples; we’ll find out how good it really is when it launches (in Beta) later this year. The integration with tools such as Premiere Pro, to extend shots or fill in B-roll, is really interesting so I hope it’s good, because having a model that’s trained on owned / licensed data is critical for businesses to embrace this technology.

Feature updates in video generation tools: Runway added video-to-video for Gen-3, enabling style transfer on up to 10 seconds of video • Dream Machine added camera controls, for directing text- and image-to-video output • Vidu added Reference to Video to keep characters or products consistent in motion videos • DZine added Image-to-Video, generating five seconds of motion from a still.

Here’s a very quick test I ran using Gen-3’s video-to-video feature:

HeyGen launched Avatar 3.0, full-body synthetic avatars with ‘emotional’ expressions and voices. x.com/HeyGen_Official
Suno added Covers, which can generate new versions of a song while preserving vocals and melody. instagram.com/@sunomusic
This is very interesting; if you’re a musician or producer you can quickly make iterations of an idea you had to see how it works in different genres and styles. It even works if you just upload your voice: not just autotune, but autotune with full production.
LetzAI launched v3 of its foundation image model, with improved photorealism and prompt understanding. letz.ai
This model is decent; not the best, but pretty good. In a crowded market, LetzAI’s model doesn’t stand out; its (hopeful) differentiator as a service is that it makes it easy to train and share custom styles.

Feature updates in image generation tools: Krea continues its wholesale integration of the FLUX.1 model with FLUX Style Mixer, which enables mixing different styles to guide an image generation, and FLUX realtime (you can guess what that does) • Stable Assistant added style references from uploaded images • Midjourney added personalisation to its Niji model (users will need to rate 200 images to train their ‘taste’).

A grid of various images — Google’s Imagen 3 is now available in the UK, through the Gemini app.

Google expanded its virtual try-on feature to include dresses. The try-on uses generative AI to show how clothes would look on a diverse range of models. blog.google

Assistants & Chat

OpenAI introduced o1

The new AI models — o1-preview and the smaller o1-mini — are designed to enhance complex reasoning capabilities and problem-solving by working in multiple steps; slower, but hopefully more accurate.

openai.com/o1

The models were announced less than 24 hours ago as I write this, so I haven’t really had time to make any useful comparisons with previous versions. I suggest you read this post by Ethan Mollick, who had early access:

One Useful Thing

Something New: On OpenAI's "Strawberry" and Reasoning

I have had access to the much-rumored OpenAI “Strawberry” enhanced reasoning system for awhile, and now that it is public, I can finally share some thoughts. It is amazing, still limited, and, perhaps most importantly, a signal of where things are heading…

4 days ago · 340 likes · 45 comments · Ethan Mollick

Google is expanding Gemini Live to free users, available on Android in English. x.com/GeminiApp
I got access to this by signing up to a year’s free trial of Gemini Advanced. I’ve used it exactly once — I just don’t have many occasions where it seems useful.
Apple held its latest hardware event where it announced two iPhone 16 models and two iPhone 16 Pro models, a new Watch, and new AirPods. It also says Apple Intelligence will launch in the US next month, and the UK and other English-language territories in December. theverge.com
Even the most ardent Apple fans have struggled to be enthusiastic about this event, where the biggest announcement was a dedicated camera button on the new phones.