Tech Landscape #341
Artificial Intelligence meets Apple Intelligence, plus the other major announcements from WWDC, and your usual technology news round-up.
Hello!
A day late this week as I wanted to wait for Apple’s keynote address at WWDC.
So let’s get right into that. Hope you’re well!
AI at Apple WWDC
Apple kicked off its latest developer event with some truly cringe videos and a lot of announcements. A lot of it was Apple playing catch-up on some fronts and leading on others, but unlike recent years this was at least interesting.
iOS largely focused on customisation options, and iPadOS has some neat new Pencil tricks including automated handwriting that clones your style. macOS can now wirelessly mirror your iPhone and Safari uses AI to summarise web pages to extract the important information. watchOS, tvOS, and AirPods will all get quality-of-life improvements too. There were also some new features for developers, lest we forget who the notional audience of this event was.
VisionOS got its first major update, with the ability to turn 2D photos into spatial photos, a larger and higher resolution output for Mac Virtual Display, new hand gestures, new APIs for developers, and a slate of new Immersive Video content. The Apple Vision Pro will launch in eight more countries, including the UK on 12 July.
But the biggest news was Apple Intelligence (AI — get it?!). Powered by on-device models, it focuses on four key areas: Language, Image, Actions, and Personal Context.
Language is largely catching up with rivals. Copy suggestions, smart replies, rewriting, proofreading, summarising, transcribing, and categorising (emails, messages, and notifications). If you’ve used any Google or Microsoft product recently you’ll know the score, although of course it’s done with typical Apple polish and flourish.
Image generation is coming in the shape of Image Playground, a feature in several core apps as well as a standalone app. Apple has taken a very safety-first approach, with a limited range of inputs and outputs; no text prompting, no photorealistic images. It seems to be more like Meta’s AI tool, focused on stickers, emoji, and fun use cases. This was a pretty interesting demo, though:
It was notable that while Apple said that its AI models wouldn’t be trained on your personal data, it didn’t say where the data used to train the models came from. The extremely limited nature of Image Playground may be a clue that it only used ethically sourced data.
Actions is the really interesting area. A redesigned Siri will be able to understand and take action with the content of apps; you might say “find the photo I took of the menu at Nando’s and send it to Sam”, and it will be able to do that. A new Intents API will let developers make app features available to Siri in the future.
When devices such as the Humane AIPin and Rabbit R1 launched I asked what would happen to them when Siri improved; I think we’re about to get our answer.
All of this is made more powerful by Personal Context, which will mine users’ data — in a privacy-preserving way — to make Actions more relevant. And if the on-board models can’t handle a user request they can send it to more powerful cloud model, starting with ChatGPT (and others in future).
All of this looks super-impressive on paper— but then, so did Siri when it launched. And it was noticeable that unlike the platform updates, which had fixed release schedules, a lot of these features had the more vague timeline of “in the coming months” and “in the future” (to borrow from Google). Getting AI right is harder than the more declarative, logic-based software scripting that Apple has done to date.
But Apple’s major strength is its close control over its entire tech stack, so I’m hopeful that if anyone can push the field forward, it can.
Expect to see more detail about all of these announcements (and a few more) as WWDC kicks off in earnest, reporters and reviewers get hands-on, and Apple does the interview circuit. I’ll include the best and most relevant next week.
Social & Messaging
YouTube is expanding Posts to more creators, enabling them to create posts with polls, quizzes, and more to their Community page. youtube.com
There’s no aggregated feed, so this isn’t like a social network; it’s more like a blog without comments, or Meta’s Channels.Instagram updated Broadcast Channels with customisation options and live one-to-one chat. threads.net/@creators
Messenger introduced Communities, allowing users to create and organize chats within groups of up to 5,000. techcrunch.com
Meta has confirmed this launch, although hasn’t officially announced it. Without having seen it myself, it sounds like Facebook Groups in Messenger.WhatsApp added new features for businesses, including AI tools and Meta Verified. about.fb.com
Telegram added animated message effects and global hashtag search, enhancing user interaction and content discovery. telegram.org
Telegram also introduced Stars, enabling users to purchase digital goods and services via in-app payments. telegram.org
Synthetic Media
Udio added audio uploads for users to expand their own sounds into songs. x.com/udiomusic
Stability AI released Stable Audio Open, an open-source AI model, trained on free data, for generating short audio clips based on text descriptions. stability.ai
You can try it for yourself in this Hugging Face Space.
Here’s an example of the two combined; the first 12 seconds of this song fragment is a drum loop I generated with Stable Audio Open, and the remaining 30 seconds or so was completed with Udio.
eBay added AI background replacement, for sellers to enhance the quality of their product photos. innovation.ebayinc.com
Generative images: from miracle to mundane in just a few years.Stylar added Insert Object to… well, insert objects (such as products) into generated images. x.com/stylar_ai
Chinese video app Kuaiying showed off Kling, a text-to-video model that can generate up to two minutes of video in 1080p. kling.kuaishou.com (Chinese)
The provided examples are very impressive. It’s interesting to see the different approaches in China and the West: Google and Open AI are being very cautious and seeding their models only to select partners; Kuaiying is opening a private beta which anyone can request to join.
Quick addendum to the above: I believe it will be a very long time before generative video is anywhere close to being ready for regular use. If you work in video production, don’t be scared about AI coming for your job, just learn as much as you can about it while it’s still in its infancy.
Assistants & Chat
Google rolled out the Gemini mobile app in the UK and Europe and added a feature to analyse what’s on screen. 9to5google.com
Gemini might not objectively be the best chat assistant on the market (it’s too verbose!) but having it on millions of phones, available at the touch of a button, is going to make it very popular, very quickly. I have a working theory that, for most people, the best assistant will be the nearest one.Google globally released NotebookLM, an AI-powered research and writing assistant, and added features and a wider variety of sources. blog.google
I’ve been waiting for this to launch here since I first heard about it. It lets you connect different sources which you can then summarise, query, and more; I suspect it’s going to be very handy for my job.
Everything Else
Meta Quest latest update improves video passthrough by reducing visual distortion, and also added continuous background audio. meta.com
Meta also posted an announcement of new games and updates coming to the platform, and made a big deal about its price — just ahead of Apple announcing a wider roll-out of the Apple Vision Pro.
Gaming & Metaverse-ish
IKEA is recruiting people to ‘work’ paid shifts in its upcoming Roblox store (it’s more like roleplaying than actually working).
Walmart Realm is a virtual shopping experience. It’s apparently to target younger consumers, although I don’t really understand how making online shopping more limited and difficult appeals to anyone.