This Week's Sponsor:

1Blocker

A Cleaner, Faster, and More Private Web Experience


How We’re Trying to Protect MacStories from AI Bots and Web Crawlers – And How You Can, Too

Over the past several days, we’ve made some changes at MacStories to address the ingestion of our work by web crawlers operated by artificial intelligence companies. We’ve learned a lot, so we thought we’d share what we’ve done in case anyone else would like to do something similar.

If you read MacStories regularly, or listen to our podcasts, you already know that Federico and I think that crawling the Open Web to train large language models is unethical. Industry-wide, AI companies have scraped the content of websites like ours, using it as the raw material for their chatbots and other commercial products without the consent or compensation of publishers and other creators.

Now that the horse is out of the barn, some of those companies are respecting publishers’ robots.txt files, while others seemingly aren’t. That doesn’t make up for the tens of thousands of articles and images that have already been scraped from MacStories. Nor is robots.txt a complete solution, so it’s just one of four approaches we’re taking to protect our work.

Read more


The Origin Story of Apple Podcasts’ Transcripts

Ari Saperstein, writing for The Guardian, interviewed Ben Cave, Apple’s global head of podcasts and Sarah Herrlinger, who manages accessibility policy for the company, about Apple Podcasts transcripts. The feature, which was introduced in March, automatically generates transcripts of podcast episodes in Apple’s catalog and has been a big accessibility win for podcast fans.

The origins of Apple’s transcription efforts began modestly:

Apple’s journey to podcast transcripts started with the expansion of a different feature: indexing. It’s a common origin story at a number of tech companies like Amazon and Yahoo – what begins as a search tool evolves into a full transcription initiative. Apple first deployed software that could identify specific words in a podcast back in 2018.

“What we did then is we offered a single line of the transcript to give users context on a result when they’re searching for something in particular,” Cave recalls. “There’s a few different things that we did in the intervening seven years, which all came together into this [transcript] feature.”

Drawing from technologies and designs used by Apple Music and Books, the feature has been lauded by the accessibility community:

“I was knocked out on how accurate it was,” says Larry Goldberg, a media and technology accessibility pioneer who created the first closed captioning system for movie theaters. The fidelity of auto-transcription is something that’s long been lacking, he adds. “It’s improved, it has gotten better … but there are times when it is so wrong.”

My experience with Podcasts’ transcripts tracks with the people interviewed for Saperstein’s story. Automatically generated transcription is hard. I’ve tried various services in the past, and I’ve never been happy enough with any of them to publish their output on MacStories. Apple’s solution isn’t perfect, but it’s easily the best I’ve seen, tipping into what I consider publishable territory. The feature makes it easy to search, select text, and generate time-stamped URLs for quoting snippets of an episode, which makes the app an excellent tool for researching and writing about podcasts, too.

Permalink

The Issues of iPadOS 18’s New Tab Bars

Earlier today on Mastodon, I shared some concerns regarding the Books app in iPadOS 18 and how Apple implemented the new tab bar design in the app. Effectively, by eschewing a sidebar, the app has returned to feeling like a blown-up iPhone version – something I hoped we had left behind when Apple announced they wanted to make iPad apps more desktop-class two years ago.

Unfortunately, it gets worse than Books. As documented by Nico Reese, the developer of Gamery, the new tab bars seem to fall short of matching the previous design’s visual affordances as well as flexibility for developers. For starters, the new tabs are just text labels, which may work well in English, but not necessarily other languages:

Since the inception of the iPhone, tabs in a tab bar have always included a glyph and a label. With the new tab style, the glyphs are gone. Glyphs play a crucial role in UX design, allowing users to quickly recognize parts of the app for fast interaction. Now, users need to read multiple text labels to find the content they want, which is slower to perceive and can cause issues in languages that generally use longer words, such as German. Additionally, because tab bars are now customizable, they can even scroll if too many tabs are added!

You’ll want to check out Nico’s examples here, but this point is spot-on: since tab bars now sit alongside toolbar items, the entire UI can get very condensed, with buttons often ending up hidden away in an overflow menu:

Although Apple’s goal was to save space on the iPad screen, in reality, it makes things even more condensed. Apps need to compress actions because they take up too much horizontal space in the navigation bar. This constant adjustment of button placement in the navigation bar as windows are resized prevents users from building muscle memory. The smaller the window gets, the more items collapse.

If the goal was to simplify the iPad’s UI, well, now iPad users will end up with three ways to navigate apps instead of two, with the default method (the top bar) now generally displaying fewer items than before, without glyphs to make them stand out:

For users, it can be confusing why the entire navigation scheme changes with window resizing, and now they must adjust to three different variations. Navigation controls can be located at the top, the bottom, or the left side (with the option to hide the sidebar!), which may not be very intuitive for users accustomed to consistent navigation patterns.

The best way I can describe this UI change is that it feels like something conceived by the same people who thought the compact tab bar in Safari for iPad was a good idea, down to how tabs hide other UI elements and make them less discoverable.

Nico’s post has more examples you should check out. I think Marcos Tanaka (who knows a thing or two about iPad apps) put it well:

It makes me quite sad that one of the three iPad-specific features we got this year seems to be missing the mark so far. I hope we’ll see some improvements and updates on this front over the next three months before this feature ships to iPad users.

Permalink

WWDC 2024: The AppStories Interviews with ADA and Swift Student Challenge Distinguished Winners

Devin Davies, the developer of Crouton.

Devin Davies, the developer of Crouton.

To wrap up our week of WWDC coverage, we just published a special episode of AppStories that was recorded in the Apple Podcasts Studio at Apple Park. Federico and I interviewed three of this year’s Apple Design Award winners:

Devin Davies.

Devin Davies.

  • Devin Davies, the creator of Crouton, which won an ADA in the Interaction category
Katarina Lotrič and Jasna Krmelj of Gentler Streak.

Katarina Lotrič and Jasna Krmelj of Gentler Streak.


- Katarina Lotrič, CEO and co-founder, and Jasna Krmelj, CTO and co-founder, of Gentler Streak, which won an ADA in the Social Impact category

James Cuda, CEO, and Michael Shaw, CTO, of Procreate.

James Cuda, CEO, and Michael Shaw, CTO, of Procreate.


- James Cuda, CEO, and Michael Shaw, CTO of Procreate, which won an ADA for (Procreate Dreams) in the Innovation category

We also interviewed two of the Swift Student Challenge Distinguished Winners:

  • Dezmond Blair, a student at the Apple Developer Academy in Detroit. His app marries his passion for biking and the outdoors with technology, which creates an immersive experience.
  • Adelaide Humez, a high school student from Lille, France. Her winning app, Egretta, allows users to create a journal of their dreams based on emotions.

In addition to being available as always in your favorite podcast app as an audio-only podcast, This special episode of AppStories is available on our new MacStories YouTube channel, which is also the home of Comfort Zone, one of the two podcasts we launched last week and other video projects.


We deliver AppStories+ to subscribers with bonus content, ad-free, and at a high bitrate early every week.

To learn more about the benefits included with an AppStories+ subscription, visit our Plans page or read the AppStories+ FAQ.

Permalink

The Latest from Magic Rays of Light, Comfort Zone, and MacStories Unwind

Enjoy the latest episodes from MacStories’ family of podcasts:

This week on Magic Rays of Light, Sigmund and Devon recap the Apple TV and entertainment announcements at WWDC – including tvOS 18, visionOS 2, Immersive Video updates, and more – and score their event predictions.


We’re back! After surviving our first challenge together, the gang is back for more with new goodies, an unexpectedly heavy topic, and a new mysterious challenge we didn’t see coming.


This week, John is joined by Jonathan Reed and Sigmund Judge for an explanation of how John missed his first episode of AppStories in seven years this week, an update from Sigmund on what’s coming to tvOS and Apple TV+, plus a bunch of picks from everyone.

Read more


Opting Out of AI Model Training

Dan Moren has an excellent guide on Six Colors that explains how to exclude your website from the web crawlers used by Apple, OpenAI, and others to train large language models for their AI products. For many sites, the process simply requires a few edits to the robots.txt file on your server:

If you’re not familiar with robots.txt, it’s a text file placed at the root of a web server that can give instructions about how automated web crawlers are allowed to interact with your site. This system enables publishers to not only entirely block their sites from crawlers, but also specify just parts of the sites to allow or disallow.

The process is a little more complicated with something like a WordPress, which MacStories uses, and Dan covers that too.

Unfortunately, as Dan explains, editing robots.txt isn’t a solution for companies that ignore the file. It’s simply a convention that doesn’t carry any legal or regulatory weight. Nor does it help with Google or Microsoft’s use of your website’s copyrighted content unless you’re also willing to remove your site from the biggest search engines.

Although I’m glad there is a way to block at least some AI web crawlers prospectively, it’s cold comfort. We and many sites have years of articles that have already been crawled to train these models, and you can’t unring that bell. That said, MacStories’ robot.txt file has been updated to ban Apple and OpenAI’s crawlers, and we’re investigating additional server-level protections.

If you listen to Ruminate or follow my writing on MacStories, you know that I think what these companies are doing is wrong both in the moral and legal sense of the word. However, nothing captures it quite as well as this Mastodon post by Federico today:

If you’ve ever read the principles that guide us at MacStories, I’m sure Federico’s post came as no surprise. We care deeply about the Open Web, but ‘open’ doesn’t give tech companies free rein to appropriate our work to build their products.

Yesterday, Federico linked to Apple’s Machine Learning Research website where it was disclosed that the company has indexed the web to train its model without the consent of publishers. I was as disappointed in Apple as Federico. I also immediately thought of this 2010 clip of Steve Jobs near the end of his life, reflecting on what ‘the intersection of Technology and the Liberal Arts’ meant to Apple:

I’ve always loved that clip. It speaks to me as someone who loves technology and creates things for the web. In hindsight, I also think that Jobs was explaining what he hoped his legacy would be. It’s ironic that he spoke about ‘technology married with Liberal Arts,’ which superficially sounds like what Apple and others have done to create their AI models but couldn’t be further from what he meant. It’s hard to watch that clip now and not wonder if Apple has lost sight of what guided it in 2010.


You can follow all of our WWDC coverage through our WWDC 2024 hub or subscribe to the dedicated WWDC 2024 RSS feed.

Permalink

Designing Dark Mode App Icons

Apple’s announcement of “dark mode” icons has me thinking about how I would approach adapting “light mode” icons for dark mode. I grabbed 12 icons we made at Parakeet for our clients to illustrate some ways of going about it.

Before that though, let’s take some inventory. Of the 28 icons in Apple’s preview image of this feature, only nine have white backgrounds in light mode. However, all icons in dark mode have black backgrounds.

Actually, it’s worth noting that five “light mode” icons have black backgrounds, which Apple slightly adjusted to have a consistent subtle black gradient found on all of their new dark mode icons. Four of these—Stocks, Wallet, TV, and Watch—all seem to be the same in both modes. However, no other (visible) icons are.

Fantastic showcase by Louie Mantia of how designers should approach the creation of dark mode Home Screen icons in iOS 18. In all the examples, I prefer Mantia’s take to the standard black background version.

See also: Gavin Nelson’s suggestion, Apple’s Human Interface Guidelines on dark mode icons, and the updated Apple Design Resources for iOS 18.

Permalink

Apple Details Its AI Foundation Models and Applebot Web Scraping

From Apple’s Machine Learning Research1 blog:

Our foundation models are trained on Apple’s AXLearn framework, an open-source project we released in 2023. It builds on top of JAX and XLA, and allows us to train the models with high efficiency and scalability on various training hardware and cloud platforms, including TPUs and both cloud and on-premise GPUs. We used a combination of data parallelism, tensor parallelism, sequence parallelism, and Fully Sharded Data Parallel (FSDP) to scale training along multiple dimensions such as data, model, and sequence length.

We train our foundation models on licensed data, including data selected to enhance specific features, as well as publicly available data collected by our web-crawler, AppleBot. Web publishers have the option to opt out of the use of their web content for Apple Intelligence training with a data usage control.

We never use our users’ private personal data or user interactions when training our foundation models, and we apply filters to remove personally identifiable information like social security and credit card numbers that are publicly available on the Internet. We also filter profanity and other low-quality content to prevent its inclusion in the training corpus. In addition to filtering, we perform data extraction, deduplication, and the application of a model-based classifier to identify high quality documents.

It’s a very technical read, but it shows how Apple approached building AI features in their products and how their on-device and server models compare to others in the industry (on servers, Apple claims their model is essentially neck and neck with GPT-4-Turbo, OpenAI’s older model).

This blog post, however, pretty much parallels my reaction to the WWDC keynote. Everything was fun and cool until they showed generative image creation that spits out slop “resembling” (strong word) other people; and in this post, everything was cool until they mentioned how – surprise! – Applebot had already indexed web content to train their model without publishers’ consent, who can only opt out now. (This was also confirmed by Apple executives elsewhere.)

As a creator and website owner, I guess that these things will never sit right with me. Why should we accept that certain data sets require a licensing fee but anything that is found “on the open web” can be mindlessly scraped, parsed, and regurgitated by an AI? Web publishers (and especially indie web publishers these days, who cannot afford lawsuits or hiring law firms to strike expensive deals) deserve better.

It’s disappointing to see Apple muddy an otherwise compelling set of features (some of which I really want to try) with practices that are no better than the rest of the industry.


  1. How long until this become the ‘Apple Intelligence Research’ website? ↩︎
Permalink

The Latest from AppStories and Ruminate

Enjoy the latest episodes from MacStories’ family of podcasts:

For the latest WWDC episode of AppStories, Federico is joined by Myke Hurley to talk about the Vision Pro and Apple Intelligence before John pops up with some AI tidbits and a WWDC vibe check from in and around Apple Park.

This episode is sponsored by:


For this special episode of AppStories, Federico is joined by Jonathan and Niléane live in the Club MacStories+ Discord community to share their first impressions of the WWDC 2024 Keynote.

This episode is sponsored by:

  • Kolide – It ensures that if a device isn’t secure it can’t access your apps. It’s Device Trust for Okta. Watch the demo now.

Recorded live in the Club MacStories Discord, Federico share their final preparations and plans for WWDC 2024 along with some last-minute predictions.

On AppStories+, Federico reveals his trio of iPad Pros and we take questions from Club members about WWDC.

This episode is sponsored by:

CleanMyMac X: Your Mac. As good as new. Get 15% off today with code APPSTORIES15.
- Kolide – It ensures that if a device isn’t secure it can’t access your apps. It’s Device Trust for Okta. Watch the demo now.


This week, new MacStories podcasts, the Ruminate intro song is back, snack news, some keyboard accessories, and an alternative to the small web.

Read more