This Week's Sponsor:

Washing Machine X9

Spring Clean Your Mac Effortlessly


Posts tagged with "gemini"

Using Simon Willison’s LLM CLI to Process YouTube Transcripts in Shortcuts with Claude and Gemini

Video Processor.

Video Processor.

I’ve been experimenting with different automations and command line utilities to handle audio and video transcripts lately. In particular, I’ve been working with Simon Willison’s LLM command line utility as a way to interact with cloud-based large language models (primarily Claude and Gemini) directly from the macOS terminal.

For those unfamiliar, Willison’s LLM CLI tool is a command line utility that lets you communicate with services like ChatGPT, Gemini, and Claude using shell commands and dedicated plugins. The llm command is extremely flexible when it comes to input and output; it supports multiple modalities like audio and video attachments for certain models, and it offers custom schemas to return structured output from an API. Even for someone like me – not exactly a Terminal power user – the different llm commands and options are easy to understand and tweak.

Today, I want to share a shortcut I created on my Mac that takes long transcripts of YouTube videos and:

  1. reformats them for clarity with proper paragraphs and punctuation, without altering the original text,
  2. extracts key points and highlights from the transcript, and
  3. organizes highlights by theme or idea.

I created this shortcut because I wanted a better system for linking to YouTube videos, along with interesting passages from them, on MacStories. Initially, I thought I could use an app I recently mentioned on AppStories and Connected to handle this sort of task: AI Actions by Sindre Sorhus. However, when I started experimenting with long transcripts (such as this one with 8,000 words from Theo about Electron), I immediately ran into limitations with native Shortcuts actions. Those actions were running out of memory and randomly stopping the shortcut.

I figured that invoking a shell script using macOS’ built-in ‘Run Shell Script’ action would be more reliable. Typically, Apple’s built-in system actions (especially on macOS) aren’t bound to the same memory constraints as third-party ones. My early tests indicated that I was right, which is why I decided to build the shortcut around Willison’s llm tool.

Read more


Gemini for iOS Gets Lock Screen Widgets, Control Center Integration, Basic Shortcuts Actions

Gemini for iOS.

Gemini for iOS.

When I last wrote about Gemini for iOS, I noted the app’s lackluster integration with several system features. But since – unlike others in the AI space – the team at Google is actually shipping new stuff on a weekly basis, I’m not too surprised to see that the latest version of Gemini for iOS has brought extensive support for widgets.

Specifically, Gemini for iOS now offers a collection of Lock Screen widgets that also appear as controls in iOS 18’s Control Center, and there are barebones Shortcuts actions to go along with them. In both the Lock Screen’s widget gallery and Control Center, you’ll find Gemini widgets to:

  • type a prompt,
  • Talk Live,
  • open the microphone (for dictation),
  • open the camera,
  • share an image (with a Photos picker), and
  • share a document (with a Files picker).

It’s nice to see these integrations with Photos and Files; notably, Gemini now also has a share extension that lets you add the same media types – plus URLs from webpages – to a prompt from anywhere on iOS.

The Shortcuts integration is a little less exciting since Google implemented old-school actions that do not support customizable parameters. Instead, Gemini only offers actions to open the app in three modes: type, dictate, or Talk Live. That’s disappointing, and I would have preferred to see the ability to pass text or images from Shortcuts directly to Gemini.

While today’s updates are welcome, Google still has plenty of work left to do on Apple’s platforms. For starters, they don’t have an iPad version of the Gemini app. There are no Home Screen widgets yet. And the Shortcuts integration, as we’ve seen, could go much deeper. Still, the inclusion of controls, basic Shortcuts actions, and a share extension goes a long way toward making Gemini easier to access on iOS – that is, until the entire assistant is integrated as an extension for Apple Intelligence.


Gemini 2.0 and LLMs Integrated with Apps

Busy day at Google today: the company rolled out version 2.0 of its Gemini AI assistant (previously announced in December) with a variety of new and updated models to more users. From the Google blog:

Today, we’re making the updated Gemini 2.0 Flash generally available via the Gemini API in Google AI Studio and Vertex AI. Developers can now build production applications with 2.0 Flash.

We’re also releasing an experimental version of Gemini 2.0 Pro, our best model yet for coding performance and complex prompts. It is available in Google AI Studio and Vertex AI, and in the Gemini app for Gemini Advanced users.

We’re releasing a new model, Gemini 2.0 Flash-Lite, our most cost-efficient model yet, in public preview in Google AI Studio and Vertex AI.

Finally, 2.0 Flash Thinking Experimental will be available to Gemini app users in the model dropdown on desktop and mobile.

Read more