Using Google Gemini AI with Drafts

For integrations with other AI platforms, see the Artificial Intelligence article in the User Guide.

Drafts provides scripting wrappers to simplify interaction with Google Gemini through the Google Generative Language API.

This article covers the setup required to use such actions which integrate with Gemini and provides useful example actions for common use cases, like text manipulation and prompts.

Setting Up Google Gemini API Access

In order to use any of these integrations, you will need a Google account and an API key to use with Drafts.

The Gemini API can be used free of charge with usage limits. If you exceed those limits, you will need to set up a paid account with Google. Information on costs at their pricing page. Generally speaking, occasional usage in Drafts will not be costly, and it is billed per usage and not based on any monthly fees.

To get an API key, visit the Google AI Studio. Use the “Create API key” link, and copy the API key generated for use in Drafts. You will be prompted to enter the key the first time you use one of the example actions below.

Drafts will remember the API key in its Credentials system, so you will only need to complete this step once. If you need to change the API key used, you can forget it in the Credentials pane in Drafts settings, and the next time you use a Gemini action, you will be prompted to enter a new key.

Example Actions

Below are a few example use-case actions that are meant as a starting point to demonstrate some ways Google Gemini integration can be used in Drafts:

  • Ask Gemini: This action will ask you to enter a text prompt for Gemini and insert the result in the current draft.
  • Gemini: Translate Selection: Take the selected text in the editor, and ask Gemini to translate it into another language. You will be prompted to select from a list of languages.
  • Gemini: Modify Selection: Select text in a draft, then run this action, and you will be prompted for an instruction to transform the text. Drafts will package that up in a prompt and replace the selected text with the result. You can do simple things like “uppercase” but also combine commands for things like “uppercase and insert a :tada: emoji between each word.”

Scripting with GoogleAI

If you wish to build your own more advanced integrations, the GoogleAI script object is your starting point.

This object provides a convenience wrapper for making API calls to the Google AI API. When making requests with this object, Drafts will take care of requesting and storing a user’s API key, providing the appropriate authentication headers, and parsing results into Javascript objects.

The object provides several additional simple request functions, like quickPrompt that abstract details about the API for simple use cases, but also the request function to build more detailed requests with all the API options. Refer to Google API documentation on request parameters and return values.

The below action is meant as a starting point and demonstrates the use of the request function. Get more examples in the scripting reference:

Troubleshooting

If you run into issues running these actions, be sure to check the Action Log for detailed error messages. Common problems include rate limits, or improperly configured API keys.

Conclusion

If you create new and interesting actions with this functionality, we hope you’ll share them in the directory and forums!

4 Likes

I use Gemini to summarize YouTube videos.

When using the action to attempt to send a link to a video to summarize, it responds with the summary of a different video. Running the action more than once will return the summary of a different video, but never the correct one.

Any idea what’s going on? Thanks.

Generally speaking, that is not something LLMs like Gemini are good at, because they cannot get live content from the web and summarize it. These actions use the Gemini APIs which make direct requests from Google’s Gemini APIs, which make request of the Gemini LLM (sorry for the redundancy).

This is not the same thing you get when using gemini.google.com. It employs a variety of extensions and other mechanisms to utilize sources other than Gemini’s LLM when answering. For example, you’ll notice if you put in “summarize [youtube URL]” the response you get indicates that it made a request to YouTube for information.

Hopefully that helps clarify.

Makes perfect sense, thanks :+1: