Using Google Gemini AI with Drafts

agiletortoise · April 16, 2024, 2:20pm

For integrations with other AI platforms, see the Artificial Intelligence article in the User Guide.

Drafts provides scripting wrappers to simplify interaction with Google Gemini through the Google Generative Language API.

This article covers the setup required to use such actions which integrate with Gemini and provides useful example actions for common use cases, like text manipulation and prompts.

Setting Up Google Gemini API Access

In order to use any of these integrations, you will need a Google account and an API key to use with Drafts.

The Gemini API can be used free of charge with usage limits. If you exceed those limits, you will need to set up a paid account with Google. Information on costs at their pricing page. Generally speaking, occasional usage in Drafts will not be costly, and it is billed per usage and not based on any monthly fees.

To get an API key, visit the Google AI Studio. Use the “Create API key” link, and copy the API key generated for use in Drafts. You will be prompted to enter the key the first time you use one of the example actions below.

Drafts will remember the API key in its Credentials system, so you will only need to complete this step once. If you need to change the API key used, you can forget it in the Credentials pane in Drafts settings, and the next time you use a Gemini action, you will be prompted to enter a new key.

Example Actions

Below are a few example use-case actions that are meant as a starting point to demonstrate some ways Google Gemini integration can be used in Drafts:

Ask Gemini: This action will ask you to enter a text prompt for Gemini and insert the result in the current draft.
Gemini: Translate Selection: Take the selected text in the editor, and ask Gemini to translate it into another language. You will be prompted to select from a list of languages.
Gemini: Modify Selection: Select text in a draft, then run this action, and you will be prompted for an instruction to transform the text. Drafts will package that up in a prompt and replace the selected text with the result. You can do simple things like “uppercase” but also combine commands for things like “uppercase and insert a emoji between each word.”

Scripting with `GoogleAI`

If you wish to build your own more advanced integrations, the GoogleAI script object is your starting point.

This object provides a convenience wrapper for making API calls to the Google AI API. When making requests with this object, Drafts will take care of requesting and storing a user’s API key, providing the appropriate authentication headers, and parsing results into Javascript objects.

The object provides several additional simple request functions, like quickPrompt that abstract details about the API for simple use cases, but also the request function to build more detailed requests with all the API options. Refer to Google API documentation on request parameters and return values.

The below action is meant as a starting point and demonstrates the use of the request function. Get more examples in the scripting reference:

Gemini: Direct Request

Troubleshooting

If you run into issues running these actions, be sure to check the Action Log for detailed error messages. Common problems include rate limits, or improperly configured API keys.

Conclusion

If you create new and interesting actions with this functionality, we hope you’ll share them in the directory and forums!

scott · June 29, 2024, 12:24pm

I use Gemini to summarize YouTube videos.

When using the action to attempt to send a link to a video to summarize, it responds with the summary of a different video. Running the action more than once will return the summary of a different video, but never the correct one.

Any idea what’s going on? Thanks.

agiletortoise · June 29, 2024, 2:01pm

Generally speaking, that is not something LLMs like Gemini are good at, because they cannot get live content from the web and summarize it. These actions use the Gemini APIs which make direct requests from Google’s Gemini APIs, which make request of the Gemini LLM (sorry for the redundancy).

This is not the same thing you get when using gemini.google.com. It employs a variety of extensions and other mechanisms to utilize sources other than Gemini’s LLM when answering. For example, you’ll notice if you put in “summarize [youtube URL]” the response you get indicates that it made a request to YouTube for information.

Hopefully that helps clarify.

scott · June 30, 2024, 2:39pm

Makes perfect sense, thanks

xiamumeiyoudao · December 12, 2024, 11:39am

Hi. Need v1beta API version(now v1), so that you can use latest models like 2.0-flash.

Refer Giải thích về các phiên bản API | Gemini API | Google AI for Developers.

Guess I’ve got it. But I don’t know what is the difference without Class GoogleAI.

let http = HTTP.create(); // 假设你有一个 HTTP 工具类

let response = http.request({
  "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=YOU_KEY",
  "method": "POST",
  "data": {
    contents: [{
      parts: [{
        text: "你是一个智能助手"
      }]
    }]
  },
  "headers": {
    "Content-Type": "application/json"
  }
});

if (response.success) {
  var text = response.responseData.candidates[0].content.parts[0].text;
  var data = response.responseData;
  var thisDraft = Draft.find("43422D83-F3E5-41E3-A183-8B29353CDBE0");
  thisDraft.append(text);
  thisDraft.update();
  console.log(thisDraft.content);
}
else {
  console.log(response.statusCode);
  console.log(response.error);
}

agiletortoise · December 13, 2024, 3:15pm

The GoogleAI class is just a convenience wrapper and takes care of remembering an API key for you, but you can always use HTTP if it does not do something you need. I should expose the API version as a property, however, now that there are multiple options, so it could be overridden. Will do that.

jsamlarose · March 30, 2025, 9:08pm

Found my way to this thread because Gemini has reportedly become more capable in recent days. Might be worth noting that the default model in the Ask Gemini might need to be updated on first use, for anyone installing it now (time and AI move quickly, and this was first posted in 2024.

If you get a little confused (like I did) check the models list here: Gemini models | Gemini API | Google AI for Developers and copy the appropriate model into the script, being sure to retain the “model/” root ahead of each model in your list.

agiletortoise · March 31, 2025, 1:31pm

Thanks for the reminder. I update the action to include a couple of the newer model options.