Using AI to Automate Google Slides/Presentations

Written by @alycerabyte on 8/29/2024, 2:54:01 PM

Google Slides API with LLM Generation **

You may read a formatted version of this writing here: Using AI to Automate Google Slides Presentation

A project I recently completed was in using a large language model's response to automate the generation of Google Slides/Presentations. Overall, the concept is quite simple. However, there were a few pain points worthy of discussion. Largely, within the structure of the Google Slides API. For what it's worth, in this project, I was working in TypeScript.

Open AI's Structured Outputs

The LLM was prompted to return a text response in Markdown which would include a title, subtitle, and would designate between each slide as Slide 1:, Slide 1:, etc. This allowed for expected delimiters when later parsing this output into the request for the API to Google Slides. I highly recommend looking into OpenAI's Structured Outputs in the API for reliable JSON schemas.

Google Slides API: Create Request

The Google Slides API requires you to first, send in a create request to create a blank presentation. Within that request you'll configure the title of the presentation. Within the blank presentation created, you will also have a Title slide included by default. This blank presentation will be assigned to whichever user is authorized via the google SSO process prior to the Slides API being hit.

requestBody: {
  title: presentation.title,
}

Google Slides API: Get Request

Now, once the presentation is created, the API will respond with the Id of the newly created presentation. Within the response object, this can be found in response.data.presentationId. Now that you have the presentation Id, you can use the get request within the Slides API to query for the Title Slide's object Ids.

const presentationData = await slides.presentations.get({
    presentationId,
});

Google Slides API: Get Response

The reason you'll need the presentation information from the get request due to your need to configure the text values for the title and subtitle objects within your default Title Slide. To accomplish this, you'll receive the following object structure as a response to your get request:

{
  ...,
  "data": {
    "presentationId": "<ID>",
    "pageSize": { "width": [Object], "height": [Object] },
    "slides": [ [Object] ],
    "title": "<TITLE>",
    "masters": [ [Object] ],
    "layouts": [
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object], [Object],
      [Object]
    ],
    "locale": "en",
    "revisionId": "g2FG59YEHdLUmQ",
    "notesMaster": {
      "objectId": "n",
      "pageType": "NOTES_MASTER",
      "pageElements": [Array],
      "pageProperties": [Object]
    }
  },
  ...
}

So, within this object structure, you'll want to retrieve the titleObjectID, which is the objectID of the title text box in your title slide and the subtitleObjectId, which is the objectID of the subtitle text box in your title slide. In order to update the text in either, you'll need to explicitly select these two objects by ObjectId. These values are randomly generated when they are not explicitly created via API call and configured by your request. Since this slide was generated by default through your create request, the objectID is randomly generated.

In order to update the text from your structured JSON into the title and subtitle objects of your title slide, you'll need to traverse the GET response for the ObjectIds similar to this example below:

const myTitleSlide = data.slides?.[0];

if (titleSlide?.pageElements) {
  for (const pageEl of titleSlide.pageElements) {
    if (pageEl.shape?.placeholder?.type === 'CENTERED_TITLE') {
      titleObjId = pageEl.objectId;
    } else if (pageEl.shape?.placeholder?.type === 'SUBTITLE') {
      subtitleObjId = pageEl.objectId;
    }
  }
}

Google Slides: batchUpdate Request

Once you have the objectIds for both the title and subtitle of your title slide, you can construct your batchUpdate request to update the presentation with the title, subtitle, and subsequent slides. In your constructor, you can reference the following structures in how I managed building my request:

Slide Create Requests:

[
  {
    slideLayoutReference: {
      predefinedLayout: 'TITLE_AND_BODY',
    },
    placeholderIdMappings: [
      {
        layoutPlaceholder: {
          type: 'TITLE',
          index: 0,
        },
        objectId: `title_${index}`,
      },
      {
        layoutPlaceholder: {
          type: 'BODY',
          index: 0,
        },
        objectId: `body_${index}`,
      },
    ],
  },
  {
    insertText: {
      objectId: `title_${index}`,
      text: slide.title,
    },
  },
  {
    insertText: {
      objectId: `body_${index}`,
      text: slide.content,
    },
  },
];

Takeaways

The biggest takeaway from this project was the fact that the Google APIs aren't very well documented and are architecturally complex. However, that's understandable when it comes to the fact that these APIs need to be quite extensible for many use cases. However, with that in mind, I strongly recommend traversing the response object structures within your own work with the API in order to detangle any information you may need. Another takeaway is that when formatting these slides, you'll need to consider the fact that you will need to leaf through their API documentation for each possible format, and apply the indicated request. For instance, if you'd like bullet formatting but not for every single test item in your slide content, you'll need to apply a textRange.type of 'FIXED_RANGE' and provision a startIndex along with an endIndex.

2 ✨

✨ Sign Up

Comments