Skip to content

Instantly share code, notes, and snippets.

@patrickloeber
Last active June 1, 2025 15:08
Show Gist options
  • Select an option

  • Save patrickloeber/0d7d06cc1f69000b3e89eb8f55c15ee4 to your computer and use it in GitHub Desktop.

Select an option

Save patrickloeber/0d7d06cc1f69000b3e89eb8f55c15ee4 to your computer and use it in GitHub Desktop.
AI SDK with Gemini API

AI SDK with Gemini API

Gemini models are accessible using the AI SDK by Vercel. This guide will help you get started with the AI SDK and Gemini.

For more information, see the following resources:

Setup

Install the AI SDK and the Google Generative AI integration:

npm install ai
npm install @ai-adk/google
// pnpm: pnpm add ai @ai-sdk/google
// yarn: yarn add ai @ai-sdk/google

Set the GOOGLE_GENERATIVE_AI_API_KEY environment variable with your API key:

# MacOS/Linux
export GOOGLE_GENERATIVE_AI_API_KEY="YOUR_API_KEY_HERE"

# Powershell
setx GOOGLE_GENERATIVE_AI_API_KEY "YOUR_API_KEY_HERE"

Getting Started

Here's a basic example that takes a single text input:

import { generateText } from 'ai';
import { google } from '@ai-sdk/google';

const model = google('gemini-2.0-flash');

const { text } = await generateText({
  model: model,
  prompt: 'Why is the sky blue?',
  // system: 'You are a friendly assistant!',
  // temperature: 0.7,
});

console.log(text);

Streaming

Here's a basic streaming example:

import { streamText } from 'ai';
import { google } from '@ai-sdk/google';

const model = google('gemini-2.0-flash');

const { textStream }  = streamText({
  model: model,
  prompt: 'Why is the sky blue?',
});

for await (const textPart of textStream) {
  console.log(textPart);
}

Document / PDF understanding

The AI SDK supports file inputs, e.g. PDF files:

import { generateText } from 'ai';
import { google } from '@ai-sdk/google';
import { readFile } from 'fs/promises';  // npm install @types/node

const model = google('gemini-2.0-flash');

const { text }  = await generateText({
  model: model,
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'Extract the date and price from the invoice',
        },
        {
          type: 'file',
          data: await readFile('./invoice.pdf'),
          mimeType: 'application/pdf',
        },
      ],
    },
  ],
});

console.log(text);

Image understanding

The AI SDK supports image inputs:

import { generateText } from 'ai';
import { google } from '@ai-sdk/google';
import { readFile } from 'fs/promises';  // npm install @types/node

const model = google('gemini-2.0-flash');

const { text }  = await generateText({
  model: model,
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'text',
          text: 'List all items from the picture',
        },
        {
          type: 'image',
          image: await readFile('./veggies.jpeg'),
          mimeType: 'image/jpeg',
        },
      ],
    },
  ],
});

console.log(text);

Structured output

The AI SDK supports structured outputs:

import { generateObject } from 'ai';
import { z } from 'zod';
import { google } from '@ai-sdk/google';
import { readFile } from 'fs/promises';

const model = google('gemini-2.0-flash');

const { object } = await generateObject({
  model: model,
  schema: z.object({
    date: z.string(),
    total_gross_worth: z.number(),
    invoice_number: z.string()
  }),
  messages: [
      {
        role: 'user',
        content: [
          {
            type: 'text',
            text: 'Extract the structured data from the following PDF file',
          },
          {
            type: 'file',
            data: await readFile('./invoice.pdf'),
            mimeType: 'application/pdf',
          },
        ],
      },
    ],
});

console.log(object)

See the AI SDK structured data guide for further resources.

Grounding with Google Search

You can configure Search grounding with Google Search:

import { generateText } from 'ai';
import { google } from '@ai-sdk/google';

const model = google('gemini-2.0-flash', { useSearchGrounding: true });

const { text, sources, providerMetadata } = await generateText({
  model: model,
  prompt: 'Who won the Super Bowl in 2025?',
});

console.log(text);

console.log("Sources:")
console.log(sources);

console.log("Metadata:")
console.log(providerMetadata?.google.groundingMetadata);

Thinking

You can use thinking models with support for thinking budgets and thought summaries:

import { generateText } from 'ai';
import { google } from '@ai-sdk/google';
import { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google';

const model = google('gemini-2.5-flash-preview-05-20');

const response = await generateText({
  model: model,
  prompt: 'Why is the sky blue?',
  providerOptions: {
    google: {
      thinkingConfig: {
        thinkingBudget: 2024,
        includeThoughts: true
      },
    } satisfies GoogleGenerativeAIProviderOptions,
  },
});

console.log(response.text);

// Log the reasoning summary
console.log("Reasoning");
console.log(response.reasoning);

Tools and function calling

The AI SDK supports function calling:

import { z } from 'zod';
import { generateText, tool } from 'ai';
import { google } from '@ai-sdk/google';

const model = google('gemini-2.5-flash-preview-05-20');

const result = await generateText({
  model: model,
  prompt: 'What is the weather in San Francisco?',
  tools: {
    weather: tool({
      description: 'Get the weather in a location',
      parameters: z.object({
        location: z.string().describe('The location to get the weather for'),
      }),
      // execute: An optional async function that is called with the arguments from the tool call.
      execute: async ({ location }) => ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    }),
  },
  maxSteps: 5, // Optional, enables multi step calling
});

console.log(result.text)

// Inspect the different messages, this will contain messages from the different steps
// here:
// Step 1 with tool-call and tool-result messages
// Step 2 with the final generated text based on the tool result
for (const message of result.response.messages) {
    console.log(message.content);
}

See the AI SDK tool calling guide for further resources.

Limitations

The Vercel AI SDK and the AI SDK Google integration are third-party libraries developed and maintained by the Vercel team. While designed to work seamlessly with the Gemini API, it is not an official Google product.

We cannot guarantee complete feature parity or identical behavior between the Vercel AI SDK's implementation and the official Gemini API. Certain features, newly released capabilities, or specific API parameters might not be immediately available or fully supported within the SDK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment