1. File Search Overview
Gemini’s File Search is a native retrieval-augmented generation (RAG) tool built directly into the API. Unlike traditional RAG pipelines where you manage embedding infrastructure, chunk documents manually, and orchestrate vector databases, File Search handles all of this server-side — you just upload documents and query them.
1.1 When to Use File Search
| Approach | Best For | Limitations |
|---|---|---|
| File Search (RAG) | Large document collections (100s-1000s of files), dynamic content, need citations | Latency from retrieval step, retrieval quality depends on embedding |
| Context Caching | Repeated queries against same large document, cost optimization | Fixed context, requires cache management |
| Long Context (1M tokens) | Single document analysis, full-context reasoning | Expensive for large docs, no citation tracking |
from google import genai
client = genai.Client()
# File Search is used as a "tool" — similar to function calling
# The model decides when to search based on the query
response = client.models.generate_content(
model="gemini-3.5-flash",
contents="What are the key findings in the Q4 report?",
config={
"tools": [{"file_search": {"file_search_store_names": ["fileSearchStores/my-store"]}}]
}
)
print(response.text)
2. FileSearchStores
A FileSearchStore is a managed container for your documents. Behind the scenes, Gemini chunks your documents, generates embeddings using gemini-embedding-2, and indexes them for fast semantic retrieval.
2.1 Creating Stores
from google import genai
from google.genai import types
client = genai.Client()
# Create a new FileSearchStore with an embedding model
store = client.file_search_stores.create(
config={
"display_name": "product-documentation",
"embedding_model": "models/gemini-embedding-2"
}
)
print(f"Store created: {store.name}")
print(f"Display name: {store.display_name}")
print(f"Embedding model: {store.embedding_model}")
models/gemini-embedding-2 model supports multimodal embeddings — it can embed text, images, audio, video, and PDFs into the same vector space. This means your File Search store can contain mixed media documents and still retrieve semantically relevant results.
from google import genai
client = genai.Client()
# Create a store optimized for code documentation
code_store = client.file_search_stores.create(
config={
"display_name": "api-reference-docs",
"embedding_model": "models/gemini-embedding-2"
}
)
print(f"Code store: {code_store.name}")
# Create a store for internal knowledge base
kb_store = client.file_search_stores.create(
config={
"display_name": "internal-knowledge-base",
"embedding_model": "models/gemini-embedding-2"
}
)
print(f"KB store: {kb_store.name}")
2.2 Listing & Deleting Stores
from google import genai
client = genai.Client()
# List all stores
print("Your FileSearchStores:")
print("-" * 50)
for store in client.file_search_stores.list():
print(f" Name: {store.name}")
print(f" Display: {store.display_name}")
print(f" Model: {store.embedding_model}")
print()
# Get a specific store by name
store = client.file_search_stores.get(name="fileSearchStores/abc123")
print(f"Retrieved: {store.display_name}")
# Delete a store (removes all documents and embeddings)
client.file_search_stores.delete(name="fileSearchStores/old-store-id")
print("Store deleted successfully")
Legal Contract Analysis Platform
A law firm caches 200-page contracts and lets attorneys ask unlimited questions. Without caching, each question costs $0.50 (re-processing the entire document). With caching, the first load costs $2.00 but subsequent questions cost $0.05 each. For contracts needing 20+ queries, this saves 75% on API costs.
3. Managing Documents
3.1 Uploading Documents
Once you have a store, add documents to it. Gemini automatically chunks, embeds, and indexes each document:
from google import genai
client = genai.Client()
store_name = "fileSearchStores/my-store-id"
# Upload a single document from a local file
with open("docs/architecture-guide.pdf", "rb") as f:
doc = client.file_search_stores.documents.create(
parent=store_name,
file=f,
config={"display_name": "Architecture Guide v2.1"}
)
print(f"Uploaded: {doc.name}")
print(f"Status: {doc.status}")
from google import genai
client = genai.Client()
store_name = "fileSearchStores/my-store-id"
# List all documents in a store
print("Documents in store:")
for doc in client.file_search_stores.documents.list(parent=store_name):
print(f" {doc.display_name} — {doc.name}")
# Get details about a specific document
doc = client.file_search_stores.documents.get(
name=f"{store_name}/documents/doc123"
)
print(f"\nDocument: {doc.display_name}")
print(f"Size: {doc.size_bytes} bytes")
print(f"Status: {doc.status}")
# Delete a document from the store
client.file_search_stores.documents.delete(
name=f"{store_name}/documents/doc123"
)
print("Document removed from store")
3.2 Supported File Types
| Category | Formats | Notes |
|---|---|---|
| Text | .txt, .md, .csv, .html | Direct text extraction |
| Documents | .pdf, .docx | OCR for scanned PDFs |
| Code | .py, .js, .ts, .java, .go, .rs | Language-aware chunking |
| Data | .json, .xml, .yaml | Structure-preserving chunking |
4. Querying with File Search
4.1 Using File Search as a Tool in generateContent
File Search is passed as a tool to generateContent. The model autonomously decides when to invoke the search based on whether the query requires external knowledge:
from google import genai
from google.genai import types
client = genai.Client()
# Query using File Search tool
response = client.models.generate_content(
model="gemini-3.5-flash",
contents="Summarize the authentication flow described in our architecture docs.",
config=types.GenerateContentConfig(
tools=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=["fileSearchStores/my-store-id"]
)
)
]
)
)
# The response includes the generated text
print("Answer:", response.text)
# Access grounding metadata — retrieved chunks and citations
if response.candidates[0].grounding_metadata:
metadata = response.candidates[0].grounding_metadata
print(f"\nRetrieved {len(metadata.grounding_chunks)} chunks")
for chunk in metadata.grounding_chunks:
print(f" Source: {chunk.retrieved_context.title}")
print(f" Text: {chunk.retrieved_context.text[:100]}...")
4.2 Interactions API Variant
The Interactions API uses a slightly different tool specification format with type discriminators:
from google import genai
client = genai.Client()
# File Search via the Interactions API
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="What deployment strategies does our documentation recommend?",
tools=[
{
"type": "file_search",
"file_search_store_names": ["fileSearchStores/my-store-id"]
}
]
)
print("Answer:", interaction.output_text)
# Inspect steps to see the retrieval
for step in interaction.steps:
print(f" Step type: {step.type}")
if step.type == "file_search_call":
print(f" Query: {step.file_search_call.query}")
elif step.type == "file_search_result":
print(f" Results: {len(step.file_search_result.chunks)} chunks")
5. Combining File Search with Generation
5.1 Structured Outputs + File Search
Combine File Search with structured output to get type-safe responses grounded in your documents:
from google import genai
from google.genai import types
client = genai.Client()
# Define a schema for the response
summary_schema = types.Schema(
type="object",
properties={
"title": types.Schema(type="string", description="Document title"),
"key_points": types.Schema(
type="array",
items=types.Schema(type="string"),
description="Key points from the document"
),
"confidence": types.Schema(type="number", description="Confidence score 0-1")
},
required=["title", "key_points", "confidence"]
)
response = client.models.generate_content(
model="gemini-3.5-flash",
contents="Extract the top 5 key points from the security policy document.",
config=types.GenerateContentConfig(
tools=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=["fileSearchStores/security-docs"]
)
)
],
response_mime_type="application/json",
response_schema=summary_schema
)
)
import json
result = json.loads(response.text)
print(f"Title: {result['title']}")
print(f"Confidence: {result['confidence']}")
for i, point in enumerate(result['key_points'], 1):
print(f" {i}. {point}")
5.2 Multi-Store Queries
Query multiple stores simultaneously to cross-reference different document collections:
from google import genai
from google.genai import types
client = genai.Client()
# Search across multiple stores at once
response = client.models.generate_content(
model="gemini-3.5-flash",
contents="Compare the authentication approaches in our API docs vs our security policy.",
config=types.GenerateContentConfig(
tools=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=[
"fileSearchStores/api-docs",
"fileSearchStores/security-policy"
]
)
)
]
)
)
print(response.text)
# Check which stores contributed to the answer
if response.candidates[0].grounding_metadata:
for chunk in response.candidates[0].grounding_metadata.grounding_chunks:
print(f" Source: {chunk.retrieved_context.title}")
Next in the Gemini SDK Track
In Part 9: The Interactions API (Beta), we’ll explore the architectural shift from stateless generateContent to stateful server-managed conversations — the steps timeline, previous_interaction_id chaining, polymorphic response formats, and streaming events.