Skip to main content
We explain how to integrate Ceramic Search with LangChain to build RAG pipelines and ground agent responses in high-quality web search results.

Installation

pip install langchain langchain-openai langchain-ceramic

API keys

Get your Ceramic API key and set it as an environment variable:

Get API Key

Create a Ceramic account for free to get an API key.
export CERAMIC_API_KEY="your-api-key"
Also set up any additional API keys you need, e.g., OpenAI via
export OPENAI_API_KEY="your-api-key"

Example usage

Tool calling

LangChain agents can use Ceramic search via tool calling to support their response with sources from the web. Ceramic uses lexical (keyword-based) search. See Best Practices for information on how to use Ceramic Search most effectively. When calling Ceramic search via a tool call, the LLM automatically converts the natural language query into an optimized keyword-based query for search.
from langchain_ceramic import CeramicSearch
from langchain_openai import ChatOpenAI
from langchain.agents import create_agent

# Initialize the Ceramic search tool and retrieve a maximum of five results
ceramic_search = CeramicSearch(max_results=5)

# Initialize the agent with the Ceramic search tool
agent = create_agent(
    model=ChatOpenAI(model="gpt-5.5"), 
    tools=[ceramic_search],
    system_prompt="You are a helpful research assistant. Use web search to find accurate, up-to-date information."
)

# Generate a response using natural language queries
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Tell me about California rental laws."}]}
)
print(result["messages"][-1].content)

RAG pipeline

Use the retriever tool CeramicSearchRetriever to obtain relevant documents for RAG pipelines. Because Ceramic uses lexical search, we first convert the natural language query into keywords using an LLM before retrieval. The original natural language query is still passed through to the answer prompt.
from langchain_ceramic import CeramicSearchRetriever
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

# Initialize the LLM and Ceramic Search Retriever
llm = ChatOpenAI(model="gpt-5.5")
retriever = CeramicSearchRetriever(k=5)

# Convert the natural language query to keywords before retrieval
keyword_prompt = PromptTemplate.from_template(
    """
    Rewrite the following question as a 2-8 word keyword query for a lexical search engine.
    
    Rules:
    - Extract specific entities, topics, locations, and dates
    - Replace conversational phrasing with concrete keywords
    - Do not include uninformative words such as articles (the, a, an). Avoid prepositions (on, about, in, for, of, at, by, with) unless they are within established phrases or names (United States of America, Into the Wild).
    - Include relevant synonyms explicitly when terminology is ambiguous
    - Keep word order meaningful (`house cat` and `cat house` return different results)
    - Good keyword query examples:
        - "2026 Super Bowl halftime performer"
        - "climate change effects global warming impact"
        - "beginner investing strategies stocks bonds basics"
    
    Return only the keyword query with no explanation.

    Question: {query}
    """
)
keyword_chain = keyword_prompt | llm | StrOutputParser()

# Format the prompt with the query and retrieved search context
answer_prompt = ChatPromptTemplate.from_template(
    "Answer the query based on the provided context.\n\nQuery: {query}\n\nContext: {context}"
)

# Create the complete chain, which involves keyword_chain and passes the formatted prompt to the LLM
# RunnablePassthrough() preserves the natural language query for the answer prompt
chain = (
    {"query": RunnablePassthrough(), "context": keyword_chain | retriever}
    | answer_prompt
    | llm
    | StrOutputParser()
)

# Generate the response
answer = chain.invoke("What are the latest AI chip export restrictions?")
print(answer)
Each retrieved Document has:
  • page_content: the result description
  • metadata["title"]: page title
  • metadata["url"]: source URL

Async usage

Both CeramicSearchRetriever and CeramicSearch support async:
docs = await retriever.ainvoke("California rental laws")

Parameters

CeramicSearch

ParameterTypeDescriptionDefault
api_keystr | NoneCeramic API key (falls back to CERAMIC_API_KEY env var)None
max_resultsintMaximum number of results to include in the response string5

CeramicSearchRetriever

ParameterTypeDescriptionDefault
api_keystr | NoneCeramic API key (falls back to CERAMIC_API_KEY env var)None
kintMaximum number of results to return10

GitHub

View source code

PyPI

View package