Langchain with Continuous Learning
Learn how to finetune each agent in your workflow without changing a line of code.
LangChain (and LangGraph) make it easy to wire up multi-step agent workflows. The hard part is keeping quality high as your workflow evolves—without turning every prompt tweak into a months-long finetuning project.
This guide shows a simple pattern: treat each LangGraph node as its own learning surface. We route different parts of a user request to specialist agents, log the traffic per node, and continuously improve each node’s model without rewriting your application code.
What you’ll build
- 3 specialist agents (GitHub, Slack, Notion)
- A router model that decides which specialists to call
- A synthesizer model that merges outputs into a final answer
- A clean separation of logs so each component can be optimized independently
Why per-node finetuning matters
If you optimize the entire agent workflow as one giant model, your training data becomes a soup of unrelated tasks: retrieval, routing, synthesis, formatting, and domain knowledge. Per-node containers keep the objective focused—so a small model can get very good at one job.
Create Maniac containers
A Maniac container represents a specific task in your workflow. In the world of langgraph, we create as many containers as we have LM-based nodes in the graph. Each container tracks inference logs separately so that we optimize our models on as specific of a task as possible.
In this example, we create 5 containers:
- github: for answering questions related to the codebase
- slack: for answering questions that can be found in the Slack discussions
- notion: for answering questions related to internal documentation
- router: routes queries between the above containers
- synthesize: summarizing the results of the workflow
Maniac will produce an optimized model for each container based on the task it is assigned to and the actual data that runs through the workflow.
from maniac import Maniac
maniac = Maniac()
for label in ["github", "slack", "notion", "router", "synthesize"]:
maniac.containers.create(label=label, initial_model="openai/gpt-5.2")Create LangChain agents
We create a tool and langchain agent for each of the github, slack, and notion containers, and we create a model for the router and synthesizer.
At a high level:
- Each specialist agent calls an OpenAI-compatible API at
https://inference.maniac.ai/v1 - The only thing that changes per agent is the model name (
maniac:github,maniac:slack, etc.) - Maniac uses the container name to keep logs, evals, and future optimized models scoped correctly
from langchain.agents import create_agent
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from tools import search_code_impl, search_docs_impl, search_discussions_impl
@tool
def search_code(query: str) -> str:
"""
Search the GitHub codebase for relevant implementation details.
"""
return search_code_impl(query, top_k=3)
github_agent = create_agent(
llm=ChatOpenAI(
model="maniac:github",
api_key=os.environ["MANIAC_API_KEY"],
base_url="https://inference.maniac.ai/v1",
),
tools=[search_code],
system_prompt="You are a codebase expert. Use tools to answer implementation questions.",
)
@tool
def search_discussions(query: str) -> str:
"""
Search the Slack discussions for prior decisions/context.
"""
return search_discussions_impl(query, top_k=3)
slack_agent = create_agent(
llm=ChatOpenAI(
model="maniac:slack",
api_key=os.environ["MANIAC_API_KEY"],
base_url="https://inference.maniac.ai/v1",
),
tools=[search_discussions],
system_prompt="You are a discussions expert. Use tools to answer 'what did we decide?' questions.",
)
@tool
def search_docs(query: str) -> str:
"""
Search the Notion documentation for policies/process.
"""
return search_docs_impl(query, top_k=3)
notion_agent = create_agent(
llm=ChatOpenAI(
model="maniac:notion",
api_key=os.environ["MANIAC_API_KEY"],
base_url="https://inference.maniac.ai/v1",
),
tools=[search_docs],
system_prompt="You are a documentation expert. Use tools to answer policy/process questions.",
)
router_llm = ChatOpenAI(
model="maniac:router",
api_key=os.environ["MANIAC_API_KEY"],
base_url="https://inference.maniac.ai/v1",
temperature=0.0,
)
synthesizer_llm = ChatOpenAI(
model="maniac:synthesize",
api_key=os.environ["MANIAC_API_KEY"],
base_url="https://inference.maniac.ai/v1",
)Create LangGraph router state + schemas
We’ll represent the workflow state as a typed dictionary. The router returns a list of targeted sub-questions (one per specialist), then we fan out to the specialists, and finally we synthesize the results.
If you’re using structured outputs for routing, keep the schema small and stable—routing is a great candidate for distillation, because you want it to be fast and consistent.
class AgentInput(TypedDict):
query: str
class AgentOutput(TypedDict):
source: str
result: str
class Classification(TypedDict):
source: Literal["github", "notion", "slack"]
query: str
class RouterState(TypedDict):
query: str
classifications: list[Classification]
results: Annotated[list[AgentOutput], operator.add]
final_answer: str
class ClassificationResult(BaseModel):
classifications: list[Classification] = Field(
description="List of agents to invoke with targeted sub-questions"
)
def classify_query(state: RouterState) -> dict:
structured = router_llm.with_structured_output(ClassificationResult)
result = structured.invoke(
[
{
"role": "system",
"content": (
"Route the user query to the relevant specialists.\n"
"- github: code/implementation details\n"
"- notion: documentation/policies\n"
"- slack: discussions/decisions\n"
"Return ONLY relevant sources with a targeted sub-question per source."
),
},
{"role": "user", "content": state["query"]},
]
)
return {"classifications": result.classifications}
def route_to_agents(state: RouterState) -> list[Send]:
return [Send(c["source"], {"query": c["query"]}) for c in state["classifications"]]
def query_github(state: AgentInput) -> dict:
res = github_agent.invoke(
{"messages": [{"role": "user", "content": state["query"]}]}
)
return {"results": [{"source": "github", "result": res["messages"][-1].content}]}
def query_notion(state: AgentInput) -> dict:
res = notion_agent.invoke(
{"messages": [{"role": "user", "content": state["query"]}]}
)
return {"results": [{"source": "notion", "result": res["messages"][-1].content}]}
def query_slack(state: AgentInput) -> dict:
res = slack_agent.invoke(
{"messages": [{"role": "user", "content": state["query"]}]}
)
return {"results": [{"source": "slack", "result": res["messages"][-1].content}]}
def synthesize_results(state: RouterState) -> dict:
if not state["results"]:
return {"final_answer": "No results from any specialist."}
formatted = "\n\n".join(
f"From {r['source'].title()}:\n{r['result']}" for r in state["results"]
)
synth = synthesizer_llm.invoke(
[
{
"role": "system",
"content": f"Synthesize results to answer: {state['query']}",
},
{"role": "user", "content": formatted},
]
)
return {"final_answer": synth.content}Create the LangGraph workflow
This is the wiring: classify chooses which specialists to call, then the
specialists run in parallel, and synthesize merges everything into a final
response.
In production, this shape helps a lot with continuous learning:
- Routing failures are isolated to the router container
- Hallucinations are often isolated to one specialist container
- Bad synthesis is isolated to the synthesizer container
from langgraph.graph import StateGraph, START, END
from langgraph.types import Send
workflow = (
StateGraph(RouterState)
.add_node("classify", classify_query)
.add_node("github", query_github)
.add_node("notion", query_notion)
.add_node("slack", query_slack)
.add_node("synthesize", synthesize_results)
.add_edge(START, "classify")
.add_conditional_edges("classify", route_to_agents, ["github", "notion", "slack"])
.add_edge("github", "synthesize")
.add_edge("notion", "synthesize")
.add_edge("slack", "synthesize")
.add_edge("synthesize", END)
.compile()
)Run the workflow
Once this runs in real traffic, Maniac will accumulate logs per container. That gives you clean training/eval data for each node without extra instrumentation.
out = workflow.invoke({"query": "How do I authenticate API requests?"})
print(out["final_answer"])Next: turn this into continuous learning
After you’ve shipped the first version, the goal is to remove guesswork:
- Define evals for each node (e.g., routing accuracy, retrieval relevance, synthesis completeness).
- Optimize / distill each container’s model against those evals.
- Deploy the best model back into the same container name so your app code stays unchanged.
For more details, see: