Hacker Newsnew | past | comments | ask | show | jobs | submit | kacper-vstorm's commentslogin

Author here. pydantic-deep is an open-source AI agent framework built on Pydantic AI. This release adds ACP (Agent Client Protocol) support so agents can run inside editors like Zed with streaming and tool visibility.

Other highlights: subagents are now full deep agents by default (filesystem, web, memory), thinking is enabled by default, Anthropic prompt caching is on, and there are 5 new lifecycle hooks for session tracking and error handling.

CHANGELOG: https://github.com/vstorm-co/pydantic-deepagents/blob/main/C...


We maintain 5 open-source libraries built on Pydantic AI: a guardrails/shields library, a subagent delegation library, a context summarization library, a filesystem backend, and a full-stack template.

When Pydantic AI shipped capabilities in v1.71+, we migrated all of them. The biggest lesson: we deleted our entire middleware abstraction layer (MiddlewareAgent, MiddlewareChain, ParallelMiddleware, pipeline compiler, config loaders). All replaced by subclassing AbstractCapability and overriding a few methods.

The article walks through what capabilities/hooks/agent specs actually are, with before/after code from each of our repos.

Key architectural insight: capabilities compose automatically. Before-hooks fire in registration order, after-hooks reverse, wrap-hooks nest as middleware. You don't build composition logic anymore -- the framework handles it.

Happy to answer questions about the migration or the specific libraries.

- Shields/guardrails: https://github.com/vstorm-co/pydantic-ai-middleware - Subagents: https://github.com/vstorm-co/pydantic-ai-subagents - Summarization: https://github.com/vstorm-co/pydantic-ai-summarization


Everyone has opinions about AI frameworks. Few people show code.

We maintain full-stack-ai-agent-template — a production template for AI/LLM applications with FastAPI, Next.js, and 75+ configuration options. One of those options is the AI framework. You pick from Pydantic AI, LangChain, LangGraph, or CrewAI during setup, and the template generates the exact same chat application with the exact same API, database schema, WebSocket streaming, and frontend. Only the AI layer differs.

This gave us a unique opportunity: a controlled comparison. Same functionality, same tests, same deployment — four implementations.


TL;DR

    BrowseComp is a web browsing benchmark, not a knowledge or reasoning test. It evaluates whether AI agents can navigate the open web to find specific, obscure information.
    Questions are “inverted” - authors start with a fact and work backwards to create a question that’s easy to verify but extremely hard to solve through search.
    Brute-force search doesn’t work. The search space is deliberately massive - thousands of papers, matches, events - making systematic enumeration impractical.
    Grading uses an LLM judge with a confidence score, creating an interesting meta-layer where one model evaluates another’s certainty.
    This benchmark reveals the gap between “can answer questions” and “can do research” - the exact capability that separates chatbots from useful AI agents.


## Quick Install

```bash pip install pydantic-deep[cli] pydantic-deep chat ```

## What is this?

The pydantic-deep CLI wraps the full [pydantic-deep](https://github.com/vstorm-co/pydantic-deepagents) agent framework into a terminal tool that works like Claude Code or LangChain's Deep Agents CLI. It gives an LLM full access to your local filesystem, shell, planning tools, and skills — so it can autonomously execute complex coding tasks.

Unlike simple chat wrappers, pydantic-deep implements the *deep agent architecture*: planning, subagent delegation, persistent memory, and context management — the same patterns powering Claude Code, Manus AI, and Devin.

## Usage

### Interactive Chat

```bash pydantic-deep chat pydantic-deep chat --model anthropic:claude-sonnet-4-20250514 ```

Features: 17 slash commands (`/help`, `/compact`, `/context`, `/model`, ...), colored diff viewer for file approvals, visual progress bar, tool call timing, @file mentions, and Ctrl+V image paste.

### Non-Interactive (Benchmark Mode)

```bash # stdout = response only (clean for piping), stderr = diagnostics pydantic-deep run "Fix the failing tests in src/" pydantic-deep run "Create a REST API with FastAPI" --model openai:gpt-4.1 pydantic-deep run "Refactor the auth module" --quiet ```

### Docker Sandbox

Run in an isolated Docker container:

```bash pydantic-deep run "Build a web scraper" --sandbox --runtime python-web pydantic-deep chat --sandbox --runtime python-datascience ```


I'm Kacper from Vstorm. We build AI agents for production and open-source our tooling as Pydantic AI libraries.

pydantic-ai-backend (github.com/vstorm-co/pydantic-ai-backend) provides file storage and code execution backends for Pydantic AI agents. We've supported Docker sandboxes since day one. In v0.1.12, we added Daytona as an alternative.

A few technical points:

1. *The sub-90ms number.* Daytona pre-provisions VMs and keeps them warm. "Create sandbox" is essentially "assign a pre-warmed VM" — no image pulls, no container creation, no daemon startup. The bottleneck is network latency to their API, not infrastructure setup.

2. *BaseSandbox abstraction.* We extracted an abstract base class that both Docker and Daytona implement: `execute()`, `_read_bytes()`, `write()`, `edit()`, `is_alive()`, `stop()`. Adding Daytona meant implementing 4 methods. Zero changes to agent code or toolsets.

3. *Native file APIs.* Docker pipes file I/O through shell commands (`docker exec cat ...`). Daytona has direct upload/download APIs via their SDK. Measurably faster for binary files.

4. *CompositeBackend.* You can route by path prefix — read source files from local filesystem (fast, no overhead), execute untrusted code in Daytona (isolated). The agent doesn't know — it calls `read_file()` and `execute()`, and the composite routes each call to the right backend.

5. *Docker isn't going away.* For local development and self-hosted setups, Docker is still the right choice (free, custom runtimes, no external dependency). Daytona is for cloud-native deployments where Docker daemon access is impractical — CI/CD, serverless, Kubernetes without privileged containers.

Install: `pip install pydantic-ai-backend[daytona]`

Happy to discuss the architecture or the backend abstraction design.


Great post!


Hey HN, I've built an open-source CLI tool that generates production-ready full-stack templates for AI/LLM applications. It's designed to cut down on boilerplate so you can focus on building features – think chatbots, assistants, or ML-powered SaaS. Install via pip install fastapi-fullstack, then run fastapi-fullstack new for an interactive wizard to customize your project. Key features:

Backend: FastAPI with Pydantic v2, async APIs, auth (JWT/OAuth/API keys), databases (async PostgreSQL/MongoDB/SQLite with Alembic), background tasks (Celery/Taskiq/ARQ), rate limiting, webhooks, Redis caching Frontend: Optional Next.js 15 (App Router, React 19, Tailwind, dark mode, i18n) with real-time WebSocket chat UI AI/LLM: Integrated PydanticAI for type-safe agents, tool calling, streaming responses, and conversation persistence Observability: Logfire for tracing everything from API requests to agent runs; plus Sentry/Prometheus CLI: Django-style management commands with auto-discovery (e.g., user creation, DB seeding, custom scripts) DevOps: Docker Compose, CI/CD templates (GitHub Actions/GitLab), Kubernetes manifests 20+ configurable integrations to pick what you need – no bloat

Repo: https://github.com/vstorm-co/full-stack-fastapi-nextjs-llm-t... Inspired by tiangolo's FastAPI templates and others, but with a stronger AI focus, modern frontend, and more enterprise-grade options out of the box. Screenshots, demo GIFs, architecture diagrams, and full docs in the README. This has sped up my own projects a lot – curious about yours: What pain points do you hit with full-stack AI setups? Any features to add (e.g., more LLM frameworks like LangChain coming soon)? Contributions welcome! Thanks


Advertise on Reddit Skip to Navigation Skip to Right Sidebar r/PydanticAI icon Go to PydanticAI r/PydanticAI • 10m ago VanillaOk4593 Pydantic-DeepAgents: Autonomous Agents with Planning, File Ops, and More in Python

Excited to share a new open-source project I just released: Pydantic-DeepAgents – a framework that extends Pydantic-AI with powerful “deep agent” capabilities, making it easy to build production-ready autonomous agents while keeping everything fully type-safe and lightweight.

Repo: https://github.com/vstorm-co/pydantic-deepagents

What it adds to Pydantic-AI It brings advanced agent patterns directly into the Pydantic-AI ecosystem:

    Built-in planning loops (TodoToolset)

    Filesystem access and file upload handling

    Subagent delegation

    Extensible skills system (define new behaviors with simple markdown prompts)

    Multiple state backends: in-memory, persistent filesystem, secure DockerSandbox, and CompositeBackend

    Automatic conversation summarization for long sessions

    Human-in-the-loop confirmation workflows

    Full streaming support

    Native structured outputs via Pydantic models (output_type)
Key features list:

    Multiple Backends: StateBackend, FilesystemBackend, DockerSandbox, CompositeBackend

    Rich Toolsets: TodoToolset, FilesystemToolset, SubAgentToolset, SkillsToolset

    File Uploads: run_with_files() and deps.upload_file()

    Skills System: markdown-based skill definitions

    Structured Output: type-safe Pydantic responses

    Context Management: auto-summarization

    Human-in-the-Loop: built-in approval steps

    Streaming: token-by-token responses
There’s a complete demo app in the repo that shows streaming UI, file uploads, reasoning traces, and human confirmation flows: https://github.com/vstorm-co/pydantic-deepagents/tree/main/e...

Quick demo video: https://drive.google.com/file/d/1hqgXkbAgUrsKOWpfWdF48cqaxRh...

Why it fits the Pydantic-AI philosophy It stays true to Pydantic’s strengths – strong typing, validation, and simplicity – while adding the agent-specific tools many of us have been missing. Compared to heavier alternatives (LangChain, CrewAI, AutoGen), it’s deliberately minimal, easier to customize, and includes production-oriented extras like Docker sandboxing out of the box.

Would love feedback from the Pydantic-AI community – especially ideas on deeper integration with upcoming Pydantic features or new agent patterns! Stars, forks, issues, and PRs are very welcome.

Thanks!


I just released Pydantic-DeepAgents, an open-source Python framework built on top of Pydantic-AI for creating production-grade autonomous agents. It helps developers quickly build agents with features like planning, filesystem ops, subagent delegation, and extensible skills. Key features:

Multiple Backends: StateBackend (in-memory), FilesystemBackend, DockerSandbox, CompositeBackend Rich Toolsets: TodoToolset, FilesystemToolset, SubAgentToolset, SkillsToolset File Uploads: Upload files for agent processing with run_with_files() or deps.upload_file() Skills System: Extensible skill definitions with markdown prompts Structured Output: Type-safe responses with Pydantic models via output_type Context Management: Automatic conversation summarization for long sessions Human-in-the-Loop: Built-in support for human confirmation workflows Streaming: Full streaming support for agent responses

There's a demo app example in the repo: https://github.com/vstorm-co/pydantic-deepagents/tree/main/e... Quick demo video: https://drive.google.com/file/d/1hqgXkbAgUrsKOWpfWdF48cqaxRh... And check the README for a screenshot overview. Repo: https://github.com/vstorm-co/pydantic-deepagents Feedback and contributions welcome! If you're into AI agents or Python automation, I'd love to hear thoughts.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: