DossierKit / AI Expert

AI / Agent / LLM Expert

Building intelligent systems and delightful experiences with AI.

Projects Explore things I've built and shipped. Experience A look at my journey and impact. Open Roles I'm open to meaningful opportunities.

Contact demo@example.com Send a Message

Target Roles

Each role page is generated from roles.yaml and reassembles work, lab notes, and proof for a target role.

A fit for teams moving Agents from demos to production, covering runtime, tool-use, evaluation, permissions, and enterprise integration.

A fit for teams that need to turn production Agent feedback into preference data, evaluation sets, and training loops.

A fit for teams that need business process fluency, field integration, and LLM product engineering in one role.

Deep case studies covering architecture, collaboration, evaluation, and lessons.

2026

Turned production Agent badcases into evaluable, labelable, trainable preference data loops.

2025

Built an evaluable, auditable, scalable Agent runtime and tool-use layer for enterprise workflows.

Lab notes showing evaluation, data loops, and applied algorithm depth.

DPO

Used chosen/rejected preference data to improve enterprise Agent tool-use decisions.

Replay Eval

Built reproducible Agent regression evaluation from production traces for prompt, tool, and model changes.

DossierKit favors proof through work, experiments, metrics, and lessons rather than keyword density.

Can decompose Agent platforms into runtime, tool-use, evaluation, and permission audit modules.

Experienced in turning badcases into preference data and post-training experiments.

A fit for roles requiring architecture, implementation, and business collaboration.

Method notes, retrospectives, and structured thinking.

2026-06-26

Turning Agent evaluation from subjective trial into replayable, layered, release-gating engineering.

Contact

Best for conversations around Agent platforms, LLM application architecture, post-training data loops, and AI product delivery.