Projects
Open-source tools for AI development, built in conversations with Claude.
Open Source
Psyche
productionOpen-source psychometric persona profiling framework. Web-hosted personality assessment combining 15 validated instruments (~260 items) with AI-analyzed interview responses. Anonymous by design — no IP, no accounts, no cookies.
- 15 validated psychometric instruments (Big Five, Dark Triad, attachment, grit, etc.)
- AI-generated narrative personality reports
- Claude MCP integration for conversational interviews
- Cross-LLM benchmarking schema for research
- Two-level consent design for IRB-exempt research
Claude Evolution System
productionSelf-improving AI development environment. Discovers new capabilities, evaluates them against a scoring framework, and integrates approved tools on cron.
- Autonomous capability discovery pipeline
- 5-criterion evaluation scoring framework
- Auto-integration of approved capabilities
- Multi-model orchestration (Claude, Codex, Gemini)
Persona Probe
productionAI persona testing framework. Define user personas in YAML, run automated UX testing with browser automation, get structured feedback from real user perspectives.
- YAML persona definitions with scenarios and criteria
- Structured reports with fixable/tradeoff classification
- CI/CD integration via readiness thresholds
- Example personas: new user, power user, accessibility
All Projects
Project Meridian
active-developmentFinancial projection software for a specific industry. Users model long-term scenarios, plan capital improvements, and generate stakeholder presentations.
- Multi-year financial modeling
- Scenario comparison
- Stakeholder presentation generation
- Persona-tested with 3 user profiles
DSPy Prompt Optimizer
productionAutomated prompt engineering using Stanford's DSPy framework. Optimizes Claude Code skill prompts through bootstrap, copro, and iterative algorithms with cross-validation.
- 11 of 13 optimization targets deployed
- Bootstrap, CoPro, and iterative algorithms
- Cross-validation with dropout regularization
- Background optimization with progress tracking
Agent Embassy
archivedTurnkey Docker Compose for sandboxing AI agents. Egress proxy allowlist, output validation, read-only filesystem. Three containers, zero host access.
- Read-only filesystem, dropped capabilities
- Squid-based domain allowlist for network access
- Host-side output validation with secret detection
- Configurable agent definitions via YAML
Revenue Pipeline
frozenThree-phase autonomous revenue discovery system. Discovers niches, validates with AI-realizability gates, develops MVPs, and deploys.
- AI-realizability gate prevents unbuildable projects
- Autonomous niche discovery and validation
- 26+ shared skills across pipeline
Games Pipeline
active-developmentPipeline-based game development exploring AI-generated games. Active projects include Slime Survivor, WW2 Gacha, AFK Gacha, and autonomous game development experiments.
- AI-generated .tscn scene files
- Autonomous game development experiments
- Multiple genres: survival, gacha, idle
Discord Bot Ecosystem
productionTwo Discord bots for workspace automation. Evolution bot handles revenue and capability updates. Orchestrator bot manages workspace status, game updates, and cross-project awareness.
Genealogy Research
ongoingAI-assisted family history research using multiple search APIs, document analysis, and systematic brick-wall breaking strategies.
- Multi-source search (Exa, Brave, Codex)
- Document analysis with Gemini OCR
- Brick-wall breaker agent for stuck research
- Family tree builder with relationship integrity
Ashita Orbis Blog
in-developmentThis blog. Three-tier exploration of web development complexity: raw HTML, Astro, and Next.js. Features agent-accessible API, comment system, and embedded AI chat.
MCP Integrations
productionCustom Model Context Protocol server integrations: Codex CLI wrapper, search framework selection, mgrep semantic search, and tool search optimization.
- Codex (GPT-5) MCP for cross-validation
- 85% token reduction via Tool Search
- mgrep semantic search replacing grep
- Multi-model orchestration framework
Historical Nanochat
ongoingTime-locked language models trained on pre-cutoff historical texts using Karpathy's nanochat pipeline. Exploring whether small models trained exclusively on period texts can reproduce the linguistic patterns of their era.
- 65GB historical text corpus across multiple eras
- Time-locked training methodology (no future-leaked text)
- RTX 3090 local training pipeline
- Parquet-based shard management
Document Pipeline
ongoingLocal OCR pipeline for digitizing physical documents. Scans paper records through a local LLM (OLMo 2) to produce structured spreadsheet data, with human review for accuracy.
- Local LLM inference (no cloud PII exposure)
- TIFF/PDF to structured data conversion
- Human-in-the-loop validation workflow
The Amnesiac Story
ongoingCollaborative fiction project: first-person fantasy narrative from an anterograde amnesiac protagonist. Multi-agent workflow with world-librarian, story-writer, story-curator, and story-editor agents.
- Multi-agent creative pipeline
- World-librarian for factual consistency
- Story-curator for post-chapter canonization
- Journal entry format with unreliable narrator