Dev
GitHub repos gaining traction - what high-signal users are starring and what's climbing the board, captured daily and enriched from GitHub. Raw material for spotting new tech and patterns worth building on.
653
repos tracked
153
surfaced this week
141
created < 30d
Python
top language
45 repos
-
Official Repo: AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery
-
OSWorld 2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks
-
The agent that grows with you
-
A clean, modular SDK for building AI agents with OpenHands V1.
-
[NeurIPS 2025] The first web-based benchmark and platform to evaluate visual reasoning and interaction capabilities of MLLM powered agents through diverse and dynamic CAPTCHA puzzles.
-
CEO-Bench: Can Agents Play the Long Game?
-
Peekaboo is a macOS CLI & optional MCP server that enables AI agents to capture screenshots of applications, or the entire system, with optional visual question answering through local or remote AI models.
-
Peekaboo is a macOS CLI & optional MCP server that enables AI agents to capture screenshots of applications, or the entire system, with optional visual question answering through local or remote AI models.
-
Agent skills for Claude Code and other AI agents
-
Interactive error analysis skill for AI agents. Studies LLM trace datasets, builds a review UI, monitors annotations, categorizes failure modes, proposes new samples.
-
Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, and collaborate in real time from any device.
-
Framework for evaluating and improving agents
-
Stores all your tweets nicely claw-able for agents.
-
Ask the oracle when you're stuck. Invoke GPT-5 Pro with a custom context and files.
-
Scripts for agents, shared between my repositories.
-
A plain-language writing skill for AI agents, with a revision view that shows what changed.
-
skills for coding agents related to marimo
-
Turn any browser into your terminal & command your agents on the go.
-
Useful skills for agents and claws.
-
MCP orchestrator that converts MPC servers to agents.
-
Makes your AI agent think like the laziest senior dev in the room. The best code is the code you never wrote.
-
Agents' Last Exam
-
Measuring frontier coding agents on original, long-horizon engineering tasks
-
Native Agent CLIs manager for macOS. Ghostty Terminals + Codex App Features/UX = Ghostex! Embedded browser & IDE. Strong agents support.
-
Custom AI agent platform to speed up your work.
-
A system for agentic LLM-powered data processing and ETL
-
📜 Entire CLI hooks into your Git workflow to capture AI agent sessions as you work. Sessions are indexed alongside commits, creating a searchable record of how code was written in your repo.
-
my workflows for ai agents like codex and claude
-
Turn any browser into your terminal & command your agents on the go.
-
Useful skills for agents and claws.
-
Landing page + leaderboard for SWE-Bench benchmark
-
cli for Better Stack to fetch logs, ClickHouse SQL style. Made for humans and agents.
-
Your agent runs on a Mac that isn't your daily driver. agentcookie keeps its sessions in sync with the Mac you actually browse on, continuously, encrypted over Tailscale, so OpenClaw, Hermes, or any other agent runtime wakes up authenticated. macOS, peer-to-peer, no cloud middleman.
-
Stores all your tweets nicely claw-able for agents.
-
Search infrastructure for AI
-
Bash for Agents
-
A benchmark for evaluating AI agents on realistic business workflows
-
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
-
🚀 Ultra Recipe for Training Long-Horizon Search Agents - matching frontier AI's search capability with a 20B model + stateful harness
-
Sandboxed tools and JS runtime for AI agents
-
agent multiplexer that lives in your terminal.
-
KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
-
Data Science Skills for AI agents like Claude Code
-
Use agent to learn agent - A skeleton course on how to design, build, and operate production AI agents
-
Open source Ghostty-based macOS terminal with vertical tabs and notifications for AI coding agents. Built for multitasking, organization, and programmability.