Datasets:

| agent stringclasses 1

value | conversations listlengths 10 449 | date stringlengths 25 32 | episode stringclasses 118

values | model stringclasses 1

value | model_provider stringclasses 1

value | result stringclasses 7

values | run_id stringclasses 123

values | task stringlengths 10 25 | trace_source stringclasses 13

values | trial_name stringlengths 19 34 |

|---|---|---|---|---|---|---|---|---|---|---|

| terminus-2 | [

{

"content": "You are an AI assistant tasked with solving command-line tasks in a Linux environment. You will be given a task description and the output from previously executed commands. Your goal is to solve the task by providing batches of shell commands.\n\nFormat your response as JSON with the following st... | 2026-04-18T17:23:24.340434+00:00 | episode-15 | hosted_vllm/glm | hosted_vllm | null | e39b3718-245e-4c29-9b8d-ee7a4ec3bbc9 | swesmith-31327_copy0002 | main | swesmith-31327_copy0002__VAF9BrZ |

| terminus-2 | [

{

"content": "You are an AI assistant tasked with solving command-line tasks in a Linux environment. You will be given a task description and the output from previously executed commands. Your goal is to solve the task by providing batches of shell commands.\n\nFormat your response as JSON with the following st... | 2026-04-17T20:30:46.732658+00:00 | episode-20 | hosted_vllm/glm | hosted_vllm | null | d73854f4-d271-4c79-a154-0fa2523ca37a | swesmith-17696_copy0000 | main | swesmith-17696_copy0000__jRGRKV3 |

| terminus-2 | [

{

"content": "You are an AI assistant tasked with solving command-line tasks in a Linux environment. You will be given a task description and the output from previously executed commands. Your goal is to solve the task by providing batches of shell commands.\n\nFormat your response as JSON with the following st... | 2026-04-17T22:48:26.879061+00:00 | episode-16 | hosted_vllm/glm | hosted_vllm | null | c4dadd8a-3921-40d6-a453-19cc9445c9d1 | swesmith-00618_copy0000 | main | swesmith-00618_copy0000__t3MGmcx |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-17T14:01:36.422338+00:00 | episode-0 | hosted_vllm/glm | hosted_vllm | null | b655f6d2-a2ef-4340-a508-044b4adf8d41 | swesmith-01495_copy0000 | summarization-1-summary | swesmith-01495_copy0000__UpPPvjM |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-18T14:16:02.104977+00:00 | episode-20 | hosted_vllm/glm | hosted_vllm | AgentTimeoutError | 6706a88a-45ad-4870-bd1c-47dfdd7137e5 | swesmith-06627_copy0000 | main | swesmith-06627_copy0000__xpDDzCx |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-18T03:37:43.816594+00:00 | episode-22 | hosted_vllm/glm | hosted_vllm | null | 08545357-f8a9-4a8b-a35f-c64e4aae00e8 | swesmith-04013_copy0000 | main | swesmith-04013_copy0000__vrJs8XL |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-18T19:21:03.659880+00:00 | episode-23 | hosted_vllm/glm | hosted_vllm | AgentTimeoutError | b655f6d2-a2ef-4340-a508-044b4adf8d41 | swesmith-24518_copy0000 | main | swesmith-24518_copy0000__pyL2ebq |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-18T07:35:11.002599+00:00 | episode-24 | hosted_vllm/glm | hosted_vllm | AgentTimeoutError | d73854f4-d271-4c79-a154-0fa2523ca37a | swesmith-01142_copy0002 | main | swesmith-01142_copy0002__SLN35nT |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-18T14:39:17.244109+00:00 | episode-30 | hosted_vllm/glm | hosted_vllm | AgentTimeoutError | e39b3718-245e-4c29-9b8d-ee7a4ec3bbc9 | swesmith-06920_copy0000 | main | swesmith-06920_copy0000__HnJWanL |

| terminus-2 | [{"content":"You are an AI assistant tasked with solving command-line tasks in a Linux environment. (...TRUNCATED) | 2026-04-17T13:50:21.249787+00:00 | episode-23 | hosted_vllm/glm | hosted_vllm | null | e39b3718-245e-4c29-9b8d-ee7a4ec3bbc9 | swesmith-14179_copy0000 | main | swesmith-14179_copy0000__8c2FDDm |

Project | Code | Collection

OpenThoughts-Agent-SFT-100K

OpenThoughts-Agent is an open-source effort to curate the best datasets for training agents. Our release includes datasets, models and our research codebase.

OpenThoughts-Agent-SFT-100K is the 100,000-example point of the OpenThoughts-Agent SFT scaling ladder (sizes 316 / 1K / 3.16K / 10K / 31.6K / 100K). It contains (task, agent-trajectory) pairs used to fine-tune OpenThinkerAgent-8B-SFT-100K and OpenThinkerAgent-32B-SFT-100K. The 100K set is the final OpenThoughts-Agent SFT dataset described in the paper.

- Homepage: https://www.openthoughts.ai/blog/agent

- Repository: https://github.com/open-thoughts/OpenThoughts-Agent

Data

Tasks are drawn from the Top-4 task sources identified by our ablations: SWE-Smith, StackExchange-SuperUser, StackExchange-Tezos (synthetically augmented to expand task diversity), and IssueTasks. Agentic trajectories are generated by GLM-4.7-AWQ acting as the teacher in the terminus-2 harness inside Daytona sandboxes, then filtered to traces with at least 5 model turns.

| Field | Description |

|---|---|

| conversations | the multi-turn agent trajectory (role/content messages) |

| task | the task description given to the agent |

| trace_source | originating task source (swesmith / superuser / tezos / issue-tasks) |

| agent,model,model_provider | rollout harness and teacher metadata |

| result,episode,run_id,trial_name,date | rollout bookkeeping |

- Rows: 100,000

- Teacher: GLM-4.7-AWQ · Harness: terminus-2

Links

- 🌐 OpenThoughts-Agent project page

- 💻 OpenThoughts-Agent GitHub repository

- 📚 OpenThinker-Agent collection

- 🤖 OpenThinkerAgent-32B-SFT-100K model

- 🤖 OpenThinkerAgent-8B-SFT-100K model

Citation

@misc{openthoughts-agent,

author = {Team, OpenThoughts-Agent},

title = {{OpenThoughts-Agent: Data Recipes for Agentic Models}},

howpublished = {https://www.openthoughts.ai/blog/agent},

year = {2026}

}

- Downloads last month

- 440