@LottoLabs: Interesting model here 35b a3b trained for agentic use It gets 60.7 on Terminal Bench2 qwen 3.6 27b gets 59.3 Essential…

X AI KOLs Following 06/08/26, 02:10 PM Models

agentic-model open-source coding-agent terminal-bench agentic-thinking large-language-model

Summary

Nex-AGI releases Nex-N2, an open-source agentic model series (Nex-N2-Pro and Nex-N2-mini) with an Agentic Thinking framework that unifies reasoning, tool use, and environment execution, achieving top-tier performance on agentic and coding benchmarks.

Interesting model here 35b a3b trained for agentic use It gets 60.7 on Terminal Bench2 qwen 3.6 27b gets 59.3 Essentially the same Going to have to try it out https://t.co/cJ0G7Nm5Yu

Original Article

View Cached Full Text

Cached at: 06/08/26, 05:24 PM

Interesting model here

35b a3b trained for agentic use

It gets 60.7 on Terminal Bench2 qwen 3.6 27b gets 59.3

Essentially the same

Going to have to try it out

https://t.co/cJ0G7Nm5Yu

nex-agi/Nex-N2-mini · Hugging Face

Source: https://huggingface.co/nex-agi/Nex-N2-mini

An agentic model with Agentic Thinking.

Today, we are officially releasing and open-sourcing our next-generation model,Nex-N2— an agent model built for real-world productivity scenarios. With first-tier coding and agentic capabilities, Nex-N2 keeps driving complex, long-horizon tasks forward in real environments to deliver stable, end-to-end results.

Over the past year, a paradigm shift led by Vibe Coding and Harness Engineering has been redefining the limits of LLM agents. From dialogue, to reasoning, to agents that execute long-horizon tasks with environmental feedback, the tasks models must handle keep growing harder, the contexts longer, and the environments more realistic. The core of next-generation model competition is no longerwhether a model can think, but whether it can reliably and efficiently turn thinking into actions that are executable, verifiable, and iterable.

Rather than treating reasoning, tool use, and environment execution as separate capabilities, Nex-N2 unifies them through anAgentic Thinkingframework that connects requirement understanding, task planning, code implementation, environmental feedback, evaluation and debugging, and continuous iteration into a single closed loop. The framework has two parts:

Adaptive Thinkinglets the model decide on its own when to think and how deeply — executing simple actions quickly while reasoning thoroughly on critical decisions.
Coherent Thinkingcarries one consistent reasoning paradigm across general reasoning and diverse agentic tasks, staying consistent across tasks and modalities to enable stable capability transfer.

Across real agentic workflows — agentic coding, deep research, tool calling, and terminal execution — Nex-N2 reaches first-tier performance, with substantial gains over the previous-generation Nex-N1 on multiple authoritative benchmarks. In real productivity scenarios such as OpenClaw one-person-company workflows, end-to-end game development, and web and multimodal generation, it likewise demonstrates outstanding usability, robustness, and stability.

https://huggingface.co/nex-agi/Nex-N2-mini#open-sourceOpen Source

In keeping with our commitment to open source, we are releasing bothNex-N2-ProandNex-N2-minias open-source models starting today.

Nex-N2-Pro:Hugging Face|ModelScope
Nex-N2-mini:Hugging Face|ModelScope
Early Access:SiliconFlow

We welcome developers and enterprises to integrate and try Nex-N2 and share their feedback.

https://huggingface.co/nex-agi/Nex-N2-mini#performancePerformance

We evaluate Nex-N2 in real agentic workflows along three directions — agentic tasks, coding tasks, and general tasks — covering benchmarks across tool calling, search-based decision-making, software engineering, and terminal execution. Nex-N2-Pro delivers strong performance that keeps pace with top-tier models such as GPT-5.5 and Opus 4.7: it excels at coding (e.g., 75.3 on Terminal-Bench 2.1) and long-horizon tasks (1585 on GDPval), and shows especially strong generalization and competitiveness on newer benchmarks like SWE-Atlas and DeepSWE. On general capability and core reasoning, it stands on par with leading frontier models.

Nex-N2 ships in two variants, both post-trained on the Qwen3.5 series:Nex-N2-Pro(built onQwen3\.5\-397B\-A17B) andNex-N2-mini(built onQwen3\.5\-35B\-A3B\-Base), covering different latency and quality trade-offs. The table below reports their scores alongside leading proprietary and open models across our full evaluation suite.

BenchmarkNex-N2-mini****Nex-N2-ProGPT-5.5Opus 4.7Kimi-K2.6GLM-5.1MiniMax M3DeepSeek-V4-ProAgentBrowseComp74.183.784.479.883.279.383.583.4GDPval140215851769175314811535-1554Toolathlon33.351.955.652.850.040.7-51.8WildClawBench47.753.558.262.2-48.2-43.7WideSearch62.075.6--80.8---TAU365.971.1---70.6--Coding & SWESWE-Bench Pro50.258.858.664.358.658.459.055.4Terminal-Bench 2.160.775.383.469.7-58.766.072.0DeepSWE8.033.670542418-8SWE-Bench Verified74.480.882.987.680.2-80.580.6SWE Atlas QnA31.537.945.445.2--37.9-SWE Atlas RF30.032.944.848.6----SWE Atlas TW23.340.042.638.2--30.8-General & ReasoningGPQA Diamond82.690.793.694.290.586.2-90.1IFEval89.194.0--94.594.5-91.9Apex9.436.5--24.011.5-38.3

https://huggingface.co/nex-agi/Nex-N2-mini#usageUsage

https://huggingface.co/nex-agi/Nex-N2-mini#local-deploymentLocal Deployment

**Note:**For the best performance with Nex-series models, we recommend serving them with our customizedsglangfork.

First, install oursglangfork:

# Use the customized `sglang` fork
git clone https://github.com/nex-agi/sglang.git
cd sglang

# Install the python packages
pip install --upgrade pip
pip install -e "python"

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-proNex-N2-Pro

Launch the server (example on two 8× H100 servers with CUDA 13.0):

# Multi-node (2 nodes). Run the same command on every node with:
#   <node-rank> = 0 on the head node, 1 on the other node
#   <node0-ip>  = IP of the head node (reachable from all others)
python -m sglang.launch_server \
  --model-path /path/to/your/model  \
  --tp 16 \
  --nnodes 2 \
  --node-rank <node-rank> \
  --dist-init-addr <node0-ip>:20000 \
  --reasoning-parser qwen3 \
  --tool-call-parser qwen3_coder \
  --mamba-scheduler-strategy extra_buffer

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-miniNex-N2-mini

Launch the server (example on one 2× H100 server with CUDA 13.0):

python -m sglang.launch_server \
  --model-path /path/to/your/model  \
  --tp 2 \
  --reasoning-parser qwen3 \
  --tool-call-parser qwen3_coder \
  --mamba-scheduler-strategy extra_buffer

https://huggingface.co/nex-agi/Nex-N2-mini#docker-deploymentDocker Deployment

We also provide a prebuilt Docker image with our customizedsglangfork preinstalled:nexagi/sglang:v0\.5\.12. The launch command is the same as above.

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-pro-1Nex-N2-Pro

# Multi-node (2 nodes). Run the same command on every node with:
#   <node-rank> = 0 on the head node, 1 on the other node
#   <node0-ip>  = IP of the head node (reachable from all others)
docker run --gpus all --shm-size 32g --network host \
  -v /path/to/your/model:/model \
  nexagi/sglang:v0.5.12 \
  python3 -m sglang.launch_server \
    --model-path /model \
    --tp 16 \
    --nnodes 2 \
    --node-rank <node-rank> \
    --dist-init-addr <node0-ip>:20000 \
    --host 0.0.0.0 --port 30000 \
    --reasoning-parser qwen3 \
    --tool-call-parser qwen3_coder \
    --mamba-scheduler-strategy extra_buffer

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-mini-1Nex-N2-mini

Single node with 2× H100:

docker run --gpus all --shm-size 32g --ipc=host \
  -p 30000:30000 \
  -v /path/to/your/model:/model \
  nexagi/sglang:v0.5.12 \
  python3 -m sglang.launch_server \
    --model-path /model \
    --tp 2 \
    --host 0.0.0.0 --port 30000 \
    --reasoning-parser qwen3 \
    --tool-call-parser qwen3_coder \
    --mamba-scheduler-strategy extra_buffer

https://huggingface.co/nex-agi/Nex-N2-mini#recommended-sampling-parametersRecommended Sampling Parameters

For the best generation quality, we recommend the following sampling parameters:

temperature: 0.7
top\_p: 0.95
top\_k: 40

https://huggingface.co/nex-agi/Nex-N2-mini#function-callingFunction Calling

Nex-series models support robust function-calling capabilities. To enable function calling, add the\-\-tool\-call\-parser qwen3\_coderflag when launching the server:

python -m sglang.launch_server --model-path /path/to/your/model --tool-call-parser qwen3_coder

https://huggingface.co/nex-agi/Nex-N2-mini#reasoning-parserReasoning Parser

Nex-series models emit explicit reasoning traces. Add the\-\-reasoning\-parser qwen3flag to parse the reasoning content separately from the final response. It can be combined with the function-calling parser above:

python -m sglang.launch_server --model-path /path/to/your/model --tool-call-parser qwen3_coder --reasoning-parser qwen3

@LottoLabs: Interesting model here 35b a3b trained for agentic use It gets 60.7 on Terminal Bench2 qwen 3.6 27b gets 59.3 Essential…

nex-agi/Nex-N2-mini · Hugging Face

https://huggingface.co/nex-agi/Nex-N2-mini#open-sourceOpen Source

https://huggingface.co/nex-agi/Nex-N2-mini#performancePerformance

https://huggingface.co/nex-agi/Nex-N2-mini#usageUsage

https://huggingface.co/nex-agi/Nex-N2-mini#local-deploymentLocal Deployment

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-proNex-N2-Pro

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-miniNex-N2-mini

https://huggingface.co/nex-agi/Nex-N2-mini#docker-deploymentDocker Deployment

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-pro-1Nex-N2-Pro

https://huggingface.co/nex-agi/Nex-N2-mini#nex-n2-mini-1Nex-N2-mini

https://huggingface.co/nex-agi/Nex-N2-mini#recommended-sampling-parametersRecommended Sampling Parameters

https://huggingface.co/nex-agi/Nex-N2-mini#function-callingFunction Calling

https://huggingface.co/nex-agi/Nex-N2-mini#reasoning-parserReasoning Parser

Similar Articles

@ModelScope2022: Nex-N2 is now open source！An agentic model series from Nex AGI built for coding, tool use, deep research, and long-hori…

@rohanpaul_ai: Qwen 3.7 Max is super close to the frontier models for coding and agentic abilities. And and it’s now available on AI/M…

NVIDIA just announced the release of Nemotron 3 Ultra (2 minute read)

@jinyuhou0: On popular benchmarks, our 30B model matches systems 20-30x its size (gpt-5.4-xhigh, DeepSeek-V3.2, Kimi-K2.5), while u…

@KyleHessling1: Hello again, everyone! We've got another really fun 9b, this one specifically trained for tool calling and agentic codi…

Submit Feedback

Similar Articles

@ModelScope2022: Nex-N2 is now open source！An agentic model series from Nex AGI built for coding, tool use, deep research, and long-hori…

@rohanpaul_ai: Qwen 3.7 Max is super close to the frontier models for coding and agentic abilities. And and it’s now available on AI/M…

NVIDIA just announced the release of Nemotron 3 Ultra (2 minute read)

@jinyuhou0: On popular benchmarks, our 30B model matches systems 20-30x its size (gpt-5.4-xhigh, DeepSeek-V3.2, Kimi-K2.5), while u…

@KyleHessling1: Hello again, everyone! We've got another really fun 9b, this one specifically trained for tool calling and agentic codi…