About
I’m Charles. I’m a development engineer, currently working at ByteDance Seed and previously employed at Ant Group. I currently reside and work in Hangzhou, focusing on Infrastructure, with a specialization in areas such as Agent Infra, DevOps, LLMs, Agent, and MLSys.
What I’m building lately:
- AIO Sandbox — packs the capabilities an Agent constantly needs (browser, code execution, terminal, file system) into a single Docker image, so an Agent can finish a multi-step task in one unified environment instead of shuttling data and re-authenticating across separate sandboxes.
- UI-TARS Desktop — a GUI Agent driven by a multimodal vision model that lets the model see the screen and operate interfaces the way a human does.
This blog is about the problems I actually got stuck on and eventually worked out — how MCP shifts the way Agent apps are built, how to design a Sandbox architecture, and the old front-end engineering questions around module packaging and monorepos. I try to skip the obvious and write down the part that tripped me up.
Co-authored Papers:
- UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning (arXiv:2509.02544)
- UI-TARS: Pioneering Automated GUI Interaction with Native Agents (arXiv:2501.12326)
Find me: