Rokcso's Blog

译> 我的 AI 应用之路

本文由 AI 翻译,rokcso 修正。

原文链接:My AI Adoption Journey

My experience adopting any meaningful tool is that I’ve necessarily gone through three phases: (1) a period of inefficiency (2) a period of adequacy, then finally (3) a period of workflow and life-altering discovery.

我的经验是,任何有意义的工具,都要经历三个阶段方能掌握:(1)低效适应期(2)基本应用期(3)革新蜕变期。

In most cases, I have to force myself through phase 1 and 2 because I usually have a workflow I’m already happy and comfortable with. Adopting a tool feels like work, and I do not want to put in the effort, but I usually do in an effort to be a well-rounded person of my craft.

由于已有得心应手的工作流,我往往要逼着自己突破前两个阶段。尝试新工具总像额外负担,虽心有不甘,但为成为全面发展的多面手,我总会坚持投入。

This is my journey of how I found value in AI tooling and what I’m trying next with it. In an ocean of overly dramatic, hyped takes, I hope this represents a more nuanced, measured approach to my views on AI and how they’ve changed over time.

这是我探索 AI 工具价值的历程,以及正在探索的新可能。在当下众声喧哗的 AI 热潮中,我希望能以这份审慎从容的思考,记录自己对 AI 认知的演进轨迹。

This blog post was fully written by hand, in my own words. I hate that I have to say that but especially given the subject matter, I want to be explicit about it.

本文全程手写,字字皆出己思。在此强调似乎多余,但鉴于主题特殊性,特此说明。

第一步:告别聊天机器人

Step 1: Drop the Chatbot

Immediately cease trying to perform meaningful work via a chatbot (e.g. ChatGPT, Gemini on the web, etc.). Chatbots have real value and are a daily part of my AI workflow, but their utility in coding is highly limited because you’re mostly hoping they come up with the right results based on their prior training, and correcting them involves a human (you) to tell them they’re wrong repeatedly. It is inefficient.

请立即停止试图通过聊天机器人(如 ChatGPT、网页版的 Gemini 等)完成实质性工作。这类工具确有价值,也是我日常 AI 工作流的一部分,但它们在编程领域的效用极其有限,因为你本质上是在赌训练数据能恰好生成正确结果,而纠错过程更需人工反复指正,效率低下。

I think everyone’s first experience with AI is a chat interface. And I think everyone’s first experience trying to code with AI has been asking a chat interface to write code.

我相信多数人的 AI 初体验都始于聊天界面,编程初尝试也是让聊天机器人代写代码。

While I was still a heavy AI skeptic, my first “oh wow” moment was pasting a screenshot of Zed’s command palette into Gemini, asking it to reproduce it with SwiftUI, and being truly flabbergasted that it did it very well. The command palette that ships for macOS in Ghostty today is only very lightly modified from what Gemini produced for me in seconds.

在我仍是 AI 怀疑论者时,曾将 Zed 编辑器命令面板的截图粘贴至 Gemini,要求其用 SwiftUI 复刻,当它在数秒内交出近乎完美的成品时,确实令我震撼。如今 Ghostty for macOS 的默认命令面板,正是在 Gemini 生成的代码基础之上微调而成。

But when I tried to reproduce that behavior for other tasks, I was left disappointed. In the context of brownfield projects, I found the chat interface produced poor results very often, and I found myself very frustrated copying and pasting code and command output to and from the interface. It was very obviously far less efficient than me doing the work myself.

但当我试图在其他任务中复现这种成功时,结果往往令人失望。尤其在已有项目中,聊天机器人频繁输出低质结果,让我陷入复制粘贴代码和命令输出的繁琐循环。这种模式效率之低下,明显不如亲自动手。

To find value, you must use an agent. An agent is the industry-adopted term for an LLM that can chat and invoke external behavior in a loop At a bare minimum, the agent must have the ability to: read files, execute programs, and make HTTP requests.

要真正释放价值,必须转向使用智能体。这是行业公认的术语,指具备循环交互能力并能触发外部行为的 LLM 系统1。一个合格的智能体至少应具备:文件读取、程序执行和 HTTP 请求发起能力。

第二步:复刻你的工作流程

Step 2: Reproduce Your Own Work

The next phase on my journey I tried Claude Code. I’ll cut to the chase: I initially wasn’t impressed. I just wasn’t getting good results out of my sessions. I felt I had to touch up everything it produced and this process was taking more time than if I had just done it myself. I read blog posts, watched videos, but just wasn’t that impressed.

我接下来的尝试是使用 Claude Code。长话短说:初期体验并不惊艳。产出质量不尽人意,总觉得需要逐行修改,耗时甚至超过亲手完成。尽管研读了技术博客、观看演示视频,仍未见其精妙之处。

Instead of giving up, I forced myself to reproduce all my manual commits with agentic ones. I literally did the work twice. I’d do the work manually, and then I’d fight an agent to produce identical results in terms of quality and function (without it being able to see my manual solution, of course).

但我没有放弃,而是强制自己用智能体复现所有手动提交的代码。字面意思地将同一项工作重复完成两次。先手动实现,再引导智能体产出功能与质量完全相同的结果(当然不会让它参考我的手写代码)。

This was excruciating, because it got in the way of simply getting things done. But I’ve been around the block with non-AI tools enough to know that friction is natural, and I can’t come to a firm, defensible conclusion without exhausting my efforts.

这个过程堪称煎熬,因为它违背了「高效完成」的基本诉求。但多年使用非 AI 工具的经验告诉我,磨合期的阵痛在所难免,唯有全力尝试后才能得出经得起推敲的结论。

But, expertise formed. I quickly discovered for myself from first principles what others were already saying, but discovering it myself resulted in a stronger fundamental understanding.

正是在这种刻意练习中,专业认知逐渐形成。我很快从第一性原理出发,验证了他人已提出的观点,而亲身体验带来的理解更为深刻:

  1. Break down sessions into separate clear, actionable tasks. Don’t try to “draw the owl” in one mega session.
    任务拆解:将复杂任务拆分为清晰可行的子任务,避免试图「一步登天」
  2. For vague requests, split the work into separate planning vs. execution sessions.
    规划分离:模糊需求应先进行规划,再执行实施
  3. If you give an agent a way to verify its work, it more often than not fixes its own mistakes and prevents regressions.
    自检机制:赋予智能体验证自身工作的能力,它往往能自主修正错误并防止倒退

More generally, I also found the edges of what agents – at the time – were good at, what they weren’t good at, and for the tasks they were good at how to achieve the results I wanted.

更重要的是,我摸清了当时智能体的能力边界:擅长什么、不擅长什么,以及如何在其优势领域达成目标。

All of this led to significant efficiency gains, to the point where I was starting to naturally use agents in a way that I felt was no slower than doing it myself (but I still didn’t feel it was any faster, since I was mostly babysitting an agent).

这一切带来显著的效率提升,以至于我开始自然而然地使用智能体,虽然尚未感觉更快(因为多数时间仍在监督执行),但至少不再迟滞。

The negative space here is worth reiterating: part of the efficiency gains here were understanding when not to reach for an agent. Using an agent for something it’ll likely fail at is obviously a big waste of time and having the knowledge to avoid that completely leads to time savings.

特别需要强调的是:效率提升部分源于懂得何时不使用智能体。明知会失败仍强行调用纯属浪费时间,而具备这种避坑意识本身就是一种节约时间2

At this stage, I was finding adequate value with agents that I was happy to use them in my workflow, but still didn’t feel like I was seeing any net efficiency gains. I didn’t care though, I was content at this point with AI as a tool.

至此,智能体已足够好用,我乐意将其纳入工作流,虽未实现净效率增益,但作为工具已令我满意。

第三步:日结智能体

Step 3: End-of-Day Agents

To try to find some efficiency, I next started up a new pattern: block out the last 30 minutes of every day to kick off one or more agents. My hypothesis was that perhaps I could gain some efficiency if the agent can make some positive progress in the times I can’t work anyways. Basically: instead of trying to do more in the time I have, try to do more in the time I don’t have.

为了进一步提升效率,我开始尝试新模式:每天留出最后 30 分钟,启动一个或多个智能体任务。我的假设是:如果智能体能在我无法工作的时间段取得进展,或许能实现效率增益。本质上是将生产力从「拥有的时间」延伸到「本不工作的时间」。

Similar to the previous task, I at first found this both unsuccessful and annoying. But, I once again quickly found different categories of work that were really helpful:

与上一阶段类似,初期尝试既无成效又令人烦躁。但我很快发现了适合此模式的三大场景:

To be clear, I did not go as far as others went to have agents running in loops all night. In most cases, agents completed their tasks in less than half an hour. But, the latter part of the working day, I’m usually tired and coming out of flow and find myself too personally inefficient, so shifting my effort to spinning up these agents I found gave me a “warm start” the next morning that got me working more quickly than I would’ve otherwise.

需要说明的是,我并未像某些实践者那样让智能体整夜循环运行。多数任务能在半小时内完成。但关键在于:工作日下午的我通常处于精力低谷,效率低下,此时转为启动智能体任务,反而能为次日早晨提供「热启动」,让我更快进入工作状态。

I was happy, and I was starting to feel like I was doing more than I was doing prior to AI, if only slightly.

至此,我开始感受到 AI 带来了超越以往的产能提升,尽管幅度尚微,但已足够令人欣喜。

第四步:外包「稳赢」任务

Step 4: Outsource the Slam Dunks

By this point, I was getting very confident about what tasks my AI was and wasn’t great at. I had really high confidence with certain tasks that the AI would achieve a mostly-correct solution. So the next step on my journey was: let agents do all of that work while I worked on other tasks.

此时,我已经非常清楚 AI 擅长和不擅长哪些任务。对于某些特定任务,我能高度确信 AI 能给出基本正确的解决方案。因此,我旅程的下一步是:在我处理其他任务的同时,让智能体包揽所有那些它擅长的工作

More specifically, I would start each day by taking the results of my prior night’s triage agents, filter them manually to find the issues that an agent will almost certainly solve well, and then keep them going in the background (one at a time, not in parallel).

更具体地说,我每天会先查看前一夜分类智能体的结果,手动筛选出那些智能体几乎肯定能完美解决的 Issue,然后让它们在后台运行(一次一个,不并行处理)。

Meanwhile, I’d work on something else. I wasn’t going to social media (any more than usual without AI), I wasn’t watching videos, etc. I was in my own, normal, pre-AI deep thinking mode working on something I wanted to work on or had to work on.

与此同时,我会去处理别的事情。我不会去刷社交媒体(和使用 AI 之前一样),也不会看视频等等。我会进入自己惯有的、AI 出现之前的深度思考模式,去处理我想做或必须做的任务。

Very important at this stage: turn off agent desktop notifications. Context switching is very expensive. In order to remain efficient, I found that it was my job as a human to be in control of when I interrupt the agent, not the other way around. Don’t let the agent notify you. During natural breaks in your work, tab over and check on it, then carry on.

这个阶段非常重要的一点是:关闭智能体的桌面通知。上下文切换的成本非常高。为了保持效率,我发现我的职责是掌控何时去打断智能体,而不是反过来被它打断。不要让智能体通知你。在你工作的自然间隙,切换标签页去检查它的进度,然后继续你的工作。

Importantly, I think the “work on something else” helps counteract the highly publicized Anthropic skill formation paper. Well, you’re trading off: not forming skills for the tasks you’re delegating to the agent while continuing to form skills naturally in the tasks you continue to work on manually.

重要的是,我认为「处理其他任务」这一点有助于抵消广为人知的 Anthropic 关于技能形成的论文 中提到的影响。这其实是一种权衡:你将任务委托给智能体,可能会影响你在这些任务上的技能形成,但同时你通过继续手动处理的任务,技能仍在自然地形成。

At this point I was firmly in the “no way I can go back” territory. I felt more efficient, but even if I wasn’t, the thing I liked the most was that I could now focus my coding and thinking on tasks I really loved while still adequately completing the tasks I didn’t.

到了这个阶段,我已经坚定地处于「绝不可能再回到过去」的状态了。我感觉效率更高了,但即便没有,我最喜欢的一点是,我现在可以将编码和思考的精力集中在我真正热爱的任务上,同时又能妥善完成那些我不太喜欢的任务。

第五步:设计约束框架

Step 5: Engineer the Harness

At risk of stating the obvious: agents are much more efficient when they produce the right result the first time, or at worst produce a result that requires minimal touch-ups. The most sure-fire way to achieve this is to give the agent fast, high quality tools to automatically tell it when it is wrong.

有一点可能显而易见:当智能体首次就能产出正确结果,或者最差也只需极少修改时,其效率会大幅提升。实现这一目标最可靠的方法是,为智能体提供快速、高质量的工具,使其能自动判断对错。

I don’t know if there is a broad industry-accepted term for this yet, but I’ve grown to calling this “harness engineering.” It is the idea that anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again. I don’t need to invent any new terms here; if another one exists, I’ll jump on the bandwagon.

我不知道业界对此是否有广泛接受的术语,但我逐渐称之为「约束框架」。其核心思想是:每当你发现智能体犯错时,就花时间设计一种解决方案,确保该智能体永不再犯同样的错误。这里我不需要创造新术语;如果已有其他术语,我很乐意采纳。

This comes in two forms:

这主要体现在两种形式中:

  1. Better implicit prompting (AGENTS.md). For simple things, like the agent repeatedly running the wrong commands or finding the wrong APIs, update the AGENTS.md (or equivalent). Here is an example from Ghostty. Each line in that file is based on a bad agent behavior, and it almost completely resolved them all.
    改进隐式提示(AGENTS.md)。针对简单问题,比如智能体反复运行错误命令或使用错误 API,更新 AGENTS.md(或类似文件)。这里有 一个 Ghostty 的示例。该文件中每一行都基于一次智能体的不当行为,而它几乎完全解决了所有这些问题。
  2. Actual, programmed tools. For example, scripts to take screenshots, run filtered tests, etc etc. This is usually paired with an AGENTS.md change to let it know about this existing.
    实际的程序化工具。例如,用于截图、运行筛选测试等的脚本。这通常与 AGENTS.md 的更改配合使用,以让智能体知晓这些现有工具。

This is where I’m at today. I’m making an earnest effort whenever I see an agent do a Bad Thing to prevent it from ever doing that bad thing again. Or, conversely, I’m making an earnest effort for agents to be able to verify they’re doing a Good Thing.

这就是我目前的状态。每当看到智能体做出不当行为时,我都会认真投入努力以防止它再次犯错。或者反过来,我正努力让智能体能够自我验证其行为是否正确。

第六步:保持智能体持续运行

Step 6: Always Have an Agent Running

Simultaneous to step 5, I’m also operating under the goal of having an agent running at all times. If an agent isn’t running, I ask myself “is there something an agent could be doing for me right now?”

在与第五步同步推进的同时,我还设定了一个目标:始终保持至少一个智能体在运行。如果没有智能体在运行,我会问自己:「现在有什么事情可以让智能体替我处理吗?」

I particularly like to combine this with slower, more thoughtful models like Amp’s deep mode (which is basically just GPT-5.2-Codex) which can take upwards of 30+ minutes to make small changes. The flip side of that is that it does tend to produce very good results.

我特别喜欢将这个目标与 Amp 的 深度模式 这类速度较慢但更善于思考的模型结合使用(该模式基本上就是 GPT-5.2-Codex),它可能需要超过 30 分钟来完成一些小改动。但好处是,它往往能产出非常出色的结果。

I’m not [yet?] running multiple agents, and currently don’t really want to. I find having the one agent running is a good balance for me right now between being able to do deep, manual work I find enjoyable, and babysitting my kind of stupid and yet mysteriously productive robot friend.

我(目前?)还没有运行多个智能体,也暂时不打算这样做。我发现,现阶段只运行一个智能体,对于我平衡两方面需求很合适:既能进行我喜欢的深度手动工作,又能照看我那位有点「笨拙」却又莫名高效的神秘机器人伙伴。

The “have an agent running at all times” goal is still just a goal. I’d say right now I’m maybe effective at having a background agent running 10 to 20% of a normal working day. But, I’m actively working to improve that.

「始终保持智能体运行」这个目标目前仍然只是一个目标。可以说,目前在一个正常的工作日里,我大概能有 10% 到 20% 的时间有效地让一个智能体在后台运行。但我正在积极努力提升这个比例。

I don’t want to run agents for the sake of running agents. I only want to run them when there is a task I think would be truly helpful to me. Part of the challenge of this goal is improving my own workflows and tools so that I can have a constant stream of high quality work to do that I can delegate. Which, even without AI, is important!

我不想为了运行智能体而运行智能体。我只在认为某项任务真正对我有帮助时才会启动它们。这个目标的部分挑战在于,需要改进我自己的工作流和工具,以便能持续产生高质量的、可以委托出去的任务。这一点,即使没有 AI,也同样重要!

当下

Today

And that’s where I’m at today.

这就是我目前的处境。

Through this journey, I’ve personally reached a point where I’m having success with modern AI tooling and I believe I’m approaching it with the proper measured view that is grounded in reality. I really don’t care one way or the other if AI is here to stay, I’m a software craftsman that just wants to build stuff for the love of the game.

通过这段旅程,我个人在使用现代 AI 工具方面已经取得了一些成功,并且我相信我正以一种基于现实的、审慎的态度来对待它。我其实并不太在意 AI 是否会永久存在3,我是一个软件工匠,仅仅因为热爱这份事业而想要创造东西。

The whole landscape is moving so rapidly that I’m sure I’ll look back at this post very quickly and laugh at my naivete. But, as they say, if you can’t be embarassed about your past self, you’re probably not growing. I just hope I’ll grow in the right direction!

整个领域发展如此之快,我敢肯定我很快就会回头看这篇文章,并嘲笑自己的天真。但是,正如人们所说,如果你不为过去的自己感到尴尬,那你可能并没有在成长。我只希望自己能朝着正确的方向成长!

I have no skin in the game here, and there are of course other reasons behind utility to avoid using AI. I fully respect anyone’s individual decisions regarding it. I’m not here to convince you! For those interested, I just wanted to share my personal approach to navigating these new tools and give a glimpse about how I approach new tools in general, regardless of AI.

我在这方面并无既得利益4,而且不使用 AI 当然也有其他合理的理由。我完全尊重任何人的个人决定。我写这些并不是为了说服你!我只是想和那些感兴趣的人分享一下我个人使用这些新工具的方法,并大致展示我通常是如何对待新工具的,无论是否与 AI 相关。


  1. Modern coding models like Opus and Codex are specifically trained to bias towards using tools compared to conversational models.
    相较于对话模型,像 Opus 和 Codex 这样的现代编程模型在训练时就被专门设计为更倾向于使用工具。 ↩︎

  2. Due to the rapid pace of innovation in models, I have to constantly revisit my priors on this one.
    由于模型迭代速度极快,我必须持续修正自己对此的原有认知。 ↩︎

  3. The skill formation issues particularly in juniors without a strong grasp of fundamentals deeply worries me, however.
    但最让我深感忧虑的是,这对基础不扎实的初级开发者可能造成的技能养成缺陷。 ↩︎

  4. I don’t work for, invest in, or advise any AI companies.
    我不受雇于任何 AI 公司,也未进行相关投资或提供咨询服务。 ↩︎

#Translation