Let a Thousand Apps Bloom

Looking back on the development of Inksightful, my diary-scanning app, the experience I remember most clearly came in the spring of this year. For the first time, I was using my app to scan a diary that I was still using instead of one written years ago, and I found I needed to rescan a page – a brand-new feature. I gave a description of the feature to Codex, and it started implementing it. It was a tricky feature that touched the guts of the app and changed some of the fundamental assumptions I’d made when I designed it (I never expected pages to change after scanning). It took several attempts to get things right, there were initially bugs in the implementation, no big deal… except that every iteration required me to run Codex’s latest attempt on my phone (the only thing with a real camera) and tap buttons to rescan a page. Instead of being the developer, I’d designed a system where I was forced to be a one-man QA team for Codex.

My productivity breakthrough came after I worked with Codex to refactor my application: I moved all core business logic to a module that could be compiled for both macOS and iOS, and then created a command-line tool that could exercise the core logic. At first glance, it may look like silly work, as no human will ever use the command-line tool. It exists just for Codex. However, it was one of the most important steps I took while developing the application. After this change, I no longer had to click buttons on my phone to test changes to “rescanning” or other core features – Codex could initiate its own rescans using the command-line and some JPG files. This change in the guts of my project simultaneously led to better output from Codex (it could spot and correct its own mistakes) and required less oversight from me. The work became both more rewarding and faster!

My conclusion from this experience: it’s one thing to read a tweet that says, “Give agents a way to check their own work,” and quite another to figure out how to do that for a project and feel how much more fun it becomes.

That’s not the only time working on this project made me realize a huge gulf of understanding exists between reading about something on LinkedIn and experiencing it. If all you did was read about “the AI Inflection Point,” you missed the simultaneously exhilarating, disorienting, and scary feeling that comes when you realize you can hand medium-sized features to Codex and trust that they’ll get done well. Living through it last December, it felt like one of those weird science-fiction phase shifts, a little glitch after which everything looks the same but all of the rules have changed.

In my day job, I’m an Engineering Director at Duolingo. My primary responsibility is to support the career development of individual engineers and to help teams perform well. I believe, deep in my bones, I would be worse at my job if my understanding of AI coding tools came only from reading about them:

The people I support are starting to ask hard questions, like: What value do I bring to the company when AI writes code faster than I do? How does my career grow now that the meaning of “software engineer” is redefined every few months? I can help people answer these questions because I’ve had to come up with my own authentic answers when working on my own project.
My experience with Inksightful’s command-line tool convinced me that we now have a new expectation for excellent engineers: Do you create systems that maximize the effectiveness of AI agents? The manager in me has noticed a gap: this important new trait is not yet captured in the career ladders we use to help people plan their career progression.

However, my job at Duolingo also puts me in meetings for much of the workday, and the sad reality is the time left over is often in fragmented blocks that hinder deep, focused work. I don’t write code for my job anymore, and rarely ask agents to write code on my behalf. I’m grateful I’ve gotten my hands-on experience with Codex, but we can’t expect all managers to sacrifice their evenings and weekends on indie software projects. One of the big challenges software companies face in the age of AI is figuring out how to organize teams so their managers have both the opportunity and the expectation to regularly use AI to ship features. Given the rate of change, a one-time hackathon isn’t good enough.

Until that time comes, shipping an indie app is a strangely effective way for a manager to get the hands-on experience that will help you in your job. If you want to try this yourself, I have some additional tips:

Really ship the app. Put something out in the world with your name on it. It will force you to pay attention to quality. Without that, you may miss the frustration at the difficulty of getting the details just right when working with AI. The engineers you support are feeling that frustration. You’ll do better if you’ve felt it, too.
Don’t spend more than $20/month on an AI plan. Rate limits are your friend for an evening / weekend project.
Pick a weird project that’s meaningful to you. Writing software is fun, but shipping software is hard. Working on a project that was deeply meaningful to me but that nobody else was working on gave me the motivation to keep going when things started to drag.

I look forward to all of the weird and meaningful projects engineering managers ship with AI. Let a thousand apps bloom!

Flowers in a garded. Center for Urban Horticulture, Seattle, WA. — Flowers in a garden at the Center for Urban Horticulture in Seattle.