Jays' Thoughts

My name is Jay and this site serves as a simple place to collect and organize my thoughts.

Posts

Yes, I will judge you for using AI...

Now, full disclosure, the title is a bit tongue-in-cheek. I am not shoving my head in the ground like an ostrich and refusing to see the realities of the world. AI, or more precisely, Large Language Models are and will remain a part of the programming world for the foreseeable future. That does not mean however that the broader effects of them should not be called out and articulated. Thousands of proverbial manhours and gallons of virtual ink have already been expended discussing the pros, the cons, the hype, the hatred of LLMs for programmer assistance, programmer replacement and who knows what else has been come up with. A brave new world has been aggressively declared by some while the same reality has been decried by others.

I am not going to attempt to reword what so many others have already written about. Instead, there is a secondary effect that I have not seen too much active discussion about, but that cuts deep to the heart of what it means to be an engineer on a team or in a community.

That concept is Trust

It underpins everything about how a group of engineers function and interact with each other in all technical contexts. When you discuss a project architecture you are trusting your team has experience and viewpoints to back up their assertions. When you review a submitted patch from a teammate you are trusting they have the contextual and organizational knowledge to competently solve a given task. As much as we would like to think we treat all engineers equally and apply the same level of critique and thoughtfulness equally, in the real world that is rarely the case. Implicit biases and reputations both fair and unfair underpin all your interactions with your fellow engineers. A classic illustration of this is simply the way you review the submitted patches from a junior vs a seasoned peer. If the junior has not yet earned a level of trust the patch must mentally go through a more robust review, ensuring mistakes that you trust your other peers to not make are not present. This is not to say that you should review your peer's patches with flippancy or a rubber stamp. But the reality is that trust brings a certain amount of "benefit of the doubt." In the open source community, it can take many many patches and many years to obtain "street cred" within that community. If we consult the Linux kernel docs for being a maintainer you will see the following quote

Maintainers must be human, therefore, it is not acceptable to add a mailing list or a group email as a maintainer. Trust and understanding are the foundation of kernel maintenance and one cannot build trust with a mailing list. Having a mailing list in addition to humans is perfectly fine

https://docs.kernel.org/maintainer/feature-and-driver-maintainers.html#multiple-maintainers

In open source communities, your reputation is everything, it is built over years and can be lost in an instant.

How is any of this long-winded rumination relevant?

It is hardly interesting to read that trust is fundamental in all parts of engineering and in a larger scope, all of life. But this is not a blog about life and I don't purport to have the experience nor education to make such sweeping statements. I will instead stay in the lane I know, which is engineering and leading teams. It is in this area that I find the subtleties of LLMs flip the traditional calculus of trust that forms the foundation of competent projects.

An LLM augmented engineer is capable of outputting far far more "Lines of Code" than a non augmented engineer could ever hope to achieve. Indeed, if "Lines of Code" was the primary output that software engineering as a field was judged upon then I dare say the advent of Large Language Models would be the ultimate and final death of the entire industry. Seeing as that has not yet happened and does not appear to be on the immediate horizon there must be other factors at play here. So, if an LLM augmented engineer is capable of vastly increased "Lines of Code" output then it stands to reason that his other abilities must be augmented as well right? Well, that is what some people in the industry would tell you. Many of those same people also seem to be oddly financially close to the concept itself, but that is a separate blog and a separate rant entirely.

The reality is that LLMs enable an inexperienced engineer to punch far above their proverbial weight class. That is to say, it allows them to work with concepts immediately that might have taken days, months or even years otherwise to get to that level of output. This sounds great, another lofty abstraction to remove the headache of low level knowledge requirements. A breakthrough of the kind when the first assembler was written, then the first Fortran compiler, followed up by the packaging of established concepts like Garbage Collection and JITing into the Hotspot Virtual Machine giving rise to Java and yet another leap forward in general purpose availability of "the ability to write code that kinda works." If one was knowledgeable about their industrie's history or had even lived through these innovations it would be an understandable, dare I say even expected assumption that LLMs merely represent the next great industry breakthrough. And yet, this time something fundamentally is different, something sinister and dangerous... The ability to be wrong, or, to be clear, the fundamental baked in design principle of being wrong. While the industry leaping abstractions that came before focused on removing complexity, they did so with the fundamental assertion that the abstraction they created was correct. That is not to say they were perfect, or they never caused bugs or failures. But those events were a failure of the given implementation a departure from what the abstraction was SUPPOSED to do, every mistake, once patched led to a safer more robust system. LLMs by their very fundamental design are a probabilistic prediction engine, they merely approximate correctness for varying amounts of time. Once that time runs out there is a sharp drop off in model accuracy, it simply cannot continue to offer you an output that even approximates something workable. I have taken to calling this phenomenon the "AI Cliff," as it is very sharp and very sudden. It is as if the LLM will happily paddle you out to sea with promises of a paradise on the other side, then throw away your oar and disappear.

These facts are well understood, but how does this new type of "abstraction" contribute to a fundamental change in engineering inter-team trust? Well, the answer is simple, by allowing inexperienced engineers to approximate a correct output of something they lack experience with, the standard markers of trust have been subverted. No longer can one rely on the simple uniqueness of an environment to force an original, and designed solution. LLMs give the ability to provide approximately contextual solutions to a given problem that may or may not be correct. The yonder days assumption that the ability to generate a novel working solution in a unique context IMPLIED a level of understanding for the solution is no longer accurate. A novel solution can be generated, while at the same time being completely misunderstood. That being said, this is hardly a NEW occurrence, for decades people have joked about Software engineers being "copy paste" experts or overtly relying on sites like StackOverflow to provide all their code. LLMs however ratchet the copy paste ability up to an exponentially higher level. Generating more complex solutions that are possibly not understood by the engineer submitting the changes. This ability to submit relatively novel solutions that are also possibly not understood means that the onus is on the reviewer or the community to ensure at a deeper level that a given patch does what it is supposed to do. No longer can one rely on the competency markers that were previously available to signal a given engineer had the experience and skills to consider all manner of edgecases and pitfalls inherent to whatever the domain they were in called for. In a perfect world, this development is not an issue; skilled, cohesive teams will continue to function at a high level, as each member can be trusted not to submit a patch they do not fully understand. That team structure is not however a universal nicety everyone enjoys. Modern software teams are made up of all manner of skill sets, trust levels on these teams can vary and generally the engineers on the teams have self selected into a loose hierarchy. It is on these teams that placing an even STRICTER review burden on the reviewers will have the most painful effect. If they are reviewing a patch submitted by a team member who they know likes to use LLM generated code then the review requires a far deeper introspection than an equivalent review of someone who does not or sparingly uses them. This effect will be seen most strongly in open source communities where the trust level of contributors is inherently low to start with.

So where do we go from here?

LLMs are clearly here to stay, if I was to guess there will be a varied response to this erosion of trust. Some possible options are,

These are of course just general guesses and none solve the problem in a particularly compelling way. I do not have answers, only a distinct impression that large changes will be required of the industry to continue to function.

Howdy, Welcome to my blog - 👋

Posts

Yes, I will judge you for using AI...

That concept is `Trust`

How is any of this long-winded rumination relevant?

So where do we go from here?

Howdy, Welcome to my blog - 👋

Posts

Yes, I will judge you for using AI...

That concept is Trust

How is any of this long-winded rumination relevant?

So where do we go from here?

That concept is `Trust`