Outsource

🧑‍💻 User

LLMs may allow users to outsource some thinking, research, and decision making to focus on higher level tasks. This is similar to consultancy in companies, paying for expert advice, etc. Some people worry about "outsourcing your thinking" in general will lead to general atrophy of necessary skills. What we can we learn from historical examples of outsourcing things in companies and individuals? Please first list tons of examples where this happens (100 ideally). I'll follow up by asking you to do a deep dive into a particular one or something.

🤖 Assistant

Scribes (ancient Mesopotamia/Egypt/Rome): elites outsource writing, recordkeeping, correspondence.
Medieval monastic copyists: outsourcing book reproduction and textual preservation to monasteries.
Notaries/public scribes (Middle Ages–modern): drafting contracts, wills, legal forms for the illiterate/busy.
Court clerks and chancery staff: rulers outsource administration, decrees, and bureaucratic memory.
Town criers: communities outsource public announcements and dissemination of information.
Libraries/archivists: institutions outsource long-term memory and document retrieval.
Accountants/bookkeepers (Renaissance onward): merchants outsource double-entry bookkeeping and financial control.
Tax farming (ancient Rome, Ottoman, French Ancien Régime): states outsource tax collection to contractors.
Customs brokers: firms outsource navigating tariffs, import rules, and paperwork.
Freight forwarders: companies outsource route planning, shipping coordination, documentation.
Shipping insurers/Lloyd’s: merchants outsource risk assessment and pooling for maritime trade.
Underwriters/actuaries: organizations outsource probabilistic forecasting (life, fire, shipping).
Bailiffs/debt collectors: creditors outsource collections and enforcement.
Mercenaries/condottieri: states outsource military capability and tactical expertise.
Privateers: governments outsource naval warfare to licensed entrepreneurs.
Intelligence networks/spies: rulers outsource information gathering and analysis.
Diplomatic envoys: sovereigns outsource negotiation, cultural translation, political judgment.
Astrologers at court: elites outsource decision timing/strategy to “expert” star-reading.
Oracles (Delphi, etc.): individuals/states outsource high-stakes decisions to religious authorities.
Priests/confessors: people outsource moral guidance and personal counsel.
Matchmakers: families outsource partner search, vetting, and negotiation.
Midwives/doulas: families outsource childbirth expertise and risk management.
Physicians: individuals outsource diagnosis and treatment planning.
Pharmacists/apothecaries: outsourcing compounding, drug knowledge, interactions.
Nurses: patients outsource monitoring and day-to-day clinical judgment.
Dentists: people outsource oral health maintenance/decisions.
Veterinarians: farmers outsource animal health expertise and interventions.
Agronomists/extension agents: farmers outsource crop planning, pest management, soil advice.
Weather forecasters/almanacs: sailors/farmers outsource prediction of conditions.
Navigators/pilots (age of sail): captains outsource specialized navigation in hazardous waters.
Cartographers: states and merchants outsource map-making and geographic knowledge.
Surveyors: landowners outsource measurement, boundary setting, technical accuracy.
Engineers: organizations outsource design and safety calculations (bridges, rail, factories).
Architects: clients outsource spatial planning, codes, aesthetics, structural integration.
Master builders/guilds: patrons outsource construction methods and quality control.
Project managers: firms outsource coordination, scheduling, risk tracking.
Quality inspectors: factories outsource checking and standards compliance.
Standards bodies (weights/measures): markets outsource trust and calibration to institutions.
Auditors: investors/boards outsource verification and fraud detection.
Credit rating agencies: markets outsource assessment of default risk (e.g., bonds).
Credit bureaus/credit scores: lenders outsource borrower assessment to centralized scoring.
Loan officers/underwriting departments: banks outsource lending decisions from individuals to process.
Insurance adjusters: claims handling and judgment outsourced from policyholders to professionals.
Lawyers/solicitors: individuals/firms outsource legal interpretation, strategy, drafting.
Patent attorneys/agents: inventors outsource claims strategy and filings.
Compliance officers: firms outsource interpreting regulations and designing controls.
HR recruiters/headhunters: companies outsource candidate sourcing, screening, negotiation.
Payroll providers: firms outsource wage calculation, tax withholding, filings.
Benefits administrators: companies outsource complex plan management and enrollment.
Training/continuing education vendors: companies outsource skill-building programs.
Management consultants: executives outsource diagnosis of organizational problems and strategy options.
Strategy “war rooms”/think tanks: governments outsource policy research and long-range analysis.
Lobbyists: organizations outsource political persuasion and legislative monitoring.
PR agencies: firms outsource messaging, media relations, crisis communications.
Advertising agencies: companies outsource market research, creative strategy, campaign execution.
Market research firms (surveys/focus groups): firms outsource understanding customer preferences.
Branding firms: startups outsource naming, positioning, identity design.
Sales outsourcing/rep firms: manufacturers outsource customer acquisition and relationship management.
Customer support call centers: companies outsource troubleshooting and user guidance.
Virtual assistants/executive assistants: individuals outsource scheduling, reminders, correspondence triage.
Switchboard operators (early telecom): users outsource call routing and directory knowledge.
Concierges: travelers outsource local planning and reservations.
Travel agents: people outsource itinerary design, booking, problem resolution.
Event planners: clients outsource logistics, vendor negotiation, scheduling.
Caterers: hosts outsource meal planning and execution.
Household servants/butlers (historical): elites outsource domestic management and social coordination.
Nannies/governesses: families outsource childcare and education.
Tutors: students outsource guidance and feedback for learning.
Editors: authors outsource clarity, structure, fact-checking, style.
Ghostwriters: public figures outsource drafting speeches/books/op-eds.
Translators/interpreters: organizations outsource language understanding and cultural nuance.
Court reporters: legal system outsources accurate transcription and record.
Secretarial pools/typists: offices outsource document production and formatting.
Dictation machines + transcriptionists: professionals outsource writing from speech to specialists.
Spellcheck/grammar check (software): individuals outsource proofreading and correctness.
Style guides/templates: organizations outsource consistency decisions to predefined rules.
Photocopiers/printing services: offices outsource document reproduction at scale.
IT managed service providers: companies outsource system administration and uptime.
Cloud hosting: firms outsource infrastructure capacity planning and maintenance.
Cybersecurity firms/SOCs: companies outsource threat monitoring and incident response.
Penetration testers: orgs outsource security evaluation and adversarial thinking.
Data analysts/BI teams: executives outsource metric definition and interpretation.
Data science teams: product groups outsource forecasting, ranking, and experimentation analysis.
A/B testing platforms: companies outsource experiment plumbing and statistical routines.
Algorithmic trading systems: investors outsource trading decisions to automated strategies.
Portfolio managers/financial advisors: individuals outsource asset allocation and planning.
Robo-advisors: individuals outsource investment decisions to standardized algorithms.
Retirement plan target-date funds: savers outsource rebalancing and glidepath decisions.
Estate planners/trust officers: families outsource intergenerational planning and tax strategy.
Professional negotiators/buyers: firms outsource supplier negotiation and procurement strategy.
Procurement platforms: companies outsource vendor selection processes to systems/workflows.
Legal document automation: firms outsource contract drafting patterns to templates/software.
Cooks/meal kit services: individuals outsource meal planning and ingredient sourcing.
Personal trainers: people outsource exercise programming and adherence coaching.
Therapists/coaches: individuals outsource structured reflection, coping strategies, goal planning.
Diagnostic labs: clinicians outsource testing and interpretation to specialized labs.
Radiologists/pathologists: hospitals outsource image/tissue interpretation to specialists.
GPS navigation (consumer): drivers outsource route planning and spatial memory to devices.
Calculators/spreadsheets: workers outsource arithmetic and many routine quantitative steps.
Search engines: people outsource information retrieval, “where do I find this?” memory, and initial research.

If you tell me which one (or a small cluster) you want to zoom in on, I can do a historical deep dive focused on what skills atrophied, what new skills emerged, and what institutions/practices mitigated the downsides.

🧑‍💻 User

Can you use these examples, or any others, to critique or supports the points in this post?

Erik Johannes Husom Outsourcing thinking 30 Jan 2026 First, a note to the reader: This blog post is longer than usual, as I decided to address multiple connected issues in the same post, without being too restrictive on length. With modern browsing habits and the amount of available online media, I suspect this post will be quickly passed over in favor of more interesting reading material. Before you immediately close this tab, I invite you to scroll down and read the conclusion, which hopefully can give you some food for thought along the way. If, however, you manage to read the whole thing, I applaud your impressive attention span.

A common criticism of the use of large language models (LLMs) is that it can deprive us of cognitive skills. The typical argument is that outsourcing certain tasks can easily cause some kind of mental atrophy. To what extent this is true is an ongoing discussion among neuroscientists, psychologists and others, but to me, the understanding that with certain skills you have to "use it or lose it" seems intuitively and empirically sound.

The more relevant question is whether certain kinds of use are better or worse than others, and if so, which? In the blog post The lump of cognition fallacy, Andy Masley discusses this in detail. His entry point to the problem is to challenge the idea that "there is a fixed amount of thinking to do", and how it leads people to the conclusion that "outsourcing thinking" to chatbots will make us lazy, less intelligent, or in other ways be negative for our cognitive abilities. He compares this to the misconception that there is only a finite amount of work that needs to be done in an economy, which often is referred to as "the lump of labour fallacy". His viewpoint is that "thinking often leads to more things to think about", and therefore we shouldn't worry about letting machines do the thinking for us — we will simply be able to think about other things instead.

Reading Masley's blog post prompted me to write down my own thoughts on the matter, as it has been churning in my mind for a long time. I realized that it could be constructive to use his blog post as a reference and starting point, because it contains arguments that are often brought up in this discussion. I will use some examples from Masley's post to show how I think differently about this, but I'll extend the scope beyond the claimed fallacy that there is a limited amount of thinking to be done. I have done my best to write this text in a way that does not require reading Masley's post first. My aim is not to refute all of his arguments, but to explain why the issue is much more complicated than "thinking often leads to more things to think about". Overall, the point of this post is to highlight some critical issues with "outsourcing thinking".

When should we avoid using generative language models? Is it possible to define categories of activities where the use of LLMs (typically in the form of chatbots) is more harmful than helpful? Masley lists certain cases where, in his view, it is obviously detrimental to outsource thinking. To fully describe my own perspective, I'll take the liberty to quote the items on his list. He writes it's "bad to outsource your cognition when it:"

Builds complex tacit knowledge you'll need for navigating the world in the future. Is an expression of care and presence for someone else. Is a valuable experience on its own. Is deceptive to fake. Is focused in a problem that is deathly important to get right, and where you don't totally trust who you're outsourcing it to. I was surprised to discover that we are to a large extent in agreement on this list, despite having fundamentally different views otherwise. The disagreement lies, I believe, in the amount of activities that fall within the categories outlined above, particularly three of them.

Personal communication and writing Let's start with the point "Is deceptive to fake". Masley uses the example of:

If someone’s messaging you on a dating app, they want to know what you’re actually like.

Very true, but in my view, it's not only in such intimate or private situations where it is deceptive to fake what you are like. Personal communication in general is an area where it matters how we express ourselves, both for ourselves and those we talk or write to. When we communicate with each other, there are certain expectations framing the whole exchange. Letting our words and phrases be transformed by a machine is a breach of those expectations. The words we choose and how we formulate our sentences carry a lot of meaning, and direct communication will suffer if we let language models pollute this type of interaction. Direct communication is not only about the information being exchanged, it's also about the relationship between the communicators, formed by who we are and how we express ourselves.

I think this is not only relevant for communication between two humans, but also for text with a personal sender conveyed to a human audience in general. To a certain extent, the same principles apply. There has been a debate in the Norwegian media lately regarding the undisclosed use of LLMs in public writing, with allegations and opinions flying around. I'm very happy to see this discussion reaching broad daylight, because we need to clarify our expectations to communication, now that chatbots are being so widely used. While I clearly think that it is beneficial to keep human-to-human communication free from an intermediate step of machine transformation, not everyone shares that view. If, going forward, our written communication will for the most part be co-authored with AI models, we need to be aware of it, and shift our expectations accordingly. Some have started disclosing when they have used AI in their writing, which I think is a good step towards better understanding of our use of LLMs. Knowing whether a text is written or "co-authored" by an LLM has an important effect on how a receiver views it; pretending otherwise is simply false.

Many see LLMs as a great boon for helping people express their opinions more clearly, particularly for people not using their native language or those who have learning disabilities. As long as the meaning originates from a person, LLMs can help express that meaning in correct and effective language. I have two main objections against this. The first one is about what happens to the text: In most cases it's impossible to separate the meaning from the expression of it. That is in essence what language is — the words are the meaning. Changing the phrasing changes the message. The second one is about what happens to us: We rob ourselves of the opportunity to grow and learn, without training wheels. LLMs can certainly help people improve the text, but the thinking process — developing the ideas — will be severely amputated when leaving the phrasing up to an AI model. They quickly become a replacement instead of help, depriving us the opportunity of discovering our own voice and who we can be and become when we stand on our own two feet.

With great care, one may be able to use a chatbot without being affected by these two drawbacks, but the problem is that with LLMs, there is an exceptionally thin line between getting help with spelling or grammar, and having the model essentially write for you, thereby glossing over your own voice. This is unavoidable with the current design of chatbots and LLM-powered tools; the step from old-school autocorrect to a generative language model is far too big. If we really envision LLMs as a tool for helping people become better at writing, we need to have a much more carefully considered interface than the chatbots we have today.

At the same time, I realize many are far more utilitarian. They just want to get the job done, finish their work, file that report, get that complaint through, answer that email, in the most efficient way possible, and then get on with their day. Getting help from an LLM to express oneself in a second language also seems useful, without considering how much or little one learns from it (I would be more positive to LLMs for translation if it wasn't for the fact that current state-of-the-art LLMs are simply very bad at producing Norwegian text. I can only hope the state is better for other non-English languages, or that it will improve over time). Additionally, LLMs seem to be efficient for people who are fighting with bureaucracy, such is filing complaints and dealing with insurance companies. In this case the advantage seems greater. We must, however, remember that the "weapon" exists on both sides of the table. What will happen to bureaucratic processes when all parties involved are armed with word generators?

It is not without reservation that I express these opinions, because it may come across as I want to deny people something that looks like a powerful tool. The point is that I think this tool will make you weaker, not stronger. LLMs don't really seem to empower people. Some of the effect I currently see is the number of applications to various calls (internships, research proposals, job openings) multiplying, but the quality dropping. Students are asking chatbots for help with solving collaborative tasks, not realizing that everyone is asking the same chatbot, robbing us of the diversity of ideas that could have formed if they took a minute to think for themselves.

The chatbots may have lowered the threshold for participation, but the competition's ground rules hasn't changed. To get better at writing, you need to write. The same goes for thinking. Applying for a job means showing who you are, not who the LLM thinks you are, or should be. Participating in the public debate is having to work out how to express opinions in clear language. Am I really participating if I'm not finding my own words?

It is important to note that not all text is affected in the same way. The category of writing that I like to call "functional text", which are things like computer code and pure conveyance of information (e.g., recipes, information signs, documentation), is not exposed to the same issues. But text that has a personal author addressing a human audience, has particular role expectations and rests on a particular trust. An erosion of that trust will be a loss for humanity.

A pragmatic attitude would be to just let the inflation of text ensue, and take stock after the dust has settled. What will be left of language afterwards? My conservative viewpoint stems from believing that what we will lose is of greater worth than what we gain. While LLMs can prove useful in the short term, using them is treating a symptom instead of the problem. It is a crutch, although some may truly be in need of that crutch. My only advice would be to make sure you actually need it before you lean on it.

Valuable experiences Using LLMs is not only about writing. Masley mentions that it's bad to outsource activities that are "a valuable experience on its own". I couldn't agree more, but I suspect that he will disagree when I say that I think this category encompasses a lot of what we already do in life. Major LLM providers love to show how their chatbots can be used to plan vacations, organize parties, and create personal messages to friends and family. I seldom feel more disconnected from the technological society than when I watch these advertisements.

To me, this highlights a problem that goes to the core of what it means to be human. Modern life brings with it a great deal of activities that can feel like chores, but at the same time it seems like we are hell-bent on treating everything as a chore as well. Humans are surprisingly good at finding discontentment in nearly anything, maybe because of an expectation in modern society that we should be able to do anything we want, anytime we want it — or perhaps more importantly, that we should be able to avoid doing things we don't feel like doing. Our inability to see opportunities and fulfillment in life as it is, leads to the inevitable conclusion that life is never enough, and we would always rather be doing something else.

In theory, I agree that automating some things can free up time for other things that are potentially more meaningful and rewarding, but we have already reached a stage where even planning our vacation is a chore that apparently a lot of people would like to avoid doing. I hope that AI's alleged ability to automate "nearly anything" helps us realize what is worth spending time and effort on, and rediscover the value of intentional living.

Building knowledge The third point I would like to address is that we shouldn't use chatbots when it "builds complex tacit knowledge you'll need for navigating the world in the future", according to Masley. Again, I agree completely, and again, I think that this point encompasses a great deal of daily life. Building knowledge happens not only when you sit down to learn something new, but also when you do repetitive work.

This misconception is not new for chatbots, but has been present since we started carrying smartphones in our pockets. With internet at hand at all times, there's apparently no need to remember information anymore. Instead of using our brains for storing knowledge, we can access information online when we need it, and spend more time learning how to actually use the information and think critically. The point we are missing here, is that acquiring and memorizing knowledge is a huge part of learning to use the knowledge. It is naive to think that we can simply separate the storage unit from the processing unit, like if we were a computer.

I learned this lesson while being a piano student. I was trying to understand jazz, and figure out how good improvisers could learn to come up with new phrases so easily on the spot. How does one practice improvisation? Is it possible to exercise the ability to come up with something new that immediately sound good? I ended up playing similar riffs almost every time I tried. After a while I got convinced that good jazz players must be born with some inherent creativity, some inner musical inspiration that hummed melodies inside their heads for them to play.

One of my tutors taught me the real trick: Good improvisation comes not from just practicing improvisation. You need to play existing songs and tunes, many of them, over and over, learn them by heart, get the chord progressions and motifs under your skin. This practice builds your intuition for what sounds good, and your improvisation can spring from that. Bits and pieces of old melodies are combined into new music. In that sense, we are more like a machine learning model than a computer, but do not make the mistake of thinking that is actually what we are.

There is a need for clarification here: I'm not saying that nothing should be automated by LLMs. But I think many are severely underestimating the knowledge we are building from boring tasks, and we are in danger of losing that knowledge when the pressure for increased efficiency makes us turn to the chatbots.

The extended mind As a sidenote, I would like to contest the idea of the extended mind, as explained by Masley:

[M]uch of our cognition isn’t limited to our skull and brain, it also happens in our physical environment, so a lot of what we define as our minds could also be said to exist in the physical objects around us.

It seems kind of arbitrary whether it’s happening in the neurons in your brain or in the circuits in your phone.

This statement is simply absurd, even when read in context. The fact that something happens in your brain rather than on a computer makes all the difference in the world. Humans are something more than information processors. Yes, we process information, but it is extremely reductionist to treat ourselves as objects where certain processes can be outsourced to external devices without consequences. Does it really matter if I remember my friend's birthday, when I can have a chatbot send them an automated congratulation? Yes, it matters because in the first case you are consciously remembering and thinking about your friend, consolidating your side of the relationship.

The quoted statement above is followed up with:

It’s true that you could lose your phone and therefore lose the stored knowledge, but you could also have a part of your brain cut out.

Losing your phone and losing a part of your brain are two tremendously different things, both in terms of likelihood and consequences. Not only does the statement above significantly underestimate the processes that happens in our brain, but to even liken having a part of your brain cut out to losing your phone reveals that the premiss of the argument is severely detached from reality.

The design of our built environments is also brought up to show how it's beneficial to minimize the amount of thinking we do:

[M]ost of our physical environments have been designed specifically to minimize the amount of thinking we have to do to achieve our daily goals.

Try to imagine how much additional thinking you would need to do if things were designed differently.

This doesn't hold up to scrutiny. Yes, if our environment suddenly changed, it would require extra mental effort of us to navigate. For a time. But, then we would have gotten familiar with that alternative design, and adapted ourselves. The only case where we would have had to do additional thinking is if the design of our physical environments changed all the time.

What we think about does matter Regarding the "lump of cognition fallacy", I fully agree that we need not worry about "draining a finite pool" of thinking, leaving "less thinking" — whatever that means — for humans. There is, however, another fallacy at play here, which is that "it does not matter what we think about, as long as we think about something". It is easy to be convinced that if a computer can do the simple, boring tasks for me, I can deal with more complex, exciting stuff myself. But we must be aware that certain mental tasks are important for us to do, even though a machine technically could do them for us.

To illustrate: If I outsource all my boring project administration tasks to a chatbot, it can leave more time for my main task: research. But it will also rob me of the opportunity to feel ownership to the project and build a basis for taking high-level decisions in the project. In a hypothetical situation where a chatbot performs all administrative tasks perfectly on my behalf, I will still have lost something, which may again have impact on the project. I'm not saying that no tasks should be automated at all, but we must be aware that we always lose something when automating a process.

Comparing with the "lump of labour" fallacy again: While it may be true that outsourcing physical work to machines will simply create new types of work to do, it doesn't mean that the new work is useful, fulfilling, or beneficial for individuals and society. The same goes for thinking. We must acknowledge that all kinds of thinking have an effect on us, even the boring and tedious kinds. Removing the need for some cognitive tasks can have just as much influence, positive or negative, as taking up new types of cognitive tasks.

Conclusion We have a major challenge ahead of us in figuring out what chatbots are suitable for in the long term. Personal communication may change forever (that is to say, maybe it won't stay personal anymore), education systems will require radical adaptations, and we need to reflect more carefully about which experiences in life actually matter. What is truly exciting about this new type of technology, is that it forces us to face questions about our humanity and values. Many formerly theoretical questions of philosophy are becoming relevant for our daily lives.

A fundamental point I'm trying to bring forth is that how we choose to use chatbots is not only about efficiency and cognitive consequences; it's about how we want our lives and society to be. I have tried to argue that there are good reasons for protecting certain human activities against the automation of machines. This is in part based on my values, and does not rely on research into whether or not our efficiency at work or cognitive abilities are affected by it. I cannot tell other people what they should do, but I challenge everyone to consider what values they want to build our communities on, and let that weigh in alongside what the research studies tell us.

🤖 Assistant

Below are places where historical “outsourcing” strongly supports Husom’s concerns, and places where it pushes back—using analogues from writing, bureaucracy, navigation, management, medicine, etc. The throughline from history is: outsourcing rarely causes a simple “people stop thinking”; it changes which thinking happens, who does it, and what failure modes appear.

1) Personal communication, authenticity, and “deceptive to fake”

What history supports in the post

Husom’s claim: undisclosed LLM help in personal writing violates expectations; phrasing carries meaning; trust erodes.

Strong historical analogues:

Ghostwriters and speechwriters. For centuries, public figures have outsourced words. The social system adapted by creating role-based expectations: a president is expected to have speechwriters; a friend texting you is not. The key variable has been norms + disclosure. When ghostwriting is hidden in contexts that presume personal authorship, it’s treated as deception.
Love letters via professional scribes. In many cultures, illiterate (or elite-but-busy) people used scribes for intimate letters. It “worked” socially when it was known/assumed (e.g., literacy constraints) and failed when it violated expectations of personal expression.
Form letters and secretarial correspondence. Executives have long used assistants to send “their” emails/letters. Recipients often discount warmth/authenticity accordingly—again: expectations matter.

Takeaway: History backs Husom’s emphasis on shared expectations. “It’s fine” or “it’s deceptive” has depended less on the tool and more on whether the audience reasonably believes the sender personally authored the text.

Where history complicates his conclusion

Husom leans toward “keep human-to-human communication free of machine transformation.” The historical counterpoint is that societies often keep authenticity by creating conventions, not by banning mediation:

Handwritten notes, signatures, voice calls, and later “sent from my iPhone” style markers all served as authenticity signals.
Today’s equivalent could be disclosure norms (“AI-assisted”), provenance metadata, or social conventions about when AI-polish is acceptable (e.g., workplace HR emails vs. condolences to a friend).

So the lesson is less “mediation destroys communication,” and more: when a medium enables easy impersonation or over-polish, communities invent new authenticity signals—or lose trust until they do.

2) “The thin line”: tool vs replacement

Husom argues the line between spellcheck and LLM authorship is exceptionally thin and current interfaces push people into replacement.

History supports this interface-based concern. Many outsourcing waves created “slippery slopes” when the default workflow moved from assist to substitute:

GPS navigation: from “help me confirm my route” to “I no longer form a route model at all.”
Autopilot in aviation: from reducing workload to manual flying skill decay, prompting recurrent training requirements.
Clinical decision support: from “double-check” to “I trust the system and stop practicing differential diagnosis,” leading to well-known deskilling worries.

Lesson: the design of the tool and default workflow predicts substitution more than the mere existence of the tool. Husom’s call for different interfaces is historically well-grounded.

3) “Functional text” vs “personal author to human audience”

His distinction (code/recipes/docs vs personal writing) maps onto older divisions:

Notaries / standardized legal forms: society decided certain writing is primarily instrumental; originality/voice is secondary. Outsourcing is expected.
Literature / personal correspondence: voice and authorship matter; outsourcing triggers backlash unless explicitly framed (co-authored, dictated, edited, etc.).

But the boundary moves. Historically, once an activity becomes high-volume and standardized, it tends to be reclassified as “functional” (e.g., much of business communication). That suggests Husom is right to worry about “inflation of text”—because institutional incentives often push communication into template-land.

4) Bureaucracy arms race: “the weapon exists on both sides”

Husom asks what happens when both parties have word generators.

History says: arms races are real and usually end with filters, gates, or new scarcity signals.

Email spam → spam filters + reputation systems.
SEO content farms → ranking algorithm changes + brand authority signals.
Mass-produced cover letters/resumes (templates) → heavier reliance on referrals, work tests, credential screens, and automated parsing.
PR/advertising growth → consumer skepticism + regulation + fact-checking + “earned trust” channels.

So the likely equilibrium isn’t “everyone writes more forever”; it’s:

a glut,
erosion of trust/attention,
new filtering institutions (automation, verification, reputational moats),
communication channels that re-price “human effort” as a credibility signal.

That supports his worry that “quality drops, volume rises,” and adds: systems tend to respond by shifting the bottleneck elsewhere.

5) Tacit knowledge, repetition, and “boring work builds you”

His jazz anecdote is a strong model: repeated internalization enables fluent generation. Historically, the strongest evidence is from domains where automation removed routine practice:

Supports his point (deskilling is real)

Navigation: sailors and later drivers relying on tools can lose internal wayfinding models (the “I can’t navigate without GPS” phenomenon).
Aviation: over-reliance on automation has been linked (in industry discussions and training doctrine) to degraded manual handling and situation awareness, hence mandated manual practice and simulator time.
Medicine: over-reliance on imaging/labs can reduce physical exam skill and clinical reasoning practice; institutions respond with training standards and checklists.

These match Husom’s claim: you can’t cleanly separate “storage” from “processing” in humans; internalized knowledge changes what you can perceive and generate.

Complicates his point (deskilling is not the only outcome)

Outsourcing also creates new tacit knowledge, often higher-level:

Double-entry bookkeeping outsourced to accountants didn’t eliminate financial thinking; it shifted merchants/executives toward capital allocation, risk, and strategy.
CAD tools reduced hand-drafting skill but created tacit knowledge about parametric constraints, simulation, manufacturing pipelines.
Search engines reduced memorized trivia but increased tacit skill in query formulation, source evaluation, triangulation (when taught).

Lesson: Husom is right that automation can amputate learning-by-doing; the historical fix is not “never automate,” but choose what must remain practiced internally and institutionalize that practice.

6) Outsourcing admin work and losing ownership / judgment

His example: if a chatbot does all project admin “perfectly,” you may lose ownership and the basis for high-level decisions.

This is extremely consistent with management history.

When leaders outsource too much to executive assistants or middle management, they can lose contact with operational reality (“management by memo”).
When companies outsource analysis to management consultants, they can gain speed and frameworks but risk hollowing out internal judgment: teams stop building the muscle to diagnose their own systems, and decisions become “because the deck said so.”
When firms outsource customer contact to call centers, they often lose tacit knowledge about customer pain, which later harms product decisions—prompting some companies to force engineers/PMs to do support rotations.

Lesson: “perfect execution” of low-level tasks can still degrade situational awareness. Historically, organizations correct this by designing feedback loops (rotations, reviews, direct exposure) so decision makers keep grounded models.

7) “Valuable experiences”: planning vacations, writing messages, etc.

Husom’s stance is value-laden (“we treat everything as a chore”), but history shows a recurring pattern:

Technologies that reduce friction (washing machines, prepared foods, travel agents, package holidays) often redefine what counts as the meaningful part of an activity.
Some people experience that as liberation; others experience it as hollowing out.

There isn’t a single historical verdict—there are stable subcultures:

Some preserved “slow” craft as identity (handmade goods, handwritten letters, vinyl records).
Others embraced convenience and reallocated effort elsewhere.

The historical warning that supports him: once convenience becomes the default, social expectations shift (“why didn’t you just…”) and people who want the “valuable experience” may have to actively defend it. That’s a real dynamic.

8) The “extended mind” dispute

Husom calls Masley’s “arbitrary whether it’s neurons or circuits” absurd, and argues remembering a birthday matters because it reinforces relationships.

History gives ammunition to both sides:

Supports Husom

Outsourcing memory to calendars can reduce spontaneous recall, which is tied to affect and salience. Remembering unprompted signals care; automated messages can feel empty. Social meaning attaches to “costly signals” of attention and effort.

Pushes back on Husom (the strongest counterexample: writing itself)

The oldest “outsourcing thinking” debate is essentially Socrates (via Plato’s Phaedrus) criticizing writing: external memory would weaken internal memory and understanding. Yet literacy enabled:

cumulative science,
complex law and governance,
long-range coordination,
deeper multi-generational reasoning.

So “externalizing cognition” isn’t arbitrary, but it also isn’t inherently corrupting. The historical lesson is: external memory can weaken some forms of recall while enabling new forms of thought at scale.

A more precise critique of Masley than Husom’s “absurd” would be: it’s not arbitrary; location matters because internal vs external processes produce different incentives, failure modes, and skill formation. But externalization can still be a genuine part of cognition.

9) “Most environments are designed to minimize thinking”

Husom says we’d adapt quickly; only changing environments cause extra thinking.

History suggests a middle view:

Stable affordances (signage, standardized controls, road rules) do reduce cognitive load and accidents. That’s why they persist.
But that reduction can create skill fragility: when the environment deviates (unusual airport signage, road closures, tool failures), some people lack the internal model to cope.

So: yes, humans adapt to stable designs—but designs also shape what skills remain practiced. This again matches the deskilling/reskilling pattern.

10) Critique of the “lump of cognition” framing

Husom agrees there’s no fixed pool of thinking, but argues “what we think about matters,” and that replacing boring tasks can remove important formation.

History supports his correction. Division of labor and outsourcing often increase total output while:

making work more fragmented,
reducing craft mastery,
shifting agency upward,
creating alienation or dependency.

The “lump of labor” analogy is instructive in exactly the way he uses it: new work appears, but it isn’t automatically meaningful or socially beneficial.

Where Husom may overreach is implying a near-universal weakening: historical outcomes depend heavily on:

whether people still practice the core skill somewhere (training, hobby, rotations),
whether institutions reward understanding or just output,
whether the outsourced system is reliable and aligned (safety-critical vs low stakes),
whether authenticity norms are clear.

A synthesis: what history suggests we should expect with LLMs

If LLMs become ubiquitous, historical outsourcing patterns predict:

Text inflation + trust erosion in channels where authorship mattered, until new norms/verification arise.
Deskilling in “routine cognition” (drafting, summarizing, first-pass reasoning) unless practice is intentionally preserved.
Reskilling and new tacit knowledge (prompting, evaluation, orchestration, taste, domain judgment, provenance checking).
Bottleneck migration: as generating words becomes cheap, scarcity shifts to attention, credibility, and demonstrated competence.
Institutional responses: training requirements, audits, disclosure norms, and “human-only” rituals for high-trust contexts.

If you want to pick one historical analogue to deep-dive (e.g., GPS, calculators in education, speechwriters/ghostwriting, management consulting, autopilot in aviation, or scribes/writing), I can map it point-by-point to likely LLM outcomes and the mitigations that actually worked.

🧑‍💻 User

What does "history suggest" for software development teams figuring out how to adopt AI agents generating code for them?

🤖 Assistant

History suggests software teams will do best with AI code agents if they treat them less like “automatic programmers” and more like a powerful form of outsourcing + automation—which reliably shifts bottlenecks, changes skill formation, and creates new failure modes unless teams add guardrails.

Below are the most consistent historical patterns (from prior “automation waves” like compilers/IDEs/codegen, offshoring/contractors, spreadsheetization of finance, autopilot in aviation, and safety-critical checklists), translated into concrete implications for AI-in-dev.

1) Expect the bottleneck to move: generation gets cheap; verification gets expensive

Pattern: When production becomes easier (printing presses, templates, spreadsheets, code generators), organizations don’t just “do the same work faster”; they produce more, and the limiting factor becomes review, trust, and integration.

In software with agents:

You’ll likely ship more diffs/PRs, but spend more time on:
code review quality,
test adequacy,
architecture coherence,
debugging “looks right but isn’t” failures,
dependency/security review.

What to do:

Invest early in fast tests, CI, linting, type checking, and reproducible builds.
Set explicit “definition of done” that makes review cheaper (tests required, no silent behavior changes, performance baselines, threat model notes).

2) Outsourcing works when responsibility stays internal

Pattern: With consultants/contractors, teams succeed when they keep clear internal ownership. When they outsource and relinquish understanding, they get “hollowed out.”

In software with agents:

If “the agent wrote it” becomes an accepted explanation, you’ll accumulate unowned code and fragile systems.

What to do:

Require a human owner per change who can explain:
why it’s correct,
how it fails,
how to operate/roll back.
Treat agent output like code from a new contractor: acceptable only with internal comprehension and supportability.

3) Deskilling is real—so preserve “manual flying” time

Pattern: Aviation automation improved safety but introduced skill decay, so the industry created recurrent training and manual practice.

In software with agents:

Likely atrophy areas: routine implementation, debugging muscle, reading unfamiliar code, API fluency, mental models of the system.

What to do (pragmatic, not ideological):

Bake in “manual reps”:
some tickets/class of work done without agent assistance,
periodic debugging/incident rotations,
code-reading sessions.
Evaluate people on systems understanding, not just throughput.

4) Standardization helps—until it erases local context

Pattern: Templates/forms increase speed but can flatten nuance; organizations respond by introducing structured inputs (forms) plus human judgment gates.

In software with agents:

Agents are strongest when you can constrain the problem: clear patterns, bounded scope, good examples, stable APIs.
They are weakest in “high-context” code: tricky concurrency, security boundaries, distributed systems behavior, complex domain invariants.

What to do:

Start adoption in high-constraint zones:
scaffolding, migrations with known patterns,
test generation + refactors,
internal tooling,
“boring” glue code behind strong tests.
Be conservative in high-context zones unless you have excellent specs/tests.

5) When output is cheap, “authenticity” is replaced by provenance

Pattern: When text becomes easy to mass-produce (printing → spam → SEO), systems shift to provenance signals (reputation, signatures, verification).

In software with agents:

You’ll want to know: what was generated, from which prompt/context, using which model, with which dependencies.

What to do:

Track provenance in a lightweight way:
label PRs as AI-assisted,
keep prompts for non-trivial changes (especially security-sensitive),
record tool/model versions for audits.
This isn’t about policing; it’s about later debugging, compliance, and incident response.

6) Quality control must become more formal, not less

Pattern: Mass production drove the invention of QA, standards bodies, audits, checklists.

In software with agents:

“Looks plausible” code increases the risk of subtle bugs and security issues.

What to do:

Strengthen gates that are hard for plausible text to fake:
property-based tests, fuzzing,
differential testing vs reference implementations,
static analysis/SAST and dependency scanning,
performance regression tests,
threat modeling for sensitive components.
Use agents to help write these tests, but don’t waive them.

7) Agents change team structure: fewer pure implementers, more integrators

Pattern: With spreadsheets and analytics tools, value moved from arithmetic to modeling, interpretation, and governance.

In software with agents:

The scarce skills shift toward:
architecture,
product-domain modeling,
writing good specs,
evaluating tradeoffs,
debugging and incident handling,
security and reliability engineering.

What to do:

Train for “spec → evaluate → integrate” workflows:
teach people to write crisp acceptance criteria,
require design notes for non-trivial changes,
emphasize reading code and reasoning about behavior.

8) The predictable failure mode: local optima and incoherent systems

Pattern: Outsourced piecework (or heavy templating) can optimize each part while degrading the whole; companies respond with stronger architecture governance.

In software with agents:

You can get a lot of individually “fine” modules that don’t compose well: inconsistent patterns, duplicated logic, leaky abstractions.

What to do:

Enforce architecture constraints:
approved patterns, reference implementations,
“one way to do X” docs,
library-first reuse,
periodic refactor budgets.
Use agents deliberately for refactors after you’ve chosen the target pattern.

9) Rollout lesson: pilots, not big-bang adoption

Pattern: New operational technologies succeed with staged adoption + feedback loops (instead of mandate-first).

What to do:

Run a 4–8 week pilot with:
clear use cases,
baseline metrics (lead time, escaped defects, review time, incident rate),
qualitative feedback (does it reduce toil or just move it?).
Expand only where metrics don’t degrade quality/operability.

A simple “historically informed” operating rule

Treat an AI agent as a very fast junior developer with perfect confidence and unknown training data:

great at drafts and variations,
weak at truth, context, and long-tail correctness,
requires strong supervision, tests, and clear standards.

If you tell me your team context (language, size, deployment frequency, regulated or not, and whether you already have strong CI/test coverage), I can propose a concrete adoption policy: which tasks to allow/forbid, what review checklist to add, and what metrics to watch so you catch hollowing-out early.