What happens when cognition becomes a design material?
We understand physical materials well today; wood has grain, steel has tensile strength, plastic can be molded, and we design with these properties in mind. In the last few decades data has joined those as a material we shape and structure for information systems. Now, as we build systems that reason, cognition itself has become a material with its own properties that we need to understand well in order to craft quality products.
Beyond Traditional Boundaries
Steve Jobs said about design "people think it's this veneer, that the designers are handed this box and told, make it look good! That's not what we think design is. It's not just what it looks like and feels like. Design is how it works."
Of course there are many background processes that involve LLMs that can be said to be an implementation details, but the reality is that most agentic systems don't hide behind clean interfaces like traditional software. Nor do they behave deterministically. They interact directly with users, all while handling complex processes in the background, meaning their contextual decisions directly shape and impact the user experience. Understanding how cognitive materials behave becomes both an engineering and design problem.
Craft, Design, and Engineering
What's the difference between craft and design?
Craft refers to the process of making while doing, where the material gives feedback that influences the final output. You might have loose plans, but not actual rigid specs. The form emerges from the interaction between maker and material.
When a potter works clay, it's a continuous exchange of information between intention and material, where the maker learns from the material as much as they inform it. Working with the material, rather than against it.
The Industrial Revolution severed this connection between craftsman and material by separating design and planning from mass production. What was lost was the feedback that allowed makers to adjust their approach based on how materials behaved in their hands.
Software development inherited this same division. Typically, designers create visuals, wireframes and flows, engineers implement them, and then after deployment it's discovered what was lost in between. But with AI, this separation creates more risk than in more predictable, deterministic systems that have established precedents and pre-existing patterns. Unlike traditional software, language models exhibit their own agency, which is part of its superpower, but it also means they have the agency to hallucinate and derail in unexpected ways.
Treating cognition as a material means restoring the connection between maker and material. You build through experimentation, observe how the system actually behaves, and adjust when it pushes back.
This approach means learning from the material's behavior as you're designing and building the system, rather than learning from it after it's built. This blurs the lines a bit between design and engineering, or at least brings the teams together, since planning and implementation influence each other through iterations.
Material Properties
Every material has inherent characteristics that determine how it behaves under stress. Some plastics become brittle in cold weather, wood expands and contracts with humidity changes, and metals the same in heat. The key insight is learning to design with these properties in mind rather than against them, because fighting a material's nature inevitably leads to failure or complication.
A materials professor told me about a client who demanded a massive conference table from solid acrylic, "like a block of ice." Despite being warned that it would sag the client insisted. Naturally, within a few years it developed a swooping curve as the acrylic sagged. Glass doesn't move at room temperature, but acrylic is made of slippery long polymer chains that cause it to flow over time, known as creep. For me it was just one among so many lessons in what happens when you ignore material properties.
When I shifted to emerging tech, these lessons carried over. Digital systems have their own grain and failure modes that are just as real as those in physical materials. Craftsmanship comes from understanding, honoring, and leveraging the natural characteristics of the materials at hand.
This principle applies to digital, and now cognitive materials. Language models have attention patterns, context limits, and failure modes. The difference between implementation and craft lies in recognizing these properties and buildibehaviorcts and systems that work with the cognitive material's natural behavior rather than forcing it into 'unnatural' patterns.
Shaping the Material
Working with cognition as a material means shaping its processes. Prompts define tasks, guides, constraints, structure, and boundaries, while agent architecture organizes cognitive effort by coordinating worker-agents, sub-tasks and tools, enabling specialization and division of labor in AI systems.
Tools as Interfaces and Capabilities
Tools act as interfaces between agents and the external world. The LLM operates on text, structured data, code, and internal representations, while tools bridge this internal space with external systems and APIs like web browsers, gmail, slack or linear for example.
Tools are deterministic functions that extend agent capabilities beyond simply back and forth text generation, letting them browse the web, store memories, send emails, manipulate files, or call APIs. This gives agents the power to absorb up to date information and actually change external systems.
Defining tools explicitly makes agent actions understandable and manageable. Curating available tools determines what the agent can do, like defining physical affordances, letting you equip agents with precise functionalities for their role. The benefit is modular, composable capabilities that any agent can use within the cognitive architecture.
Cognitive Ergonomics in System Architecture
Attention is the scarce resource in AI systems. With traditional software where processing behaves predictably, transformers face constraints in how they allocate attention. Semantic crowding degrades performance when too many similar options are presented simultaneously, and models can become easily overloaded.
One small example of this are these simple tests I ran across eight different language models. When given 128 tools (admittedly, an absurd number), presenting all tools at once overwhelmed the models, leading to the agent selecting the correct tool only 71.4% of the time. Organizing them hierarchically with a progressive disclosure system actually got seven out of eight models to 100% accuracy while increasing tool call parameter quality when scored by an llm as a judge. You can read the full experiment in The Architecture of Attention, but the key insight is preserving cognitive resources for actual reasoning.
Effective cognitive ergonomics means treating attention budget as precious and allocating it strategically rather than saturating models with information. The goal is designing information architectures that work with transformer attention mechanisms. Just as physical ergonomics prevents strain, cognitive ergonomics prevents reasoning degradation.
Different model architectures show varying resilience to cognitive overload. In the few tests I ran, Claude models maintained impressively consistent performance under conditions of semantic crowding than the GPT series of models. The GPT-series models (4.1 and 5) show dramatic improvements from hierarchical organization, but also more dramatic degradation when overwhelmed.
In short, the point here is just that cognitive materials, in this case actual language models, have distinct architectures and properties that require different handling. It's beginning to feel that much of what we call model limitation may actually be poor information architectures getting in the way of its intelligence.
Building with Cognitive Awareness
Viewing cognition as a material encourages a more deliberate and principled approach to designing intelligent systems. What are the inherent properties of the cognitive resources we are using? How can we best shape their inputs and outputs? How do we structure workflows to minimize unnecessary load and potential failure points? How can we integrate specialized tools effectively to augment core cognitive capabilities?
The field and study of cognitive ergonomics, traditionally focused on human-computer interaction, seems to be a valuable framework for designing the internal workings of agentic systems.
Material Properties: Understanding the Cognitive Substrate
Deterministic functions fail completely with invalid inputs, but when language models lack information to complete a task accurately, instead of refusing or erroring, they'll often fabricate plausible responses or "simulate" a response that should follow. The thing is, language models are always hallucinating. Andrej Karpathy called it 'dreaming'. When you provide them what they need, their dreams will to some extent match, describe or reflect outer reality accurately. When language models are not given what training data or prompt context they need, those hallucinations are more like dreams: lacking particularity or detail, fuzzy, inconclusive, vague, and airy.
What's dangerous about these systems is that a complete failure mode can be easily disguised as success to someone unfamiliar with the correct answers to their own questions (which sortof defeats the purpose?). It could be said that without safeguards, wrappers and subsystems, language models have a built-in failure mode of dishonesty, which means it's even more important to avoid.
Working with the grain means designing prompts that follow these natural patterns: - Avoid putting critical instructions in the middle of a long context window, where attention is the weakest - If-conditions can help agents handle ambiguous circumstances - Use hierarchical organization: whether multi-agent delegation or progressive tool disclosure (categories → lists → details) - Separate and structure different types of information clearly with delineators (md or preferably xml)
Working against the grain means forcing models to process information in unnatural ways: - Dumping all possibly relevant content without curation into a context window - Burying key instructions or details (needles) in the middle of long contexts (haystacks) - Mixing different information types without clear structure - Placing critical information where attention is weakest
"Lost in the Middle" and "Context Rot" research shows what happens when working against the grain at scale. Models maintain confidence even as they lose coherence because the architecture is stressed beyond its natural material limitations. Each additional token actively degrades the model's ability to process all previous tokens as the attention mechanism is forced to spread focus thinner and begins to fail silently.
Plasticity varies dramatically between models and contexts. Claude's failure mode is more likely to be overly conservative when confused and refusing tasks, while GPT maintains a worrying level of confidence as its accuracy drops.
Contextual confusion creates another challenging failure class, when a model thinks it's operating in a different environment than it actually is. An agent might believe it's in a loop with multiple opportunities to continue working, so it ends prematurely with "I'll do that for you now, one moment please," then stops entirely. This appears cooperative but represents total failure from misunderstanding its operational context.
This failure shows that cognitive materials need to see into their operating environment to work well. Agents require bidirectional awareness of their embedded system. They need to understand whether they're in single-shot or persistent sessions, what features are enabled, which integrations are available, user preferences and permissions, environmental context like timezones and locations, and current system state affecting possible actions.
They need to see enough app internals to understand their actions' implications. This shifts from treating cognitive materials as isolated components to recognizing them as integrated parts of larger systems. It's about giving sufficient context about the operating environment for appropriate decisions.
Language models can exhibit different pathology: coherent, yet totally excessive tool usage. Where a simple task might require two or three function calls, these types of models can chain together 20-30 calls, completely overdoing the work. This isn't ideal for a number of reasons, a few being the slow processing time's negative impact on user experience and the cost of processing so many requests per task.
Different model architectures are genuinely different materials that require tailored approaches. Systems designed to switch between models should account for these differences, not paper over them with a single generic approach.
Hybrids
Raw cognition is rarely useful alone. These systems require persistence, memory, capability, environment, etc.
Consider a typical agent. It operates within a deterministic conversation framework that manages context and history. It calls deterministic tools that interact with external systems. Classifiers act as triggers, making binary decisions about whether to save memories or respond to queries.
Deterministic functions, frameworks or guardrails provide structure and reliability the cognitive material lacks. Like alloys, it introduces an additional material that strengthens the alloy against certain failure modes.
It's important to note that the interface between cognitive and deterministic components becomes a potential fracture point.
Context Hygiene and Craftsmanship
Clean context recognizes that every piece of information has a cost. Information must be worth its cognitive weight. This means aggressive pruning, hierarchical organization, and temporal or relevance filtering where every element serves a purpose.
Context hygiene means active maintenance. Systems must identify and remove outdated information, resolve conflicts before they reach the model or memory store, and maintain semantic coherence across sessions. It's an ongoing process that should run like clockwork.
The best craftsmanship makes the best products, because the material and the maker collaborate together, exchanging ideas, pushing back, being encouraging.
Instead of forcing models to work within chaos, provide clear structure. Instead of making them track complex state, maintain it deterministically. Instead of asking them to remember everything, build selective memory systems that alongside the model.
Cognition is a material with specific properties, failure modes, and optimal working methods, and this understanding changes how we build intelligent systems. We're not just programming computers anymore. We're creating new patterns with a new material. The better we understand its properties, the better we can shape it while leveraging its inherent nature.