Studio·~15 min·Will Derman·9 May 2026

Apps, companions, co-workers, people. The order matters.

Four tiers, in that order. Most AI products in 2026 are still mistaking step one for step four.

A worn wooden step ladder leans against a textured plaster wall. Window light throws a grid of shadows beside it.

Apps do a task. Companions know a room. Co-workers hold a brief. People carry a relationship. That is the ladder, and the four tiers are not interchangeable positions on a spectrum. Each one requires the previous as a foundation. Moving up the ladder is not a question of making the model more capable. It is a question of what kind of relationship is being proposed.

The taxonomy has become necessary because the word "AI" in 2026 covers a range that would not fit in any other category. A spelling corrector and a product that holds a patient's medical history and flags anomalies before the clinician notices are both, in current usage, "AI." The category word does too little work. The distinctions inside it matter more than the category word suggests, and the confusion between tiers costs users, products, and the category as a whole.

Graaft's position depends on understanding those distinctions precisely. The studio builds at step three and points toward step four. That claim requires the four tiers to mean something specific. Otherwise it is not a claim. It is a marketing position, and those are different things.

Step one: apps

An app does something when you ask it to. Input arrives, output is produced, the session ends. The app does not carry anything from one session into the next unless you explicitly give it context. Each interaction starts at zero.

The app tier describes the majority of what is sold as AI in 2026. Image generators, document summarisers, code completers, translation tools, grammar checkers, research assistants. The task-completion value these products deliver is real. The results can be genuinely excellent. A well-designed code completer catches bugs a senior developer would miss. A well-prompted image generator produces work that took human artists days to produce.

None of this is diminished by naming the tier accurately. The category failure at step one is not the products. It is the gap between what the products are and how they are described. An app that completes one task very well is worth its price if the task is worth completing. The failure is describing the app as something further up the ladder: as a companion that knows the user, as a co-worker that holds a professional brief, as a person that understands the underlying situation. These descriptions create expectations the app tier cannot satisfy, and the disappointment that follows is structural.

The failure is not the apps. The failure is describing task-completion products as relationships.

The app tier is also not trivial to execute well. Building an AI application that reliably produces good output for a defined task, handles edge cases gracefully, and maintains quality under real-world usage conditions requires serious engineering. The fact that apps are at step one does not mean they are easy to build. It means the relationship they propose is bounded: task in, output out. That boundary is a design decision, not a limitation. The limitation is pretending the boundary does not exist.

The products that perform best at step one are the ones that do not apologise for being at step one. They define the task precisely, optimise for it rigorously, and do not propose a relationship they cannot deliver. The products that perform worst at step one are the ones designed to appear to be at step two or three. The energy that went into the appearance is energy that did not go into the task.

Step two: companions

A companion knows a context. The step from apps to companions is the step from transaction to relationship. The companion does not start fresh with every session. It holds a model of its room, updates that model over time, and uses that model to inform what it does next. The question a companion is designed to answer is not "what task should I complete" but "what does this person need in the context of the room I know."

The room is the key concept. A companion's room is the domain it was built to understand: the household, the kitchen, the garden, the medical practice, the classroom. The room defines what the companion knows and what falls outside its competence. A companion that claims to know all rooms knows none of them. The specialism is not a restriction. It is the design.

Deme and Sofia are Graaft's two companions. Each knows a specific room. Deme knows the garden: the seasonal rhythms of Australian conditions, the difference between what the gardening books say and what actually works in a Perth backyard in February or a Joburg garden in the dry winter. Sofia knows the kitchen: the household's actual eating patterns, the weekly rhythm of what gets cooked, the gap between what the family claims to want for dinner and what they eat on a Wednesday when no one has much energy.

The knowledge each companion holds is specific by construction, not by configuration. A companion that was given general knowledge and told to focus on a topic is a different product from a companion whose room was the design premise. The first answers questions about the topic adequately. The second notices things the user did not think to ask, because the room is the frame through which the companion sees.

Companions know a room. Specific by construction, not configured at runtime. The difference between a product that knows you and one that performs knowing you.

Most products marketed as AI companions in 2026 are apps with a persistent memory layer. The memory creates a sense of continuity. The product remembers what you told it last time and applies that context to the next interaction. This is genuinely useful, and it is not the companion tier. Continuity of retrieval is not the same as knowing a room.

Building at the companion tier requires designing the room before designing the product. The room's boundaries, the knowledge it contains, the ways it updates, the things it does not know and should say so about: these are design decisions that come before the interface, before the voice, before the model. Products that skip this step and add memory on top of a task-completion frame produce an uncanny variant: a companion that seems to know you because it can quote back what you told it, but that does not actually understand your situation.

Knowing the room. Not knowing about the room.

Step three: co-workers

A co-worker holds a brief. The step from companions to co-workers is the step from contextual knowledge to professional competence. A co-worker knows a domain well enough to make judgment calls inside it without detailed instruction. They understand the standards of the domain, can produce work that meets those standards, and can identify when a brief has a problem that falls outside their specialism.

The professional brief is what separates the co-worker tier from the companion tier. A companion's job is to know the room and be attentive within it. A co-worker's job is to hold a professional standard and produce work against it. You engage a co-worker for their specialism. The specialism is what you are hiring.

Graaft's sixteen AI colleagues are co-workers. Each holds a specific brief: research, SEO strategy, marketing strategy, creative direction, copywriting, brand identity design, UI design, UX design, frontend development, backend development, data and analytics, social media strategy, email and lifecycle marketing, website and CMS, presentation design. The briefs are not interchangeable. The research co-worker is not a substitute for the copywriting co-worker, any more than a journalist is a substitute for a copywriter.

A brief is not an instruction set. It is a professional standard.

What co-workers require is what distinguishes them from tools: consistency under variation. A co-worker who produces good output in one session and poor output in the next is not holding the brief. Professional competence means performing to standard across the range of work that comes in, not just in ideal conditions with a perfect brief and unlimited time.

The sixteen work on the studio's briefs within their domains. Their output goes through an edit pass: the co-worker produces, both founders review, the final judgment is theirs. Neither founder ships without the other's read. The co-workers produce; the founders decide. The professional responsibility is structured so it is clear who is accountable.

Co-workers hold a brief. The brief is not an instruction set. It is a professional standard, and the co-worker's job is to perform to it.

Step four: people

People carry a relationship. The step from co-workers to people is the step the AI category in 2026 most commonly either ignores or overclaims. People are not simply very capable co-workers. The distinction is not capability but presence: the capacity to be present to what the relationship requires, not just to what the brief specifies.

A co-worker at their best produces excellent professional work within their domain. A person at their best understands what the work means to the people involved, what it costs, what it would feel like if it went wrong, and uses that understanding to inform how they approach it. The difference is not output quality. It is the quality of the relationship itself, and of the judgment that relationship makes possible.

The fourth tier is where the brand promise points. "We build AI people. Made present." The word "people" is doing specific work in that sentence. It is not describing what the current products are. It is naming the aspiration: AI that carries relationships, not just tasks or knowledge or professional briefs. Made present is the modifier: not just a relationship, but one that shows up before you ask.

The category failure

The confusion between tiers is not only a marketing problem. It is a product design problem, and the two reinforce each other.

When you market a product at step three but design it as a step-one app, you build the wrong things. You optimise for novelty at the interface. You miss the persistent room model, because you did not design the room. You miss honest accounting of limitations, because you positioned the product as something without them. The marketing decision becomes the design constraint. The design constraint becomes the user experience. The user experience is a disappointment that damages trust in the whole category.

The cost of misplaced-tier expectations is real. A user sold step-three professional competence who receives step-one task completion does not only feel disappointed. They lose trust in the category. That trust is not recovered by the next product claiming to be at step four. It is recovered only by products that are honest about what tier they are at and perform at that tier consistently.

Honest placement on the ladder is a competitive advantage. A product at step one that performs step one extraordinarily well is more valuable than a product claiming step three that delivers step one inconsistently. The user who trusts a step-one product because it is honest and reliable is more valuable than a user perpetually disappointed by step-three claims.

Honest placement on the ladder is a competitive advantage. The trust built by performing one step extremely well outperforms the disappointment of claiming three and delivering one.

Where the studio sits

The studio is at step three. That is the honest accounting.

The sixteen AI colleagues hold their briefs. They work within their specialisms and produce professional-quality output that goes through an edit pass before anything ships. The standard is the founders' responsibility, and the standard is what the studio means when it describes its co-workers as specialists.

Deme and Sofia know their rooms. The rooms were built by design, not configured at runtime. The knowledge is specific by construction. They are companions, and the companion line is honest about what companions are.

Step four is the direction. Building AI that carries relationships, that is present to what the relationship requires rather than just what the brief specifies, is the work. The work is not at step four yet. Knowing which step you are at, and which step you are working toward, is what makes it possible to build honestly toward the next one.

The ladder is not a ranking of products. It is a description of what kind of relationship is being proposed, and what that kind of relationship requires to be built with honesty. Apps, companions, co-workers, people. The order matters, because each step requires the previous one to have been built honestly first.

The ladder as a decision framework

The four-line ladder is also a decision framework for product questions that do not obviously have a tier to resolve them.

When a companion's feature development hits a fork, the question "which step is this for?" usually clarifies the decision. Adding persistent memory is a step-two feature. Adding a professional briefing capability is a step-three feature. If the product is at step two, adding a step-three feature before the step-two foundation is solid is a way to appear to be at step three while actually degrading step two. The ladder disciplines the sequence.

When a co-worker's output quality becomes a conversation, the question "at what step is the output problem occurring?" clarifies the diagnosis. Is the co-worker failing to hold the professional standard of the domain, which is a step-three failure? Or is it failing to know the room, which is a step-two failure that the step-three framing cannot fix? A step-three product that keeps failing at step-two tasks needs step-two work, not step-three refinement.

When the studio describes its products in conversation, the ladder provides the constraint. The companions are at step two. The co-workers are at step three. Neither is described as something further up the ladder, because claiming a step you have not built for is the category failure the ladder is designed to prevent. The discipline is naming the step you are at, building that step correctly, and being honest about the work that remains toward the next one.

The ladder is also a map for clients. A client who understands the four steps can make a more informed decision about what they are engaging and what they can reasonably expect from it. A client who expected a step-three co-worker and received a step-one app has a justified complaint. A client who understood they were engaging a step-three co-worker with specific domain boundaries has a basis for productive calibration when the co-worker operates at those boundaries.

Clarity at the tier level is a service to the client, not just a positioning statement. It is the same honesty the studio expects of its products at the product level, applied at the description level.

The four lines on the ladder are, ultimately, four different promises to the person on the other end of the product. An app promises to complete the task. A companion promises to know the room. A co-worker promises to hold the brief. A person promises to carry the relationship. Making a promise you have not designed for is how the category loses trust, one disappointed user at a time. Designing for the promise you have made, and placing the product honestly on the ladder before you describe it to anyone, is how the category earns trust back, one correctly tiered product at a time.

Will Derman

Co-founder, Product Design & Innovation

Will is co-founder of Graaft, based in Johannesburg. He sets the design and experience direction, owns the brief-to-pixel journey across every front-end and every experience the studio ships, and holds the craft bar on what gets built.