Text Mode vs Sketch Mode: When to Use Each

Text Mode: Start from an Idea

Sketch Mode: Start from a Drawing

How the AI Processes Each Mode

Text Mode Pipeline

Sketch Mode Pipeline

Can I Mix Both?

Tips for Better Results

CloudDiagram.ai gives you two distinct paths to a professional architecture diagram. Both produce the same interactive, editable output — but they start from very different inputs. Here is how to choose.

Text Mode: Start from an Idea

Text mode is ideal when you know what you want but have not drawn anything yet. You type a description like:

"Serverless event processing pipeline with API Gateway, SQS queue, Lambda consumers, DynamoDB for storage, and CloudWatch for monitoring."

The AI asks 3–5 follow-up questions to clarify your intent — things like which database engine you prefer, whether you need multi-AZ, or how many Lambda functions are involved. Then it generates the full diagram.

Best for:

Greenfield designs where you are exploring options
Quickly iterating on architecture ideas during planning meetings
Generating standard patterns (three-tier, serverless, microservices)
People who think in words rather than pictures

Sketch Mode: Start from a Drawing

Sketch mode shines when you already have something visual — a whiteboard photo, a napkin sketch, or even a screenshot from another tool. Upload the image and the AI identifies every component, label, and connection, then converts it into a clean diagram with real AWS icons.

Best for:

Preserving whiteboard session outputs as proper documentation
Converting hand-drawn mockups into presentation-ready diagrams
Digitizing existing paper-based architecture drawings
Reverse-engineering a diagram from a screenshot

How the AI Processes Each Mode

Text Mode Pipeline

You submit a description
AI generates clarifying questions
You answer questions
AI generates a complete diagram JSON with nodes, edges, and groups
The canvas renders the interactive diagram

Sketch Mode Pipeline

You upload an image (with optional context)
AI uses vision capabilities to analyze every element in the image
AI generates clarifying questions specific to what it detected
You answer questions
AI generates the diagram JSON matching your sketch
The canvas renders the interactive diagram

The sketch pipeline costs 3 credits vs 1 for text because it involves an additional vision analysis step that requires more compute.

Can I Mix Both?

Currently, each generation uses one mode. However, a powerful workflow is:

Start with text mode to generate an initial diagram
Edit the diagram interactively on the canvas
Export as .drawio for further modifications
Take a screenshot of a modified version
Use sketch mode to regenerate with new changes

This hybrid approach lets you leverage both AI speed and human judgment.

Tips for Better Results

Text mode tips:

Be specific about AWS services (say "Aurora PostgreSQL" not just "database")
Mention networking requirements (VPC, subnets, AZs)
Include security components (WAF, IAM, security groups)
State the pattern name if applicable ("three-tier", "microservices", "event-driven")

Sketch mode tips:

Use clear, dark lines on a light background
Label your components legibly
Draw arrows to show data flow direction
Group related services visually (even rough circles help)
Add context in the text field to tell the AI what your sketch represents

Text Mode vs Sketch Mode: When to Use Each

Table of Contents

Text Mode: Start from an Idea

Sketch Mode: Start from a Drawing

How the AI Processes Each Mode

Text Mode Pipeline

Sketch Mode Pipeline

Can I Mix Both?

Tips for Better Results