// guide
What is Markdown?
The complete guide to .md files.
Markdown is a lightweight markup language that lets you format plain text using simple symbols. It is the standard format for README files, documentation, AI context files, and LLM inputs.
// quick answer
Markdown is a plain text formatting syntax created by John Gruber in 2004. Files use the .md or .markdown extension. You write plain text with simple symbols — # for headings, **text** for bold — and tools convert it to formatted HTML, PDF, or other outputs. It is the de-facto format for developer documentation, AI context files, and RAG pipelines.
What is a .md file?
A .md file is a plain text file written in Markdown syntax. You can open it in any text editor — Notepad, VS Code, Vim. On platforms like GitHub, GitLab, and Notion, .md files are automatically rendered as formatted HTML.
Common .mdfiles you've likely encountered:
README.mdProject introduction on GitHub
CLAUDE.mdCodebase context for Claude Code
CHANGELOG.mdVersion history and release notes
CONTRIBUTING.mdContribution guidelines for open-source
skill.mdClaude Code skill definitions
context.mdTopic context for LLM sessions
Markdown syntax — quick reference
Markdown uses symbols you already type. There is no configuration or installation required.
# Heading 1Large heading (H1)One # per heading level## Heading 2Sub-heading (H2)**bold text**Bold textOr __bold___italic text_Italic textOr *italic*- list itemBullet listOr * or +1. ordered itemNumbered list`inline code`Inline codeBacktick```python\ncode\n```Code blockFenced with language[Link](https://url.com)Hyperlink| Col | Col |\n|---|---|TableGFM only> blockquoteBlockquote---Horizontal ruleWhat is Markdown used for?
Markdown has become the universal language for structured text on the internet and inside AI systems.
Software documentation
README.md, CHANGELOG.md, CONTRIBUTING.md are Markdown by convention. GitHub renders them automatically. Every major open-source project uses Markdown for its documentation.
Documentation sites
Tools like Docusaurus, GitBook, MkDocs, and Nextra take a folder of .md files and generate a full documentation website. Markdown is the content layer; the framework handles design and navigation.
Blogs and publishing
Ghost, Hashnode, and Dev.to use Markdown for post authoring. Static site generators (Hugo, Astro, Next.js) treat Markdown files as content sources.
Note-taking and knowledge management
Obsidian, Notion, Roam Research, and Bear use Markdown as their native format. Your notes are portable plain text files, not proprietary data.
AI context files and LLM inputs
CLAUDE.md, llms.txt, context.md, and skill.md are all .md files that brief AI models on projects, codebases, and topics. Markdown's structure is readable by both humans and language models.
RAG pipelines and vector databases
Before embedding documents into a vector database, they are converted to clean Markdown. The format preserves semantic structure (headings, tables) while minimizing token noise.
Why do LLMs prefer Markdown?
Every character you feed an LLM costs tokens. Markdown encodes the same structure as HTML or XML but with dramatically fewer characters — which means fewer tokens, lower cost, and more room for actual content.
Raw PDF text
~52,400
HTML
~38,000
Clean Markdown
~21,800
Beyond token count, Markdown's structure helps models understand hierarchy. A document with ## Section headings gives the model explicit signals about content organisation. Raw text or HTML forces the model to infer structure from context — wasting attention on formatting rather than meaning.
Does the best Markdown format differ by model?
Yes. Different LLMs were trained on different data and respond best to different Markdown structures.
Uses XML-like tags. Anthropic recommends wrapping long documents in <document> and <section> tags. Claude explicitly parses these in its attention mechanism. YAML frontmatter is less effective than XML structure.
<document><section>## Heading content</section></document>
Prefers standard ATX Markdown with YAML frontmatter. Does not benefit from XML tags — they add noise. Aggressive empty-line removal and clean heading hierarchy work best.
--- title: Document --- ## Section Content here.
With a 1M token context window, chunking is rarely necessary. Clean prose with consistent headings is sufficient. Minimal metadata, no XML wrapping needed.
## Section Content here. ## Next Section More content.
SuperMD markitdown automatically applies the right Markdown format for each model when you convert a file. Select Claude, GPT-4o, or Gemini in the profile selector and the output is optimised accordingly. Try it free →
How to create a Markdown file
Create a .md file in any text editor
Open VS Code, Notepad, or any editor. Create a new file and save it with the .md extension — for example, README.md or notes.md. That's all it takes. There is no special software required.
Write using Markdown syntax
Use # for H1 headings, ## for H2, **bold** for bold, _italic_ for italics, - for bullet lists, and ``` for code blocks. The syntax is designed to be readable even without rendering.
Preview or render it
VS Code has a built-in Markdown preview (Ctrl+Shift+V). GitHub renders .md files automatically. Tools like Pandoc convert .md to PDF, DOCX, or HTML. Online editors like StackEdit give a live side-by-side preview.
Convert existing files to Markdown with SuperMD
If you have a PDF, Word doc, or spreadsheet, SuperMD markitdown converts it to clean, LLM-optimized Markdown in your browser. No upload, no account, free.
Frequently asked questions
What is Markdown?
Markdown is a lightweight markup language created by John Gruber in 2004. It uses plain text with simple symbols (like # for headings and ** for bold) to define formatting. A Markdown file uses the .md or .markdown extension and can be converted to HTML, PDF, or other formats.
What is an .md file?
An .md file is a plain text file written in Markdown syntax. It can be opened in any text editor. On GitHub, .md files are automatically rendered as formatted HTML. CLAUDE.md, README.md, and skill.md are common examples of .md files used in software development.
Why do LLMs prefer Markdown?
LLMs prefer Markdown because it encodes structure (headings, lists, tables) using fewer tokens than HTML or XML. A PDF with raw formatting overhead can use 40–60% more tokens than the same content converted to clean Markdown. Markdown also makes the model's job easier — structure is explicit without being noisy.
What is Markdown used for?
Markdown is used for README files on GitHub, documentation sites (Docusaurus, GitBook), blog posts (Ghost, Hashnode), note-taking apps (Obsidian, Notion), AI context files (CLAUDE.md, llms.txt), and RAG pipelines for LLMs. It is the de-facto format for developer documentation and AI-ready content.
How is Markdown different from HTML?
HTML uses verbose tags like <h1>Title</h1> and <strong>bold</strong>. Markdown uses # Title and **bold**. Markdown is faster to write, easier to read as plain text, and produces fewer tokens — making it better for LLM consumption. Markdown is typically converted to HTML for display.
What is GitHub Flavored Markdown (GFM)?
GitHub Flavored Markdown (GFM) is a superset of standard Markdown that adds tables, task lists (- [ ] item), strikethrough (~~text~~), and syntax-highlighted fenced code blocks (```javascript). It is the most widely used Markdown dialect and is supported by most LLMs.
Convert any file to Markdown
Drop a PDF, DOCX, XLSX, or image. Get LLM-optimized Markdown in seconds. Free.