URL encoding vs HTML encoding
URL encoding and HTML encoding solve different problems with similar-looking syntax. Both convert special characters into safer representations. Both use a special prefix character. Both are mandatory in their respective contexts. But they’re not interchangeable, and using one where the other is needed produces broken output.
This article explains what each one does, when to use each, and what happens when you mix them up.
The short version
URL encoding is for putting data into URLs. It converts unsafe characters into %XX percent-encoded byte sequences, so the URL stays inside the ASCII set HTTP requires.
HTML encoding is for putting data into HTML markup. It converts characters that would otherwise be interpreted as HTML syntax (<, >, &) into named or numeric entities (<, >, &).
Different jobs, different schemes, different output. They don’t overlap and they don’t substitute for each other.
Side-by-side comparison
Take the string Hello <World> & "Friends" and encode it both ways:
| Original | URL-encoded | HTML-encoded |
|---|---|---|
| Hello | Hello | Hello |
| space | %20 | space |
| < | %3C | < |
| > | %3E | > |
| & | %26 | & |
| " | %22 | " |
Both schemes solve the “these characters have special meaning in this context” problem — but for different contexts, with different special-meaning characters.
When to use URL encoding
You’re putting data into a URL. The data needs to travel through HTTP. The URL parser will encounter your data and try to interpret structural characters (?, &, =) that might be part of your data.
- Query parameter values:
?q=Hello%2C%20World%21 - Path segments:
/users/John%20Doe - Fragment identifiers (after the
#) - Embedded URLs (the inner one is fully encoded)
- Form submissions sent via GET (the browser handles this automatically)
When to use HTML encoding
You’re putting data into HTML markup. The data will be rendered by a browser. The HTML parser will encounter your data and try to interpret structural characters (<, >, &) as tag delimiters.
- Text inside HTML tags:
<p>User said: "Hello"</p> - Attribute values:
<a href="page.html" title="Smith & Sons"> - Anything user-generated rendered as HTML (security-critical — prevents XSS)
The case where they combine: URLs inside HTML
The most confusing case is when you have a URL inside an HTML attribute. The URL has been URL-encoded, then the whole thing needs to be HTML-encoded for the attribute:
// User input
const query = "Rock & Roll";
// Step 1: URL-encode for the query string
const url = `/search?q=${encodeURIComponent(query)}`;
// → /search?q=Rock%20%26%20Roll
// Step 2: HTML-encode for the href attribute
// (most templating engines do this automatically)
const html = `<a href="${htmlEncode(url)}">Search</a>`;
// → <a href="/search?q=Rock%20%26%20Roll">Search</a>
In practice, URL-encoded strings happen to be HTML-safe — they only contain letters, digits, and a small set of punctuation (%, &, =, ?) that’s mostly already safe in HTML attributes. The one exception is &, which should be encoded as & for strict HTML compliance — but most browsers accept raw & in href attributes.
What goes wrong when you mix them up
URL encoding HTML
// Wrong — URL-encoding HTML content
const html = `<p>Hello</p>`;
const encoded = encodeURIComponent(html);
// → "%3Cp%3EHello%3C%2Fp%3E"
// This is no longer HTML — the browser sees percent signs, not tags.
If you display this on a page, users see literal %3Cp%3EHello%3C%2Fp%3E as text. You’ve URL-encoded data that needed to remain HTML.
HTML-encoding URL data
// Wrong — HTML-encoding a URL value
const value = htmlEncode("Hello & World");
// → "Hello & World"
const url = `/search?q=${value}`;
// → "/search?q=Hello & World"
// The server sees "Hello " and a separate parameter "amp; World"
Now your URL has the literal text & in it. The server’s URL parser interprets it as an unencoded ampersand starting a new parameter.
Real-world security: XSS prevention
The biggest reason to know the difference is XSS (cross-site scripting). If user input is rendered directly into HTML without encoding, an attacker can inject script tags:
// User submits: <script>alert(document.cookie)</script>
// Wrong — rendering raw
<p>${userInput}</p>
// → <p><script>alert(document.cookie)</script></p>
// The browser executes the script.
// Right — HTML-encoding
<p>${htmlEncode(userInput)}</p>
// → <p><script>alert(document.cookie)</script></p>
// The browser displays the text harmlessly.
URL encoding does NOT protect against XSS — the result is still HTML if you render it as HTML. Different defense for a different attack.
HTML encoding functions by language
| Language | HTML encode | URL encode |
|---|---|---|
| JavaScript | manual replace | encodeURIComponent |
| Python | html.escape | urllib.parse.quote |
| PHP | htmlspecialchars | rawurlencode |
| Java | StringEscapeUtils | URLEncoder.encode |
| Go | html.EscapeString | url.QueryEscape |
| C# | HttpUtility.HtmlEncode | Uri.EscapeDataString |
One-line decision tree
Where is the data going?
- Into a URL → URL encode it
- Into HTML (between tags) → HTML encode it
- Into an HTML attribute (like href or src) — and the value is a URL → URL encode first, then HTML encode (most templates do the HTML encoding automatically)
- Into JavaScript code → JavaScript escape (yet another scheme — backslash-escape special chars)
- Into a JSON value → JSON encode (the JSON.stringify function handles this)
Five different encoding schemes for five different contexts. They share a family resemblance but no two are interchangeable. Pick the one that matches the destination.
Found this useful? Try the URL decoder, the URL encoder, or browse all tools.