URL Encoding Explained

URL encoding, also called percent-encoding, is the mechanism that lets a URL carry characters that would otherwise be unsafe or ambiguous within it. A URL has a strict grammar in which certain characters — slashes, question marks, ampersands — have structural meaning, so any data containing those characters must be escaped to avoid being misread. The scheme is simple: a problematic byte is replaced by a `%` followed by its two-digit hexadecimal value, so a space becomes `%20`. This guide explains which characters are reserved versus safe, why encoding is necessary, the important difference between `encodeURIComponent` and `encodeURI`, and the perennial confusion around spaces and the `+` character.

  1. 1. Why URLs need encoding

    A URL is not free-form text; it has a defined structure where characters like `/`, `?`, `#`, `&`, and `=` separate its components and parameters. If a piece of data you put into a URL happens to contain one of those characters, it would be interpreted as structure rather than data and break the URL. Encoding solves this by replacing such characters with a safe `%`-escaped form, so a search for `a&b` becomes `a%26b` and is no longer mistaken for two query parameters. Encoding also handles characters that are simply not allowed in a URL at all, such as spaces and most non-ASCII bytes.

  2. 2. Reserved vs unreserved characters

    RFC 3986 divides characters into groups. Unreserved characters — the ASCII letters, digits, and `-`, `_`, `.`, `~` — are always safe and never need encoding. Reserved characters such as `: / ? # [ ] @ ! $ & ' ( ) * + , ; =` carry special meaning in a URL and must be encoded when they appear as literal data rather than as delimiters. Everything else, including spaces and non-ASCII characters, is encoded too. The key idea is that the same character can be a delimiter or data depending on context, and encoding is how you signal “treat this as data.”

  3. 3. How percent-encoding works

    To encode a character, you take its byte value and write it as `%` followed by two hexadecimal digits. So a space (byte `0x20`) becomes `%20`, an ampersand becomes `%26`, and a forward slash becomes `%2F`. Non-ASCII characters are first encoded to bytes using UTF-8 and then each byte is percent-encoded, so `é` becomes `%C3%A9` — two bytes, two escapes. Decoding simply reverses the process, turning each `%XX` back into its byte and then interpreting the bytes as UTF-8.

  4. 4. encodeURIComponent vs encodeURI

    JavaScript offers two functions that differ in how aggressively they encode. `encodeURIComponent` is for a single piece of data — one query value, one path segment — and encodes nearly everything that is not unreserved, including `/`, `?`, `&`, and `=`. `encodeURI` is for encoding a whole URL and deliberately leaves the structural characters like `:/?#&=` intact so the URL still works. The practical rule is to use `encodeURIComponent` on each individual value you are inserting into a URL, and reserve `encodeURI` for the rare case of fixing up an entire URL string at once.

  5. 5. Encoding query strings

    A query string is a series of `key=value` pairs joined by `&`, so any key or value that itself contains `=`, `&`, or other special characters must be encoded individually before assembly. If you build the string by hand, run `encodeURIComponent` on each key and each value, then join them — encoding the whole thing at the end would wrongly escape the `=` and `&` that hold it together. In practice, prefer `URLSearchParams` (or your platform’s equivalent), which encodes each pair correctly for you and removes a whole class of bugs.

  6. 6. The space and + pitfall

    Spaces are the classic source of confusion because there are two conventions. In the path and in general percent-encoding, a space is `%20`. In the `application/x-www-form-urlencoded` format used for HTML form submissions and many query strings, a space is encoded as `+`, and a literal `+` must therefore be written as `%2B`. This is why `encodeURIComponent` produces `%20` for a space while form encoding produces `+`, and why a `+` typed into a query value can silently turn into a space when decoded. When in doubt, use `%20` for spaces in paths, let `URLSearchParams` handle query encoding, and always encode a literal `+` as `%2B`.

← All developer guides