URI Encoding in Attributes


 

URI Encoding (or Percent Encoding) applies to attributes of type URI and URIs.

Important: After any required URI-encoding (see below), any remaining ampersand (&) characters must then be escaped as & (or &) before inserting the URI into the attribute. Also, if using a pair of single quote (') characters to delimit the attribute, any single quotes within the attribute value itself must be escaped as ' (or ' if compatibility with HTML isn't necessary - see Character Entities Predefined by XML 1.0). These requirements are necessary to comply with the constraints of the *XML 1.0 Specification.

The characters A-Z, a-z, 0-9, hyphen/dash (-), underscore (_), full stop (.) and tilde (~) should be used without encoding in a URI. Others may have to be encoded, depending upon their purpose in the URI.

Characters within a URI may be grouped into the following categories:

Unreserved characters
These are characters which may be used as they are, without any encoding, and comprise:
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789-_.~

Reserved characters
These 19 characters have a reserved purpose and should usually be *URI-encoded unless they are being used for that purpose. Reserved characters comprise:
:/?#[]@!$&'()*+,;=%
Examples:
  • The plus sign (+) is often used to represent a space character and should not be encoded if used for that purpose.
  • The percent sign (%) is used to begin an encoded character and, if not being used for that purpose, it should be encoded as %25.
Other characters
All other characters (not listed above), including all whitespace, must be *URI-encoded.

See *RFC 3986: URI Generic Syntax (Section 2) for more information on the structure of URIs.

The table below shows how common characters should be URI-encoded.

Unicode Character NameChar URI-encoding Other Name
SPACE %20 or +-
EXCLAMATION MARK!%21-
QUOTATION MARK"%22Double Quote
NUMBER SIGN#%23Hash
DOLLAR SIGN$%24-
PERCENT SIGN%%25Percentage Sign
AMPERSAND&%26-
APOSTROPHE'%27Single Quote
LEFT PARENTHESIS(%28Left Bracket
RIGHT PARENTHESIS)%29Right Bracket
ASTERISK*%2AStar
PLUS SIGN+%2B-
COMMA,%2C-
SOLIDUS/%2FSlash
COLON:%3A-
SEMICOLON;%3B-
LESS-THAN SIGN<%3C-
EQUALS SIGN=%3D-
GREATER-THAN SIGN>%3E-
QUESTION MARK?%3F-
COMMERCIAL AT@%40At Sign
LEFT SQUARE BRACKET[%5B-
REVERSE SOLIDUS\%5CBackslash
RIGHT SQUARE BRACKET]%5D-
CIRCUMFLEX ACCENT^%5ECaret
GRAVE ACCENT`%60Backtick
LEFT CURLY BRACKET{%7BLeft Brace
VERTICAL LINE|%7CPipe
RIGHT CURLY BRACKET}%7DRight Brace
POUND SIGN£%C2%A3 (UTF-8), %A3 (ISO-8859-1)-

Page Footer & Copyright

Copyright © Sally Maughan 2005-2009 (Page last updated on 16 May 2009)

*Valid XHTML 1.1 - hosted by *Openstrike

Content based on the W3C Working Draft: *XHTML 1.1 and Recommendation: *XHTML Modularisation 1.1.

W3C, XHTML, XML, HTML, CSS and MathML are *Trademarks of the W3C (*MIT, *ERCIM, *Keio) with which the site's author has no connection.


Up, Next & Previous Links

Your Location

Home > XHTML 1.1 Home > Data Types in XHTML 1.1 > URI Encoding in Attributes