<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pingback="http://madskills.com/public/xml/rss/module/pingback/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Kirk Jackson's Page of Words - AntiXSS</title>
    <link>http://pageofwords.com/blog/</link>
    <description>Run the ink across this page of words</description>
    <language>en-us</language>
    <copyright>Kirk Jackson</copyright>
    <lastBuildDate>Fri, 09 Oct 2009 09:00:18 GMT</lastBuildDate>
    <generator>newtelligence dasBlog 1.9.6264.0</generator>
    <managingEditor>kirkj@paradise.net.nz</managingEditor>
    <webMaster>kirkj@paradise.net.nz</webMaster>
    <item>
      <trackback:ping>http://pageofwords.com/blog/Trackback.aspx?guid=e8f31c63-4081-4e58-8d7e-accb2315d8bb</trackback:ping>
      <pingback:server>http://pageofwords.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://pageofwords.com/blog/PermaLink,guid,e8f31c63-4081-4e58-8d7e-accb2315d8bb.aspx</pingback:target>
      <dc:creator>Kirk Jackson</dc:creator>
      <wfw:comment>http://pageofwords.com/blog/CommentView,guid,e8f31c63-4081-4e58-8d7e-accb2315d8bb.aspx</wfw:comment>
      <wfw:commentRss>http://pageofwords.com/blog/SyndicationService.asmx/GetEntryCommentsRss?guid=e8f31c63-4081-4e58-8d7e-accb2315d8bb</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
To prevent cross-site scripting, it's important to encode data before outputting it.
</p>
        <p>
Up until now, it has been quite hard to ensure you're encoding everywhere throughout
your app.
</p>
        <p>
It's great to see the new syntax in ASP.NET 4 to automatically encode:
</p>
        <blockquote>
          <pre class="csharpcode">First Name: <span class="asp">&lt;%</span><span class="kwrd">:</span> Model.FirstName <span class="asp">%&gt;</span> Last
Name: <span class="asp">&lt;%</span><span class="kwrd">:</span> Model.FirstName <span class="asp">%&gt;</span><span class="kwrd">&lt;</span><span class="html">form</span><span class="attr">method</span><span class="kwrd">="post"</span><span class="kwrd">&gt;</span><span class="asp">&lt;%</span><span class="kwrd">:</span> Html.TextBox(<span class="str">"FirstName"</span>) <span class="asp">%&gt;</span><span class="asp">&lt;%</span><span class="kwrd">:</span> Html.TextBox(<span class="str">"LastName"</span>) <span class="asp">%&gt;</span><span class="kwrd">&lt;/</span><span class="html">form</span><span class="kwrd">&gt;</span></pre>
        </blockquote>
        <p>
(From <a href="http://haacked.com/archive/2009/09/25/html-encoding-code-nuggets.aspx">Phil
Haack's blog</a>)
</p>
        <p>
This means that for all new web applications, you can build using &lt;%: %&gt; instead
of &lt;%= %&gt;, which is great for ASP.NET MVC applications where that syntax is
common. 
</p>
        <p>
For older applications you will be able to opt in to the new encoding syntax, but
your old code will keep working exactly as it already does (perhaps insecurely, if
you're not encoding!)
</p>
        <p>
Here's hoping that we'll be able to replace the standard HtmlEncode with the <a href="http://www.codeplex.com/AntiXSS">AntiXSS</a> goodness
I described here:
</p>
        <ul>
          <li>
            <a href="http://pageofwords.com/blog/2009/02/25/WhatIsEncodingCrossSiteScriptingAndTheAntiXSSEncodingMethods.aspx">What
is encoding? Cross site scripting and the AntiXSS encoding methods</a>
          </li>
        </ul>
        <p>
Kirk
</p>
        <img width="0" height="0" src="http://pageofwords.com/blog/aggbug.ashx?id=e8f31c63-4081-4e58-8d7e-accb2315d8bb" />
      </body>
      <title>Syntax support for HTML Encoding in ASP.NET 4</title>
      <guid isPermaLink="false">http://pageofwords.com/blog/PermaLink,guid,e8f31c63-4081-4e58-8d7e-accb2315d8bb.aspx</guid>
      <link>http://pageofwords.com/blog/2009/10/09/SyntaxSupportForHTMLEncodingInASPNET4.aspx</link>
      <pubDate>Fri, 09 Oct 2009 09:00:18 GMT</pubDate>
      <description>&lt;p&gt;
To prevent cross-site scripting, it's important to encode data before outputting it.
&lt;/p&gt;
&lt;p&gt;
Up until now, it has been quite hard to ensure you're encoding everywhere throughout
your app.
&lt;/p&gt;
&lt;p&gt;
It's great to see the new syntax in ASP.NET 4 to automatically encode:
&lt;/p&gt;
&lt;blockquote&gt; &lt;pre class="csharpcode"&gt;First Name: &lt;span class="asp"&gt;&amp;lt;%&lt;/span&gt;&lt;span class="kwrd"&gt;:&lt;/span&gt; Model.FirstName &lt;span class="asp"&gt;%&amp;gt;&lt;/span&gt; Last
Name: &lt;span class="asp"&gt;&amp;lt;%&lt;/span&gt;&lt;span class="kwrd"&gt;:&lt;/span&gt; Model.FirstName &lt;span class="asp"&gt;%&amp;gt;&lt;/span&gt; &lt;span class="kwrd"&gt;&amp;lt;&lt;/span&gt;&lt;span class="html"&gt;form&lt;/span&gt; &lt;span class="attr"&gt;method&lt;/span&gt;&lt;span class="kwrd"&gt;="post"&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt; &lt;span class="asp"&gt;&amp;lt;%&lt;/span&gt;&lt;span class="kwrd"&gt;:&lt;/span&gt; Html.TextBox(&lt;span class="str"&gt;"FirstName"&lt;/span&gt;) &lt;span class="asp"&gt;%&amp;gt;&lt;/span&gt; &lt;span class="asp"&gt;&amp;lt;%&lt;/span&gt;&lt;span class="kwrd"&gt;:&lt;/span&gt; Html.TextBox(&lt;span class="str"&gt;"LastName"&lt;/span&gt;) &lt;span class="asp"&gt;%&amp;gt;&lt;/span&gt; &lt;span class="kwrd"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="html"&gt;form&lt;/span&gt;&lt;span class="kwrd"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
(From &lt;a href="http://haacked.com/archive/2009/09/25/html-encoding-code-nuggets.aspx"&gt;Phil
Haack's blog&lt;/a&gt;)
&lt;/p&gt;
&lt;p&gt;
This means that for all new web applications, you can build using &amp;lt;%: %&amp;gt; instead
of &amp;lt;%= %&amp;gt;, which is great for ASP.NET MVC applications where that syntax is
common. 
&lt;/p&gt;
&lt;p&gt;
For older applications you will be able to opt in to the new encoding syntax, but
your old code will keep working exactly as it already does (perhaps insecurely, if
you're not encoding!)
&lt;/p&gt;
&lt;p&gt;
Here's hoping that we'll be able to replace the standard HtmlEncode with the &lt;a href="http://www.codeplex.com/AntiXSS"&gt;AntiXSS&lt;/a&gt; goodness
I described here:
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="http://pageofwords.com/blog/2009/02/25/WhatIsEncodingCrossSiteScriptingAndTheAntiXSSEncodingMethods.aspx"&gt;What
is encoding? Cross site scripting and the AntiXSS encoding methods&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
Kirk
&lt;/p&gt;
&lt;img width="0" height="0" src="http://pageofwords.com/blog/aggbug.ashx?id=e8f31c63-4081-4e58-8d7e-accb2315d8bb" /&gt;</description>
      <comments>http://pageofwords.com/blog/CommentView,guid,e8f31c63-4081-4e58-8d7e-accb2315d8bb.aspx</comments>
      <category>AntiXSS;Security</category>
    </item>
    <item>
      <trackback:ping>http://pageofwords.com/blog/Trackback.aspx?guid=4dd3d79c-7169-401e-8ecd-71f75c4dd2db</trackback:ping>
      <pingback:server>http://pageofwords.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://pageofwords.com/blog/PermaLink,guid,4dd3d79c-7169-401e-8ecd-71f75c4dd2db.aspx</pingback:target>
      <dc:creator>Kirk Jackson</dc:creator>
      <wfw:comment>http://pageofwords.com/blog/CommentView,guid,4dd3d79c-7169-401e-8ecd-71f75c4dd2db.aspx</wfw:comment>
      <wfw:commentRss>http://pageofwords.com/blog/SyndicationService.asmx/GetEntryCommentsRss?guid=4dd3d79c-7169-401e-8ecd-71f75c4dd2db</wfw:commentRss>
      <slash:comments>1</slash:comments>
      <title>What is encoding? Cross site scripting and the AntiXSS encoding methods</title>
      <guid isPermaLink="false">http://pageofwords.com/blog/PermaLink,guid,4dd3d79c-7169-401e-8ecd-71f75c4dd2db.aspx</guid>
      <link>http://pageofwords.com/blog/2009/02/25/WhatIsEncodingCrossSiteScriptingAndTheAntiXSSEncodingMethods.aspx</link>
      <pubDate>Wed, 25 Feb 2009 03:57:16 GMT</pubDate>
      <description>&lt;p&gt;
Encoding is &amp;quot;the process of transforming information from one format into another&amp;quot;
[&lt;a href="http://en.wikipedia.org/w/index.php?title=Encoding&amp;amp;oldid=272528119"&gt;Wikipedia&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;
In the web development world, when we talk about encoding text, we are normally talking
about taking some input text and making it &lt;em&gt;appropriate to use&lt;/em&gt; in a &lt;em&gt;given
context&lt;/em&gt;. For example, taking the user's first name and last name, and making
it safe to put in a &amp;lt;b&amp;gt; tag within an html page.
&lt;/p&gt;
&lt;p&gt;
We care about encoding most when we take input that we don't trust from our users
- if we ever display that input we have to be careful to remove any characters that
may interfere with the display of our web pages, cause javascript to run, or allow
other malicious actions.
&lt;/p&gt;
&lt;p&gt;
This article will help you understand what encoding is, why you need to do it and
how that helps prevent cross-site scripting, and give a little introduction to the &lt;a href="http://www.codeplex.com/AntiXSS"&gt;AntiXSS&lt;/a&gt; library.
&lt;/p&gt;
&lt;h2&gt;A bold example
&lt;/h2&gt;
&lt;p&gt;
As a running example, let's say we are letting the user enter anything they want for
their name - in an input box like this on our website:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_2.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="58" alt="Text box to collect name from the user" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb.png" width="289" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
We then take the text they enter and store it in our database. Later on when we display
it on the web page, we wrap the text in bold tags so that it stands out:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_4.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="51" alt="Welcome to the website, Kirk!" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_1.png" width="225" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
In ASP.NET one way of doing this would be to put an ASP.NET label between &amp;lt;b&amp;gt;
tags:
&lt;/p&gt;
&lt;pre class="code"&gt;Welcome to the website, &lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;b&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;asp&lt;/span&gt;&lt;span style="color: blue"&gt;:&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;Label &lt;/span&gt;&lt;span style="color: red"&gt;ID&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;NameLabel&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;runat&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;server&amp;quot;&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;asp&lt;/span&gt;&lt;span style="color: blue"&gt;:&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;Label&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;b&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;!&lt;/pre&gt;
&lt;p&gt;
...and then in the code behind, take the name from our database and assign it to the
Text property:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: rgb(43,145,175)"&gt;User &lt;/span&gt;user = GetFromDatabase();
NameLabel.Text = user.Name;&lt;/pre&gt;
&lt;h2&gt;Trust no-one
&lt;/h2&gt;
&lt;p&gt;
The problem is, we've received this name directly from your user (who of course, you
shouldn't trust), and we've stored it in a column in our database (which we now can't
trust), and now we can't safely display it on our website without sanitising it or
making it trust-worthy.
&lt;/p&gt;
&lt;p&gt;
The number one lesson I try to give in my presentations on web security is &lt;em&gt;&amp;quot;Don't
trust...&amp;quot;&lt;/em&gt;. You can't trust your user, you can't trust your employees, your
students, or even your mother. There is no such thing as &amp;quot;safe input&amp;quot; that
you receive over the Internet, everything you receive is suspect. 
&lt;/p&gt;
&lt;p&gt;
(Even people who are otherwise trustworthy might not be in control of their faculties
if they have spyware or are virus-infected)
&lt;/p&gt;
&lt;p&gt;
Everything is fine if the user enters only ascii characters:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_6.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="115" alt="User enters " src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_2.png" width="287" border="0" safe?="safe?" is="is" output="output" jackson?,="jackson?," kirk="kirk" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
But what happens if the user enters some html into the input box?
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_8.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="110" alt="The user enters html, the page layout changes." src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_3.png" width="286" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The user is now able to change how our page looks! Indeed, they can inject HTML, script
or other content directly into pages on our website!
&lt;/p&gt;
&lt;p&gt;
This is known as Cross-site scripting, or XSS, and is the bane of our existence as
web developers.
&lt;/p&gt;
&lt;h2&gt;What went wrong?
&lt;/h2&gt;
&lt;p&gt;
The ASP.NET label outputs the Text &lt;em&gt;directly&lt;/em&gt; into the HTML output of the page:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;p&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt; &lt;/span&gt;Welcome
to the website, &lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;b&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;span &lt;/span&gt;&lt;span style="color: red"&gt;id&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;NameLabel&amp;quot;&amp;gt;&lt;/span&gt;Kirk &lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;b&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;i&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;Jackson&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;i&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;span&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;b&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;! &lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;p&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
The problem here is that the ASP.NET label is not encoding the text before outputting
it. The text is not &lt;em&gt;appropriate&lt;/em&gt; to use in an &lt;em&gt;HTML context&lt;/em&gt;, as it
contains characters that have meaning in HTML (namely the characters making the &amp;lt;/b&amp;gt;
and &amp;lt;i&amp;gt; tags).
&lt;/p&gt;
&lt;p&gt;
To make the user's name safe to use in an HTML context, we need to encode the inappropriate
text to be safe in an HTML context:
&lt;/p&gt;
&lt;pre class="code"&gt;Kirk &lt;span style="color: red"&gt;&amp;amp;lt;&lt;/span&gt;/b&amp;gt;&lt;span style="color: red"&gt;&amp;amp;lt;&lt;/span&gt;i&amp;gt;Jackson&lt;span style="color: red"&gt;&amp;amp;lt;&lt;/span&gt;/i&amp;gt;&lt;/pre&gt;
&lt;h2&gt;HTML Encoding
&lt;/h2&gt;
&lt;p&gt;
HTML encoding is turning a string into a safe block of text for insertion in an HTML
web page. 
&lt;/p&gt;
&lt;p&gt;
This means it should not use any of the special characters that are used to mark the
beginning or end of tags (&amp;lt; and &amp;gt;), attribute values (&amp;quot;) or the ampersand
character on it's own (&amp;amp;). If those characters are left in the string, then they
could be used to start or stop HTML tags and change the behaviour of our page.
&lt;/p&gt;
&lt;p&gt;
To remove these characters, HTML encoding requires them to be turned into character
entity references, or numeric entity references. This stops them from being treated
as special characters for formatting an HTML page, and just treats them as a character
to be displayed.
&lt;/p&gt;
&lt;blockquote&gt; 
&lt;table cellspacing="0" cellpadding="2" border="1"&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th valign="top"&gt;
Original character&lt;/th&gt;
&lt;th valign="top"&gt;
Character Entity Reference&lt;/th&gt;
&lt;th valign="top"&gt;
Numeric Entity Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
&amp;lt; (less-than sign)&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;lt;&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;#60;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
&amp;gt; (greater-than sign)&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;gt;&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;#62;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
&amp;quot; (double quote)&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;quot;&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;#34;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
&amp;amp; (ampersand)&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;amp;&lt;/td&gt;
&lt;td valign="top"&gt;
&amp;amp;#38;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;
The above table shows a few examples of how to encode special characters. For a more
complete reference, see &lt;a href="http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references"&gt;Wikipedia&lt;/a&gt; or &lt;a href="http://www.w3.org/TR/html4/sgml/entities.html"&gt;W3C&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
Note that since the ampersand character is used to start an encoded character sequence,
it can't be used on it's own as a regular character. This is why ampersands should
be encoded as &amp;amp;amp; in HTML.
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
Once the users name is encoded, it will then be in the HTML as &lt;span style="color: red"&gt;&amp;amp;lt;&lt;/span&gt;i&amp;gt;
instead of &amp;lt;i&amp;gt;, which means that in the above example, italic mode won't turn
on:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_10.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="113" alt="The users text is now encoded correctly." src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_4.png" width="358" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The screenshot above looks a little weird, but the page now displays the text exactly
as the user typed it in, without treating the users input as special HTML markup.
&lt;/p&gt;
&lt;h2&gt;Attribute Encoding
&lt;/h2&gt;
&lt;p&gt;
Attribute encoding is turning a string into a safe block of text for use within an
attribute of an HTML tag.
&lt;/p&gt;
&lt;p&gt;
Attributes are the name/value pairs on a tag node in HTML (or SGML and XML, for that
matter). For example, in the following HTML, the &lt;em&gt;a&lt;/em&gt; tag has a &lt;em&gt;title &lt;/em&gt;attribute:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a &lt;/span&gt;&lt;span style="color: red"&gt;href&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;foo.html&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;title&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;test&amp;quot;&amp;gt;&lt;/span&gt;thing&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_20.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="81" alt="The title tag is displayed as a tooltip" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_9.png" width="89" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The text inside the title attribute is used to create a tool tip when the mouse pointer
hovers over the hyperlink.
&lt;/p&gt;
&lt;p&gt;
This HTML contains an &lt;em&gt;a&lt;/em&gt; tag (an anchor tag), which has two attributes set: &lt;em&gt;href &lt;/em&gt;and &lt;em&gt;title&lt;/em&gt;.
The &lt;em&gt;a&lt;/em&gt; tag also contains some HTML within it: the text 'thing'. The contained
text must be HTML encoded if you only want text within the &lt;em&gt;a &lt;/em&gt;tag, and the
two attributes must be attribute encoded.
&lt;/p&gt;
&lt;p&gt;
At a simplistic level, text is valid inside an attribute as long as it doesn't contain
double quotes (&amp;quot;), ampersands (&amp;amp;) or less-than symbols (&amp;lt;), as the double
quote would prematurely end the attribute, and the other two characters must be encoded
anywhere they are used within an HTML document (except when creating tags).
&lt;/p&gt;
&lt;p&gt;
To extend our earlier example, imagine the users name is used as the tooltip of a
link, to pop up before they follow the link. If we naively output the users name as
a title attribute without encoding it, the user could inject some additional behaviour
into our page. e.g.
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a &lt;/span&gt;&lt;span style="color: red"&gt;href&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;foo.html&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;title&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;&lt;/span&gt;&lt;span style="background: rgb(255,238,98); -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"&gt;&amp;lt;%&lt;/span&gt;=
User.Name &lt;span style="background: rgb(255,238,98); -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"&gt;%&amp;gt;&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;quot;&amp;gt;&lt;/span&gt;thing&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
If the user enters something malicious, for example by entering a double-quote followed
by some javascript, then they have managed to inject extra HTML or javascript behaviour
into our site:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_24.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="155" alt="User enters script into Name field" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_11.png" width="309" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The hover for the hyperlink looks okay, but when the user clicks the link, malicious
javacript can run:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_26.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="189" alt="Malicious javascript running" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_12.png" width="213" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
This is because the HTML that we have sent to the clients browser actually contains
an onclick attribute that we didn't intend:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a &lt;/span&gt;&lt;span style="color: red"&gt;href&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;foo.html&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;title&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;Kirk&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;onclick&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;alert('Hi')&amp;quot;&amp;gt;&lt;/span&gt;thing&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
Encoding the users data before sending it to the browser would have protected us from
this, and then the HTML sent would look like this:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a &lt;/span&gt;&lt;span style="color: red"&gt;href&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;foo.html&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;title&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;Kirk&amp;amp;quot;
onclick=&amp;amp;quot;alert('Hi')&amp;quot;&amp;gt;&lt;/span&gt;thing&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
Which correctly displays exactly what the user entered:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_22.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="83" alt="Tooltip now shows complete text entered" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_10.png" width="191" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;h2&gt;URL Encoding
&lt;/h2&gt;
&lt;p&gt;
URL encoding is turning a string into a safe block of text for appending on the query
string of a URL.
&lt;/p&gt;
&lt;p&gt;
The original specification for HTTP URL's (&lt;a href="http://www.rfc-editor.org/rfc/rfc1738.txt"&gt;RFC
1738&lt;/a&gt;) specifies that URLs should only include certain characters, and all others
must be encoded. This is similar to the case of HTML encoding, but there is a much
smaller set of characters allowed, and the way you encode them is different.
&lt;/p&gt;
&lt;p&gt;
To encode characters to append to a URL, you use a percentage symbol, followed by
the two-digit hex number representing that character. For example:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;blockquote&gt; 
&lt;table cellspacing="0" cellpadding="2" width="324" border="1"&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th valign="top"&gt;
Original character&lt;/th&gt;
&lt;th valign="top"&gt;
Character Entity Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
space&lt;/td&gt;
&lt;td valign="top"&gt;
%20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
/ (forward slash)&lt;/td&gt;
&lt;td valign="top"&gt;
%2F&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
&amp;quot; (double quote)&lt;/td&gt;
&lt;td valign="top"&gt;
%22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td valign="top"&gt;
? (question mark)&lt;/td&gt;
&lt;td valign="top"&gt;
%3F&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;
The above table shows a few examples of how to URL encode special characters. For
a more complete reference, see Brian Wilson's &lt;a href="http://www.blooberry.com/indexdot/html/topics/urlencoding.htm"&gt;URL
Encoding&lt;/a&gt; page.
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
We need to encode strings before appending them to a URL, to make sure that untrusted
input is not able to change the URL.
&lt;/p&gt;
&lt;p&gt;
For example, if our page above constructed a URL to search Google for the name of
the user entered into the website, it could look like this:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_14.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="96" alt="Construct a search url by joining two strings together" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_6.png" width="452" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
When the user clicks the link, they will search Google for their name.
&lt;/p&gt;
&lt;p&gt;
Here the naive code is just constructing a url by joining the two strings together:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: rgb(43,145,175)"&gt;User &lt;/span&gt;user = GetFromDatabase(); &lt;span style="color: blue"&gt;string &lt;/span&gt;url
= &lt;span style="color: rgb(163,21,21)"&gt;&amp;quot;http://www.google.com/search?q=&amp;quot; &lt;/span&gt;+
user.Name;&lt;/pre&gt;
&lt;p&gt;
But if a name with spaces is entered, then we're generating an invalid URL:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_16.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="91" alt="Create a url with spaces in it" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_7.png" width="505" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The URL is invalid because it contains an illegal character - a space that should
be encoded as %20.
&lt;/p&gt;
&lt;p&gt;
We could also be opening our users up to cross-site scripting bugs, because we are
effectively letting them create any url they want. For example:
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_18.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="91" alt="Create a url with ampersands in it" src="http://pageofwords.com/blog/content/binary/WindowsLiveWriter/WhatisencodingCrosssitescriptingandtheAn_128B5/image_thumb_8.png" width="505" border="0" /&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
Here we are appending the ampersand (&amp;amp;) that the user entered directly to the
end of the url, so rather than their text being passed to the server as the &amp;quot;q&amp;quot;
parameter, we're letting them add other query string parameters (in this case, the
&amp;quot;I'm feeling lucky!&amp;quot; button). The solution in this case is to encode the
ampersand as %26.
&lt;/p&gt;
&lt;h2&gt;The AntiXSS library
&lt;/h2&gt;
&lt;p&gt;
The &lt;a href="http://www.codeplex.com/AntiXSS"&gt;AntiXSS library&lt;/a&gt; (currently at version
3.0 beta) has been built by the &lt;a href="http://blogs.msdn.com/ace_team/"&gt;&lt;strike&gt;Microsoft
ACE Security and Performance Team&lt;/strike&gt;&lt;/a&gt; [ooops! By the &lt;a href="http://blogs.msdn.com/cisg/"&gt;Connected
Information Security Group&lt;/a&gt;, sorry!]
&lt;/p&gt;
&lt;p&gt;
The library provides two related functions:
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
Encoding methods to make text safe for a variety of contexts 
&lt;/li&gt;
&lt;li&gt;
An HttpHandler to automatically encode your ASP.NET controls 
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
I'll cover the Security Runtime Engine HttpHandler in another post.
&lt;/p&gt;
&lt;p&gt;
The encoding methods have been built using more robust and secure coding practices
than the existing methods in the HttpUtility class of the .NET framework, so you should
use them in preference when encoding your data.
&lt;/p&gt;
&lt;p&gt;
&lt;span style="color: blue"&gt;public class &lt;/span&gt;&lt;span style="color: rgb(43,145,175)"&gt;AntiXss 
&lt;br /&gt;
&lt;/span&gt;{ 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;HtmlAttributeEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;HtmlEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;JavaScriptEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;UrlEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;VisualBasicScriptEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;XmlAttributeEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160; &lt;span style="color: blue"&gt;public static string &lt;/span&gt;XmlEncode(&lt;span style="color: blue"&gt;string &lt;/span&gt;input); 
&lt;br /&gt;
}
&lt;/p&gt;
&lt;a href="http://11011.net/software/vspaste"&gt;&lt;/a&gt; 
&lt;p&gt;
You need to decide which &lt;em&gt;context &lt;/em&gt;you're outputting text, and then choose
the appropriate method to encode the text.
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HtmlEncode &lt;/strong&gt;- use for all HTML output, except for when you're adding
text inside an attribute of a tag (e.g. use for &amp;lt;b&amp;gt;...&amp;lt;/b&amp;gt;) 
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HtmlAttributeEncode &lt;/strong&gt;- use for text that will appear inside attributes
of tags (e.g. &amp;lt;a title=&amp;quot;...&amp;quot;&amp;gt;) 
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UrlEncode &lt;/strong&gt;- use for text that you are appending as a value in a url
query string (e.g. &lt;a href="http://google.com/search?q"&gt;http://google.com/search?q&lt;/a&gt;=...) 
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JavascriptEncode &lt;/strong&gt;- use when you want to put the string into a javascript
variable (e.g. var foo = '...'). This method will also create the surrounding quotes. 
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VisualBasicScriptEncode &lt;/strong&gt;- use if you're unlucky enough to be creating
pages with VBScript on them 
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XmlEncode, XmlAttributeEncode&lt;/strong&gt; - the XML equivalents of the above
HTML methods 
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
To use inline in your ASPX page, you can call the library methods directly:
&lt;/p&gt;
&lt;pre class="code"&gt;&lt;span style="color: blue"&gt;&amp;lt;&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a &lt;/span&gt;&lt;span style="color: red"&gt;href&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;foo.html&amp;quot; &lt;/span&gt;&lt;span style="color: red"&gt;title&lt;/span&gt;&lt;span style="color: blue"&gt;=&amp;quot;&lt;/span&gt;&lt;span style="background: rgb(255,238,98); -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"&gt;&amp;lt;%&lt;/span&gt;=
HttpUtility.HtmlAttributeEncode(User.Name) &lt;span style="background: rgb(255,238,98); -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"&gt;%&amp;gt;&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;quot;&amp;gt;&lt;/span&gt;thing&lt;span style="color: blue"&gt;&amp;lt;/&lt;/span&gt;&lt;span style="color: rgb(163,21,21)"&gt;a&lt;/span&gt;&lt;span style="color: blue"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;
To use from your code-behind, decide whether your control outputs it's content as
an attribute or in an html context, and then call the appropriate method:
&lt;/p&gt;
&lt;pre class="code"&gt;Label1.Text = &lt;span style="color: rgb(43,145,175)"&gt;AntiXss&lt;/span&gt;.HtmlEncode(User.Name);&lt;/pre&gt;
&lt;p&gt;
Deciding which context you're in and which encoding method to use is a major annoyance,
so be sure to look at the Security Runtime Engine which does it for you. I'll write
more about that in a future blog post, so please &lt;a href="http://feeds2.feedburner.com/pageofwords"&gt;subscribe
to my RSS&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
Hopefully this article has helped you understand what encoding is; why you need to
encode untrusted input and how that helps prevent cross-site scripting; and has given
a little intro to the AntiXSS library.
&lt;/p&gt;
&lt;p&gt;
Kirk
&lt;/p&gt;
&lt;img width="0" height="0" src="http://pageofwords.com/blog/aggbug.ashx?id=4dd3d79c-7169-401e-8ecd-71f75c4dd2db" /&gt;</description>
      <comments>http://pageofwords.com/blog/CommentView,guid,4dd3d79c-7169-401e-8ecd-71f75c4dd2db.aspx</comments>
      <category>AntiXSS;Security;Web</category>
    </item>
  </channel>
</rss>