Context-aware Autoescaping in Go
I’ve been working with server-side generated HTML for several years now, and the problem of code injection into HTML pages has been pervasive. A couple of days back, I discovered something fantastic that Go has built right into the standard library to help with this: context-aware autoescaping in HTML templates.
I’ve never really worked with Go - I’m running these examples on the Go playground - but to begin with, let’s look at the normal templating that Go offers.
package main
import (
"os"
"text/template"
)
func main() {
data := make(map[string]string)
data["name"] = "Harry"
data["school"] = "Hogwarts"
tmplContent := "{{ .name }} goes to {{ .school }}"
tmpl, _ := template.New("test").Parse(tmplContent)
tmpl.Execute(os.Stdout, data)
}
and as output, we get "Harry goes to Hogwarts"
. Apart from the unfamiliar syntax, this seems simple enough. Pretty much the same thing as string formatting in, say, Python. This code:
data = {
'name': 'Harry',
'school': 'Hogwarts'
}
tmplContent = "{name} goes to {school}"
print(tmplContent.format(**data))
# or tmplContent.format(name="Harry", school="Hogwarts")
gives us the same output.
However, this gets interesting when you try to generate HTML the same way. If the data was instead
data["name"] = "Harry<script>alert('you have been pwned')</script>"
data["school"] = "Hogwarts"
we’d get "Harry<script>alert('you have been pwned')</script> goes to Hogwarts"
which is definitely not nice. But here’s where the cool stuff starts. Instead of the text/template
, let’s use the drop-in replacement html/template
.
package main
import (
"os"
"html/template" // note the change here
)
func main() {
data := make(map[string]string)
data["name"] = "Harry<script>alert('you have been pwned')</script>"
data["school"] = "Hogwarts"
tmplContent := "{{ .name }} goes to {{ .school }}"
tmpl, _ := template.New("test").Parse(tmplContent)
tmpl.Execute(os.Stdout, data)
}
Try it out on the Go playground. The output is "Harry<script>alert('you have been pwned')</script> goes to Hogwarts"
. The troublesome characters were escaped automagically, and now we don’t have a code injection anymore. Awesome!
So we’ve seen the autoescaping, but what’s this about “context-awareness”? That’s the part that really impressed me. Go’s html/template
package understands HTML, along with CSS, Javascript and URIs. It knows what kind of escaping needs to be done where.
Let’s see a few examples to make better sense of this. For the data string "<b>You're</b> weird"
, we get the following outputs from the following templates. (.
can be used to represent the entire single input given to the template.)
<p>{{ . }}</p>
=> <p><b>You're</b> weird</p>
<p class="{{ . }}"></p>
=> <p class="<b>You're</b> weird"></p>
(this time the single quote got encoded too)
<a href="{{ . }}"></a>
=> <a href="%3cb%3eYou%27re%3c/b%3e%20weird"></a>
(and there's URL encoding)
<script>var s = '{{ . }}';</script>
=> <script>var s = '\x3cb\x3eYou\x27re\x3c\/b\x3e weird';</script>
That’s just HTML content, attributes, URLs, and Javascript. There’s a lot more! Go’s docs for the html/template package have detailed information.
I thought this was a fantastic way to prevent accidental code injection in HTML. Developers are prone to make mistakes, and there have been an enormous number of cases (Facebook, Stack overflow, server-side React, etc.) involving XSS vulnerabilities in websites. A templating system that intelligently handles escaping for you? Kudos!