11 Semgrep Rules for Go Web Projects
- 16 minutes read - 3305 wordsI’ve mentioned semgrep a few times in recent articles, and I thought it would be good to introduce this new(ish) tool and demonstrate a few rules that you can use to find problems in your Go web apps.
At the end of this article you will:
- understand what semgrep is and what it can do
- have some idea of the limits of semgrep’s power
- have some rules that you can immediately apply to your own projects
Fair warning: I am not a semgrep expert by any stretch of the imagination. If you are, and you think these rules can be improved, please drop a note to brian at universalglue.dev.

Semgrep isn’t named for semaphore flags… but it does offer a pretty good signal.
What is semgrep anyway?
Semgrep is a static analysis tool that allows users to write custom rules that match against patterns found in source code. The tool comes with rules to detect errors – especially security errors – in multiple languages. It has support for a bunch of programming languages including Go.
There is an interactive tutorial that I highly recommend. It only takes a few minutes to go through it.
For this article, I’m just going to show rules and sample code that exercises the rules. Installation is simple. It has a python cli, and can be installed into a python virtual environment in your local directory (this “just worked” for me on debian and ubuntu systems):
$ python3 -m venv ./venv
$ . .venv/bin/activate
(.venv) $ pip install semgrep
(.venv) $ semgrep --version
We’ll put all of our rules and sample code in a rules/
directory:
(.venv) $ mkdir rules
Rule 1: AbortWithStatus Should Immediately be Followed by return
Last week I mentioned that calling c.AbortWithStatus without calling return is an easy error to make when writing a gin handler. Let’s see how we can catch that error with a simple semgrep rule.
At the top level of this rule is patterns
, which means that all the chid
elements must match.
The first child is pattern-either
, which will match if any of its
children match. Its children are simple patterns that match any of the
AbortWithXXX
function calls we want to make sure are followed by a
return
.
The next three children are pattern-not-inside
. If any of these patterns
are present in the code, the child will fail to match, and the toplevel
will not match.
The overall effect of this rule is to say “if any of these match BUT not if any of these other things match”, then log a warning.
Here is rules/abortwithstatus-followed-by-return.yaml
:
---
rules:
- id: abortwithstatus-followed-by-return
languages: [go]
message: c.AbortWithError, AbortWithStatus, and AbortWithStatusJSON should always be followed by return
severity: WARNING
patterns:
- pattern-either:
- pattern: $C.AbortWithError(...)
- pattern: $C.AbortWithStatus(...)
- pattern: $C.AbortWithStatusJSON(...)
- pattern-not-inside: |
$C.AbortWithError(...)
return
- pattern-not-inside: |
$C.AbortWithStatus(...)
return
- pattern-not-inside: |
$C.AbortWithStatusJSON(...)
return
We can test rules by putting some code in a file of the same name but with
the target extension. In comments in the file we identify each line that
should trigger with ruleid:
and the name of the rule. On lines that
should not trigger, use ok:
and the name of the rule. The test runner
will ensure that lines preceded by the former comments trigger reports, and
none of lines with the latter comments trigger reports.
Run tests like this:
(.venv) $ semgrep -q --test rules
✓ All tests passed!
(Yes, it’s almost too meta that there are tests for what are almost like tests. But it’s a handy way to make sure the rule works, especially while learning how to use the tool.)
Here is rules/abortwithstatus-followed-by-return.go
:
package main
import (
"log"
"net/http"
"github.com/gin-gonic/gin"
)
func test1(c *gin.Context) {
// ok: abortwithstatus-followed-by-return
c.AbortWithError(http.StatusInternalServerError)
return
}
func test2(c *gin.Context) {
// ruleid: abortwithstatus-followed-by-return
c.AbortWithError(http.StatusInternalServerError)
log.Printf("asdf")
}
func test3(c *gin.Context) {
if true {
// ok: abortwithstatus-followed-by-return
c.AbortWithStatus(http.StatusBadRequest)
return
}
}
func test4(c *gin.Context) {
if false {
// ruleid: abortwithstatus-followed-by-return
c.AbortWithStatus(http.StatusInternalServerError)
log.Printf("asdf")
}
}
func test5(c *gin.Context) {
// ok: abortwithstatus-followed-by-return
c.AbortWithStatusJSON(http.StatusInternalServerError)
return
}
func test6(c *gin.Context) {
// ruleid: abortwithstatus-followed-by-return
c.AbortWithStatusJSON(http.StatusInternalServerError)
log.Printf("asdf")
}
func test7(c *gin.Context) {
// ruleid: abortwithstatus-followed-by-return
c.AbortWithStatusJSON(http.StatusInternalServerError)
log.Printf("other stuff in between is not allowed")
return
}
Rule 2: Handler Naming Scheme Enforcement
If I don’t have a strong naming scheme, my function names tend to end up a jumbled mix. This rule shows how to enforce a naming scheme.
The first pattern matches a Gin handler function. Note the use of a
metavariable $FUNC
to match the function name.
The second pattern applies to that regex. It uses pattern-not-regex
to
match (triggering a warning) whenever the function name does not match
the given pattern.
Even if you hate my naming scheme, hopefully this is clear enough that you can implement a rule for whatever you prefer.
This goes in rules/handler-naming.yaml
, with tests in
rules/handler-naming.go
:
---
rules:
- id: handler-naming
languages: [go]
message: Naming of handlers should be <thing><CrudAction><Method>
severity: WARNING
patterns:
- pattern: |
func $FUNC($C *gin.Context) {
...
}
- metavariable-pattern:
metavariable: $FUNC
patterns:
# Regex alternatives avoid having a name like thingDeleteDelete...
- pattern-not-regex: "^[a-z]+((Index|Show|New|Edit)(Get|Post|Patch)|Delete)$"
package main
import (
"github.com/gin-gonic/gin"
)
// missing http method
// ruleid: handler-naming
func fooIndex(c *gin.Context) {}
// missing crudAction
// ruleid: handler-naming
func bookGet(c *gin.Context) {}
// useless repetition
// ruleid: handler-naming
func thingDeleteDelete(c *gin.Context) {}
// XXX consider allowing exported handlers
// ruleid: handler-naming
func ThingIndexGet(c *gin.Context) {}
// ok: handler-naming
func thingIndexGet(c *gin.Context) {}
// ok: handler-naming
func thingShowGet(c *gin.Context) {}
// ok: handler-naming
func thingNewPost(c *gin.Context) {}
// ok: handler-naming
func thingEditPatch(c *gin.Context) {}
// ok: handler-naming
func thingDelete(c *gin.Context) {}
// ok: handler-naming
func AnythingGoes() {}
Rule 3: r.GET/r.POST use correct handler
Email subscribers know that using snippets can greatly reduce copy-paste errors, but if you’re someone who hasn’t jumped on the snippet bandwagon yet, you might still occasionally make a copy-paste error.
One area I’ve made this mistake is when adding a new route. It’s easy, just copy-paste an existing route, change a couple of things, and you’re done… unless you forget to change the handler name, and you connect a new POST route to an existing GET handler.
This is very likely to be caught in tests, but it might be convenient to have a rule to warn you before you have to catch it in a test.
The first child says that this should only match inside a handler function.
Note that it uses the metavariable $R
to match the router variable. This
is used below.
The second child says that either of its children can match. And then each
of those children is also a list of subpatterns that all must match. The
first subpattern matches on a call to the router’s ($R
) GET
method,
capturing the $HANDLER
function name in a metavariable. And then there’s
a metavariable subpattern that will match (triggering a warning) if that
handler does not match a regex, in this case ending with “Get”, which the
previous rule requires that all GET handlers match.
A similar pair of rules handles POST handlers. Adding rules and tests for the other methods you have in your app is left as a fun exercise for the reader.
This goes in rules/route-handlers.yaml
and rules/route-handlers.go
:
---
rules:
- id: route-handlers
languages: [go]
message: Make sure route handler functions match the method
severity: WARNING
patterns:
- pattern-inside: |
func $FUNC($R *gin.Engine) {
...
}
- pattern-either:
- patterns:
- pattern: $R.GET(..., $HANDLER)
- metavariable-pattern:
metavariable: $HANDLER
patterns:
- pattern-not-regex: "Get$"
- patterns:
- pattern: $R.POST(..., $HANDLER)
- metavariable-pattern:
metavariable: $HANDLER
patterns:
- pattern-not-regex: "Post$"
package main
import "gin-gonic/gin"
func setupRoutes(r *gin.Engine) {
// ruleid: route-handlers
r.POST("/blah", blahIndexGet)
// ruleid: route-handlers
r.GET("/blah", blahIndexPost)
// ruleid: route-handlers
r.GET("/blah", blahIndexHandler)
// ok: route-handlers
r.POST("/blah", blahIndexPost)
// ok: route-handlers
r.GET("/blah", blahIndexGet)
}
Rule 4: Templates Match Naming Scheme
Just like with handler functions, it’s easy to end up with templates that have apparently random naming conventions.
It’s also pretty easy to enforce a convention. Note that this rule uses the
generic
language – which is still an experimental part of semgrep.
This rule enforces a two-level template naming scheme.
The first pattern-inside
only allows matches inside a template. Note that
this only matches on a two-level template. If you have one- or three-level
templates then it won’t match. It also won’t match if you have a “.html”
suffix in the template name. Adjust as needed for your codebase.
The metavariable-regex
uses a negative lookahead assertion – it won’t
match if this template is in the base
directory. This is where I keep
templates that build the foundation of other templates. If you have other
directories like this, or “special” directories, you could add them to this
regex so they don’t trigger the rule.
The last pattern uses another negative lookahead to avoid matching when the page conforms to the allowed list of template types.
This rule is pretty rigid – a real-world app would need to be more flexible. But this gives a demonstration of how such a rule can work.
Put this in rules/template-naming.yaml
and rules/template-naming.html
:
---
rules:
- id: template-naming
languages: [generic]
paths:
include:
- "*.html"
message: html template does not conform to naming scheme
severity: WARNING
patterns:
- pattern-inside: |
{{ define "$DIR/$PAGE" }}
...
{{ end }}
- metavariable-regex:
metavariable: $DIR
regex: (?!base)
- metavariable-regex:
metavariable: $PAGE
regex: '(?!^(delete|edit|new|list|show)$)'
// ok: template-naming
{{ define "base/blah" }} anything {{ end }}
// ruleid: template-naming
{{ define "thing/blah" }} anything {{ end }}
// ok: template-naming
{{ define "stuff/new" }} anything {{ end }}
Rule 5: Templates Contain Header/Footer
Except for templates in base/
, we want all templates to include the
header and footer. If we have a reliable semgrep rule for this, we don’t
have to add tests that look for a fragment from the header and footer in
all the pages.
As with the previous rule, this also uses the generic
language.
The first pattern matches inside a (two level) template and the second
pattern avoids matching in the base
directory.
The third pattern-not-inside
avoids matching (and thus avoids triggering
a warning) if the template incudes both a header at the top and a footer at
the bottom. If we wanted to be slightly more flexible with the placement
(eg. allowing other content above the header or below the footer) we could
add ellipses above the header or below the footer.
It’s also worth noting that semgrep’s generic
language will only match
ten lines with an ellipsis. I’ve got five in this rule, so this will work
with templates up to 50 lines long. This works for me but you may need to
adjust if you have very long templates.
This goes in rules/template-header-footer.yaml
and
rules/template-header-footer.html
:
rules:
- id: template-header-footer
languages: [generic]
paths:
include:
- "*.html"
message: html templates include header+footer
severity: WARNING
patterns:
- pattern-inside: |
{{ define "$DIR/$PAGE" }}
...
{{ end }}
- metavariable-regex:
metavariable: $DIR
regex: (?!base)
- pattern-not-inside: |
{{ define "$DIR/$PAGE" }}
{{ template "base/header" . }}
... ... ... ... ...
{{ template "base/footer" . }}
{{ end }}
// ok: template-header-footer
{{ define "base/blah" }}
anything
{{ end }}
// ruleid: template-header-footer
{{ define "thing/blah" }}
anything
{{ end }}
// ok: template-header-footer
{{ define "stuff/new" }}
{{ template "base/header" . }}
anything
{{ template "base/footer" . }}
{{ end }}
// ruleid: template-header-footer
{{ define "stuff/new" }}
anything
{{ template "base/footer" . }}
{{ end }}
// ruleid: template-header-footer
{{ define "stuff/new" }}
{{ template "base/header" . }}
anything
{{ end }}
Rule 6: Templates Post to Self
In the article on handling forms in Gin, I showed a form that is loaded from /books/new
and is posted to
/books/new
. That pattern is something we’ll see show up in multiple
places.
This is another place where copy-paste errors can show up: if you copy a
template from, say, /books/new
into /authors/new
and forget to change
the action=
attribute in the form, then your app will have a bug. You can
catch this with a test, but you have to explicitly remember to add a
fragment looking for the correct action=
in every form.
Here’s a rule that uses patterns similar to what I’ve shown in previous rules to enforce a convention that all forms must post to the route that matches the template in which they are contained.
Put this in rules/template-posts-to-self.yaml
and
rules/template-posts-to-self.html
:
---
rules:
- id: template-posts-to-self
languages: [generic]
paths:
include:
- "*.html"
message: html template form should post to itself
severity: WARNING
patterns:
- pattern-inside: |
{{ define "$DIR/$TEMPLATE" }}
...
{{ end }}
- pattern: <form action="...">...</form>
- pattern-not: <form action="/$DIR/$TEMPLATE">...</form>
{{ define "abc/xyz" }}
<!-- ok: template-posts-to-self -->
<form action="/abc/xyz"></form>
<!-- ruleid: template-posts-to-self -->
<form action="/mno/xyz"></form>
<!-- ruleid: template-posts-to-self -->
<form action="abc/xyz"></form>
<!-- ruleid: template-posts-to-self -->
<form action="xyz"></form>
{{ end }}
Rule 7: Label Must Have Matching Input
Another convention is that every <label>
must have an <input>
with a
matching name and id. Enforcing this in a test requires fragments for
every label and name, and requires the html to conform to rigid matching,
which makes them more fragile. (Or for the tests to use regex matching
against the fragments, which makes them more complex.)
Note in the tests that the label and input are allowed to have other
attributes before the for=
, name=
, and id=
. This makes the rule less
prone to false-positives. The extra pattern-not
with the attributes in
reversed order allows them to appear in the template in either order.
These go in rules/template-label-has-input.yaml
and
rules/template-label-has-input.html
:
---
rules:
- id: template-label-has-input
languages: [generic]
paths:
include:
- "*.html"
message: html template label must have corresponding input
severity: WARNING
patterns:
- pattern-inside: |
<label ... for="$NAME" ...>...</label>
...
- pattern: <input ...>
- pattern-not: <input ... name="$NAME" ... id="$NAME" ...>
- pattern-not: <input ... id="$NAME" ... name="$NAME" ...>
{{ define "abc" }}
<form action="/mno/xyz">
<label for="aaa">Aaa</label>
<!-- ok: template-label-has-input -->
<input type="text" name="aaa" id="aaa">
<label class="my-label" for="aaa">Aaa</label>
<!-- ok: template-label-has-input -->
<input name="aaa" type="text" class="xyz" id="aaa">
<label class="my-label" for="aaa">Aaa</label>
<!-- ok: template-label-has-input -->
<input id="aaa" name="aaa" type="text" class="xyz">
<label for="aaa">Aaa</label>
<!-- ruleid: template-label-has-input -->
<input type="text" name="bbb" id="aaa">
<label for="aaa">Aaa</label>
<!-- ruleid: template-label-has-input -->
<input type="text" name="aaa" id="bbb">
<label for="aaa">Aaa</label>
<!-- ruleid: template-label-has-input -->
<input type="text" name="aaa">
<label for="aaa">Aaa</label>
<!-- ruleid: template-label-has-input -->
<input type="text" id="aaa">
</form>
{{ end }}
Rule 8: Tests Use t.Parallel
The final three rules enforce some conventions in test code.
The first convention is that all tests call t.Parallel
. The rule does
this by matching on any test function, where “test function” is defined as
a function taking a single *testing.T
argument.
Two pattern-not
are use to exclude code that properly calls t.Parallel
as the first statement in the function, and to exclude helper functions
that call t.Helper
.
Put these in rules/tests-are-parallel.yaml
and
rules/tests-are-parallel.go
:
---
rules:
- id: tests-are-parallel
languages: [go]
message: test cases must call t.Parallel
severity: WARNING
patterns:
- pattern: |
func $F($T *testing.T) {
...
}
- pattern-not: |
func $F($T *testing.T) {
$T.Parallel()
...
}
- pattern-not: |
func $F($T *testing.T) {
$T.Helper()
...
}
package main
import (
"log"
"testing"
)
// ruleid: tests-are-parallel
func test1(t *testing.T) {
// Note: parallel must be called.
}
// ruleid: tests-are-parallel
func test2(t *testing.T) {
// Note: parallel has to be called first.
log.Print("abc")
t.Parallel()
}
// ok: tests-are-parallel
func test3(t *testing.T) {
// Note: parallel is called first. This is ok.
t.Parallel()
log.Print("abc")
}
// ruleid: tests-are-parallel
func testHelper1(t *testing.T) {
// Note: helper has to be called first.
log.Print("abc")
t.Helper()
}
// ok: tests-are-parallel
func testHelper2(t *testing.T) {
// Note: helper is called first. This is ok.
t.Helper()
log.Print("abc")
}
Rule 9: Helper Functions Never Return Error
Another convention for tests is that helper functions should not return errors.
This can be enforced with a rule that matches any function that has
*testing.T
in the argument list, calls t.Helper
, and returns error
.
The ellipses (..., $T *testing.T, ...
) in the argument list will match if
the *testing.T
is anywhere in the argument list. In theory this should
also work in the return list, but when I was building this rule I found
what appears to be a semgrep
bug – in the second
pattern shown it doesn’t match error
anywhere in the return list, it only
matches in that specific position. If the linked bug is fixed, this rule
would match more flexibly and it should only need the second pattern.
Here are rules/test-helpers-dont-return-error.yaml
and
rules/test-helpers-dont-return-error.go
:
---
rules:
- id: test-helpers-dont-return-error
languages: [go]
message: test helpers must not return error
severity: WARNING
pattern-either:
- pattern: |
func $F(..., $T *testing.T, ...) error {
$T.Helper()
...
}
# XXX The pattern below doesn't work the way it should. See
# https://github.com/returntocorp/semgrep/issues/4896
- pattern: |
func $F(..., $T *testing.T, ...) (..., error, ...) {
$T.Helper()
...
}
package main
import (
"testing"
)
// ruleid: test-helpers-dont-return-error
func testHelper1(t *testing.T) error {
t.Helper()
return nil
}
// ok: test-helpers-dont-return-error
func testHelper2(t *testing.T) int {
t.Helper()
return 0
}
// ok: test-helpers-dont-return-error
func testHelper2(t *testing.T) {
t.Helper()
}
Rule 10: t.Error + t.FailNow should be t.Fatal
This rule enforces a maintenance nit: instead of calling t.Error
followed by t.FailNow
, instead just call t.Fatal
. It uses
pattern-either
to enforce this against four different flavors of the same
code construct.
Put these in rules/error-failnow-fatal.yaml
and
rules/error-failnow-fatal.go
:
---
rules:
- id: error-failnow-fatal
languages: [go]
message: t.Error or t.Log followed by t.FailNow should just call t.Fatal
severity: WARNING
pattern-either:
- pattern: |
$T.Error(...)
$T.FailNow()
- pattern: |
$T.Errorf(...)
$T.FailNow()
- pattern: |
$T.Log(...)
$T.FailNow()
- pattern: |
$T.Logf(...)
$T.FailNow()
package main
import "testing"
func test1(t *testing.T) {
// ruleid: error-failnow-fatal
t.Error("abc")
t.FailNow()
}
func test2(t *testing.T) {
// ruleid: error-failnow-fatal
t.Errorf("%s", "abc")
t.FailNow()
}
func test3(t *testing.T) {
// ruleid: error-failnow-fatal
t.Log("abc")
t.FailNow()
}
func test4(t *testing.T) {
// ruleid: error-failnow-fatal
t.Logf("%s", "abc")
t.FailNow()
}
Rule 11: Naming Conventions for Handler Tests
This is just another naming convention. Let’s assume that any test that
calls postHasStatus
or getHasStatus
is a handler-testing function. We
can enforce that handler test names use a parallel construction to the
handler functions.
The rule works by matching on test functions, where a $HELPER
is called,
and that helper matches a regex. If we add other similar helper functions
we can add them to this regex.
The final regex match against $FUNC
is similar to the regex in rule 2
above. Note that it is not anchored at the end, so that if multiple
functions are needed to test a given handler they can be given unique
suffixes.
Here are rules/test-handler-naming.yaml
and
rules/test-handler-naming.go
:
---
rules:
- id: test-handler-naming
languages: [go]
message: Naming of tests for handlers should be test<Thing><CrudAction><Method><Any>
severity: WARNING
patterns:
- pattern-inside: |
func $FUNC($T *testing.T) {
...
$HELPER($T, ...)
...
}
- metavariable-regex:
metavariable: $HELPER
regex: "(post|get)HasStatus"
- metavariable-regex:
metavariable: $FUNC
regex: "^test(?!([A-Z][a-zA-Z]*)(Index|Show|New|Edit)(Delete|Get|Patch|Post))"
package main
import "testing"
// Doesn't match because it doesn't use getHasStatus/postHasStatus.
// ok: test-handler-naming
func testFoo(t *testing.T) {}
// Has all the parts.
// ok: test-handler-naming
func testThingIndexGet(t *testing.T) {
getHasStatus(t)
}
// Has all the parts, plus extra.
// ok: test-handler-naming
func testThingIndexGetStuff(t *testing.T) {
getHasStatus(t)
}
// Has Thing, but missing crud+method.
// ruleid: test-handler-naming
func testFoo(t *testing.T) {
getHasStatus(t)
}
// Missing "Thing"
// ruleid: test-handler-naming
func testIndexGet(t *testing.T) {
getHasStatus(t)
}
// Missing method
// ruleid: test-handler-naming
func testThingIndex(t *testing.T) {
getHasStatus(t)
}
// Wrong order for crud+method
// ruleid: test-handler-naming
func testThingGetIndex(t *testing.T) {
getHasStatus(t)
}
Coming Up
On Friday we will integrate the tool and these rules into our book tracking project. (Including fixing up some code that conform to the conventions.)
Next week will look at Gin validation and error reporting.