Description
We often need to construct an RR from its name, type, TTL and content (for example, when reading records from the database or from user API requests.)
We currently use something similar to NewRR(fmt.Sprintf("%s %d IN %s %s", name, ttl, type, content))
, but that requires a fair amount hacky extra work:
- The content (or even the name!) could contain semicolons, which would be interpreted as starting a comment.
- The content could contain newlines, causing the A record content
1.2.3.4\n5.6.7.8
to parse without errors (as1.2.3.4
). - Perhaps it would even be possible to inject an
$INCLUDE
somehow, so we need to disable that as a precaution. - We have to deal with special cases around quoting and escaping.
- We ignore leading and trailing whitespace, meaning
fe80::1\t
is considered a valid AAAA record content. - etc.
It's also currently impossible to parse generic (RFC 3597) records into a *dns.RFC3597
when the underlying type is known to this library because NewRR
always returns the known type's struct in that case, even when the content started with a \#
token. ToRFC3597
can be used to convert it back to a *dns.RFC3597
, but that calls pack()
, which changes behavior depending on the underlying type. (For example, calling NewRR
with . 1 IN TYPE9999 \# 3 abcdef
produces Rdata \# 3 abcdef
as expected, whereas changing TYPE9999 to TYPE48 results in \# 4 abcdef00
instead.)
This is far from ideal for our use case, so I've taken a stab at trying to build something more reasonable, which would require additional public interface. In short, I've been trying to build something that parses only the record's content, without any special zone file syntax.
My first thought was that this would be fairly simple to do with a single function (this one isn't super clean; consider it an example or first draft):
func ParseRdata(newFn func() RR, h *RR_Header, origin string, r io.Reader) (RR, error) {
if origin != "" {
origin = Fqdn(origin)
if _, ok := IsDomainName(origin); !ok {
return nil, &ParseError{"", "bad initial origin name", lex{}}
}
}
c := newZLexer(r)
var rr RR
parseAsRFC3597 := newFn == nil
// If a newFn function was provided but the content starts with
// a `\#` token, ignore newFn and parse as RFC 3597 regardless.
if c.Peek().token == `\#` {
parseAsRFC3597 = true
}
if parseAsRFC3597 {
rr = &RFC3597{Hdr: *h}
} else {
rr = newFn()
*rr.Header() = *h
}
if err := rr.parse(c, origin); err != nil {
// err is a concrete *ParseError without the file field set.
// The setParseError call below will construct a new
// *ParseError with file set to zp.file.
// err.lex may be nil in which case we substitute our current
// lex token.
if err.lex == (lex{}) {
return nil, &ParseError{err: err.err}
}
return nil, &ParseError{"", err.err, err.lex}
}
if err := c.Err(); err != nil {
return nil, err
}
n, ok := c.Next()
return rr, nil
}
This gets us most of the way there, but I realized the lexer is also context-aware and also parses comments. The former is easily worked around by initializing the zlexer
with owner: false
, which makes it expect content instead of an owner directive. The latter is more problematic and seems to require an entirely new, purpose-built lexer. Since RR.parse()
expects a *zlexer
, that would either involve making zlexer
an interface, or adding the new logic to the existing zlexer
struct. In addition, I noticed all RR.parse()
functions call slurpRemainder()
, which would then possibly have to be moved out.
I'd be happy to work on this but before I spend too much time I wanted to ask whether this is a change you'd consider merging at all (hopefully), and, if so, whether someone with a deeper understanding of the parser and lexer has any design thoughts for me. I know you all tend to be conservative when it comes to adding new public interface (which is good), but I think this could be a useful addition that would enable use cases that are otherwise very difficult to achieve correctly.