* PromQL: Use a sync.Pool for the generated parser structure
The generated PromQL parser allocates a struct about 4kb in size on every run.
This puts a high load on the garbage collector.
To reduce that load, a sync.Pool is used to recycle these structures.
On small queries this makes parsing 2-3 times faster.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
During the PromQL parser rewrite there was some logic put in place that allowed switching between the non generated and the generated parser. Since the parser is now fully generated this is not needed anymore.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
* Add parser method to produce errors messages about unexpected items
* PromQL: use parser.unexpected in generated parser
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
* Add grammar for label_sets
* Parse label Sets using the generated parser
* Allow trailing commas for label sets and selectors
* Add test to trigger all possible error messages for label matchers
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
The most common format (used by go, gcc and clang) for compiler error positions seems to be
`filename:line:char:` or `line:char:` if the filename is unknown.
This PR adapts the PromQL parser to use this convention.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
* promql: Allow injecting fake tokens into the generated parser
Yacc grammars do not support having multiple start symbols.
To work around that restriction, it is possible to inject fake tokens into the lexer stream,
as described here https://www.gnu.org/software/bison/manual/html_node/Multiple-start_002dsymbols.html .
This is part of the parser rewrite effort described in #6256.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
For yacc generated parsers there is the convention to capitalize the names of item types provided by the lexer, which makes it easy to distinct lexer tokens (capitalized) from nonterminal symbols (not capitalized) in language grammars.
This convention is also followed by the (non generated) go compiler (see https://golang.org/pkg/go/token/#Token).
Part of the parser rewrite described in #6256.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
This is the first step towards a generated lexer as described in #6256.
It adds methods to the parser struct, that make it implement the yyLexer interface required by a yacc generated parser, as described here: https://godoc.org/golang.org/x/tools/cmd/goyacc .
The yyLexer interface is implemented by the parser struct instead of the lexer struct for the following reasons:
* Both parsers have a lookahead that the lexer does not know about. This solution makes it possible to synchronize these lookaheads when switching parsers.
* The routines to handle parser errors are not accessible to the lexer.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
* promql: Clean up parser struct
The parser struct used two have two somewhat misused fields:
peekCount int
token [3]item
By reading the code carefully one notices, that peekCount always has the value 0 or 1 and that only the first element of token is ever accessed.
To make this clearer, this commit replaces the token array with a single variable and the peekCount int with a boolean.
Signed-off-by: Tobias Guggenmos <tguggenm@redhat.com>
i) Uses the more idiomatic Wrap and Wrapf methods for creating nested errors.
ii) Fixes some incorrect usages of fmt.Errorf where the error messages don't have any formatting directives.
iii) Does away with the use of fmt package for errors in favour of pkg/errors
Signed-off-by: tariqibrahim <tariq181290@gmail.com>
* Expose lexer item types
We have generally agreed to expose AST types / values that are necessary
to make sense of the AST outside of the promql package. Currently the
`UnaryExpr`, `BinaryExpr`, and `AggregateExpr` AST nodes store the lexer
item type to indicate the operator type, but since the individual item
types aren't exposed, an external user of the package cannot determine
the operator type. So this PR exposes them.
Although not all item types are required to make sense of the AST (some
are really only used in the lexer), I decided to expose them all here to
be somewhat more consistent. Another option would be to not use lexer
item types at all in AST nodes.
The concrete motivation is my work on the PromQL->Flux transpiler, but
this ought to be useful for other cases as well.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
* Fix item type names in tests
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Although it is spelling mistakes, it might make an affects
while reading.
Co-Authored-By: Kim Bao Long longkb@vn.fujitsu.com
Signed-off-by: Nguyen Hai Truong <truongnh@vn.fujitsu.com>
When there was an error in the parser, the
lexer goroutine was left running.
Also make runtime panic test actually test things.
Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* parser test cleanup
- Test against the exported package functions instead of the private functions.
* Improves readability of TestParseSeries
- Moves package function closer to parser function
For instant vectors, if "stale" is the newest sample
ignore the timeseries.
For range vectors, filter out "stale" samples.
Make it possible to inject "stale" samples in promql tests.
The current separation between lexer and parser is a bit fuzzy when it
comes to operators, aggregators and other keywords. The lexer already
tries to determine the type of a token, even though that type might
change depending on the context.
This led to the problematic behavior that no tokens known to the lexer
could be used as label names, including operators (and, by, ...),
aggregators (count, quantile, ...) or other keywords (for, offset, ...).
This change additionally checks whether an identifier is one of these
types. We might want to check whether the specific item identification
should be moved from the lexer to the parser.