public interface Flags
TokenizerProperties
object: The setting affects all
Tokenizer
instances that share this TokenizerProperties
object as well as the TokenizerProperty
objects registered in this
TokenizerProperties
that do haven't set the flag locally.
Tokenizer
(see Tokenizer.changeParseFlags(int, int)
:
A single Tokenizer
will behave differently to the setting in
the used TokenizerProperties
object, but still follow the setting
for a single TokenizerProperty
object. Only a limited number of
flags can be set for a Tokenizer
, especially the flags that
are "dynamic", applicable more for the tokenizing process than describing
an attribute of a TokenizerProperty
, e. g. F_COUNT_LINES
and F_KEEP_DATA
.
TokenizerProperty
: This setting affects
only the handling of the property and overrules both settings for the
TokenizerProperties
that contains the property, and settings for a
Tokenizer
using the TokenizerProperties
object. Only
a limited number of flags can be set for a singel property including the
descriptive flags like F_NO_CASE
, F_ALLOW_NESTED_COMMENTS
and F_SINGLE_LINE_STRING
.
TokenizerProperties
Modifier and Type | Field and Description |
---|---|
static short |
F_ALLOW_NESTED_COMMENTS
Nested block comments are normally not allowed.
|
static short |
F_CASE
Deprecated.
for properties with a case handling different to the global
settings of a
TokenizerProperties instance use
the constructor TokenizerProperty(int, java.lang.String[], java.lang.Object, int, int) |
static short |
F_COUNT_LINES
Tells a
Tokenizer to count lines and columns. |
static short |
F_FREE_PATTERN
Treat pattern the same way as whitespaces, separators or special sequences.
|
static short |
F_KEEP_DATA
Set this flag to let a
Tokenizer buffer all data. |
static short |
F_NO_CASE
When this flag is set globally for a
TokenizerProperties instance
(see #setParseFlags , input data is generally treated case-insensitive. |
static short |
F_RETURN_BLOCK_COMMENTS
Return block comments.
|
static short |
F_RETURN_IMAGE_PARTS
By setting this flag for a
TokenizerProperties instance, a
Tokenizer or for a single property, a tokenizer returns not only
the token images but also image parts (see Token.getImageParts() ). |
static short |
F_RETURN_LINE_COMMENTS
Return line comments.
|
static short |
F_RETURN_SIMPLE_WHITESPACES
Return simple whitespaces.
|
static short |
F_RETURN_WHITESPACES
In many cases, parsers are not interested in whitespaces.
|
static short |
F_SINGLE_LINE_STRING
Per default, strings are all characters between and including a pair of
string start and end sequences, regardless if there are line separators in
between.
|
static short |
F_TOKEN_POS_ONLY
For performance and memory reasons, this flag is used to avoid copy operations
for every token.
|
static final short F_NO_CASE
TokenizerProperties
instance
(see #setParseFlags
, input data is generally treated case-insensitive.
Specific properties may still be treated case-sensitive. Set this flag set
in the flag mask and cleared in the corresponding flags).
TokenizerProperties
and TokenizerProperty
instances. It should not to be used
dynamically (Tokenizer.changeParseFlags(int, int)
).static final short F_CASE
TokenizerProperties
instance use
the constructor TokenizerProperty(int, java.lang.String[], java.lang.Object, int, int)
F_NO_CASE
. If
F_NO_CASE
is set via TokenizerProperties.setParseFlags(int)
,
F_CASE
can be used for single properties where case-sensitivity
is nessecary inspite of the global case-insensitivity.
F_CASE
nor F_NO_CASE
is set, F_CASE
is assumed. If both flags are set, F_CASE
takes preceedence.
TokenizerProperties
and TokenizerProperty
instances. It should not to be used
dynamically (Tokenizer.changeParseFlags(int, int)
).static final short F_TOKEN_POS_ONLY
Token
instance, only its position and length in the input stream.
TokenizerProperties
,
and TokenizerProperty
instances. It should also be a dynamic flag
that can be switched on and off during runtime using Tokenizer.changeParseFlags(int, int)
.static final short F_KEEP_DATA
Tokenizer
buffer all data. Usually, a tokenizer
will apply a strategie to allocate only a reasonable amount of memory.
TokenizerProperties
and Tokenizer
objects, but not for single TokenizerProperty
instances. It could also be a dynamic flag that can be switched on and off
during runtime of a tokenizer (Tokenizer.changeParseFlags(int, int)
), although
it is generally set before parsing starts.static final short F_COUNT_LINES
Tokenizer
to count lines and columns. The tokenizer may use
java.lang.System.getProperty
("line.separator")
to
obtain the end-of-line sequence or accept different line separator sequences
for a better portability: single carriage return (Mac OS), single line feed
(Unix), combination of carriage return and line feed (Windows OS).
Tokenizer
implementation finds these occurences. This is in order to
maintain a good performance, since otherwise there would be a potential huge
amount of unsuccessfull newline scans in these tokens. Consider defining
special sequences for '\r', '\n' and '\r\n' alone and remove them from the
whitespace set, if You cannot live with the described limitation.
TokenizerProperties
and Tokenizer
objects, but not for single TokenizerProperty
instances. It could also be a dynamic flag that can be switched on and off
during runtime of a tokenizer, although it is generally set before parsing
starts.static final short F_ALLOW_NESTED_COMMENTS
TokenizerProperties
and TokenizerProperty
instances. It should not to be used
dynamically (as in versions of JTopas prior to 0.8).static final short F_FREE_PATTERN
TokenizerProperties
and TokenizerProperty
instances. It should not to be used
dynamically.static final short F_RETURN_SIMPLE_WHITESPACES
#setWhitespaces
. The flag is part of the composite mask
F_RETURN_WHITESPACES
.
TokenizerProperties
and Tokenizer
, but not for single TokenizerProperty
instances. It is also a dynamic flag that can be switched on and off
during runtime of a tokenizer (Note:: Flags for a single
TokenizerProperty
take precedence over other settings).static final short F_RETURN_BLOCK_COMMENTS
F_RETURN_WHITESPACES
.
TokenizerProperties
,
Tokenizer
and for single TokenizerProperty
instances. It is
also a dynamic flag that can be switched on and off during runtime of a
tokenizer (Note:: Flags for a single TokenizerProperty
take precedence over other settings).static final short F_RETURN_LINE_COMMENTS
F_RETURN_WHITESPACES
.
TokenizerProperties
,
Tokenizer
and for single TokenizerProperty
instances. It is
also a dynamic flag that can be switched on and off during runtime of a
tokenizer (Note:: Flags for a single TokenizerProperty
take precedence over other settings).static final short F_RETURN_WHITESPACES
F_RETURN_SIMPLE_WHITESPACES
, F_RETURN_BLOCK_COMMENTS
and F_RETURN_LINE_COMMENTS
either by setting it generally for
a TokenizerProperties
or a single Tokenizer
object or even
more specific for a single TokenizerProperties
.static final short F_SINGLE_LINE_STRING
TokenizerProperties
instance in general or for a single string property.
TokenizerProperties
and TokenizerProperty
instances. It should not to be used
dynamically.static final short F_RETURN_IMAGE_PARTS
TokenizerProperties
instance, a
Tokenizer
or for a single property, a tokenizer returns not only
the token images but also image parts (see Token.getImageParts()
).
TokenizerProperties
,
Tokenizer
and for single TokenizerProperty
instances.