Note
Support for handling many Unicode whitespace characters is currently not available, but will be added in the future.
Note
We would like to add a procedure to merge adjacent segments with the same transition destination into a single segment.
Type | Visibility | Attributes | Name | Initial | |||
---|---|---|---|---|---|---|---|
type(segment_t), | public, | parameter | :: | SEG_ANY | = | segment_t(UTF8_CODE_MIN, UTF8_CODE_MAX) | |
type(segment_t), | public, | parameter | :: | SEG_CR | = | segment_t(13, 13) | |
type(segment_t), | public, | parameter | :: | SEG_DIGIT | = | segment_t(48, 57) | |
type(segment_t), | public, | parameter | :: | SEG_EMPTY | = | segment_t(UTF8_CODE_EMPTY, UTF8_CODE_EMPTY) | |
type(segment_t), | public, | parameter | :: | SEG_EPSILON | = | segment_t(-1, -1) | |
type(segment_t), | public, | parameter | :: | SEG_FF | = | segment_t(12, 12) | |
type(segment_t), | public, | parameter | :: | SEG_INIT | = | segment_t(UTF8_CODE_MAX+2, UTF8_CODE_MAX+2) | |
type(segment_t), | public, | parameter | :: | SEG_LF | = | segment_t(10, 10) | |
type(segment_t), | public, | parameter | :: | SEG_LOWERCASE | = | segment_t(97, 122) | |
type(segment_t), | public, | parameter | :: | SEG_SPACE | = | segment_t(32, 32) | |
type(segment_t), | public, | parameter | :: | SEG_TAB | = | segment_t(9, 9) | |
type(segment_t), | public, | parameter | :: | SEG_UNDERSCORE | = | segment_t(95, 95) | |
type(segment_t), | public, | parameter | :: | SEG_UPPER | = | segment_t(UTF8_CODE_MAX+1, UTF8_CODE_MAX+1) | |
type(segment_t), | public, | parameter | :: | SEG_UPPERCASE | = | segment_t(65, 90) | |
type(segment_t), | public, | parameter | :: | SEG_ZENKAKU_SPACE | = | segment_t(12288, 12288) |
This interface block provides the .in.
operator, which checks whether
an integer and a segment, an integer and a list of segments, or a segment
and a segment, is contained in the latter, respectively.
Checks if the given integer is within the specified segment.
This function determines whether the integer a
falls within the
range defined by the min
and max
values of the segment_t
type.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int32), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | seg |
Check if the ginve integer is within any of specified segments in a list.
This function determins whether the integer a
falls within any of the
ranges defined by the min
and max
value of the segment_t
type
in the provided list of segments.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int32), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | seg_list(:) |
Check if the one segment is completely within another segment.
This function determines whether the segment a
is entirely within the
range specified by the segment b
.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | seg | |||
type(segment_t), | intent(in) | :: | list(:) |
This interface block provides a not equal operator for comparing segments.
Check if two segments are not equivalent.
This function determines whether the segment a
is not equivalent to the
segment b
, meaning their min
or max
values are different.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
This interface block provides a equal operator for comparing segments.
Check if the one segment is exactly equal to another segment.
This function determines wheter the segment a
is equivalent to the
segment b
, meaning both their min
and max
values are identical.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
This derived-type represents a contiguous range of the Unicode character set
as a min
and max
value, providing an effective way to represent ranges of characters
when building automata where a range characters share the same transition destination.
Type | Visibility | Attributes | Name | Initial | |||
---|---|---|---|---|---|---|---|
integer(kind=int32), | public | :: | max | = | UTF8_CODE_MAX+2 | ||
integer(kind=int32), | public | :: | min | = | UTF8_CODE_MAX+2 |
procedure, public :: print => segment_for_print | |
procedure, public :: validate => segment_is_valid |
This function convert an input symbol into the segment corresponding it.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | symbol |
This function takes an array of segments and a character as arguments, and returns the segment as rank=1 array to which symbol belongs (included in the segment interval).
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | segments(:) | |||
character(len=*), | intent(in) | :: | symbol |
Checks if the given integer is within the specified segment.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int32), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | seg |
Check if the ginve integer is within any of specified segments in a list.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int32), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | seg_list(:) |
Check if the one segment is completely within another segment.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
Check if the one segment is exactly equal to another segment.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
Converts a segment to a printable string representation.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
class(segment_t), | intent(in) | :: | seg |
Checks if a segment is valid.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
class(segment_t), | intent(in) | :: | self |
Check if two segments are not equivalent.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(in) | :: | a | |||
type(segment_t), | intent(in) | :: | b |
This subroutine inverts a list of segment ranges representing Unicode characters. It compute the complement of the given ranges and modifies the list accordingly.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(inout), | allocatable | :: | list(:) |
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(inout), | allocatable | :: | segments(:) |
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
type(segment_t), | intent(inout), | allocatable | :: | segments(:) |