Procedures

ProcedureLocationProcedure TypeDescription
add_nfa_state forgex_nfa_state_set_m Subroutine

This subroutine adds a specified state (s) to an NFA state set state_set by setting the corresponding element in state%vec to true.

adjustl_multi_byte forgex_utf8_m Function
arg_in_segment forgex_segment_m Function

Checks if the given integer is within the specified segment.

Read more…
arg_in_segment_list forgex_segment_m Function

Check if the ginve integer is within any of specified segments in a list.

Read more…
ascii2seg forgex_bitmap_m Subroutine
ascii__add_character_char forgex_bitmap_m Subroutine
ascii__add_character_codepoint forgex_bitmap_m Subroutine
ascii__add_character_range forgex_bitmap_m Subroutine
assignment(=) forgex_cube_m Interface
automaton__build_nfa forgex_automaton_m Subroutine
automaton__compute_reachable_state forgex_automaton_m Function

This function calculates a set of possible NFA states from the current DFA state by the input character symbol.

Read more…
automaton__construct_dfa forgex_automaton_m Subroutine

This subroutine gets the destination index of DFA nodes from the current index with given symbol, adding a DFA node if necessary.

Read more…
automaton__destination forgex_automaton_m Function

This function returns the dfa transition object, that contains the destination index and the corresponding set of transitionable NFA state.

automaton__epsilon_closure forgex_automaton_m Subroutine

Compute the ε-closure for a set of NFA states.

Read more…
automaton__initialize forgex_automaton_m Subroutine

This subroutine reads tree and tree_top variable, constructs the NFA graph, and then initializes the DFA graph.

automaton__print_dfa forgex_automaton_m Subroutine

This subroutine prints DFA states and transitions to a given unit number.

automaton__print_info forgex_automaton_m Subroutine

This subroutine provides the automata' summarized information.

automaton__register_state forgex_automaton_m Subroutine

This subroutine takes a nfa_state_set_t type argument as input and register the set as a DFA state node in the DFA graph.

Read more…
best forgex_syntax_tree_optimize_m Function
best_factor forgex_syntax_tree_optimize_m Subroutine

This is recursive procedure to tour a given syntax tree.

bmp2seg forgex_bitmap_m Subroutine
bmp__add_character_char forgex_bitmap_m Subroutine
bmp__add_character_codepoint forgex_bitmap_m Subroutine
bmp__add_character_range forgex_bitmap_m Subroutine
bubble_sort forgex_sort_m Subroutine

Implementing insertion sort instead of this algorithm is considered.

char_utf8 forgex_utf8_m Function

The char_utf8 function takes a code point as integer in Unicode character set, and returns the corresponding character as UTF-8 binary string.

Read more…
character_string_to_array forgex_character_array_m Subroutine

This subroutine parses a pattern string for character class, and outputs character_array_t type array. When it encounters invalid value along the way, it returns.

check_nfa_state forgex_nfa_state_set_m Function

This function checks if the arguement 'state' (set of NFA state) includes state 's'.

clear forgex_priority_queue_m Subroutine

The clear subroutine deallocates the queue.

compute_reachable_state forgex_dense_dfa_m Function

This function calculates a set of possible NFA states from the current DFA state.

Read more…
construct_dense_dfa forgex_dense_dfa_m Subroutine

This subroutine convert an NFA into a fully compiled DFA.

convert_escaped_character_into_segments forgex_syntax_tree_graph_m Subroutine

This subroutine converts escaped character of the argument chara into segment seg_list.

copy_dfa_transition forgex_lazy_dfa_node_m Subroutine

This subroutine copies the data of a specified transition into the variables of another dfa_transition_t.

count_token forgex_utf8_m Function

This function counts the occurrence of a spcified character(token) in a given string.

cube__bmp2seg forgex_cube_m Subroutine
cube__dump_sps forgex_cube_m Subroutine
cube__first_codepoint forgex_cube_m Function
cube__invert forgex_cube_m Subroutine
cube__number_of_flagged_bits forgex_cube_m Function
cube__switch_ascii_to_bmp forgex_cube_m Subroutine
cube_add__cube forgex_cube_m Subroutine
cube_add__segment forgex_cube_m Subroutine
cube_add__segment_list forgex_cube_m Subroutine
cube_add__symbol forgex_cube_m Subroutine
cube_flag__epsilon forgex_cube_m Subroutine
cube_flag__is_flagged_epsilon forgex_cube_m Function
cube_t__codepoint_in_cube forgex_cube_m Function
cube_t__cube_assign forgex_cube_m Subroutine
cube_t__equal forgex_cube_m Function
cube_t__symbol_in_cube forgex_cube_m Function
deallocate_tree forgex_syntax_tree_node_m Subroutine

This subroutine deallocate the syntax tree.

dense_dfa__add_transition forgex_lazy_dfa_graph_m Subroutine
dequeue forgex_priority_queue_m Subroutine

The dequeue function takes out and returns the prior segment from the queue.

destination forgex_dense_dfa_m Subroutine

This subroutine gets the next DFA nodes index from current index, and stores the result in next and next_set. If the DFA state is already registered, it returns the index, otherwise it returns DFA_INVALID_INDEX.

dfa_state_node__add_transition forgex_lazy_dfa_node_m Subroutine

This subroutine processes to add the given transition to the list which dfa_state_node_t has.

dfa_state_node__get_transition_top forgex_lazy_dfa_node_m Function

This function returns the index of top transition in the list dfa_state_node_t has.

dfa_state_node__increment_transition_top forgex_lazy_dfa_node_m Subroutine

This subroutine increments the value of top transition index.

dfa_state_node__initialize_transition_top forgex_lazy_dfa_node_m Subroutine

This subroutine initialize the top index of the transition array of the dfa node with the value of the given argument.

dfa_state_node__is_registered_transition forgex_lazy_dfa_node_m Function
dfa_state_node__is_registered_transition_cube forgex_lazy_dfa_node_m Function
dfa_state_node__reallocate_transition_forward forgex_lazy_dfa_node_m Subroutine

This subroutine performs allocating initial or additional transition arrays.

Read more…
disjoin forgex_segment_disjoin_m Interface

Interface for the procedure disjoin_kernel.

disjoin_kernel forgex_segment_disjoin_m Subroutine

Disjoins overlapping segments and creates a new list of non-overlapping segments.

Read more…
disjoin_nfa_each_transition forgex_nfa_graph_m Subroutine

This subroutine updates the NFA state transitions by disjoining the segments.

Read more…
do_matching_exactly forgex_api_internal_m Subroutine

This subroutine is intended to be called from the forgex API module.

do_matching_including forgex_api_internal_m Subroutine

This procedure reads a text, performs regular expression matching using an automaton, and stores the string index in the argument if it contains a match.

dump_tree_table forgex_syntax_tree_graph_m Subroutine
enqueue forgex_priority_queue_m Subroutine

The enqueue subroutine is responsible for allocating heap structure and holding the disjoined segment data with ascending priority order.

Read more…
equivalent_nfa_state_set forgex_nfa_state_set_m Function

This function determines if two NFA state sets (logical vectors) are equivalent.

Read more…
extract_literal forgex_syntax_tree_optimize_m Subroutine

This is the public procedure of this module to obtain each literal from AST.

function__regex forgex Function

The function implemented for the regex_f function.

generate_nfa forgex_nfa_graph_m Subroutine
generate_nfa_closure forgex_nfa_graph_m Subroutine
generate_nfa_concatenate forgex_nfa_graph_m Subroutine
get_error_message forgex_error_m Function

This function accepts integer error status code and then return error messagge corresponding to it. !

get_index_comma forgex_utility_m Subroutine

This procedure find first comma and number of it in the given pattern. It aims to parse repetation times of the expression.

get_index_list_forward forgex_utility_m Subroutine

This subroutine creates an array containing a list of the positions of the prefixes that exist in the text

get_literal forgex_syntax_tree_optimize_m Function

Wrapping function to retrieve literals: all, prefix, suffix, factor.

get_token forgex_syntax_tree_node_m Subroutine

Get the currently focused character (1 to 4 bytes) from the entire string inside the type_t derived-type, and store the enumerator's numeric value in the current_token component. This is a type-bound procedure of tape_t.

hex2seg forgex_segment_m Subroutine

This subroutine converts character string that represents hexadecimal value to the segment corresponding its integer type.

ichar_utf8 forgex_utf8_m Function

Take a UTF-8 character as an argument and return the integer (also known as "code point" in Unicode) representing its UTF-8 binary string.

Read more…
idxutf8 forgex_utf8_m Function

This function returns the index of the end of the (multibyte) character, given the string str and the current index curr. Class of invalid UTF-8 characters 1. invalid lead byte 2. invalid trail byte 3. overrun 4. over long encoding 5. incomplete multibyte sequence 6. invalid character range (U+D800-U+DFFF) 7. BOM appears in the middle 8. isolated trail byte

Read more…
index_ca forgex_character_array_m Function
index_list_from_segment_list forgex_segment_disjoin_m Subroutine

Extracts a sorted list of unique indices from a list of segments.

Read more…
init_state_set forgex_nfa_state_set_m Subroutine
insertion_sort forgex_sort_m Subroutine
interpret_class_string forgex_syntax_tree_graph_m Subroutine

This subroutine parses a pattern string and outputs a list of segment_t type.

invert_segment_list forgex_segment_m Subroutine

This subroutine inverts a list of segment ranges representing Unicode characters. It compute the complement of the given ranges and modifies the list accordingly.

is_eqv_str forgex_test_m Function
is_first_byte_of_character forgex_utf8_m Function

This function determines if a given character is the first byte of a UTF-8 multibyte character. It takes a 1-byte character as input and returns a logical value indicating if it is the first byte of an UTF-8 binary string.

is_first_byte_of_character_array forgex_utf8_m Subroutine

This subroutine determines if each character in a given string is the first byte of a UTF-8 multibyte character. It takes a UTF-8 string and return a logical array indicating for each position if it is the first byte.

is_integer forgex_utility_m Function

This function determines whether the input character string can be parsed into an integer.

is_overlap_to_seg_list forgex_segment_disjoin_m Function

Checks if a segment overlaps with any segments in a list.

Read more…
is_prime_semgment forgex_segment_disjoin_m Function

Checks if a segment is a prime segment within a disjoined list.

Read more…
is_there_caret_at_the_top forgex_utility_m Function

This function returns .true. if the pattern contains the caret character at the top that matches the beginning of a line.

is_there_dollar_at_the_end forgex_utility_m Function

This funciton returns .true. if the pattern contains the doller character at the end that matches the ending of a line.

is_valid__error forgex_test_m Function

This function checks whether it returns the correct error for a given pattern and text.

is_valid__in forgex_test_m Function

This function checks if a pattern is found within a string and compares the result to the correct_answer.

is_valid__match forgex_test_m Function

This function checks if a pattern matches exactly a string and compares the result to the correct answer.

is_valid__pattern forgex_test_m Function

This function checks if the given pattern is valid as a regex pattern and compares the result to the correct_answer.

is_valid__prefix forgex_test_m Function

This function checks whether the correct prefix is extracted for a given pattern.

is_valid__regex forgex_test_m Function

This function checks if a pattern matches a string using the regex function and compares the result to the expected answer.

is_valid__suffix forgex_test_m Function

This function checks whether the correct suffix is extracted for a given pattern.

is_valid_multiple_byte_character forgex_utf8_m Function

This function checks the input byte string is valid as a single UTF-8 character.

is_valid_regex forgex Interface

The generic name for the is_valid_regex function implemented as is_valid_regex_pattern.

is_valid_regex_pattern forgex Function

The function validating a given regex patten.

join_two_segments forgex_segment_m Function

This function converts two isolated segments into single fused segment and returns it.

lazy_dfa__add_transition forgex_lazy_dfa_graph_m Subroutine

This subroutine construct an new transition object from the arguments, and invokes the type-bound procedure of dfa_state_node_t with it.

lazy_dfa__preprocess forgex_lazy_dfa_graph_m Subroutine

This subroutine determines the number of DFA nodes the graph has and allocate the array.

lazy_dfa__reallocate forgex_lazy_dfa_graph_m Subroutine

This subroutine performs reallocating array that represents the DFA graph.

Read more…
lazy_dfa__registered_index forgex_lazy_dfa_graph_m Function

Returns whether the DFA state is already registered by index, or DFA_INVALID_INDEX if it is not registered.

len_trim_utf8 forgex_utf8_m Function

This function calculates the length of a UTF-8 string excluding tailing spaces.

Read more…
len_utf8 forgex_utf8_m Function

This function calculates the length of a UTF-8 string.

Read more…
make_atom forgex_syntax_tree_node_m Function
make_repeat_node forgex_syntax_tree_node_m Function
make_replacement_char forgex_utf8_m Function
make_tree_node forgex_syntax_tree_node_m Function
match_dense_dfa_exactly forgex_dense_dfa_m Function

This procedure reads a text, performs regular expression matching using compiled DFA, and returns .true. if it matches exactly.

match_dense_dfa_including forgex_dense_dfa_m Subroutine

This procedure reads a text, performs regular expression matching using an automaton, and stores the string index in the argument if it contains a match.

merge_segments forgex_segment_m Subroutine
move forgex_dense_dfa_m Function

This function returns the dfa transition object, that contains the destination index and the corresponding set of transitionable NFA state.

nchar forgex_test_m Function

nchar means 'negative char'.

next_idxutf8 forgex_utf8_m Function

This function returns the index of the next character, given the string str and the current index curr. If the current index is for the last character, it returns the invalid value.

next_idxutf8_strict forgex_utf8_m Subroutine

This subroutine returns the index of the next UTF-8 character conteined in str. This is used to handle strings that may not be encoded by UTF-8.

next_state_dense_dfa forgex_dense_dfa_m Function

This function returns the index of the destination DFA state from the index of the current automaton DFA state array and the input symbol.

nfa__add_transition forgex_nfa_node_m Subroutine
nfa__add_transition_cube forgex_nfa_node_m Subroutine
nfa__merge_segments_of_transition forgex_nfa_node_m Subroutine
nfa__reallocate_transition_forward forgex_nfa_node_m Subroutine
nfa_graph__build forgex_nfa_graph_m Subroutine
nfa_graph__collect_epsilon_transition forgex_nfa_graph_m Subroutine
nfa_graph__disjoin forgex_nfa_graph_m Subroutine
nfa_graph__is_exceeded forgex_nfa_graph_m Function
nfa_graph__mark_epsilon_transition forgex_nfa_graph_m Subroutine
nfa_graph__new_node forgex_nfa_graph_m Subroutine
nfa_graph__print forgex_nfa_graph_m Subroutine
nfa_graph__reallocate forgex_nfa_graph_m Subroutine
operator(.in.) forgex_segment_m Interface

This interface block provides the .in. operator, which checks whether an integer and a segment, an integer and a list of segments, or a segment and a segment, is contained in the latter, respectively.

operator(.in.) forgex Interface

Interface for user-defined operator of .in.

operator(.in.) forgex_cube_m Interface
operator(.match.) forgex Interface

Interface for user-defined operator of .match.

operator(/=) forgex_segment_m Interface

This interface block provides a not equal operator for comparing segments.

operator(==) forgex_segment_m Interface

This interface block provides a equal operator for comparing segments.

operator(==) forgex_cube_m Interface
operator__in forgex Function

The function implemented for the .in. operator.

operator__match forgex Function

The function implemented for the .match. operator.

parse_backslash_and_hyphen_in_char_array forgex_character_array_m Subroutine

This subroutine processes a character array, and outputs the corresponding flagged array. It removes backslash and hyphen characters, and then flags the current element in character_array_t type array.

parse_escape_sequence_with_argument forgex_character_array_m Subroutine
parse_segment_width_in_char_array forgex_character_array_m Subroutine

This subroutine assigns the expected segment size from the character c of the current array element to its seg_size.

print_class_simplify forgex_syntax_tree_graph_m Function
print_hex forgex_test_m Subroutine
print_nfa_state_set forgex_nfa_state_set_m Subroutine
print_tree_internal forgex_syntax_tree_graph_m Subroutine
print_tree_wrap forgex_syntax_tree_graph_m Subroutine
prop2seg forgex_segment_m Subroutine
reallocate_tree forgex_syntax_tree_node_m Subroutine
regex forgex Interface

The generic name for the regex subroutine implemented as procedure__regex.

regex_f forgex Interface

The generic name for the regex_f function implemented as function__regex.

register_seg_list forgex_segment_disjoin_m Subroutine

Registers a new segment into a list if it is valid.

Read more…
register_segment_to_list forgex_segment_m Subroutine

This procedure registers given segment_t value to segment_t type array, increments counter of the actual size of the array, and initializes temporary variable.

repeat forgex_test_m Function

This function generates a string by repeating a given pattern a specified number of times.

return_class_closure forgex_syntax_tree_optimize_m Function
reverse_utf8 forgex_utf8_m Function
runner_error forgex_test_m Subroutine

This subroutine runs is_valid_error function and prints its result.

runner_in forgex_test_m Subroutine

This subroutine runs the is_valid__in function and prints the result.

runner_match forgex_test_m Subroutine

This subroutine runs the is_valid__match function and prints the result.

runner_prefix forgex_test_m Subroutine

This subroutine runs the is_valid_prefix function and prints the result.

runner_regex forgex_test_m Subroutine

This subroutine runs the is_valid__regex function and prints the result.

runner_suffix forgex_test_m Subroutine

This function runs the is_valid_suffix function and prints the result.

runner_validate forgex_test_m Subroutine

This subroutine runs the is_valid__pattern function and prints the result.

same_part_of_prefix forgex_syntax_tree_optimize_m Function
same_part_of_suffix forgex_syntax_tree_optimize_m Function
seg_in_segment forgex_segment_m Function

Check if the one segment is completely within another segment.

Read more…
seg_in_segment_list forgex_segment_m Function
segment_equivalent forgex_segment_m Function

Check if the one segment is exactly equal to another segment.

Read more…
segment_for_print forgex_segment_m Function

Converts a segment to a printable string representation.

Read more…
segment_is_valid forgex_segment_m Function

Checks if a segment is valid.

Read more…
segment_not_equiv forgex_segment_m Function

Check if two segments are not equivalent.

Read more…
set_continuation_byte forgex_utf8_m Function

This function take one byte, set the first two bits to 10, and returns one byte of the continuation part.

sort_segment_by_min forgex_segment_m Subroutine
subroutine__regex forgex Subroutine

The function implemented for the regex subroutine.

symbol_to_segment forgex_segment_m Function

This function convert an input symbol into the segment corresponding it.

total_width_of_segment forgex_segment_m Function
tree_graph__build_syntax_tree forgex_syntax_tree_graph_m Subroutine

This procedure builds an AST corresponding to a given (regular expression) pattern from it.

tree_graph__char_class forgex_syntax_tree_graph_m Subroutine

This subroutine treats character class expression, and does not call any other recursive procedures.

tree_graph__connect_left forgex_syntax_tree_graph_m Subroutine
tree_graph__connect_right forgex_syntax_tree_graph_m Subroutine
tree_graph__deallocate forgex_syntax_tree_graph_m Subroutine

This procedure deallocates nodes of tree_t

tree_graph__get_top forgex_syntax_tree_graph_m Function
tree_graph__hexadecimal_to_codepoint forgex_syntax_tree_graph_m Subroutine
tree_graph__hexadecimal_to_segment forgex_syntax_tree_graph_m Subroutine

This procedure handles a escape sequence with '\x'.

tree_graph__make_tree_caret_dollar forgex_syntax_tree_graph_m Subroutine

This function constructs a tree node for carriage return (CR) and line feed (LF) characters.

tree_graph__make_tree_crlf forgex_syntax_tree_graph_m Subroutine
tree_graph__primary forgex_syntax_tree_graph_m Subroutine
tree_graph__reallocate forgex_syntax_tree_graph_m Subroutine

This procedure handles the reallcation of the tree_node_t type array within the component of the tree_t object. However, it is not be used in v4.2.

tree_graph__regex forgex_syntax_tree_graph_m Subroutine
tree_graph__register_connector forgex_syntax_tree_graph_m Subroutine
tree_graph__register_node forgex_syntax_tree_graph_m Subroutine
tree_graph__shorthand forgex_syntax_tree_graph_m Subroutine

This function handles shorthand escape sequences (\t, \n, \r, \d, \D, \w, \W, \s, \S). It does not call any other recursive procedures.

tree_graph__suffix_op forgex_syntax_tree_graph_m Subroutine
tree_graph__term forgex_syntax_tree_graph_m Subroutine
tree_graph__times forgex_syntax_tree_graph_m Subroutine

This subroutine handles a quantifier range, and does not call any other recursive procedures.

tree_graph__unicode_property forgex_syntax_tree_graph_m Subroutine
trim_invalid_utf8_byte forgex_utf8_m Function
which_segment_symbol_belong forgex_segment_m Function

This function takes an array of segments and a character as arguments, and returns the segment as rank=1 array to which symbol belongs (included in the segment interval).

width_of_segment forgex_segment_m Function