forgex_utf8_m Module

The forgex_utf8_m module processes a byte-indexed character strings type as UTF-8 strings.


Functions

public pure function adjustl_multi_byte(chara) result(res)

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: chara

Return Value character(len=:), allocatable

public pure function char_utf8(code) result(str)

The char_utf8 function takes a code point as integer in Unicode character set, and returns the corresponding character as UTF-8 binary string.

Read more…

Arguments

Type IntentOptional Attributes Name
integer(kind=int32), intent(in) :: code

Return Value character(len=:), allocatable

public pure function count_token(str, token) result(count)

This function counts the occurrence of a spcified character(token) in a given string.

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: str
character(len=1), intent(in) :: token

Return Value integer

public pure function ichar_utf8(chara) result(res)

Take a UTF-8 character as an argument and return the integer (also known as "code point" in Unicode) representing its UTF-8 binary string.

Read more…

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: chara

Return Value integer(kind=int32)

public pure function idxutf8(str, curr) result(tail)

This function returns the index of the end of the (multibyte) character, given the string str and the current index curr.

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: str
integer(kind=int32), intent(in) :: curr

Return Value integer(kind=int32)

public pure function is_first_byte_of_character(chara) result(res)

This function determines if a given character is the first byte of a UTF-8 multibyte character. It takes a 1-byte character as input and returns a logical value indicating if it is the first byte of an UTF-8 binary string.

Arguments

Type IntentOptional Attributes Name
character(len=1), intent(in) :: chara

Return Value logical

public pure function is_valid_multiple_byte_character(chara) result(res)

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: chara

Return Value logical

public pure function len_trim_utf8(str) result(count)

This function calculates the length of a UTF-8 string excluding tailing spaces.

Read more…

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: str

Return Value integer

public pure function len_utf8(str) result(count)

This function calculates the length of a UTF-8 string.

Read more…

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: str

Return Value integer

public pure function trim_invalid_utf8_byte(chara) result(res)

Arguments

Type IntentOptional Attributes Name
character(len=*), intent(in) :: chara

Return Value character(len=:), allocatable

private pure function set_continuation_byte(byte) result(res)

This function take one byte, set the first two bits to 10, and returns one byte of the continuation part.

Arguments

Type IntentOptional Attributes Name
integer(kind=int8), intent(in) :: byte

Return Value integer(kind=int8)


Subroutines

public pure subroutine is_first_byte_of_character_array(str, array, length)

This subroutine determines if each character in a given string is the first byte of a UTF-8 multibyte character. It takes a UTF-8 string and return a logical array indicating for each position if it is the first byte.

Arguments

Type IntentOptional Attributes Name
character(len=length), intent(in) :: str
logical, intent(inout), allocatable :: array(:)
integer(kind=int32), intent(in) :: length