The forgex_utf8_m
module processes a byte-indexed character strings type as UTF-8 strings.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | chara |
The char_utf8
function takes a code point as integer in Unicode character set,
and returns the corresponding character as UTF-8 binary string.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int32), | intent(in) | :: | code |
This function counts the occurrence of a spcified character(token) in a given string.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | str | |||
character(len=1), | intent(in) | :: | token |
Take a UTF-8 character as an argument and return the integer (also known as "code point" in Unicode) representing its UTF-8 binary string.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | chara |
This function returns the index of the end of the (multibyte) character, given the string str and the current index curr.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | str | |||
integer(kind=int32), | intent(in) | :: | curr |
This function determines if a given character is the first byte of a UTF-8 multibyte character. It takes a 1-byte character as input and returns a logical value indicating if it is the first byte of an UTF-8 binary string.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=1), | intent(in) | :: | chara |
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | chara |
This function calculates the length of a UTF-8 string excluding tailing spaces.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | str |
This function calculates the length of a UTF-8 string.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | str |
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=*), | intent(in) | :: | chara |
This function take one byte, set the first two bits to 10, and returns one byte of the continuation part.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
integer(kind=int8), | intent(in) | :: | byte |
This subroutine determines if each character in a given string is the first byte of a UTF-8 multibyte character. It takes a UTF-8 string and return a logical array indicating for each position if it is the first byte.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
character(len=length), | intent(in) | :: | str | |||
logical, | intent(inout), | allocatable | :: | array(:) | ||
integer(kind=int32), | intent(in) | :: | length |