transforms

transforms #

This file implements the building blocks for transforming a collection of input strings to the desired format in order to calculate the WER of CER.

In principle, for word error rate calculations, every string of a sentence needs to be collapsed into a list of strings, where each string is a single word. This is done with transforms.ReduceToListOfListOfWords. A composition of multiple transformations must therefore always end with transforms.ReduceToListOfListOfWords.

For the character error rate, every string of a sentence also needs to be collapsed into a list of strings, but here each string is a single character. This is done with transforms.ReduceToListOfListOfChars. Similarly, a composition of multiple transformations must therefore also always end with transforms.ReduceToListOfListOfChars.

AbstractTransform #

Bases: object

The base class of a Transform.

Source code in src/jiwer/transforms.py

class AbstractTransform(object):
    """
    The base class of a Transform.
    """

    def __call__(self, sentences: Union[str, List[str]]):
        """
        Transforms one or more strings.

        Args:
            sentences: The strings to transform.

        Returns:
            (Union[str, List[str]]): The transformed strings.

        """
        if isinstance(sentences, str):
            return self.process_string(sentences)
        elif isinstance(sentences, list):
            return self.process_list(sentences)
        else:
            raise ValueError(
                "input {} was expected to be a string or list of strings".format(
                    sentences
                )
            )

    def process_string(self, s: str):
        raise NotImplementedError()

    def process_list(self, inp: List[str]):
        return [self.process_string(s) for s in inp]

call #

__call__(sentences)

Transforms one or more strings.

Parameters:

Name	Type	Description	Default
`sentences`	`Union[str, List[str]]`	The strings to transform.	required

Returns:

Type	Description
`Union[str, List[str]]`	The transformed strings.

Source code in src/jiwer/transforms.py

def __call__(self, sentences: Union[str, List[str]]):
    """
    Transforms one or more strings.

    Args:
        sentences: The strings to transform.

    Returns:
        (Union[str, List[str]]): The transformed strings.

    """
    if isinstance(sentences, str):
        return self.process_string(sentences)
    elif isinstance(sentences, list):
        return self.process_list(sentences)
    else:
        raise ValueError(
            "input {} was expected to be a string or list of strings".format(
                sentences
            )
        )

Compose #

Bases: object

Chain multiple transformations back-to-back to create a pipeline combining multiple transformations.

Note that each transformation needs to end with either ReduceToListOfListOfWords or ReduceToListOfListOfChars, depending on whether word error rate, or character error rate is desired.

Example

import jiwer

jiwer.Compose([
    jiwer.RemoveMultipleSpaces(),
    jiwer.ReduceToListOfListOfWords()
])

Source code in src/jiwer/transforms.py

class Compose(object):
    """
    Chain multiple transformations back-to-back to create a pipeline combining multiple
    transformations.

    Note that each transformation needs to end with either `ReduceToListOfListOfWords`
    or `ReduceToListOfListOfChars`, depending on whether word error rate,
    or character error rate is desired.

    Example:
        ```python3
        import jiwer

        jiwer.Compose([
            jiwer.RemoveMultipleSpaces(),
            jiwer.ReduceToListOfListOfWords()
        ])
        ```
    """

    def __init__(self, transforms: List[AbstractTransform]):
        """

        Args:
            transforms: The list of transformations to chain.
        """
        self.transforms = transforms

    def __call__(self, text):
        for tr in self.transforms:
            text = tr(text)

        return text

init #

__init__(transforms)

Parameters:

Name	Type	Description	Default
`transforms`	`List[AbstractTransform]`	The list of transformations to chain.	required

Source code in src/jiwer/transforms.py

def __init__(self, transforms: List[AbstractTransform]):
    """

    Args:
        transforms: The list of transformations to chain.
    """
    self.transforms = transforms

ExpandCommonEnglishContractions #

Bases: AbstractTransform

Replace common contractions such as let's to let us.

Currently, this method will perform the following replacements. Note that ␣ is used to indicate a space () to get around markdown rendering constrains.

Contraction	transformed into
`won't`	`␣will not`
`can't`	`␣can not`
`let's`	`␣let us`
`n't`	`␣not`
`'re`	`␣are`
`'s`	`␣is`
`'d`	`␣would`
`'ll`	`␣will`
`'t`	`␣not`
`'ve`	`␣have`
`'m`	`␣am`

Example

import jiwer

sentences = ["she'll make sure you can't make it", "let's party!"]

print(jiwer.ExpandCommonEnglishContractions()(sentences))
# prints: ["she will make sure you can not make it", "let us party!"]

Source code in src/jiwer/transforms.py

class ExpandCommonEnglishContractions(AbstractTransform):
    """
    Replace common contractions such as `let's` to `let us`.

    Currently, this method will perform the following replacements. Note that `␣` is
     used to indicate a space (` `) to get around markdown rendering constrains.

    | Contraction   | transformed into |
    | ------------- |:----------------:|
    | `won't`       | `␣will not`      |
    | `can't`       | `␣can not`       |
    | `let's`       | `␣let us`        |
    | `n't`         | `␣not`           |
    | `'re`         | `␣are`           |
    | `'s`          | `␣is`            |
    | `'d`          | `␣would`         |
    | `'ll`         | `␣will`          |
    | `'t`          | `␣not`           |
    | `'ve`         | `␣have`          |
    | `'m`          | `␣am`            |

    Example:
        ```python
        import jiwer

        sentences = ["she'll make sure you can't make it", "let's party!"]

        print(jiwer.ExpandCommonEnglishContractions()(sentences))
        # prints: ["she will make sure you can not make it", "let us party!"]
        ```

    """

    def process_string(self, s: str):
        # definitely a non exhaustive list

        # specific words
        s = re.sub(r"won't", "will not", s)
        s = re.sub(r"can\'t", "can not", s)
        s = re.sub(r"let\'s", "let us", s)

        # general attachments
        s = re.sub(r"n\'t", " not", s)
        s = re.sub(r"\'re", " are", s)
        s = re.sub(r"\'s", " is", s)
        s = re.sub(r"\'d", " would", s)
        s = re.sub(r"\'ll", " will", s)
        s = re.sub(r"\'t", " not", s)
        s = re.sub(r"\'ve", " have", s)
        s = re.sub(r"\'m", " am", s)

        return s

ReduceToListOfListOfChars #

Bases: AbstractTransform

Transforms a single input sentence, or a list of input sentences, into a list with lists of characters, which is the expected format for calculating the edit operations between two input sentences on a character-level.

A sentence is assumed to be a string. Each string is expected to contain only a single sentence.

Example

import jiwer

sentences = ["hi", "this is an example"]

print(jiwer.ReduceToListOfListOfChars()(sentences))
# prints: [['h', 'i'], ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', 'n', ' ', 'e', 'x', 'a', 'm', 'p', 'l', 'e']]

Source code in src/jiwer/transforms.py

class ReduceToListOfListOfChars(AbstractTransform):
    """
    Transforms a single input sentence, or a list of input sentences, into
    a list with lists of characters, which is the expected format for calculating the
    edit operations between two input sentences on a character-level.

    A sentence is assumed to be a string. Each string is expected to contain only a
    single sentence.

    Example:
        ```python
        import jiwer

        sentences = ["hi", "this is an example"]

        print(jiwer.ReduceToListOfListOfChars()(sentences))
        # prints: [['h', 'i'], ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', 'n', ' ', 'e', 'x', 'a', 'm', 'p', 'l', 'e']]
        ```
    """

    def process_string(self, s: str):
        return [[w for w in s]]

    def process_list(self, inp: List[str]):
        sentence_collection = []

        for sentence in inp:
            list_of_words = self.process_string(sentence)[0]

            sentence_collection.append(list_of_words)

        if len(sentence_collection) == 0:
            return [[]]

        return sentence_collection

ReduceToListOfListOfWords #

Bases: AbstractTransform

Transforms a single input sentence, or a list of input sentences, into a list with lists of words, which is the expected format for calculating the edit operations between two input sentences on a word-level.

A sentence is assumed to be a string, where words are delimited by a token (such as , space). Each string is expected to contain only a single sentence. Empty strings (no output) are removed for the list.

Example

import jiwer

sentences = ["hi", "this is an example"]

print(jiwer.ReduceToListOfListOfWords()(sentences))
# prints: [['hi'], ['this', 'is', 'an, 'example']]

Source code in src/jiwer/transforms.py

class ReduceToListOfListOfWords(AbstractTransform):
    """
    Transforms a single input sentence, or a list of input sentences, into
    a list with lists of words, which is the expected format for calculating the
    edit operations between two input sentences on a word-level.

    A sentence is assumed to be a string, where words are delimited by a token
    (such as ` `, space). Each string is expected to contain only a single sentence.
    Empty strings (no output) are removed for the list.

    Example:
        ```python
        import jiwer

        sentences = ["hi", "this is an example"]

        print(jiwer.ReduceToListOfListOfWords()(sentences))
        # prints: [['hi'], ['this', 'is', 'an, 'example']]
        ```
    """

    def __init__(self, word_delimiter: str = " "):
        """
        Args:
            word_delimiter: the character which delimits words. Default is ` ` (space).
        """
        self.word_delimiter = word_delimiter

    def process_string(self, s: str):
        return [[w for w in s.split(self.word_delimiter) if len(w) >= 1]]

    def process_list(self, inp: List[str]):
        sentence_collection = []

        for sentence in inp:
            list_of_words = self.process_string(sentence)[0]
            sentence_collection.append(list_of_words)

        if len(sentence_collection) == 0:
            return [[]]

        return sentence_collection

init #

__init__(word_delimiter=' ')

Parameters:

Name	Type	Description	Default
`word_delimiter`	`str`	the character which delimits words. Default is (space).	`' '`

Source code in src/jiwer/transforms.py

def __init__(self, word_delimiter: str = " "):
    """
    Args:
        word_delimiter: the character which delimits words. Default is ` ` (space).
    """
    self.word_delimiter = word_delimiter

ReduceToSingleSentence #

Bases: AbstractTransform

Transforms multiple sentences into a single sentence. This operation can be useful when the number of reference and hypothesis sentences differ, and you want to do a minimal alignment over these lists. Note that this creates an invariance: wer([a, b], [a, b]) might not be equal to wer([b, a], [b, a]).

Example

import jiwer

sentences = ["hi", "this is an example"]

print(jiwer.ReduceToSingleSentence()(sentences))
# prints: ['hi this is an example']

Source code in src/jiwer/transforms.py

class ReduceToSingleSentence(AbstractTransform):
    """
    Transforms multiple sentences into a single sentence.
    This operation can be useful when the number of reference and hypothesis sentences
    differ, and you want to do a minimal alignment over these lists.
    Note that this creates an invariance: `wer([a, b], [a, b])` might not be equal to
    `wer([b, a], [b, a])`.

    Example:
        ```python3
        import jiwer

        sentences = ["hi", "this is an example"]

        print(jiwer.ReduceToSingleSentence()(sentences))
        # prints: ['hi this is an example']
        ```
    """

    def __init__(self, word_delimiter: str = " "):
        """
        :param word_delimiter: the character which delimits words. Default is ` ` (space).
        """
        self.word_delimiter = word_delimiter

    def process_string(self, s: str):
        return s

    def process_list(self, inp: List[str]):
        filtered_inp = [i for i in inp if len(i) >= 1]

        if len(filtered_inp) == 0:
            return []
        else:
            return ["{}".format(self.word_delimiter).join(filtered_inp)]

init #

__init__(word_delimiter=' ')

:param word_delimiter: the character which delimits words. Default is (space).

Source code in src/jiwer/transforms.py

def __init__(self, word_delimiter: str = " "):
    """
    :param word_delimiter: the character which delimits words. Default is ` ` (space).
    """
    self.word_delimiter = word_delimiter

RemoveEmptyStrings #

Bases: AbstractTransform

Remove empty strings from a list of strings.

Example

import jiwer

sentences = ["", "this is an example", " ",  "                "]

print(jiwer.RemoveEmptyStrings()(sentences))
# prints: ['this is an example']

Source code in src/jiwer/transforms.py

class RemoveEmptyStrings(AbstractTransform):
    """
    Remove empty strings from a list of strings.

    Example:
        ```python
        import jiwer

        sentences = ["", "this is an example", " ",  "                "]

        print(jiwer.RemoveEmptyStrings()(sentences))
        # prints: ['this is an example']
        ```
    """

    def process_string(self, s: str):
        return s.strip()

    def process_list(self, inp: List[str]):
        return [s for s in inp if self.process_string(s) != ""]

RemoveKaldiNonWords #

Bases: AbstractTransform

Remove any word between [] and <>. This can be useful when working with hypotheses from the Kaldi project, which can output non-words such as [laugh] and <unk>.

Example

import jiwer

sentences = ["you <unk> like [laugh]"]

print(jiwer.RemoveKaldiNonWords()(sentences))

# prints: ["you  like "]
# note the extra spaces

Source code in src/jiwer/transforms.py

class RemoveKaldiNonWords(AbstractTransform):
    """
    Remove any word between `[]` and `<>`. This can be useful when working
    with hypotheses from the Kaldi project, which can output non-words such as
    `[laugh]` and `<unk>`.

    Example:
        ```python
        import jiwer

        sentences = ["you <unk> like [laugh]"]

        print(jiwer.RemoveKaldiNonWords()(sentences))

        # prints: ["you  like "]
        # note the extra spaces
        ```
    """

    def process_string(self, s: str):
        return re.sub(r"[<\[][^>\]]*[>\]]", "", s)

RemoveMultipleSpaces #

Bases: AbstractTransform

Filter out multiple spaces between words.

Example

import jiwer

sentences = ["this is   an   example ", "  hello goodbye  ", "  "]

print(jiwer.RemoveMultipleSpaces()(sentences))
# prints: ['this is an example ', " hello goodbye ", " "]
# note that there are still trailing spaces

Source code in src/jiwer/transforms.py

class RemoveMultipleSpaces(AbstractTransform):
    """
    Filter out multiple spaces between words.

    Example:
        ```python
        import jiwer

        sentences = ["this is   an   example ", "  hello goodbye  ", "  "]

        print(jiwer.RemoveMultipleSpaces()(sentences))
        # prints: ['this is an example ', " hello goodbye ", " "]
        # note that there are still trailing spaces
        ```

    """

    def process_string(self, s: str):
        return re.sub(r"\s\s+", " ", s)

    def process_list(self, inp: List[str]):
        return [self.process_string(s) for s in inp]

RemovePunctuation #

Bases: BaseRemoveTransform

This transform filters out punctuation. The punctuation characters are defined as all unicode characters whose category name starts with P. See here for more information. Example:

import jiwer

sentences = ["this is an example!", "hello. goodbye"]

print(jiwer.RemovePunctuation()(sentences))
# prints: ['this is an example', "hello goodbye"]

Source code in src/jiwer/transforms.py

class RemovePunctuation(BaseRemoveTransform):
    """
    This transform filters out punctuation. The punctuation characters are defined as
    all unicode characters whose category name starts with `P`.
    See [here](https://www.unicode.org/reports/tr44/#General_Category_Values) for more
    information.
    Example:
        ```python
        import jiwer

        sentences = ["this is an example!", "hello. goodbye"]

        print(jiwer.RemovePunctuation()(sentences))
        # prints: ['this is an example', "hello goodbye"]
        ```
    """

    def __init__(self):
        punctuation_characters = _get_punctuation_characters()
        super().__init__(punctuation_characters)

RemoveSpecificWords #

Bases: SubstituteWords

Can be used to filter out certain words. As words are replaced with a character, make sure to that RemoveMultipleSpaces, Strip() and RemoveEmptyStrings are present in the composition after RemoveSpecificWords.

Example

import jiwer

sentences = ["yhe awesome", "the apple is not a pear", "yhe"]

print(jiwer.RemoveSpecificWords(["yhe", "the", "a"])(sentences))
# prints: ['  awesome', '  apple is not   pear', ' ']
# note the extra spaces

Source code in src/jiwer/transforms.py

class RemoveSpecificWords(SubstituteWords):
    """
    Can be used to filter out certain words.
    As words are replaced with a ` ` character, make sure to that
    `RemoveMultipleSpaces`, `Strip()` and `RemoveEmptyStrings` are present
    in the composition _after_ `RemoveSpecificWords`.

    Example:
        ```python
        import jiwer

        sentences = ["yhe awesome", "the apple is not a pear", "yhe"]

        print(jiwer.RemoveSpecificWords(["yhe", "the", "a"])(sentences))
        # prints: ['  awesome', '  apple is not   pear', ' ']
        # note the extra spaces
        ```
    """

    def __init__(self, words_to_remove: List[str]):
        """
        Args:
            words_to_remove: List of words to remove.
        """
        mapping = {word: " " for word in words_to_remove}

        super().__init__(mapping)

init #

__init__(words_to_remove)

Parameters:

Name	Type	Description	Default
`words_to_remove`	`List[str]`	List of words to remove.	required

Source code in src/jiwer/transforms.py

def __init__(self, words_to_remove: List[str]):
    """
    Args:
        words_to_remove: List of words to remove.
    """
    mapping = {word: " " for word in words_to_remove}

    super().__init__(mapping)

RemoveWhiteSpace #

Bases: BaseRemoveTransform

This transform filters out white space characters. Note that by default space () is also removed, which will make it impossible to split a sentence into a list of words by using ReduceToListOfListOfWords or ReduceToSingleSentence. This can be prevented by replacing all whitespace with the space character. If so, make sure that jiwer.RemoveMultipleSpaces, Strip() and RemoveEmptyStrings are present in the composition after RemoveWhiteSpace.

Example

import jiwer

sentences = ["this is an example", "hello world "]

print(jiwer.RemoveWhiteSpace()(sentences))
# prints: ["thisisanexample", "helloworld"]

print(jiwer.RemoveWhiteSpace(replace_by_space=True)(sentences))
# prints: ["this is an example", "hello world  "]
# note the trailing spaces

Source code in src/jiwer/transforms.py

class RemoveWhiteSpace(BaseRemoveTransform):
    """
    This transform filters out white space characters.
    Note that by default space (` `) is also removed, which will make it impossible to
    split a sentence into a list of words by using `ReduceToListOfListOfWords` or
    `ReduceToSingleSentence`.
    This can be prevented by replacing all whitespace with the space character.
    If so, make sure that `jiwer.RemoveMultipleSpaces`,
    `Strip()` and `RemoveEmptyStrings` are present in the composition _after_
    `RemoveWhiteSpace`.

    Example:
        ```python
        import jiwer

        sentences = ["this is an example", "hello world\t"]

        print(jiwer.RemoveWhiteSpace()(sentences))
        # prints: ["thisisanexample", "helloworld"]

        print(jiwer.RemoveWhiteSpace(replace_by_space=True)(sentences))
        # prints: ["this is an example", "hello world  "]
        # note the trailing spaces
        ```
    """

    def __init__(self, replace_by_space: bool = False):
        """

        Args:
            replace_by_space: every white space character is replaced with a space (` `)
        """
        characters = [c for c in string.whitespace]

        if replace_by_space:
            replace_token = " "
        else:
            replace_token = ""

        super().__init__(characters, replace_token=replace_token)

init #

__init__(replace_by_space=False)

Parameters:

Name	Type	Description	Default
`replace_by_space`	`bool`	every white space character is replaced with a space ()	`False`

Source code in src/jiwer/transforms.py

def __init__(self, replace_by_space: bool = False):
    """

    Args:
        replace_by_space: every white space character is replaced with a space (` `)
    """
    characters = [c for c in string.whitespace]

    if replace_by_space:
        replace_token = " "
    else:
        replace_token = ""

    super().__init__(characters, replace_token=replace_token)

Strip #

Bases: AbstractTransform

Removes all leading and trailing spaces.

Example

import jiwer

sentences = [" this is an example ", "  hello goodbye  ", "  "]

print(jiwer.Strip()(sentences))
# prints: ['this is an example', "hello goodbye", ""]
# note that there is an empty string left behind which might need to be cleaned up

Source code in src/jiwer/transforms.py

class Strip(AbstractTransform):
    """
    Removes all leading and trailing spaces.

    Example:
        ```python
        import jiwer

        sentences = [" this is an example ", "  hello goodbye  ", "  "]

        print(jiwer.Strip()(sentences))
        # prints: ['this is an example', "hello goodbye", ""]
        # note that there is an empty string left behind which might need to be cleaned up
        ```
    """

    def process_string(self, s: str):
        return s.strip()

SubstituteRegexes #

Bases: AbstractTransform

Transform strings by substituting substrings matching regex expressions into another substring.

Example

import jiwer

sentences = ["is the world doomed or loved?", "edibles are allegedly cultivated"]

# note: the regex string "\b(\w+)ed\b", matches every word ending in 'ed',
# and "\1" stands for the first group ('\w+). It therefore removes 'ed' in every match.
print(jiwer.SubstituteRegexes({r"doom": r"sacr", r"\b(\w+)ed\b": r"\1"})(sentences))

# prints: ["is the world sacr or lov?", "edibles are allegedly cultivat"]

Source code in src/jiwer/transforms.py

class SubstituteRegexes(AbstractTransform):
    r"""
    Transform strings by substituting substrings matching regex expressions into
    another substring.

    Example:
        ```python
        import jiwer

        sentences = ["is the world doomed or loved?", "edibles are allegedly cultivated"]

        # note: the regex string "\b(\w+)ed\b", matches every word ending in 'ed',
        # and "\1" stands for the first group ('\w+). It therefore removes 'ed' in every match.
        print(jiwer.SubstituteRegexes({r"doom": r"sacr", r"\b(\w+)ed\b": r"\1"})(sentences))

        # prints: ["is the world sacr or lov?", "edibles are allegedly cultivat"]
        ```
    """

    def __init__(self, substitutions: Mapping[str, str]):
        """

        Args:
            substitutions: a mapping of regex expressions to replacement strings.
        """
        self.substitutions = substitutions

    def process_string(self, s: str):
        for key, value in self.substitutions.items():
            s = re.sub(key, value, s)

        return s

init #

__init__(substitutions)

Parameters:

Name	Type	Description	Default
`substitutions`	`Mapping[str, str]`	a mapping of regex expressions to replacement strings.	required

Source code in src/jiwer/transforms.py

def __init__(self, substitutions: Mapping[str, str]):
    """

    Args:
        substitutions: a mapping of regex expressions to replacement strings.
    """
    self.substitutions = substitutions

SubstituteWords #

Bases: AbstractTransform

This transform can be used to replace a word into another word. Note that the whole word is matched. If the word you're attempting to substitute is a substring of another word it will not be affected. For example, if you're substituting foo into bar, the word foobar will NOT be substituted into barbar.

Example

import jiwer

sentences = ["you're pretty", "your book", "foobar"]

print(jiwer.SubstituteWords({"pretty": "awesome", "you": "i", "'re": " am", 'foo': 'bar'})(sentences))

# prints: ["i am awesome", "your book", "foobar"]

Source code in src/jiwer/transforms.py

class SubstituteWords(AbstractTransform):
    """
    This transform can be used to replace a word into another word.
    Note that the whole word is matched. If the word you're attempting to substitute
    is a substring of another word it will not be affected.
    For example, if you're substituting `foo` into `bar`, the word `foobar` will NOT
    be substituted into `barbar`.

    Example:
        ```python
        import jiwer

        sentences = ["you're pretty", "your book", "foobar"]

        print(jiwer.SubstituteWords({"pretty": "awesome", "you": "i", "'re": " am", 'foo': 'bar'})(sentences))

        # prints: ["i am awesome", "your book", "foobar"]
        ```

    """

    def __init__(self, substitutions: Mapping[str, str]):
        """
        Args:
            substitutions: A mapping of words to replacement words.
        """
        self.substitutions = substitutions

    def process_string(self, s: str):
        for key, value in self.substitutions.items():
            s = re.sub(r"\b{}\b".format(re.escape(key)), value, s)

        return s

init #

__init__(substitutions)

Parameters:

Name	Type	Description	Default
`substitutions`	`Mapping[str, str]`	A mapping of words to replacement words.	required

Source code in src/jiwer/transforms.py

def __init__(self, substitutions: Mapping[str, str]):
    """
    Args:
        substitutions: A mapping of words to replacement words.
    """
    self.substitutions = substitutions

ToLowerCase #

Bases: AbstractTransform

Convert every character into lowercase. Example:

import jiwer

sentences = ["You're PRETTY"]

print(jiwer.ToLowerCase()(sentences))

# prints: ["you're pretty"]

Source code in src/jiwer/transforms.py

class ToLowerCase(AbstractTransform):
    """
    Convert every character into lowercase.
    Example:
        ```python
        import jiwer

        sentences = ["You're PRETTY"]

        print(jiwer.ToLowerCase()(sentences))

        # prints: ["you're pretty"]
        ```
    """

    def process_string(self, s: str):
        return s.lower()

ToUpperCase #

Bases: AbstractTransform

Convert every character to uppercase.

Example

import jiwer

sentences = ["You're amazing"]

print(jiwer.ToUpperCase()(sentences))

# prints: ["YOU'RE AMAZING"]

Source code in src/jiwer/transforms.py

class ToUpperCase(AbstractTransform):
    """
    Convert every character to uppercase.

    Example:
        ```python
        import jiwer

        sentences = ["You're amazing"]

        print(jiwer.ToUpperCase()(sentences))

        # prints: ["YOU'RE AMAZING"]
        ```
    """

    def process_string(self, s: str):
        return s.upper()

transforms

transforms #

AbstractTransform #

__call__ #

Compose #

__init__ #

ExpandCommonEnglishContractions #

ReduceToListOfListOfChars #

ReduceToListOfListOfWords #

__init__ #

ReduceToSingleSentence #

__init__ #

RemoveEmptyStrings #

RemoveKaldiNonWords #

RemoveMultipleSpaces #

RemovePunctuation #

RemoveSpecificWords #

__init__ #

RemoveWhiteSpace #

__init__ #

Strip #

SubstituteRegexes #

__init__ #

SubstituteWords #

__init__ #

ToLowerCase #

ToUpperCase #

call #

init #

init #

init #

init #

init #

init #

init #