The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Syllabifier for CMU Dictionary Transcriptions
Version: 0.1.1
Author: Josef Fruehwald
Maintainer: Josef Fruehwald <jofrhwld@gmail.com>
Description: Implements tidy syllabification of transcription. Based on @kylebgorman's 'python' implementation https://github.com/kylebgorman/syllabify.
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1
Suggests: testthat
License: MIT + file LICENSE
Imports: dplyr, purrr, stringr, tibble, tidyr
Depends: R (≥ 2.10)
NeedsCompilation: no
Packaged: 2020-10-24 15:25:36 UTC; joseffruehwald
Repository: CRAN
Date/Publication: 2020-10-24 15:40:02 UTC

make onset indices

Description

make onset indices

Usage

make_onset_indices(nuclei_indices)

CMU pronunciation check

Description

CMU pronunciation check

Usage

pronunciation_check_cmu(pron)

Syllabify

Description

This will take a transcription as input, and return it as a data frame.

Usage

syllabify(pron, alaska_rule = T)

Arguments

pron

The CMU dictionary pronunciation, either as a vector, or a string with labels separated by spaces

alaska_rule

Don't maximize onset on lax vowel + s sequences

Value

Returns a data frame with the following columns

syll

A numeric index for each syllable

part

What part of the syllable each phone belongs to

phone

The phone label from the transcription

stress

The syllable stress

Examples

# String input
syllabify("AO0 S T R EY1 L Y AH0")

# Vector input
syllabify(c("AO0", "S", "T", "R", "EY1", "L", "Y", "AH0"))

# Hiatus
syllabify("HH AY0 EY1 T AH0 S")

# Deficient transcriptions (has warning)
syllabify(c("M"))

Syllabify to a list

Description

This will take a transcription as input, and return it as a list.

Usage

syllabify_list(pron, alaska_rule = TRUE)

Arguments

pron

The CMU dictionary pronunciation, either as a vector, or a string with labels separated by spaces

alaska_rule

Don't maximize onset on lax vowel + s sequences

Value

A with one value per syllable. Each value is a list, with three values: onset, nucleus, coda. Each will contain a vector of the phones which belong to each constituent part of the syllable. Any empty constituent parts will have the value character(0)

Examples

# String input
syllabify_list("AO0 S T R EY1 L Y AH0")

# Vector input
syllabify_list(c("AO0", "S", "T", "R", "EY1", "L", "Y", "AH0"))
# Hiatus
syllabify_list("HH AY0 EY1 T AH0 S")

# Deficient transcriptions (has warning)
syllabify_list(c("M"))

syllabify test dict

Description

trans

CMU transcription

word

word


Syllabify: A package for doing tidy syllabification

Description

This is a package to do tidy syllabification of phonetic transcriptions. The syllabifier "maximizes onset". The algorithmic approach to this is adapted from Kyle Gorman's python implementation (https://github.com/kylebgorman/syllabify)

Functions

The key function is syllabify(). Given a CMU transcription, it will return a tibble. See ?syllabify() for more info.

Also available is syllabify_list(). This is a list representation of the syllables. See ?syllabify_list() for more info.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.