Complex Setswana Parts of Speech Tagging


Journal article


G. Malema, Boago Okgetheng, Bopaki Tebalo, Moffat Motlhanka, Goaletsa Rammidi
RAIL, 2020

Semantic Scholar
Cite

Cite

APA   Click to copy
Malema, G., Okgetheng, B., Tebalo, B., Motlhanka, M., & Rammidi, G. (2020). Complex Setswana Parts of Speech Tagging. RAIL.


Chicago/Turabian   Click to copy
Malema, G., Boago Okgetheng, Bopaki Tebalo, Moffat Motlhanka, and Goaletsa Rammidi. “Complex Setswana Parts of Speech Tagging.” RAIL (2020).


MLA   Click to copy
Malema, G., et al. “Complex Setswana Parts of Speech Tagging.” RAIL, 2020.


BibTeX   Click to copy

@article{g2020a,
  title = {Complex Setswana Parts of Speech Tagging},
  year = {2020},
  journal = {RAIL},
  author = {Malema, G. and Okgetheng, Boago and Tebalo, Bopaki and Motlhanka, Moffat and Rammidi, Goaletsa}
}

Abstract

Setswana language is one of the Bantu languages written disjunctively. Some of its parts of speech such as qualificatives and some adverbs are made up of multiple words. That is, the part of speech is made up of a group of words. The disjunctive style of writing poses a challenge when a sentence is tokenized or when tagging. A few studies have been done on identification of multi-word parts of speech. In this study we go further to tokenize complex parts of speech which are formed by extending basic forms of multi-word parts of speech. The parts of speech are extended by recursively concatenating more parts of speech to a basic form of parts of speech. We developed rules for building complex relative parts of speech. A morphological analyzer and Python NLTK are used to tag individual words and basic forms of multi-word parts of speech. Developed rules are then used to identify complex parts of speech. Results from a 300 sentence text files give a performance of 74%. The tagger fails when it encounters expansion rules not implemented and when tagging by the morphological analyzer is incorrect.


Share



Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in