Introduction of fullstop-punctuation-multilang-large
Model Details of fullstop-punctuation-multilang-large
This model predicts the punctuation of English, Italian, French and German texts. We developed it to restore the punctuation of transcribed spoken language.
This multilanguage model was trained on the
Europarl Dataset
provided by the
SEPP-NLG Shared Task
.
Please note that this dataset consists of political speeches. Therefore the model might perform differently on texts from other domains.
The model restores the following punctuation markers:
"." "," "?" "-" ":"
Sample Code
We provide a simple python package that allows you to process text of any length.
from deepmultilingualpunctuation import PunctuationModel
model = PunctuationModel()
text = "My name is Clara and I live in Berkeley California Ist das eine Frage Frau Müller"
result = model.restore_punctuation(text)
print(result)
output
My name is Clara and I live in Berkeley, California. Ist das eine Frage, Frau Müller?
Predict Labels
from deepmultilingualpunctuation import PunctuationModel
model = PunctuationModel()
text = "My name is Clara and I live in Berkeley California Ist das eine Frage Frau Müller"
clean_text = model.preprocess(text)
labled_words = model.predict(clean_text)
print(labled_words)
The performance differs for the single punctuation markers as hyphens and colons, in many cases, are optional and can be substituted by either a comma or a full stop. The model achieves the following F1 scores for the different languages:
@article{guhr-EtAl:2021:fullstop,
title={FullStop: Multilingual Deep Models for Punctuation Prediction},
author = {Guhr, Oliver and Schumann, Anne-Kathrin and Bahrmann, Frank and Böhme, Hans Joachim},
booktitle = {Proceedings of the Swiss Text Analytics Conference 2021},
month = {June},
year = {2021},
address = {Winterthur, Switzerland},
publisher = {CEUR Workshop Proceedings},
url = {http://ceur-ws.org/Vol-2957/sepp_paper4.pdf}
}
Runs of oliverguhr fullstop-punctuation-multilang-large on huggingface.co
202.1K
Total runs
1.2K
24-hour runs
-2.2K
3-day runs
-4.6K
7-day runs
-150.4K
30-day runs
More Information About fullstop-punctuation-multilang-large huggingface.co Model
More fullstop-punctuation-multilang-large license Visit here:
fullstop-punctuation-multilang-large huggingface.co is an AI model on huggingface.co that provides fullstop-punctuation-multilang-large's model effect (), which can be used instantly with this oliverguhr fullstop-punctuation-multilang-large model. huggingface.co supports a free trial of the fullstop-punctuation-multilang-large model, and also provides paid use of the fullstop-punctuation-multilang-large. Support call fullstop-punctuation-multilang-large model through api, including Node.js, Python, http.
fullstop-punctuation-multilang-large huggingface.co is an online trial and call api platform, which integrates fullstop-punctuation-multilang-large's modeling effects, including api services, and provides a free online trial of fullstop-punctuation-multilang-large, you can try fullstop-punctuation-multilang-large online for free by clicking the link below.
oliverguhr fullstop-punctuation-multilang-large online free url in huggingface.co:
fullstop-punctuation-multilang-large is an open source model from GitHub that offers a free installation service, and any user can find fullstop-punctuation-multilang-large on GitHub to install. At the same time, huggingface.co provides the effect of fullstop-punctuation-multilang-large install, users can directly use fullstop-punctuation-multilang-large installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
fullstop-punctuation-multilang-large install url in huggingface.co: