openai / whisper

Convert speech in audio to text

replicate.com
Total runs: 62.0M
24-hour runs: 0
7-day runs: 1.8M
30-day runs: 7.6M
Github
Model's Last Updated: November 27 2024

Introduction of whisper

Model Details of whisper

Readme

Whisper Large-v3

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition, translation, and language identification.

This version runs only the most recent Whisper model, large-v3 . It’s optimized for high performance and simplicity.

Model Versions
Model Size Version
large-v3 link
large-v2 link
all others link

While this implementation only uses the large-v3 model, we maintain links to previous versions for reference.

For users who need different model sizes, check out our multi-model version .

Model Description

Approach

Whisper uses a Transformer sequence-to-sequence model trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. All of these tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing for a single model to replace many different stages of a traditional speech processing pipeline.

[Blog] [Paper] [Model card]

License

The code and model weights of Whisper are released under the MIT License. See LICENSE for further details.

Citation
@misc{https://doi.org/10.48550/arxiv.2212.04356,
  doi = {10.48550/ARXIV.2212.04356},
  url = {https://arxiv.org/abs/2212.04356},
  author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  title = {Robust Speech Recognition via Large-Scale Weak Supervision},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Pricing of whisper replicate.com

Run time and cost

This model costs approximately $0.00064 to run on Replicate, or 1562 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker .

This model runs on Nvidia T4 (High-memory) GPU hardware . Predictions typically complete within 3 seconds.

Runs of openai whisper on replicate.com

62.0M
Total runs
0
24-hour runs
0
3-day runs
1.8M
7-day runs
7.6M
30-day runs

More Information About whisper replicate.com Model

whisper replicate.com

whisper replicate.com is an AI model on replicate.com that provides whisper's model effect (Convert speech in audio to text), which can be used instantly with this openai whisper model. replicate.com supports a free trial of the whisper model, and also provides paid use of the whisper. Support call whisper model through api, including Node.js, Python, http.

openai whisper online free

whisper replicate.com is an online trial and call api platform, which integrates whisper's modeling effects, including api services, and provides a free online trial of whisper, you can try whisper online for free by clicking the link below.

openai whisper online free url in replicate.com:

https://replicate.com/openai/whisper

whisper install

whisper is an open source model from GitHub that offers a free installation service, and any user can find whisper on GitHub to install. At the same time, replicate.com provides the effect of whisper install, users can directly use whisper installed effect in replicate.com for debugging and trial. It also supports api for free installation.

whisper install url in replicate.com:

https://replicate.com/openai/whisper

whisper install url in github:

https://github.com/replicate/cog-whisper

Url of whisper

Provider of whisper replicate.com