tiiuae / falcon-180B-chat

huggingface.co
Total runs: 68.1K
24-hour runs: 0
7-day runs: 50.4K
30-day runs: 67.4K
Model's Last Updated: November 07 2023
text-generation

Introduction of falcon-180B-chat

Model Details of falcon-180B-chat

🚀 Falcon-180B-Chat

Falcon-180B-Chat is a 180B parameters causal decoder-only model built by TII based on Falcon-180B and finetuned on a mixture of Ultrachat , Platypus and Airoboros . It is made available under the Falcon-180B TII License and Acceptable Use Policy .

Paper coming soon 😊

🤗 To get started with Falcon (inference, finetuning, quantization, etc.), we recommend reading this great blogpost from HF or this one from the release of the 40B! Note that since the 180B is larger than what can easily be handled with transformers + acccelerate , we recommend using Text Generation Inference .

You will need at least 400GB of memory to swiftly run inference with Falcon-180B.

Why use Falcon-180B-chat?

💬 This is a Chat model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from Falcon-180B .

💸 Looking for a smaller, less expensive model? Falcon-7B-Instruct and Falcon-40B-Instruct are Falcon-180B-Chat's little brothers!

💥 Falcon LLMs require PyTorch 2.0 for use with transformers !

Model Card for Falcon-180B-Chat

Model Details
Model Description
Model Source
  • Paper: coming soon .
Uses

See the acceptable use policy .

Direct Use

Falcon-180B-Chat has been finetuned on a chat dataset.

Out-of-Scope Use

Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.

Bias, Risks, and Limitations

Falcon-180B-Chat is mostly trained on English data, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.

Recommendations

We recommend users of Falcon-180B-Chat to develop guardrails and to take appropriate precautions for any production use.

How to Get Started with the Model

To run inference with the model in full bfloat16 precision you need approximately 8xA100 80GB or equivalent.

from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-180b-chat"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
Training Details

Falcon-180B-Chat is based on Falcon-180B .

Training Data

Falcon-180B-Chat is finetuned on a mixture of Ultrachat , Platypus and Airoboros .

The data was tokenized with the Falcon tokenizer.

Evaluation

Paper coming soon.

See the OpenLLM Leaderboard for early results.

Technical Specifications
Model Architecture and Objective

Falcon-180B-Chat is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token).

The architecture is broadly adapted from the GPT-3 paper ( Brown et al., 2020 ), with the following differences:

For multiquery, we are using an internal variant which uses independent key and values per tensor parallel degree.

Hyperparameter Value Comment
Layers 80
d_model 14848
head_dim 64 Reduced to optimise for FlashAttention
Vocabulary 65024
Sequence length 2048
Compute Infrastructure
Hardware

Falcon-180B-Chat was trained on AWS SageMaker, on up to 4,096 A100 40GB GPUs in P4d instances.

Software

Falcon-180B-Chat was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO and high-performance Triton kernels (FlashAttention, etc.)

Citation

Paper coming soon 😊. In the meanwhile, you can use the following information to cite:

@article{falcon,
  title={The Falcon Series of Language Models:Towards Open Frontier Models},
  author={Almazrouei, Ebtesam and Alobeidli, Hamza and Alshamsi, Abdulaziz and Cappelli, Alessandro and Cojocaru, Ruxandra and Debbah, Merouane and Goffinet, Etienne and Heslow, Daniel and Launay, Julien and Malartic, Quentin and Noune, Badreddine and Pannier, Baptiste and Penedo, Guilherme},
  year={2023}
}

To learn more about the pretraining dataset, see the 📓 RefinedWeb paper .

@article{refinedweb,
  title={The {R}efined{W}eb dataset for {F}alcon {LLM}: outperforming curated corpora with web data, and web data only},
  author={Guilherme Penedo and Quentin Malartic and Daniel Hesslow and Ruxandra Cojocaru and Alessandro Cappelli and Hamza Alobeidli and Baptiste Pannier and Ebtesam Almazrouei and Julien Launay},
  journal={arXiv preprint arXiv:2306.01116},
  eprint={2306.01116},
  eprinttype = {arXiv},
  url={https://arxiv.org/abs/2306.01116},
  year={2023}
}
Contact

[email protected]

Runs of tiiuae falcon-180B-chat on huggingface.co

68.1K
Total runs
0
24-hour runs
22.3K
3-day runs
50.4K
7-day runs
67.4K
30-day runs

More Information About falcon-180B-chat huggingface.co Model

More falcon-180B-chat license Visit here:

https://choosealicense.com/licenses/unknown

falcon-180B-chat huggingface.co

falcon-180B-chat huggingface.co is an AI model on huggingface.co that provides falcon-180B-chat's model effect (), which can be used instantly with this tiiuae falcon-180B-chat model. huggingface.co supports a free trial of the falcon-180B-chat model, and also provides paid use of the falcon-180B-chat. Support call falcon-180B-chat model through api, including Node.js, Python, http.

falcon-180B-chat huggingface.co Url

https://huggingface.co/tiiuae/falcon-180B-chat

tiiuae falcon-180B-chat online free

falcon-180B-chat huggingface.co is an online trial and call api platform, which integrates falcon-180B-chat's modeling effects, including api services, and provides a free online trial of falcon-180B-chat, you can try falcon-180B-chat online for free by clicking the link below.

tiiuae falcon-180B-chat online free url in huggingface.co:

https://huggingface.co/tiiuae/falcon-180B-chat

falcon-180B-chat install

falcon-180B-chat is an open source model from GitHub that offers a free installation service, and any user can find falcon-180B-chat on GitHub to install. At the same time, huggingface.co provides the effect of falcon-180B-chat install, users can directly use falcon-180B-chat installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

falcon-180B-chat install url in huggingface.co:

https://huggingface.co/tiiuae/falcon-180B-chat

Url of falcon-180B-chat

falcon-180B-chat huggingface.co Url

Provider of falcon-180B-chat huggingface.co

tiiuae
ORGANIZATIONS

Other API from tiiuae

huggingface.co

Total runs: 139.8K
Run Growth: 10.4K
Growth Rate: 7.45%
Updated: August 09 2024
huggingface.co

Total runs: 104.0K
Run Growth: 22.7K
Growth Rate: 21.83%
Updated: October 12 2024
huggingface.co

Total runs: 32.0K
Run Growth: 11.8K
Growth Rate: 36.86%
Updated: December 17 2024
huggingface.co

Total runs: 24.5K
Run Growth: 12.4K
Growth Rate: 50.73%
Updated: December 17 2024
huggingface.co

Total runs: 22.9K
Run Growth: 10.2K
Growth Rate: 44.49%
Updated: July 13 2023
huggingface.co

Total runs: 4.0K
Run Growth: -136
Growth Rate: -3.39%
Updated: September 06 2023
huggingface.co

Total runs: 2.5K
Run Growth: -103
Growth Rate: -4.10%
Updated: November 07 2024
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated: June 06 2024