The model is based on Flan-T5-Large that predicts a binary label - 1 for supported and 0 for unsupported.
The model is doing predictions on the
sentence-level
. It takes as input a document and a sentence and determine
whether the sentence is supported by the document:
MiniCheck-Model(document, claim) -> {0, 1}
MiniCheck-Flan-T5-Large is fine tuned from
google/flan-t5-large
(
Chung et al., 2022
)
on the combination of 35K data:
The performance of these models is evaluated on our new collected benchmark (unseen by our models during training),
LLM-AggreFact
,
from 10 recent human annotated datasets on fact-checking and grounding LLM generations. Our most capable model MiniCheck-Flan-T5-Large outperform all
exisiting specialized fact-checkers with a similar scale by a large margin (4-10% absolute increase) and is on par with GPT-4, but 400x cheaper. See full results in our work.
Note: We only evaluated the performance of our models on real claims -- without any human intervention in
any format, such as injecting certain error types into model-generated claims. Those edited claims do not reflect
LLMs' actual behaviors.
Model Usage Demo
Please first clone our
GitHub Repo
and install necessary packages from
requirements.txt
.
Below is a simple use case
from minicheck.minicheck import MiniCheck
doc = "A group of students gather in the school library to study for their upcoming final exams."
claim_1 = "The students are preparing for an examination."
claim_2 = "The students are on vacation."# model_name can be one of ['roberta-large', 'deberta-v3-large', 'flan-t5-large']
scorer = MiniCheck(model_name='flan-t5-large', device=f'cuda:0', cache_dir='./ckpts')
pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2])
print(pred_label) # [1, 0]print(raw_prob) # [0.9805923700332642, 0.007121307775378227]
@misc{tang2024minicheck,
title={MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents},
author={Liyan Tang and Philippe Laban and Greg Durrett},
year={2024},
eprint={2404.10774},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Runs of lytang MiniCheck-Flan-T5-Large on huggingface.co
802
Total runs
26
24-hour runs
63
3-day runs
236
7-day runs
-4.2K
30-day runs
More Information About MiniCheck-Flan-T5-Large huggingface.co Model
MiniCheck-Flan-T5-Large huggingface.co is an AI model on huggingface.co that provides MiniCheck-Flan-T5-Large's model effect (), which can be used instantly with this lytang MiniCheck-Flan-T5-Large model. huggingface.co supports a free trial of the MiniCheck-Flan-T5-Large model, and also provides paid use of the MiniCheck-Flan-T5-Large. Support call MiniCheck-Flan-T5-Large model through api, including Node.js, Python, http.
MiniCheck-Flan-T5-Large huggingface.co is an online trial and call api platform, which integrates MiniCheck-Flan-T5-Large's modeling effects, including api services, and provides a free online trial of MiniCheck-Flan-T5-Large, you can try MiniCheck-Flan-T5-Large online for free by clicking the link below.
lytang MiniCheck-Flan-T5-Large online free url in huggingface.co:
MiniCheck-Flan-T5-Large is an open source model from GitHub that offers a free installation service, and any user can find MiniCheck-Flan-T5-Large on GitHub to install. At the same time, huggingface.co provides the effect of MiniCheck-Flan-T5-Large install, users can directly use MiniCheck-Flan-T5-Large installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
MiniCheck-Flan-T5-Large install url in huggingface.co: