Capability
Squad V2 Benchmark Aligned Answer Span Prediction
18 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “span-based answer annotation with character-level indexing”
150K reading comprehension questions including unanswerable ones.
Unique: Uses character-level span indexing rather than token-level, making answers independent of tokenization choices. This enables fair comparison across models with different tokenizers and avoids off-by-one errors from token boundaries.
vs others: More precise than free-form answer generation (which requires BLEU/ROUGE metrics) and more tokenizer-agnostic than token-level span prediction, enabling reproducible evaluation across different model architectures.