promptbenchBenchmark30/100 via “prompt-engineering-technique-library-with-chain-of-thought”
PromptBench is a powerful tool designed to scrutinize and analyze the interaction of large language models with various prompts. It provides a convenient infrastructure to simulate **black-box** adversarial **prompt attacks** on the models and evaluate their performances.
Unique: Implements a modular library of prompt engineering techniques (CoT, Emotion, Expert, etc.) as composable transformations rather than hard-coded strategies, allowing researchers to apply, combine, and evaluate techniques systematically across datasets and models.
vs others: More comprehensive than single-technique tools because it provides multiple prompt engineering methods in one framework, enabling comparative evaluation and technique composition. Allows systematic study of which techniques work for which models/tasks.