Capability
3 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →via “conditional-unconditional score function learning”
* ⭐ 08/2022: [Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation (DreamBooth)](https://arxiv.org/abs/2208.12242)
Unique: Uses conditioning dropout during training to force a single model to learn both conditional and unconditional score functions within shared parameters, rather than training separate models or using external classifiers for guidance
vs others: More parameter-efficient than separate conditional and unconditional models, and avoids external classifier dependencies compared to classifier guidance, but requires careful multi-objective training and may suffer from objective interference
via “trajectory-conditioned solution generation with scoring feedback”
* ⏫ 10/2023: [Eureka: Human-Level Reward Design via Coding Large Language Models (Eureka)](https://arxiv.org/abs/2310.12931)
Unique: Encodes the full optimization history as in-context examples rather than using a learned surrogate model or explicit reward function. The LLM implicitly learns to recognize patterns in the trajectory (e.g., 'solutions with property X scored higher') and applies those patterns to generate the next candidate, enabling adaptation without explicit model updates.
vs others: Simpler and faster to implement than Bayesian optimization or neural surrogate models, while capturing richer semantic patterns than random search or grid search by leveraging the LLM's pre-trained understanding of solution quality.
via “trajectory filtering and quality-based curriculum learning”
### Other Papers <a name="2023op"></a>
Unique: Applies curriculum learning to trajectory-based policy optimization, enabling agents to learn from mixed-quality data by prioritizing successful examples — this is distinct from uniform trajectory sampling which treats all trajectories equally
vs others: More sample-efficient than uniform sampling because high-quality trajectories contribute more to learning, and more robust than filtering alone because it gradually includes harder cases rather than discarding them
Building an AI tool with “Trajectory Conditioned Solution Generation With Scoring Feedback”?
Submit your artifact →curl unfragile.ai/agents.md | sh© 2026 Unfragile. The platform for software for agents.