{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"pypi_pypi-peft","slug":"pypi-peft","name":"peft","type":"finetune","url":"https://github.com/huggingface/peft","page_url":"https://unfragile.ai/pypi-peft","categories":["model-training"],"tags":["deep","learning"],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"pypi_pypi-peft__cap_0","uri":"capability://code.generation.editing.low.rank.adapter.injection.with.dynamic.module.wrapping","name":"low-rank adapter injection with dynamic module wrapping","description":"Injects trainable low-rank decomposition matrices (LoRA) into transformer model layers by wrapping linear modules with a parallel adapter path that computes A @ B^T additions to activations. Uses a registry-based dispatch mechanism (src/peft/mapping.py) to identify target layers by name pattern, then replaces them with LoRALinear wrappers that maintain frozen base weights while training only the rank-r adapter matrices, achieving 0.1-2% parameter overhead per adapter.","intents":["Fine-tune a 7B parameter LLM using only 10-50MB of trainable weights instead of 14GB","Create task-specific adapters that can be swapped without reloading the base model","Reduce VRAM requirements from 80GB to 16GB for training large language models"],"best_for":["ML engineers fine-tuning large pretrained models on consumer/enterprise GPUs","Teams deploying multiple task-specific models from a single base checkpoint","Researchers experimenting with adapter composition and multi-task learning"],"limitations":["LoRA rank selection requires manual tuning; no automated rank discovery (use AdaLoRA for dynamic ranks)","Adapter inference adds ~5-10% latency overhead due to additional matrix multiplications","Cannot be applied to embedding layers or non-linear modules without custom implementation","Merging adapters into base weights is irreversible without saving original checkpoint"],"requires":["PyTorch 1.13+","transformers library 4.20+","Base model in HuggingFace format (safetensors or PyTorch)","Python 3.8+"],"input_types":["pretrained transformer model (LLaMA, Mistral, BERT, ViT, etc.)","LoRA configuration (rank, alpha, target_modules, lora_dropout)","training dataset with input_ids and attention_mask tensors"],"output_types":["adapter weights (A and B matrices) as .safetensors or .bin","adapter_config.json with hyperparameters","merged model weights (optional, full model size)"],"categories":["code-generation-editing","model-adaptation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_1","uri":"capability://code.generation.editing.dynamic.rank.allocation.with.gradient.based.importance.scoring","name":"dynamic rank allocation with gradient-based importance scoring","description":"AdaLoRA extends LoRA by maintaining per-layer importance scores that guide automatic rank allocation during training. The implementation computes Hadamard products of adapter gradients to estimate parameter importance, then dynamically increases ranks for high-importance layers and decreases ranks for low-importance ones, achieving 40-50% parameter reduction vs fixed-rank LoRA while maintaining task performance.","intents":["Automatically determine optimal LoRA rank per layer without manual hyperparameter search","Reduce adapter size from 50MB to 20-30MB by pruning low-importance parameters during training","Discover which transformer layers are most critical for a specific downstream task"],"best_for":["Practitioners without domain knowledge to set LoRA ranks manually","Resource-constrained deployments requiring minimal adapter footprint","Research into layer-wise importance in transformer models"],"limitations":["Requires additional forward/backward passes to compute importance scores, adding ~15-20% training time overhead","Rank reallocation happens at fixed intervals (configurable), not continuously, potentially missing optimal allocation","Importance scoring is task-specific; adapters optimized for one task may not transfer well to others","No theoretical guarantee that discovered ranks are globally optimal"],"requires":["PyTorch 1.13+ with autograd enabled","transformers 4.20+","PEFT 0.4.0+ (AdaLoRA introduced in this version)","GPU with sufficient memory for gradient computation"],"input_types":["pretrained model","AdaLoRA config (initial_rank, target_rank, lora_alpha, lora_dropout, layers_to_transform)","training dataset with labels"],"output_types":["adapter weights with variable ranks per layer","importance scores per parameter (for analysis)","adapter_config.json with final rank allocation"],"categories":["code-generation-editing","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_10","uri":"capability://code.generation.editing.adapter.merging.and.unmerging.with.weight.fusion","name":"adapter merging and unmerging with weight fusion","description":"Provides merge_adapter() and unmerge_adapter() methods that fuse adapter weights into base model weights or extract them back out. For LoRA, merging computes (W + alpha/r * A @ B^T) to create a single set of weights, reducing inference latency by eliminating the adapter computation path. Unmerging recovers the original base weights and adapter weights from the merged state, enabling reversible adapter composition. Implemented through method-specific merge logic in each tuner class.","intents":["Merge a trained LoRA adapter into base weights to eliminate adapter inference overhead","Create a production model with fused weights that doesn't require adapter loading","Unmerge adapters to recover original weights and try different adapter combinations"],"best_for":["Production inference where latency is critical and adapter overhead is unacceptable","Scenarios where model deployment requires single monolithic weights","Experimentation with adapter combinations before final deployment"],"limitations":["Merging is irreversible without saving the original base model separately","Merged models lose the ability to switch adapters at inference time","Merging quantized adapters requires dequantization, increasing model size","Not all PEFT methods support merging (e.g., prompt tuning cannot be merged)"],"requires":["PEFT 0.1.0+","PyTorch 1.13+","original base model checkpoint (for unmerging)"],"input_types":["PeftModel with trained adapter","optional: base model for unmerging"],"output_types":["merged model weights (same size as base model)","unmerged base model and adapter weights"],"categories":["code-generation-editing","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_11","uri":"capability://safety.moderation.configuration.validation.and.compatibility.checking","name":"configuration validation and compatibility checking","description":"Validates PEFT configurations against model architecture and detects incompatibilities before training begins. The system checks that target_modules exist in the model, that adapter ranks are compatible with layer dimensions, and that method-specific constraints are satisfied. Implemented through PeftConfig validation methods and pre-training checks in get_peft_model() that raise informative errors for common misconfiguration patterns.","intents":["Catch configuration errors early (e.g., targeting non-existent modules) before training starts","Validate that LoRA rank is not larger than layer dimensions","Check that adapter method is compatible with model architecture"],"best_for":["Teams new to PEFT who may misconfigure adapters","Automated training pipelines that need early error detection","Debugging adapter training failures"],"limitations":["Validation is static; cannot detect runtime issues that emerge during training","Error messages may be cryptic for complex configuration issues","No automatic configuration suggestions; users must manually fix errors","Validation overhead is minimal but adds startup latency"],"requires":["PEFT 0.1.0+","PyTorch 1.13+","transformers 4.20+"],"input_types":["PEFT config (LoRA, AdaLoRA, etc.)","base model"],"output_types":["validation errors/warnings","validated PeftModel instance"],"categories":["safety-moderation","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_2","uri":"capability://code.generation.editing.quantization.aware.adapter.training.with.frozen.base.weights","name":"quantization-aware adapter training with frozen base weights","description":"Enables fine-tuning of 4-bit and 8-bit quantized models by freezing the quantized base weights and training only adapter parameters, implemented through integration with bitsandbytes quantization library. The system detects quantized layers (Linear4bit, Linear8bit) and injects adapters in the forward pass without dequantizing base weights, reducing memory footprint by 75-90% compared to full-precision training while maintaining numerical stability through careful gradient flow management.","intents":["Fine-tune a 70B parameter model on a single 24GB GPU by combining 4-bit quantization with LoRA","Train adapters for quantized models without requiring full-precision checkpoints","Reduce training memory from 320GB to 24GB for enterprise-scale language models"],"best_for":["Teams with limited GPU resources (consumer/mid-range hardware)","Production fine-tuning pipelines requiring minimal infrastructure cost","Researchers studying the interaction between quantization and adapter learning"],"limitations":["Quantized base weights cannot be merged into adapters; merged models require dequantization step","Gradient computation through quantized layers adds ~10-15% training time overhead","Only compatible with bitsandbytes library; other quantization frameworks (GPTQ, AWQ) require custom integration","Numerical stability depends on careful learning rate selection; higher rates can cause gradient overflow"],"requires":["bitsandbytes 0.37.0+","PyTorch 1.13+","CUDA 11.0+ (for GPU quantization)","transformers 4.30+ (for quantization config support)"],"input_types":["quantized model (loaded via load_in_4bit=True or load_in_8bit=True in transformers)","PEFT config (LoRA, QLoRA, or other method)","training dataset"],"output_types":["adapter weights (.safetensors or .bin)","adapter_config.json","dequantized merged model (optional, requires full-precision base model)"],"categories":["code-generation-editing","data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_3","uri":"capability://tool.use.integration.multi.adapter.composition.and.routing","name":"multi-adapter composition and routing","description":"Enables loading and composing multiple adapters on a single base model through add_adapter(), set_adapter(), and delete_adapter() methods that manage an adapter registry. Supports sequential composition (stacking adapters), parallel composition (weighted averaging), and task-specific routing where different adapters activate based on input characteristics. Implemented via the PeftModel wrapper maintaining a dictionary of adapter states and switching between them without reloading the base model.","intents":["Load 10 task-specific adapters (one per customer) on a single base model, switching between them per request","Stack multiple adapters (e.g., domain + task) to combine their learned transformations","Dynamically select adapters based on input tokens or metadata without model reloading"],"best_for":["Multi-tenant SaaS platforms serving different customers with a single base model","Multi-task learning systems where different tasks require different adapter combinations","Research into adapter composition and knowledge transfer"],"limitations":["Sequential adapter composition adds latency proportional to number of stacked adapters (~5ms per adapter)","Parallel composition (weighted averaging) requires manual weight tuning; no learned composition weights","Adapter switching requires state management; concurrent requests to different adapters need synchronization","No built-in mechanism for detecting adapter conflicts or incompatibilities"],"requires":["PEFT 0.4.0+","PyTorch 1.13+","transformers 4.20+","sufficient GPU memory for base model + largest adapter"],"input_types":["base model","multiple adapter checkpoints (from save_pretrained)","adapter names and composition strategy"],"output_types":["model with active adapter(s)","predictions from composed adapters","adapter state snapshots"],"categories":["tool-use-integration","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_4","uri":"capability://text.generation.language.prompt.learning.and.soft.prompt.optimization","name":"prompt learning and soft prompt optimization","description":"Implements prefix tuning and prompt tuning methods that prepend learnable soft prompt tokens to input sequences, optimizing only the prompt embeddings while freezing all model weights. The implementation maintains a learnable embedding matrix that is concatenated to input embeddings before the first transformer layer, enabling task adaptation through prompt optimization rather than weight updates. Supports both prefix (prepended to all layers) and prompt (prepended to input only) variants.","intents":["Fine-tune a frozen model by learning task-specific prompts instead of adapter weights","Create interpretable task descriptions by analyzing learned soft prompts","Reduce adapter size to <1MB by using only prompt embeddings"],"best_for":["Scenarios where model weights must remain frozen (licensing, security, or inference constraints)","Interpretability research into what prompts models learn for specific tasks","Extreme parameter efficiency requirements (<0.01% overhead)"],"limitations":["Prompt learning typically requires longer training (2-3x more steps) than LoRA to reach comparable performance","Learned prompts are often task-specific and don't transfer well to other tasks","Prompt length is a hyperparameter that must be tuned; longer prompts improve performance but increase latency","Interpretability of learned soft prompts is limited; they don't correspond to natural language"],"requires":["PEFT 0.1.0+","PyTorch 1.13+","transformers 4.20+","model with accessible embedding layer"],"input_types":["pretrained model","PrefixTuningConfig or PromptTuningConfig (num_virtual_tokens, prompt_tuning_init)","training dataset"],"output_types":["learned prompt embeddings (.safetensors)","adapter_config.json with prompt configuration","training logs showing prompt evolution"],"categories":["text-generation-language","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_5","uri":"capability://automation.workflow.adapter.serialization.and.checkpoint.management","name":"adapter serialization and checkpoint management","description":"Provides save_pretrained() and from_pretrained() methods that serialize only adapter weights and configurations to disk, enabling efficient checkpoint storage and loading. The system saves adapter parameters as .safetensors or .bin files alongside adapter_config.json containing method-specific hyperparameters, supporting both local filesystem and HuggingFace Hub uploads. Implemented through a unified serialization interface (src/peft/utils/save_and_load.py) that abstracts method-specific serialization logic.","intents":["Save a 50MB adapter checkpoint instead of a 14GB full model checkpoint","Load pre-trained adapters from HuggingFace Hub and apply them to any compatible base model","Version control adapter checkpoints in Git (small file size enables this)"],"best_for":["Teams managing many adapter checkpoints for different tasks/customers","Researchers sharing fine-tuned adapters via HuggingFace Hub","Production systems requiring efficient checkpoint storage and recovery"],"limitations":["Adapters are not portable across different base model architectures (e.g., LoRA for BERT cannot load on GPT-2)","Adapter loading requires the base model to be loaded first; no standalone inference","Checkpoint compatibility depends on matching PEFT version; older checkpoints may not load in newer versions","No built-in versioning or rollback mechanism; requires external version control"],"requires":["PEFT 0.1.0+","PyTorch 1.13+","safetensors library (for .safetensors format, recommended)","huggingface_hub library (for Hub uploads)"],"input_types":["trained PeftModel instance","save path (local or Hub repo ID)","optional: push_to_hub=True for automatic Hub upload"],"output_types":["adapter weights (.safetensors or .bin)","adapter_config.json","README.md (optional, for Hub)"],"categories":["automation-workflow","data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_6","uri":"capability://automation.workflow.distributed.training.with.adapter.synchronization","name":"distributed training with adapter synchronization","description":"Enables distributed training across multiple GPUs/TPUs by synchronizing adapter gradients using standard PyTorch DistributedDataParallel (DDP) or DeepSpeed integration. The implementation treats adapters as regular parameters in the distributed training graph, with gradient accumulation and all-reduce operations handled by the distributed backend. Supports both data parallelism (same adapters across devices) and model parallelism (adapters sharded across devices) through integration with transformers' distributed training utilities.","intents":["Train adapters on 8 GPUs with synchronized gradient updates and batch aggregation","Use DeepSpeed ZeRO-3 to shard adapter parameters across devices for memory efficiency","Scale adapter training to 100+ GPUs without code changes"],"best_for":["Teams with multi-GPU infrastructure training large models","Organizations using DeepSpeed or FSDP for distributed training","Research labs scaling fine-tuning to enterprise GPU clusters"],"limitations":["Distributed training adds communication overhead; adapter synchronization adds ~5-10% per-step latency","Requires careful learning rate scaling (typically linear scaling with number of GPUs)","Debugging distributed training issues is complex; requires understanding of DDP/DeepSpeed internals","Not all PEFT methods are equally efficient in distributed settings; some require additional synchronization"],"requires":["PyTorch 1.13+ with distributed training support","NCCL 2.0+ (for GPU communication)","transformers 4.20+","optional: DeepSpeed 0.7.0+ for advanced distributed features","multiple GPUs/TPUs with NVLink or InfiniBand interconnect"],"input_types":["PeftModel instance","distributed training config (num_processes, per_device_batch_size, gradient_accumulation_steps)","training dataset (should be sharded across processes)"],"output_types":["synchronized adapter weights","training logs aggregated across processes","final checkpoint saved from rank 0 process"],"categories":["automation-workflow","data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_7","uri":"capability://image.visual.vision.model.and.diffusion.model.adapter.support","name":"vision model and diffusion model adapter support","description":"Extends PEFT methods (LoRA, prefix tuning, etc.) to vision transformers (ViT, DeiT) and diffusion models (Stable Diffusion, DDPM) by identifying and wrapping attention/linear layers in these architectures. The implementation uses the same adapter injection mechanism as language models but adapts layer identification patterns for vision-specific architectures. Supports fine-tuning image generation, classification, and segmentation tasks with minimal parameter overhead.","intents":["Fine-tune Stable Diffusion for custom image generation with 50MB adapters instead of 4GB model updates","Adapt a ViT image classifier to a new dataset using LoRA with 0.1% parameter overhead","Create style-specific diffusion adapters that can be composed for multi-style generation"],"best_for":["Computer vision teams fine-tuning large pretrained models","Generative AI practitioners creating custom image generation models","Researchers studying adapter effectiveness across modalities"],"limitations":["Vision adapter support is less mature than language model support; fewer tested configurations","Diffusion model adapters require careful tuning of noise schedules and timestep embeddings","Adapter composition in vision models is less studied; multi-adapter effects are unpredictable","Some vision architectures (e.g., CNN-based) have limited adapter support"],"requires":["PEFT 0.3.0+","PyTorch 1.13+","diffusers library (for diffusion models)","transformers 4.20+ (for vision transformers)"],"input_types":["vision model (ViT, DeiT, or diffusion model)","PEFT config with vision-specific target_modules","image dataset with appropriate preprocessing"],"output_types":["adapter weights for vision model","adapter_config.json","fine-tuned image outputs"],"categories":["image-visual","code-generation-editing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_8","uri":"capability://code.generation.editing.custom.peft.method.registration.and.extension","name":"custom peft method registration and extension","description":"Provides a plugin architecture for implementing new PEFT methods by extending BaseTuner and registering them in the method registry (src/peft/mapping.py). New methods define a configuration class (inheriting from PeftConfig), a tuner class (inheriting from BaseTuner), and register themselves via PEFT_TYPE_TO_CONFIG_MAPPING. The system automatically handles adapter lifecycle (initialization, forward pass injection, serialization) through the base class, enabling new methods to integrate with all PEFT infrastructure without reimplementation.","intents":["Implement a novel adapter method (e.g., sparse LoRA, mixture-of-adapters) and use it with all PEFT tooling","Create domain-specific adapters (e.g., medical NLP adapters) that integrate with the PEFT ecosystem","Extend PEFT with custom layer types or composition strategies"],"best_for":["Researchers developing new parameter-efficient fine-tuning methods","Teams implementing proprietary adapter methods that should integrate with PEFT","Contributors adding methods to the PEFT library"],"limitations":["Requires deep understanding of PEFT architecture and BaseTuner interface","Custom methods must implement required abstract methods (get_trainable_parameters, forward, etc.)","No automatic testing or validation; custom methods must be thoroughly tested","Documentation for extension is limited; requires reading existing method implementations"],"requires":["PEFT 0.1.0+","PyTorch 1.13+","understanding of PEFT architecture (PeftModel, BaseTuner, PeftConfig)","familiarity with transformer model internals"],"input_types":["custom tuner class inheriting from BaseTuner","custom config class inheriting from PeftConfig","registration code in mapping.py"],"output_types":["integrated PEFT method usable via get_peft_model()","method-specific adapters and checkpoints","compatibility with all PEFT infrastructure (quantization, distributed training, etc.)"],"categories":["code-generation-editing","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-peft__cap_9","uri":"capability://automation.workflow.layer.wise.learning.rate.scheduling.and.gradient.management","name":"layer-wise learning rate scheduling and gradient management","description":"Enables fine-grained control over adapter training through layer-wise learning rate schedules and gradient clipping strategies. The implementation integrates with PyTorch optimizers to apply different learning rates to different adapter layers, and supports gradient accumulation patterns specific to adapter training. Implemented through integration with transformers' Trainer API and custom callback hooks that modify optimizer parameter groups per training step.","intents":["Apply higher learning rates to later transformer layers and lower rates to earlier layers","Implement gradient clipping per adapter layer to prevent training instability","Use discriminative learning rates where each layer has a different learning rate schedule"],"best_for":["Practitioners fine-tuning large models where different layers learn at different rates","Research into layer-wise learning dynamics in adapters","Scenarios where training instability requires careful gradient management"],"limitations":["Layer-wise learning rate scheduling adds complexity to training configuration","No automatic discovery of optimal per-layer learning rates; requires manual tuning or grid search","Gradient clipping per layer can interact unexpectedly with batch normalization or layer normalization","Incompatible with some optimizers (e.g., LAMB) that don't support per-parameter learning rates"],"requires":["PyTorch 1.13+","transformers 4.20+ (for Trainer integration)","PEFT 0.2.0+","understanding of optimizer parameter groups"],"input_types":["PeftModel instance","learning rate schedule (constant, linear, cosine, etc.)","layer-wise learning rate multipliers","gradient clipping config"],"output_types":["trained adapter weights","training logs with per-layer loss/gradient statistics","learning rate schedule visualization"],"categories":["automation-workflow","planning-reasoning"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":23,"verified":false,"data_access_risk":"high","permissions":["PyTorch 1.13+","transformers library 4.20+","Base model in HuggingFace format (safetensors or PyTorch)","Python 3.8+","PyTorch 1.13+ with autograd enabled","transformers 4.20+","PEFT 0.4.0+ (AdaLoRA introduced in this version)","GPU with sufficient memory for gradient computation","PEFT 0.1.0+","original base model checkpoint (for unmerging)"],"failure_modes":["LoRA rank selection requires manual tuning; no automated rank discovery (use AdaLoRA for dynamic ranks)","Adapter inference adds ~5-10% latency overhead due to additional matrix multiplications","Cannot be applied to embedding layers or non-linear modules without custom implementation","Merging adapters into base weights is irreversible without saving original checkpoint","Requires additional forward/backward passes to compute importance scores, adding ~15-20% training time overhead","Rank reallocation happens at fixed intervals (configurable), not continuously, potentially missing optimal allocation","Importance scoring is task-specific; adapters optimized for one task may not transfer well to others","No theoretical guarantee that discovered ranks are globally optimal","Merging is irreversible without saving the original base model separately","Merged models lose the ability to switch adapters at inference time","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.05,"quality":0.34,"ecosystem":0.46,"match_graph":0.25,"freshness":0.52,"weights":{"adoption":0.35,"quality":0.2,"ecosystem":0.1,"match_graph":0.3,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-06-17T09:51:05.295Z","last_scraped_at":"2026-05-03T15:20:15.343Z","last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=pypi-peft","compare_url":"https://unfragile.ai/compare?artifact=pypi-peft"}},"signature":"uAmV77YeFXOk5mpdqMDQhPJzzTE1eHA8z5aEJZn157NlAuJZrW8R/RnZOezvWe3ATMg8mkAjzTcBU/UutomeAw==","signedAt":"2026-06-23T02:16:10.521Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/pypi-peft","artifact":"https://unfragile.ai/pypi-peft","verify":"https://unfragile.ai/api/v1/verify?slug=pypi-peft","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}