Capability
Huggingface Compatible Generation Endpoint
2 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “huggingface-compatible-generation-endpoint”
A self-hosted copilot clone which uses the library behind llama.cpp to run the 6 billion parameter Salesforce Codegen model in 4 GB of RAM.
Unique: Provides HF Inference API compatibility alongside OpenAI compatibility in the same server, allowing users to choose between two major API standards without running separate services, whereas most inference servers support only one API format
vs others: Enables HF ecosystem integration but with less complete parameter support than native HF Transformers library