{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache","slug":"xlangai--ubuntu_osworld_file_cache","name":"ubuntu_osworld_file_cache","type":"dataset","url":"https://huggingface.co/datasets/xlangai/ubuntu_osworld_file_cache","page_url":"https://unfragile.ai/xlangai--ubuntu_osworld_file_cache","categories":["model-training"],"tags":["license:apache-2.0","arxiv:2404.07972","region:us"],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache__cap_0","uri":"capability://data.processing.analysis.ubuntu.os.task.execution.trajectory.caching","name":"ubuntu os task execution trajectory caching","description":"Stores pre-computed file system states and execution traces from Ubuntu desktop environment interactions, enabling rapid retrieval of realistic OS-level task demonstrations without re-executing complex multi-step workflows. The dataset captures filesystem snapshots, command sequences, and state transitions from the OSWorld benchmark, allowing models to learn from cached execution patterns rather than simulating environments from scratch.","intents":["Train agents to understand realistic Ubuntu desktop workflows and file system operations","Retrieve reference trajectories for specific OS-level tasks without environment simulation overhead","Build datasets for evaluating agent performance on desktop automation tasks","Study failure modes and edge cases in OS interaction patterns across diverse system states"],"best_for":["ML researchers training desktop automation agents","Teams building OS-level task datasets for benchmarking","Developers creating agents that must understand Linux file system semantics","Organizations evaluating LLM performance on real-world system administration tasks"],"limitations":["Dataset is Ubuntu-specific; Windows and macOS workflows not represented","Cached trajectories reflect specific system configurations and may not generalize to all Ubuntu versions or custom setups","File cache captures point-in-time snapshots; dynamic system state changes between cache points are not recorded","No built-in versioning or temporal tracking of how file states evolve across multiple task sequences"],"requires":["HuggingFace Datasets library (datasets>=2.0.0)","Python 3.8+","Sufficient disk space for ~1M+ cached file entries","Understanding of OSWorld benchmark format and task definitions"],"input_types":["task identifiers","file path queries","execution step indices"],"output_types":["file system snapshots (JSON/structured)","command execution traces","state transition records","file content and metadata"],"categories":["data-processing-analysis","model-training"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache__cap_1","uri":"capability://search.retrieval.multi.step.task.trajectory.indexing.and.retrieval","name":"multi-step task trajectory indexing and retrieval","description":"Implements a structured index over cached execution traces that maps task identifiers to sequences of file system states, command outputs, and intermediate results. Enables efficient lookup of complete task trajectories or individual execution steps without scanning the entire dataset, using hierarchical indexing by task type, complexity, and execution outcome.","intents":["Retrieve full execution traces for a specific task to understand expected behavior","Look up intermediate file states at particular steps in a multi-step workflow","Find similar tasks based on file system patterns or command sequences","Sample diverse task trajectories for training data augmentation"],"best_for":["Researchers building few-shot learning datasets for OS tasks","Teams implementing retrieval-augmented generation (RAG) for agent prompting","Developers creating curriculum learning strategies for progressive task complexity","Benchmark evaluators needing stratified sampling across task categories"],"limitations":["Index is static and reflects the OSWorld benchmark snapshot; new tasks require dataset regeneration","Trajectory retrieval returns cached states only; real-time system state divergence is not handled","No fuzzy matching for similar file paths or command variations across trajectories","Index structure optimized for lookup speed, not for discovering cross-task dependencies or shared subtasks"],"requires":["HuggingFace Datasets library with indexing support","Python 3.8+","Familiarity with OSWorld task taxonomy and naming conventions"],"input_types":["task identifiers (string)","step indices (integer)","file path patterns (string)"],"output_types":["trajectory records (JSON)","file snapshots (binary/text)","command sequences (list)","execution metadata (structured)"],"categories":["search-retrieval","data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache__cap_2","uri":"capability://data.processing.analysis.file.system.state.serialization.and.deserialization","name":"file system state serialization and deserialization","description":"Converts live Ubuntu file system states (directory trees, file contents, permissions, metadata) into serialized formats suitable for storage and transmission, and reconstructs those states for agent evaluation. Uses structured representations (JSON/Protocol Buffers) to capture file hierarchies, content hashes, and system metadata while maintaining semantic equivalence for task execution validation.","intents":["Serialize file system snapshots from task execution for storage in the cache","Reconstruct file system states in agent evaluation environments to match ground truth","Compare file system states before and after agent actions to validate task completion","Generate minimal diffs between consecutive task steps for efficient storage"],"best_for":["Benchmark infrastructure teams managing large-scale task execution and caching","Researchers validating agent behavior against ground-truth file system states","Teams implementing deterministic task replay and reproducibility","Developers building file system diff and validation tools for OS agents"],"limitations":["Serialization captures file metadata and content but not extended attributes (xattr) or SELinux contexts","Large binary files are stored as content hashes; full content retrieval requires separate blob storage","Symbolic links and hard links are normalized; exact link structure may not be preserved","File permissions and ownership are captured but may not be fully reproducible across different Ubuntu versions or user contexts"],"requires":["Python 3.8+","File system access permissions to read metadata and content","Storage backend for large file blobs (local disk or cloud object store)"],"input_types":["file system paths (string)","directory trees (nested structures)","file content (binary/text)","metadata (permissions, timestamps, ownership)"],"output_types":["serialized state (JSON/Protocol Buffers)","file content hashes (SHA256)","directory tree representations (nested JSON)","state diffs (structured deltas)"],"categories":["data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache__cap_3","uri":"capability://data.processing.analysis.task.outcome.and.success.criteria.validation","name":"task outcome and success criteria validation","description":"Encodes ground-truth success criteria for each cached task (file creation, content validation, permission changes, command output matching) and provides validation functions to check whether agent actions achieve those criteria. Stores expected file states, output patterns, and side effects alongside trajectories, enabling automated evaluation without manual inspection.","intents":["Automatically validate whether an agent successfully completed a task by comparing final state to ground truth","Extract success criteria from cached trajectories to use in agent training reward functions","Identify partial task completion and intermediate milestones for curriculum learning","Generate failure diagnostics by comparing agent-produced states to expected outcomes"],"best_for":["Benchmark evaluation frameworks automating task success assessment","Researchers training agents with reward signals derived from ground-truth outcomes","Teams implementing automated testing for OS automation agents","Developers building diagnostic tools to explain agent failures"],"limitations":["Success criteria are binary or categorical; no continuous reward signals for partial progress","Validation assumes deterministic task outcomes; non-deterministic behaviors (timing, race conditions) may cause false negatives","Criteria are task-specific and may not generalize to task variants or modified environments","No support for validating side effects or unintended state changes beyond the explicit success criteria"],"requires":["Python 3.8+","Access to cached task definitions and expected outcomes","File system comparison utilities"],"input_types":["task identifiers (string)","final file system state (serialized)","command outputs (text)","execution logs (structured)"],"output_types":["success/failure boolean","validation report (structured)","mismatch details (file-level diffs)","partial completion metrics (percentage)"],"categories":["data-processing-analysis","safety-moderation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"hf-dataset-xlangai--ubuntu_osworld_file_cache__cap_4","uri":"capability://data.processing.analysis.benchmark.dataset.versioning.and.provenance.tracking","name":"benchmark dataset versioning and provenance tracking","description":"Maintains metadata about dataset version, OSWorld benchmark version, Ubuntu system configuration, and execution environment for each cached trajectory. Enables reproducibility by documenting the exact conditions under which tasks were executed, and supports dataset evolution by tracking changes to task definitions, success criteria, or file system states across versions.","intents":["Reproduce agent evaluation results by matching dataset version and system configuration","Track changes to task definitions or success criteria across dataset versions","Understand which Ubuntu versions and configurations are represented in the cache","Compare agent performance across different dataset versions to measure benchmark drift"],"best_for":["Benchmark maintainers managing dataset evolution and backward compatibility","Researchers ensuring reproducibility of published results","Teams comparing agent performance across benchmark versions","Organizations auditing dataset provenance for compliance or transparency"],"limitations":["Provenance metadata is captured at trajectory level but not at individual file level","Version tracking does not include detailed change logs; only version identifiers are stored","System configuration metadata may be incomplete if execution environment was not fully documented","No automatic detection of breaking changes between versions; manual review required"],"requires":["HuggingFace Datasets library with metadata support","Python 3.8+","Access to dataset version history and release notes"],"input_types":["dataset version identifiers (string)","system configuration metadata (JSON)","execution environment details (structured)"],"output_types":["version information (structured)","provenance metadata (JSON)","compatibility reports (text)","change summaries (structured)"],"categories":["data-processing-analysis"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":22,"verified":false,"data_access_risk":"high","permissions":["HuggingFace Datasets library (datasets>=2.0.0)","Python 3.8+","Sufficient disk space for ~1M+ cached file entries","Understanding of OSWorld benchmark format and task definitions","HuggingFace Datasets library with indexing support","Familiarity with OSWorld task taxonomy and naming conventions","File system access permissions to read metadata and content","Storage backend for large file blobs (local disk or cloud object store)","Access to cached task definitions and expected outcomes","File system comparison utilities"],"failure_modes":["Dataset is Ubuntu-specific; Windows and macOS workflows not represented","Cached trajectories reflect specific system configurations and may not generalize to all Ubuntu versions or custom setups","File cache captures point-in-time snapshots; dynamic system state changes between cache points are not recorded","No built-in versioning or temporal tracking of how file states evolve across multiple task sequences","Index is static and reflects the OSWorld benchmark snapshot; new tasks require dataset regeneration","Trajectory retrieval returns cached states only; real-time system state divergence is not handled","No fuzzy matching for similar file paths or command variations across trajectories","Index structure optimized for lookup speed, not for discovering cross-task dependencies or shared subtasks","Serialization captures file metadata and content but not extended attributes (xattr) or SELinux contexts","Large binary files are stored as content hashes; full content retrieval requires separate blob storage","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.05,"quality":0.2,"ecosystem":0.38999999999999996,"match_graph":0.25,"freshness":0.75,"weights":{"adoption":0.3,"quality":0.25,"ecosystem":0.1,"match_graph":0.3,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-05-24T12:16:22.764Z","last_scraped_at":"2026-05-03T14:22:48.064Z","last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=xlangai--ubuntu_osworld_file_cache","compare_url":"https://unfragile.ai/compare?artifact=xlangai--ubuntu_osworld_file_cache"}},"signature":"2M5lsLtT+LAd5ZY8sV/YSQ0PYGvtxN36wwH3iVGdrp9RrKhgAIGzKYIo2oNf08w4gVjLplWD+NlgJuo2Y71HDA==","signedAt":"2026-06-21T04:04:21.327Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/xlangai--ubuntu_osworld_file_cache","artifact":"https://unfragile.ai/xlangai--ubuntu_osworld_file_cache","verify":"https://unfragile.ai/api/v1/verify?slug=xlangai--ubuntu_osworld_file_cache","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}