{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"pypi_pypi-vaex","slug":"pypi-vaex","name":"vaex","type":"repo","url":"https://www.github.com/vaexio/vaex","page_url":"https://unfragile.ai/pypi-vaex","categories":["data-analysis"],"tags":[],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"pypi_pypi-vaex__cap_0","uri":"capability://data.processing.analysis.lazy.expression.evaluation.with.virtual.columns","name":"lazy-expression-evaluation-with-virtual-columns","description":"Implements a deferred computation model where DataFrame operations (e.g., df.x * df.y) are stored as expression trees rather than executed immediately. Virtual columns are calculated on-the-fly during materialization, avoiding intermediate memory allocation. The expression system defers actual computation until results are explicitly needed (visualization, aggregation, export), enabling efficient processing of billion-row datasets by processing only required data chunks.","intents":["Process datasets larger than available RAM without materializing intermediate results","Chain multiple column transformations without memory overhead","Defer expensive computations until results are actually needed","Create derived columns that consume no additional storage"],"best_for":["data scientists working with multi-gigabyte datasets on memory-constrained machines","teams building ETL pipelines requiring minimal intermediate storage","analysts exploring large datasets interactively without pre-computation"],"limitations":["Expression trees can become complex and difficult to debug for deeply nested operations","Some operations (e.g., certain joins) may force materialization, negating lazy benefits","Debugging lazy expressions requires understanding deferred execution semantics"],"requires":["Python 3.7+","NumPy for underlying array operations","Understanding of lazy evaluation paradigm"],"input_types":["column references (df.column_name)","numeric literals","boolean expressions","function calls on columns"],"output_types":["expression objects (not materialized)","computed arrays (when materialized)","scalar values (when aggregated)"],"categories":["data-processing-analysis","lazy-evaluation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_1","uri":"capability://data.processing.analysis.memory.mapped.out.of.core.dataframe.access","name":"memory-mapped-out-of-core-dataframe-access","description":"Leverages OS-level memory mapping (mmap) to map data files directly into virtual address space, loading only accessed data pages into physical RAM on-demand. The DataFrame abstraction sits atop memory-mapped datasets (via dataset_mmap.py), enabling transparent access to files larger than available memory. Zero-copy operations mean column slicing and filtering create views rather than copies, with the kernel handling page faults and eviction automatically.","intents":["Work with 100GB+ datasets on machines with 8-16GB RAM","Perform column slicing and filtering without data duplication","Access specific row ranges efficiently without scanning entire file","Maintain consistent performance regardless of dataset size relative to RAM"],"best_for":["researchers analyzing large scientific datasets (astronomy, genomics) on commodity hardware","data engineers building scalable single-machine pipelines","teams avoiding cloud infrastructure costs for large-scale analysis"],"limitations":["Performance degrades if working set exceeds available RAM (causes thrashing)","Requires contiguous file formats (HDF5, Arrow, Parquet) — CSV requires full load","Memory-mapped files are OS-dependent; behavior varies on Windows vs Linux","Cannot modify memory-mapped data in-place; mutations require materialization"],"requires":["Python 3.7+","File system supporting mmap (all modern systems)","Data in HDF5, Apache Arrow, or Parquet format for optimal performance","Sufficient virtual address space (64-bit systems recommended)"],"input_types":["HDF5 files","Apache Arrow IPC format","Apache Parquet files","CSV/JSON (requires full load into memory)"],"output_types":["DataFrame views (zero-copy)","column arrays (memory-mapped)","filtered subsets (lazy views)"],"categories":["data-processing-analysis","memory-management"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_10","uri":"capability://data.processing.analysis.data.type.system.with.automatic.inference.and.conversion","name":"data-type-system-with-automatic-inference-and-conversion","description":"Implements a comprehensive data type system supporting numeric (int, float, complex), string, datetime, boolean, and categorical types with automatic inference from source data. Type conversion is lazy (deferred until materialization) and supports explicit casting via expressions. The system handles missing values (NaN, None) appropriately for each type. Array conversion to NumPy/Arrow formats is optimized for zero-copy where possible.","intents":["Automatically infer data types from imported data","Convert between data types efficiently","Handle missing values appropriately for each type","Export data to NumPy, Arrow, or other formats with correct types"],"best_for":["data import pipelines requiring type inference","teams working with heterogeneous data sources","analysis requiring specific numeric precision (float32 vs float64)"],"limitations":["Automatic type inference may be incorrect for ambiguous data (e.g., '123' as string vs int)","Type conversion can be expensive for large columns (requires materialization)","Some types (e.g., complex numbers) have limited support in some formats","Categorical types require materialization of category mapping","Missing value handling varies by type and may cause unexpected behavior"],"requires":["Python 3.7+","NumPy for type definitions","PyArrow for Arrow type system (optional)"],"input_types":["raw data (strings, numbers, dates)","explicit type specifications","source format (CSV, HDF5, etc.)"],"output_types":["typed columns (int, float, string, datetime, etc.)","NumPy arrays with correct dtype","Arrow arrays with correct type"],"categories":["data-processing-analysis","type-system"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_11","uri":"capability://data.processing.analysis.string.operations.with.vectorized.processing","name":"string-operations-with-vectorized-processing","description":"Provides vectorized string operations (substring, split, replace, case conversion, pattern matching) implemented in C++ for performance. String operations work on virtual columns without materializing intermediate results. The system supports regular expressions and Unicode handling. Operations are lazy and composed into expression trees for efficient batch processing.","intents":["Perform string transformations on large text columns efficiently","Extract patterns from text using regular expressions","Clean and normalize text data without materialization","Combine string operations with other column transformations"],"best_for":["data cleaning pipelines with text processing","teams analyzing text-heavy datasets (logs, documents, user-generated content)","NLP preprocessing requiring efficient string operations"],"limitations":["Complex string operations may require materialization for efficiency","Regular expression performance depends on pattern complexity","Unicode handling may have edge cases with certain character sets","Some advanced string operations (e.g., fuzzy matching) not supported","Memory usage scales with string length and column size"],"requires":["Python 3.7+","C++ extensions for vectorized operations","Regular expression support (built-in)"],"input_types":["string columns","pattern strings (for regex)","replacement strings","substring indices"],"output_types":["transformed string columns","boolean arrays (for pattern matching)","numeric arrays (for string length, position)"],"categories":["data-processing-analysis","text-processing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_12","uri":"capability://data.processing.analysis.statistical.aggregation.with.single.pass.computation","name":"statistical-aggregation-with-single-pass-computation","description":"Implements efficient statistical aggregations (sum, mean, std, min, max, median, percentiles, etc.) computed in a single pass over the data using Welford's algorithm and other numerically stable techniques. Aggregations work on virtual columns and support filtering and grouping. Results are computed lazily and materialized only when needed. The system maintains numerical stability for large datasets.","intents":["Compute summary statistics (mean, std, min, max) on large columns","Calculate percentiles and quantiles efficiently","Compute statistics on filtered or grouped subsets","Maintain numerical stability for large datasets"],"best_for":["statistical analysis of large datasets","data quality assessment and profiling","teams requiring robust statistics on billion-row datasets"],"limitations":["Some aggregations (e.g., median, percentiles) may require sorting or materialization","Numerical stability depends on algorithm choice (Welford's algorithm used for mean/std)","Very large datasets may accumulate floating-point errors","Some advanced statistics (e.g., skewness, kurtosis) may require multiple passes"],"requires":["Python 3.7+","NumPy for numerical operations"],"input_types":["numeric columns","aggregation function names (sum, mean, std, etc.)","grouping columns (optional)","filtering expressions (optional)"],"output_types":["scalar values (for single aggregations)","DataFrames (for grouped aggregations)","arrays (for percentiles)"],"categories":["data-processing-analysis","statistics"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_13","uri":"capability://data.processing.analysis.sorting.and.ordering.with.external.memory.techniques","name":"sorting-and-ordering-with-external-memory-techniques","description":"Provides sorting capabilities using external memory techniques (merge sort with disk spillover) for datasets larger than RAM. Sorting operations create ordered views or materialized sorted DataFrames. The system supports sorting on multiple columns with mixed sort orders (ascending/descending). Sorting is lazy when possible but may require materialization for certain operations. Index-based access enables efficient lookups on sorted data.","intents":["Sort large datasets that exceed available RAM","Create ordered views for sequential processing","Sort on multiple columns with mixed sort orders","Enable efficient lookups on sorted data via indexing"],"best_for":["data processing pipelines requiring sorted output","teams analyzing time-series or sequential data","analysis requiring top-K or bottom-K operations"],"limitations":["Sorting requires materialization (cannot remain lazy)","External memory sorting adds disk I/O overhead","Sorting on high-cardinality keys can be expensive","Multi-column sorting may require multiple passes","Sorted order is not preserved across mutations"],"requires":["Python 3.7+","Sufficient disk space for external memory sorting","Sortable data types (numeric, string, datetime)"],"input_types":["column names to sort by","sort order specification (ascending/descending)","list of columns for multi-column sort"],"output_types":["sorted DataFrame","sort indices (for reordering other columns)","top-K/bottom-K results"],"categories":["data-processing-analysis","sorting"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_14","uri":"capability://data.processing.analysis.export.to.multiple.formats.with.format.optimization","name":"export-to-multiple-formats-with-format-optimization","description":"Provides export functionality to HDF5, Apache Arrow, Apache Parquet, CSV, and other formats with automatic format selection based on use case. Export operations materialize data and write to disk with optional compression. The system supports incremental export (appending to existing files) and format conversion. Export can be parallelized across multiple threads for improved throughput.","intents":["Save processed data to disk in optimized formats","Convert between data formats for interoperability","Export subsets of data for sharing or archival","Compress data to reduce storage requirements"],"best_for":["data pipelines requiring persistent storage","teams sharing data across different tools and platforms","analysis requiring format conversion for downstream processing"],"limitations":["Export requires materializing data (cannot remain lazy)","Compression adds CPU overhead during export","Large exports can take significant time (hours for 100GB+)","Incremental export may not be supported for all formats","Format conversion may lose metadata or type information"],"requires":["Python 3.7+","Format-specific libraries (h5py, pyarrow, pandas)","Sufficient disk space for output file"],"input_types":["DataFrame objects","output file path","format specification","compression options"],"output_types":["HDF5, Arrow, Parquet, CSV, or JSON files","export status/progress","file metadata"],"categories":["data-processing-analysis","input-output"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_2","uri":"capability://automation.workflow.task.execution.engine.with.multithreading.orchestration","name":"task-execution-engine-with-multithreading-orchestration","description":"Implements a task-based execution model (via execution.py and tasks.py) where deferred expressions are compiled into tasks that execute on thread pools. The engine batches operations, manages task dependencies, and coordinates multithreaded execution across CPU cores. Tasks operate on chunked data, allowing efficient parallelization while respecting memory constraints. Progress tracking and cancellation are built into the execution pipeline.","intents":["Execute deferred expressions efficiently across multiple CPU cores","Batch multiple operations to minimize memory bandwidth overhead","Track progress of long-running computations","Cancel in-flight computations without data corruption"],"best_for":["multi-core systems (4+ cores) processing large datasets","interactive analysis requiring responsive progress feedback","batch processing pipelines with cancellation requirements"],"limitations":["GIL contention may limit speedup on CPU-bound operations with pure Python code","Task scheduling overhead (~1-5ms per task) becomes significant for very small datasets","No distributed execution — limited to single machine","Complex task graphs can consume significant memory for intermediate results"],"requires":["Python 3.7+","Multi-core CPU (single-core systems will not benefit from parallelization)","NumPy/C++ extensions for compute-intensive operations (to release GIL)"],"input_types":["expression trees (from lazy evaluation)","task dependency graphs"],"output_types":["computed arrays","aggregation results","progress events"],"categories":["automation-workflow","task-orchestration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_3","uri":"capability://data.processing.analysis.groupby.aggregation.with.hash.based.binning","name":"groupby-aggregation-with-hash-based-binning","description":"Implements efficient group-by operations using hash-based binning rather than sorting, allowing O(n) aggregations without requiring data to be pre-sorted. The GroupBy abstraction supports multiple aggregation functions (sum, mean, count, std, etc.) computed in a single pass over the data. Virtual columns enable grouping on derived expressions without materializing intermediate results. Results are returned as new DataFrames with group keys and aggregated values.","intents":["Compute aggregations (sum, mean, count, std) grouped by one or more columns","Group by derived expressions without materializing intermediate columns","Perform multiple aggregations in a single pass for efficiency","Handle high-cardinality grouping keys efficiently"],"best_for":["data analysts computing summary statistics by category","time-series analysis with temporal grouping","teams requiring fast multi-level aggregations on large datasets"],"limitations":["Hash-based binning requires sufficient memory for hash table (scales with cardinality)","Very high cardinality grouping keys (millions of unique values) may cause memory pressure","Ordered grouping (e.g., cumulative sums) requires additional sorting pass","Some advanced aggregations (e.g., percentiles) may require materialization"],"requires":["Python 3.7+","Columns to group by must be hashable types (numeric, string, datetime)","Sufficient memory for hash table proportional to cardinality"],"input_types":["column names (string)","expression objects (for derived grouping keys)","list of columns (for multi-level grouping)"],"output_types":["DataFrame with group keys and aggregated values","scalar values (for single-group aggregations)"],"categories":["data-processing-analysis","aggregation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_4","uri":"capability://data.processing.analysis.join.operations.with.hash.and.sort.strategies","name":"join-operations-with-hash-and-sort-strategies","description":"Implements multiple join strategies (hash join, sort-merge join) selected based on data characteristics and memory availability. The join operation builds hash tables or sorts data as needed, supporting inner, left, right, and outer joins. Joins operate on DataFrames with automatic alignment of join keys, and results are returned as new DataFrames. The system optimizes join order and strategy selection based on dataset size and cardinality.","intents":["Combine data from two DataFrames based on matching keys","Perform inner, left, right, and outer joins efficiently","Join on derived expressions without materializing intermediate columns","Handle joins where one or both DataFrames are larger than RAM"],"best_for":["data integration tasks combining multiple large datasets","relational analysis requiring multi-table operations","teams building data pipelines with complex join logic"],"limitations":["Join operations may force materialization of one or both DataFrames, negating lazy benefits","Hash joins require sufficient memory for hash table (scales with smaller DataFrame size)","Sort-merge joins require O(n log n) time and temporary storage","Joining on high-cardinality keys can cause memory exhaustion","Cross joins (Cartesian product) are not supported for large datasets"],"requires":["Python 3.7+","Join keys must be hashable or sortable types","Sufficient memory for hash table or sort buffers"],"input_types":["two DataFrame objects","join key column names (string or list)","join type specification (inner, left, right, outer)"],"output_types":["DataFrame with combined columns from both inputs","rows matching join condition"],"categories":["data-processing-analysis","relational-operations"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_5","uri":"capability://data.processing.analysis.multi.format.data.import.with.format.optimization","name":"multi-format-data-import-with-format-optimization","description":"Provides unified import interface supporting HDF5, Apache Arrow, Apache Parquet, CSV, and JSON formats with automatic format detection and optimization recommendations. The system includes format-specific dataset classes (e.g., HDF5Dataset, ArrowDataset) that implement memory-mapped access where possible. CSV/JSON require full materialization but are automatically converted to optimized formats for repeated access. The import pipeline handles compression, encoding, and type inference.","intents":["Load data from multiple file formats into a unified DataFrame interface","Automatically detect optimal format for performance and storage","Convert between formats while preserving data integrity","Handle compressed and encoded data transparently"],"best_for":["data engineers building ETL pipelines with heterogeneous data sources","analysts working with datasets in multiple formats","teams optimizing storage and access patterns for large datasets"],"limitations":["CSV/JSON import requires full materialization into memory (no streaming)","Format conversion can be time-consuming for very large files (hours for 100GB+)","Some formats (CSV) lose type information; requires explicit type specification","Compression adds CPU overhead during decompression","Cloud storage (S3, GCS) requires additional dependencies and network latency"],"requires":["Python 3.7+","Format-specific libraries: h5py (HDF5), pyarrow (Arrow/Parquet), pandas (CSV/JSON)","Sufficient disk space for format conversion"],"input_types":["file paths (local or cloud URLs)","file-like objects","format specification (auto-detected or explicit)"],"output_types":["DataFrame objects","format recommendations","conversion status/progress"],"categories":["data-processing-analysis","input-output"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_6","uri":"capability://image.visual.interactive.visualization.with.server.backend","name":"interactive-visualization-with-server-backend","description":"Provides interactive visualization capabilities through a server-based architecture (vaex-server) that streams aggregated data to browser-based frontends. The visualization system computes histograms, heatmaps, and scatter plots on the server side, sending only aggregated results to the client. This enables interactive exploration of billion-row datasets with responsive UI updates. The server handles query execution, caching, and result streaming.","intents":["Explore large datasets interactively with responsive visualizations","Create histograms, heatmaps, and scatter plots without materializing full data","Share interactive dashboards with collaborators via web interface","Drill down into data subsets based on visual selections"],"best_for":["data scientists exploring large datasets interactively","teams building shared dashboards for data exploration","analysts requiring responsive UI for billion-row datasets"],"limitations":["Server-based architecture adds network latency compared to local visualization","Aggregation-based approach may hide outliers or fine-grained patterns","Requires running separate server process (additional infrastructure)","Browser-based UI may have performance limitations for very dense visualizations","Real-time updates require WebSocket support and may consume significant bandwidth"],"requires":["Python 3.7+","vaex-server package","Modern web browser (Chrome, Firefox, Safari, Edge)","Network connectivity between client and server"],"input_types":["DataFrame objects","column specifications for visualization","aggregation parameters (bin counts, ranges)"],"output_types":["interactive web-based visualizations","aggregated data (histograms, heatmaps)","drill-down results"],"categories":["image-visual","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_7","uri":"capability://data.processing.analysis.machine.learning.model.integration.with.lazy.feature.engineering","name":"machine-learning-model-integration-with-lazy-feature-engineering","description":"Provides wrapper classes for scikit-learn, XGBoost, and other ML frameworks that integrate with Vaex's lazy evaluation system. Features can be engineered as virtual columns without materialization, and models are trained on materialized data only when needed. The system supports feature scaling, encoding, and transformation pipelines that operate on expressions. Model predictions can be added back as virtual columns for further analysis.","intents":["Train ML models on large datasets without materializing all features","Engineer features as virtual columns without storage overhead","Apply feature transformations (scaling, encoding) lazily","Add model predictions as virtual columns for ensemble or analysis"],"best_for":["data scientists building ML pipelines on large datasets","teams requiring efficient feature engineering without intermediate storage","analysts combining ML predictions with exploratory analysis"],"limitations":["Model training still requires materializing training data (lazy features only)","Some scikit-learn transformers may not work with Vaex expressions","Hyperparameter tuning requires multiple training passes (expensive for large data)","Predictions on new data require materializing feature vectors","Limited support for deep learning frameworks (PyTorch, TensorFlow)"],"requires":["Python 3.7+","scikit-learn, XGBoost, or other ML framework","Sufficient memory to materialize training data","vaex-ml package"],"input_types":["DataFrame objects","feature column names or expressions","target column name","model class (sklearn estimator)"],"output_types":["trained model objects","predictions (as arrays or virtual columns)","feature importance scores"],"categories":["data-processing-analysis","tool-use-integration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_8","uri":"capability://memory.knowledge.caching.system.with.smart.invalidation","name":"caching-system-with-smart-invalidation","description":"Implements a multi-level caching system that stores computed results (aggregations, filtered views, materialized columns) with automatic invalidation when source data changes. The cache tracks dependencies between operations, invalidating only affected cached results when mutations occur. Cache eviction policies balance memory usage with hit rates. The system supports both in-memory and disk-based caching for large intermediate results.","intents":["Avoid recomputing expensive aggregations and transformations","Maintain cache consistency when data is modified","Optimize memory usage through intelligent cache eviction","Speed up interactive analysis with cached intermediate results"],"best_for":["interactive analysis sessions with repeated queries","pipelines with expensive intermediate computations","teams working with stable datasets with occasional updates"],"limitations":["Cache invalidation logic can be complex for deeply nested operations","Memory overhead of cache metadata may be significant for many small operations","Disk-based caching adds I/O latency compared to in-memory caching","Cache coherency issues if data is modified externally (outside Vaex)","No distributed cache (single-machine only)"],"requires":["Python 3.7+","Sufficient memory or disk space for cache storage","Understanding of cache invalidation semantics"],"input_types":["computed results (arrays, scalars)","operation dependency graphs"],"output_types":["cached results (returned directly)","cache hit/miss statistics"],"categories":["memory-knowledge","optimization"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"pypi_pypi-vaex__cap_9","uri":"capability://data.processing.analysis.selection.and.filtering.with.boolean.indexing","name":"selection-and-filtering-with-boolean-indexing","description":"Provides efficient row filtering through boolean indexing and selection operations that create lazy views without materializing filtered data. Selections can be combined using boolean operators (AND, OR, NOT) and chained for complex filtering logic. The system supports filtering on both materialized columns and virtual (derived) columns. Filtered views maintain the original data structure and can be further processed or materialized on demand.","intents":["Filter rows based on column values or complex boolean expressions","Create subsets of data for focused analysis without copying","Combine multiple filter conditions efficiently","Apply filters on derived columns without materializing intermediate results"],"best_for":["exploratory analysis requiring frequent filtering","data cleaning pipelines with complex selection logic","teams building interactive dashboards with dynamic filtering"],"limitations":["Complex boolean expressions can become difficult to read and maintain","Filtering on high-cardinality columns may require materializing boolean arrays","Chained filters may not optimize well (requires query optimization)","Some operations (e.g., row-based filtering with UDFs) require materialization"],"requires":["Python 3.7+","Boolean expressions using standard Python operators (==, !=, <, >, &, |, ~)"],"input_types":["boolean expressions (df.column > value)","column names and values","combined boolean expressions"],"output_types":["filtered DataFrame views (lazy)","boolean arrays (when materialized)","row counts (for filtered subsets)"],"categories":["data-processing-analysis","filtering"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":25,"verified":false,"data_access_risk":"high","permissions":["Python 3.7+","NumPy for underlying array operations","Understanding of lazy evaluation paradigm","File system supporting mmap (all modern systems)","Data in HDF5, Apache Arrow, or Parquet format for optimal performance","Sufficient virtual address space (64-bit systems recommended)","NumPy for type definitions","PyArrow for Arrow type system (optional)","C++ extensions for vectorized operations","Regular expression support (built-in)"],"failure_modes":["Expression trees can become complex and difficult to debug for deeply nested operations","Some operations (e.g., certain joins) may force materialization, negating lazy benefits","Debugging lazy expressions requires understanding deferred execution semantics","Performance degrades if working set exceeds available RAM (causes thrashing)","Requires contiguous file formats (HDF5, Arrow, Parquet) — CSV requires full load","Memory-mapped files are OS-dependent; behavior varies on Windows vs Linux","Cannot modify memory-mapped data in-place; mutations require materialization","Automatic type inference may be incorrect for ambiguous data (e.g., '123' as string vs int)","Type conversion can be expensive for large columns (requires materialization)","Some types (e.g., complex numbers) have limited support in some formats","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.05,"quality":0.35,"ecosystem":0.39999999999999997,"match_graph":0.25,"freshness":0.52,"weights":{"adoption":0.3,"quality":0.2,"ecosystem":0.15,"match_graph":0.3,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-06-17T09:51:05.295Z","last_scraped_at":"2026-05-03T15:20:22.334Z","last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=pypi-vaex","compare_url":"https://unfragile.ai/compare?artifact=pypi-vaex"}},"signature":"OgK1J9ikIZ4HudQr3w3liKHjwCsP1ie9bG2jjdg0qoq5CeG2NwB4Z/vMPTmQcXSgMzMl0pVDJf6X1zOJpknaAQ==","signedAt":"2026-06-22T20:09:08.393Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/pypi-vaex","artifact":"https://unfragile.ai/pypi-vaex","verify":"https://unfragile.ai/api/v1/verify?slug=pypi-vaex","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}