We have hosted the application stringzilla in order to run this application in our online workstations with Wine or directly.


Quick description about stringzilla:

StringZilla is the Godzilla of string libraries, splitting, sorting, and shuffling large textual datasets. StringZilla uses a heuristic so simple it's almost stupid. but it works. It matches the first few letters of words with hyper-scalar code to achieve memcpy speeds. The implementation fits into a single C 99 header file and uses different SIMD flavors and SWAR on older platforms. The Str is designed to replace long Python str strings and wrap our C-level API. On the other hand, the File memory-maps a file from persistent memory without loading its copy into RAM. The contents of that file would remain immutable, and the mapping can be shared by multiple Python processes simultaneously. A standard dataset pre-processing use case would be to map a sizeable textual dataset like Common Crawl into memory, spawn child processes, and split the job between them.

Features:
  • Collection-Level Operations
  • Low-Level Python API
  • String libraries, splitting, sorting, and shuffling large textual dataset
  • JavaScript docs
  • Python docs
  • Substring Search


Programming Language: C++.
Categories:
JSON

Page navigation:

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.