| Title: | Compress and Decompress Data Using the 'BLOSC' Library |
|---|---|
| Description: | Arrays of structured data types can require large volumes of disk space to store. 'Blosc' is a library that provides a fast and efficient way to compress such data. It is often applied in storage of n-dimensional arrays, such as in the case of the geo-spatial 'zarr' file format. This package can be used to compress and decompress data using 'Blosc'. |
| Authors: | Pepijn de Vries [aut, cre] (ORCID: <https://orcid.org/0000-0002-7961-6646>), Chris Maiwald [cph], Alexander Gessler [cph] |
| Maintainer: | Pepijn de Vries <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.2.0002 |
| Built: | 2026-06-05 10:46:42 UTC |
| Source: | https://github.com/pepijn-devries/blosc |
Use the Blosc library to compress or decompress data.
blosc_compress( x, compressor = "blosclz", level = 7L, shuffle = "noshuffle", typesize = 4L, ... ) blosc_decompress(x, ...)blosc_compress( x, compressor = "blosclz", level = 7L, shuffle = "noshuffle", typesize = 4L, ... ) blosc_decompress(x, ...)
x |
In case of In case of |
compressor |
The compression algorithm to be used. Can be any of
|
level |
An |
shuffle |
A shuffle filter to be activated before compression.
Should be one of |
typesize |
BLOSC compresses arrays of structured data. This argument
specifies the size ( |
... |
Arguments passed to |
In case of blosc_compress() a vector of compressed raw
data is returned. In case of blosc_decompress() returns a vector of
decompressed raw data. Or in in case dtype (see dtype_to_r()) is
specified, a vector of the specified type is returned.
my_dat <- as.raw(sample.int(2L, 10L*1024L, replace = TRUE) - 1L) my_dat_out <- blosc_compress(my_dat, typesize = 1L) my_dat_decomp <- blosc_decompress(my_dat_out) ## After compressing and decompressing the data is the same as the original: all(my_dat == my_dat_decomp)my_dat <- as.raw(sample.int(2L, 10L*1024L, replace = TRUE) - 1L) my_dat_out <- blosc_compress(my_dat, typesize = 1L) my_dat_decomp <- blosc_decompress(my_dat_out) ## After compressing and decompressing the data is the same as the original: all(my_dat == my_dat_decomp)
Obtain information about raw data compressed with blosc.
blosc_info(x, ...)blosc_info(x, ...)
x |
Raw data compressed with |
... |
Ignored |
Returns a named list with information about blosc compressed
data x.
data_compressed <- blosc_compress(volcano, typesize = 2, dtype = "<i2", compressor = "lz4", shuffle = "bitshuffle") blosc_info(data_compressed)data_compressed <- blosc_compress(volcano, typesize = 2, dtype = "<i2", compressor = "lz4", shuffle = "bitshuffle") blosc_info(data_compressed)
Use ZARR V2.0 data types to convert between R native types and raw data.
r_to_dtype(x, dtype, na_value = NA, ...) dtype_to_r(x, dtype, na_value = NA, ...)r_to_dtype(x, dtype, na_value = NA, ...) dtype_to_r(x, dtype, na_value = NA, ...)
x |
Object to be converted |
dtype |
The data type used for encoding/decoding raw data. The The second character represents the main data type ( The following characters are numerical indicating the byte size of the data type.
For example: The main types For more details about dtypes see
ZARR V2.0
or |
na_value |
When storing raw data, you may want to reserve a value to
represent missing values. This is also what Therefore, you can use this argument to indicate which value should represent
missing values. By default it uses For more details see |
... |
Ignored |
One of the applications of BLOSC compression is in ZARR, which is used to store
n-dimensional structured data. r_to_dtype() and dtype_to_r() are convenience functions
that allows you to convert most common data types to R native types.
R natively only supports logical() (actually stored as 32 bit integer in memory),
integer() (signed 32 bit integers), numeric() (64 bit floating points) and complex()
(real and imaginary component both represented by a 64 bit floating point). R also has some
more complex classes, but those are generally derivatives of the aforementioned types.
The functions documented here will attempt to convert raw data to R types (or vice versa). As not all 'dtypes' have an appropriate R type counterpart, some conversions will not be possible directly and will result in an error.
For more details see vignette("dtypes").
In case of r_to_dtype() a vector of encoded raw data is returned.
In case of dtype_to_r() a vector of an R type (appropriate for the specified dtype)
is returned if possible.
Pepijn de Vries
## Encode volcano data to 16 bit floating point values volcano_encoded <- r_to_dtype(volcano, dtype = "<f2") ## Decode the volcano format to its original volcano_reconstructed <- dtype_to_r(volcano_encoded, dtype = "<f2") ## The reconstruction is the same as its original: all(volcano_reconstructed == volcano) ## Encode a numeric sequence with a missing value represented by -999 r_to_dtype(c(1, 2, 3, NA, 4), dtype = "<i2", na_value = -999)## Encode volcano data to 16 bit floating point values volcano_encoded <- r_to_dtype(volcano, dtype = "<f2") ## Decode the volcano format to its original volcano_reconstructed <- dtype_to_r(volcano_encoded, dtype = "<f2") ## The reconstruction is the same as its original: all(volcano_reconstructed == volcano) ## Encode a numeric sequence with a missing value represented by -999 r_to_dtype(c(1, 2, 3, NA, 4), dtype = "<i2", na_value = -999)