schicluster.zarr.cool_ds#

Module Contents#

SMALL_SAMPLE_CHUNK = 1[source]#
COMPRESSOR_C_LEVEL = 3[source]#
class CoolDSSingleMatrixWriter(path, cool_table_path, value_types, chrom_sizes_path, chrom1, chrom2=None, mode='w', cooler_bin_size=10000, bin_chunk_size=510, sample_chunk_size=50, data_dtype='float32', cpu=1)[source]#
_read_cool_table(cool_table_path)[source]#
_read_chrom_info(chrom_sizes_path)[source]#
_add_root_attrs()[source]#
_init_zarr()[source]#
_save_cool_to_temp_zarr(cool_paths, output_path, cool_type, value_type, triu)[source]#
_save_small_sample_chunks()[source]#
static _save_single_bin_chunk_worker(ds_paths, zarr_path, cool_type, bin1_slice, bin2_slice, sample_id_slice, value_idx)[source]#
_save_single_bin_chunk(zarr_path, path_array, cool_type)[source]#
_save_small_sample_chunks_ds_to_final_zarr(temp_zarr_records)[source]#
_write_data()[source]#
execute()[source]#

Execute the pipeline.

generate_cool_ds(output_dir, cool_table_path, value_types, chrom_sizes_path, trans_matrix=False, mode='w', cooler_bin_size=10000, bin_chunk_size=510, sample_chunk_size=50, data_dtype='float32', cpu=1)[source]#

Generate a CoolDS zarr dataset from cool files.

Parameters:
  • output_dir – The output directory.

  • cool_table_path – Path to the cool table with four columns: sample, value_type, path, cool_type

  • value_types – Dict of cool types and their value types.

  • chrom_sizes_path – Path to the chrom sizes file.

  • trans_matrix – Whether generate trans-contacts (chrom1 != chrom2) matrix

  • mode – Mode to open the zarr.

  • cooler_bin_size – Cooler bin size.

  • bin_chunk_size – Chunk size of the bin1 and bin2 dimensions.

  • sample_chunk_size – Chunk size of the sample dimension.

  • data_dtype – Data type of the matrix.

  • cpu – Number of CPUs to use.