schicluster.loop.merge_raw_matrix#

Module Contents#

_chrom_sum_iterator(cell_urls, chrom_sizes, chrom_offset, add_trans=False)[source]#

Iterate through the raw matrices and chromosomes of cells.

Parameters:
  • cell_urls – List of cell urls.

  • chrom_sizes – Dictionary of chromosome sizes.

  • chrom_offset – Dictionary of chromosome offsets.

  • add_trans – If true, will also iterate all the trans combinations (different chromosomes).

Yields:

pixel_df – Dataframe of pixels. Used by cooler.create_cooler to save to h5 file.

_save_single_matrix_type(cooler_path, bins_df, cell_urls, chrom_sizes, chrom_offset, add_trans=False)[source]#

Save a single matrix type Cool file from merging multiple cell urls.

Parameters:
  • cooler_path – Path to the output cool file.

  • bins_df – Dataframe of bins. Created from chromosome sizes and resolution.

  • cell_urls – List of cell urls to merge.

  • chrom_sizes – Dictionary of chromosome sizes.

  • chrom_offset – Dictionary of chromosome offsets.

  • add_trans – Whether add trans matrix also.

make_raw_matrix_cell_table(cell_table, resolution_str='10K')[source]#
merge_raw_scool_by_cluster(chrom_size_path, resolution, cell_table_path, output_dir, add_trans=False, cpu=1)[source]#

Sum the raw matrix of cells, no normalization.

Parameters:
  • chrom_size_path – Path to the chrom size file. This file is used to determine chromosome names and bins.

  • resolution – Resolution of the raw matrix.

  • cell_table_path – Path to the cell table. This table should contain three columns: cell_id, cell_url, cell_group; no Header. The cell_id is the id of the cell in the raw matrix. The cell_url is the path to the raw matrix. The cell_group is the group of the cells.

  • output_dir – Path to the output directory. Group cool files will be named as “<output_dir>/<cell_group>.cool”.

  • add_trans – Whether add trans matrix also.

  • cpu – Number of CPUs to use.