Skip to content

subsetting kallisto object using subset_kallisto()  #204

@agropyron

Description

@agropyron

Hi,
@warrenmcg's advice in Issue #201 to subset files (below) does not work for me

kal <- sleuth::read_bootstrap(kal_path, read_bootstrap = TRUE)
ids <- c("ENSMUST00000000001", "ENSMUST00000000003", "ENSMUST00000000129")
subset_kal <- sleuth::subset_kallisto(kal, target_ids = ids)
write_kallisto_hdf5(subset_kal, fname = "subset_kal.h5")

For importing I used read_kallisto - I think read_bootstrap is a typo. Worked.
subset_kallisto(kal, target_ids=ids) results in a file with the correct number of rows (616) in kal$bootstrap but with zero rows in kal$abundance.
I proceeded through_write_kallisto_hdf5_ and, for a sanity check, reimported the subsetted files using read_kallisto, which fails (understandably) with the error below.

Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem, :
Not enough memory to read data! Try to read a subset of data by specifying the index or count parameter.
Found 100 bootstrap samples
Error in $<-.data.frame(*tmp*, "est_counts", value = c(776, 4054, :
replacement has 616 rows, data has 0

When I use a custom function to subset the files, providing additional attributes (original_num_targets, num_targets, excluded_ids) in the same way subset_kallisto does, I get the correct number of rows in kal$abundance and kal$bootstrap. However, I can't cheat sleuth into accepting the new attributes: is_kallisto_subset does not report the file as subsetted.

Responsible is either write_kallisto_hdf5 or read_kallisto_h5. I think part of the problem lies with the below lines in read_kallisto_h5, but I can't figure out why the excluded_ids are not shown when calling summary(kal)

attr(res, 'num_targets') <- nrow(abund) ##ok
attr(res, 'original_num_targets') <- nrow(abund) ##importing again the number rows instead of the attribute original_num_targets written while subsetting file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions