-
Notifications
You must be signed in to change notification settings - Fork 96
Description
Hi,
@warrenmcg's advice in Issue #201 to subset files (below) does not work for me
kal <- sleuth::read_bootstrap(kal_path, read_bootstrap = TRUE)
ids <- c("ENSMUST00000000001", "ENSMUST00000000003", "ENSMUST00000000129")
subset_kal <- sleuth::subset_kallisto(kal, target_ids = ids)
write_kallisto_hdf5(subset_kal, fname = "subset_kal.h5")
For importing I used read_kallisto - I think read_bootstrap is a typo. Worked.
subset_kallisto(kal, target_ids=ids) results in a file with the correct number of rows (616) in kal$bootstrap but with zero rows in kal$abundance.
I proceeded through_write_kallisto_hdf5_ and, for a sanity check, reimported the subsetted files using read_kallisto, which fails (understandably) with the error below.
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem, :
Not enough memory to read data! Try to read a subset of data by specifying the index or count parameter.
Found 100 bootstrap samples
Error in$<-.data.frame
(*tmp*
, "est_counts", value = c(776, 4054, :
replacement has 616 rows, data has 0
When I use a custom function to subset the files, providing additional attributes (original_num_targets, num_targets, excluded_ids) in the same way subset_kallisto does, I get the correct number of rows in kal$abundance and kal$bootstrap. However, I can't cheat sleuth into accepting the new attributes: is_kallisto_subset does not report the file as subsetted.
Responsible is either write_kallisto_hdf5 or read_kallisto_h5. I think part of the problem lies with the below lines in read_kallisto_h5, but I can't figure out why the excluded_ids are not shown when calling summary(kal)
attr(res, 'num_targets') <- nrow(abund) ##ok
attr(res, 'original_num_targets') <- nrow(abund) ##importing again the number rows instead of the attribute original_num_targets written while subsetting file