Skip to content

Conversation

@dampierch
Copy link

Intro

Thank you for sharing your code.
I am using it to analyze some of my own data.
In particular, I am using the qtl scripts.
I encountered a few bugs while running eqtl_prepare_expression.py and fixed them in a local branch.
I thought I should share them with you in case the fixes are helpful for others using your code.

Fixes

  1. Added pyqtl to Dockerfile per Issue #61
  2. Added some functions to make numeric indices in the sample_participant_lookup run smoothly.
  3. Added an option to ignore ENSEMBL gene version numbers in the merging part of the BED preparation.
  • One should never have different ENSEMBL gene version numbers if the gene model is the same in all processing steps. Unfortunately, sometimes the gene model used in a particular step is unknown due to insufficient documentation from a collaborator or commercial service.

@ActioTom
Copy link

Also, I believe the --convert-tpm option is broken because of a bug around line 100 in eqtl_prepare_expression.py:

    if args.convert_tpm:
        print('  * Converting to TPM', flush=True)
        tpm_df = tpm_df / tpm_df.sum(0) * 1e6

This fails because the first column of tpm_df is Name and contains the gene ids rather than numeric data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants