jass_preprocessing package

Submodules

jass_preprocessing.compute_score module

jass_preprocessing.compute_score.compute_sample_size(mgwas, diagnostic_folder, trait)[source]
jass_preprocessing.compute_score.compute_z_score(mgwas)[source]

Compute zscore value and sign1 add the corresponding column to the mgwas dataframe

jass_preprocessing.dna_utils module

Few fonction to to compute DNA complement

jass_preprocessing.dna_utils.dna_complement(input)[source]
jass_preprocessing.dna_utils.dna_complement_base(inputbase)[source]

jass_preprocessing.map_gwas module

Map GWAS

A set of functions to find GWAS files in subfolder and to map columns

jass_preprocessing.map_gwas.convert_missing_values(df)[source]

Convert all missing value strings to a standart np.nan value

Parameters:GWAS_table (pandas dataframe) – GWAS data as a dataframe
Returns:a pandas dataframe with missing value all equal to np.nan

Walk the GWAS path to find the GWAS tables

Parameters:
  • GWAS_table (str) – path of the folder to explore
  • findfile (str) – name of the file to find
Returns:

a pandas dataframe with one column for the filename and one column containing the complete path to the file

jass_preprocessing.map_gwas.map_columns_position(gwas_internal_link, GWAS_labels)[source]

Find column position for each specific Gwas

Parameters:
  • gwas_internal_link (str) – filename of the GWAS data (with path)
  • GWAS_labels (str) – filename of the csv information file
Returns:

pandas Series with column position and column names as index

jass_preprocessing.map_gwas.read_gwas(gwas_internal_link, column_map)[source]

Read gwas raw data, fetch columns thanks to position stored in column_map and rename columns according to column_map.index

Parameters:
  • gwas_internal_link (str) – GWAS data as a dataframe
  • column_map (pandas Series) – Series containing the position of column in
  • raw data (the) –
Returns:

a pandas dataframe with missing value all equal to np.nan

jass_preprocessing.map_gwas.walkfs(startdir, findfile)[source]

Go through the folder and subfolder to find the specified file

Parameters:
  • startdir (str) – path of the folder to explore
  • findfile (str) – name of the file to find

jass_preprocessing.map_reference module

Module of function

jass_preprocessing.map_reference.compute_is_aligned(mgwas)[source]

Check if the reference panel and the GWAS data have the same reference allele. return a boolean vector. The function should be the complement of “is_flipped” but we still compute the two function to eventually detect weird cases (more than two alleles for instance)

jass_preprocessing.map_reference.compute_is_flipped(mgwas)[source]

Check if the reference panel and the GWAS data have the same reference allele. return a boolean vector.

Parameters:mgwas (pandas dataframe) – GWAS study dataframe merged with the reference_panel
Returns:merge studies,
Return type:is_flipped (pandas dataframe)
jass_preprocessing.map_reference.compute_snp_alignement(mgwas)[source]

Add a column to mgwas indicating if the reference and coded allele is flipped compared to the reference panel. If it is, the sign of the statistic must be flipped :param mgwas: a pandas dataframe of the GWAS data merged

with the reference panel
jass_preprocessing.map_reference.map_on_ref_panel(gw_df, ref_panel)[source]

Merge Gwas dataframe with the reference panel Make sure that the same SNPs are in the reference panel and the gwas

Parameters:
  • gw_df (pandas dataframe) – GWAS study dataframe
  • ref_panel (pandas dataframe) – reference panel dataframe
Returns:

merge studies,

Return type:

merge_GWAS (pandas dataframe)

jass_preprocessing.map_reference.read_reference(gwas_reference_panel)[source]

helper function to name correctly the column

jass_preprocessing.save_output module

jass_preprocessing.save_output.save_output(mgwas, ImpG_output_Folder, my_study)[source]

Write the preprocessed Gwas for ldscore analysis

jass_preprocessing.save_output.save_output_by_chromosome(mgwas, ImpG_output_Folder, my_study)[source]

Write the preprocessed Gwas for imputation

Module contents

map_gwas Map GWAS
dna_utils Few fonction to to compute DNA complement
map_reference Module of function
compute_score
save_output