Processing in R
Step 1 - Preprocessing
Create a folder called for the dataset e.g. TCGA, and within this folder create a folder for each project.
Run the R script Preprocessing.R
specifying the phenotypical trait and project, checking to ensure the paths point to the created data
folder.
Save each modalities processed folder with naming convention modality_processed.RData
.
The options are
BRCA :
project = ‘BRCA’
trait = ‘paper_paper_BRCA_Subtype_PAM50’
LGG :
project = ‘LGG’
trait = ‘paper_Grade’
KIPAN :
project = ‘KIPAN’
trait = ‘subtype’
Step 2 - Graph Generation
Point the knn_graph_generation.R to the project folder containing the processed modalities.
Create a folder called raw. This is the folder from which MOGDx will be run.
Use the R script knn_graph_generation.R
specifying the phenotypical trait, project and modalities downloaded in the for loop.
Step 3 - SNF
Create a folder called Network outside data
Copy each modalities modality_graph.csv
to this folder \
Specify the modalities of interest in the list mod_list
Point the SNF script to the new Network folder
Run the R script SNF.R
Example of directory structure for TCGA
data
TCGA-BRCA
mRNA
mRNA.rda
miRNA
miRNA.rda
processed
mRNA_processed.RData
miRNA_processed.RData
raw
datExpr_mRNA.csv
datMeta_mRNA.csv
datExpr_miRNA.csv
datMeta_miRNA.csv
Network
mRNA_graph.csv
miRNA_graph.csv
mRNA_miRNA_graph.csv