mt_load

mt_load(mtdata,mtgene = NULL,mtmeta = NULL)

Arguments

mtdata

(required) data.table with read expressions.

mtgene

(optional) data.table with metadata associated to genes (rows).

mtmeta

(optional) data.table with metadata associated to samples (columns).

Value

A list of class "mt" with 3 elements.

Details

If objects of class data.frame are provided they will be coerced into data.table. The data structure remains unchanged in most cases, however it is recomended to have the data in data.table format before loading it.

The mtdata

The mtdata-table contains read expressions for all genes in each sample. The provided mtdata-table must be a data frame with the following requirements:

  • The rows are gene IDs and the columns are samples.

  • The gene ID's are expected to be in eiher the rownames of the data frame or in a column called "ID". Otherwise the function will stop with a message.

  • The column names of the data frame are the sample IDs, exactly matching those in the metadata.

  • Generally avoid special characters and spaces in row- and column names.

A minimal example is available with data("example_mtdata").

The mtgene

The mtgene-table contains additional information about the genes, for example reference database ID's, product, and function, which can be used during analysis. The amount of information in the mtgene-table is unlimited, it can contain any number of columns (variables), however there are a few requirements:

  • The gene IDs must be in the first column. These must match exactly to those in the mtdata-table.

  • Column classes matter, categorical variables should be loaded either as.character() or as.factor(), and continuous variables as.numeric(). See below.

  • Generally avoid special characters and spaces in row- and column names.

The mt_load function will automatically use the sample IDs in the first column as rownames, but it is important to also have an actual column with sample IDs, so it is possible to fx group by that column during analysis. Any unmatched samples between the otutable and metadata will be removed.

A minimal example is available with data("example_mtgene").

The mtmeta

The mtmeta-table contains additional information about the samples, for example where each sample was taken, date, pH, treatment etc, which is used to compare and group the samples during analysis. The amount of information in the mtmeta-table is unlimited, it can contain any number of columns (variables), however there are a few requirements:

  • The sample IDs must be in the first column. These sample IDs must match exactly to those in the mtdata-table.

  • Column classes matter, categorical variables should be loaded either as.character() or as.factor(), and continuous variables as.numeric(). See below.

  • Generally avoid special characters and spaces in row- and column names.

If for example a column is named "Year" and the entries are simply entered as numbers (2011, 2012, 2013 etc), then R will automatically consider these as numerical values (as.numeric()) and therefore the column as a continuous variable, while it is a categorical variable and should be loaded as.factor() or as.character() instead. This has consequences for the analysis as R treats them differently. Therefore either use the colClasses = argument when loading a csv file or col_types = when loading an excel file, or manually adjust the column classes afterwards with fx metadata$Year <- as.character(metadata$Year).

The mt_load function will automatically use the sample IDs in the first column as rownames, but it is important to also have an actual column with sample IDs, so it is possible to fx group by that column during analysis. Any unmatched samples between the otutable and metadata will be removed.

A minimal example is available with data("example_mtmeta").

Examples

# NOT RUN { # Load the different components. data("example_mtmeta") data("example_mtgene") data("example_mtdata") # Combine in one object of class 'mmt'. mt <- mt_load(mtdata = example_mtdata,mtgene = example_mtgene,mtmeta = example_mtmeta) #Show a short summary about the data by simply typing the name of the object in the console mt # }