extract_bic {fabia}R Documentation

Extraction of Biclusters

Description

extract_bic: R implementation of extract_bic.

Usage


extract_bic(L,Z,thresZ=0.5,thresL=NULL,lapla=NULL,Psi=NULL)

Arguments

L loading, left matrix.
Z factor, right matrix.
thresZ threshold for sample belonging to bicluster (default 0.5).
thresL threshold for loading belonging to bicluster (if not given it is estimated).
lapla inverse variance of the variational approximation for each sample and each factor.
Psi noise variance vector for observations where independent noise is asumed.

Details

Essentially the model is the sum of outer products of vectors. The number of summands p is the number of biclusters.

X = L Z + U

X = sum_{i=1}^{p} L_i (Z_i )^T + U

The hidden dimension p is used for kmeans clustering of L_i and Z_i .

U is the Gaussian noise with a diagonal covariance matrix which entries are given by Psi.

The Z is locally approximated by a Gaussian with inverse variance given by lapla.

Using these values we can computer for each j the variance Z_i given x_j. Here

x_j = L z_j + u_j

This variance can be used to determine the information content of a bicluster.

The L_i and Z_i are used to extract the bicluster i, where a threshold determines which observations and which samples belong the the bicluster.

In bic the biclusters are extracted according to the largest absolute values of the component i, i.e. the largest values of L_i and the largest values of Z_i . The factors Z_i are normalized to variance 1.

The components of bic are binp, bixv, bixn, biypv, and biypn.

binp give the size of the bicluster: number observations and number samples. bixv gives the values of the extracted observations that have absolute values above a threshold. They are sorted. bixn gives the extracted observation names (e.g. gene names). biypv gives the values of the extracted samples that have absolute values above a threshold. They are sorted. biypn gives the names of the extracted samples (e.g. sample names).

In bicopp the opposite cluster to the biclusters are give. Opposite means that the negative pattern is present.

The components of opposite clusters bicopp are binn, bixv, bixn, biypnv, and biynn.

binp give the size of the opposite bicluster: number observations and number samples. bixv gives the values of the extracted observations that have absolute values above a threshold. They are sorted. bixn gives the extracted observation names (e.g. gene names). biynv gives the values of the opposite extracted samples that have absolute values above a threshold. They are sorted. biynn gives the names of the opposite extracted samples (e.g. sample names).

That means the samples are divided into two groups where one group shows large positive values and the other group has negative values with large absolute values. That means a observation pattern can be switched on or switched off relative to the average value.

numn gives the indexes of bic with components: numng = bix and numnp = biypn.

numn gives the indexes of bicopp with components: numng = bix and numnn = biynn.

Implementation in R.

Value

bic extracted biclusters.
numn indexes for the extracted biclusters.
bicopp extracted opposite biclusters.
numnopp indexes for the extracted opposite biclusters.
avini average over j of the variance Z_i given x_j.
ini for each j the variance Z_i given x_j.

Author(s)

Sepp Hochreiter

See Also

fabi, fabia, fabiap, fabias, fabiasp, mfsc, nmfdiv, nmfeu, nmfsc, nprojfunc, projfunc, make_fabi_data, make_fabi_data_blocks, make_fabi_data_pos, make_fabi_data_blocks_pos, extract_plot, myImagePlot, PlotBicluster, Breast_A, DLBCL_B, Multi_A, fabiaDemo, fabiaVersion

Examples


#---------------
# TEST
#---------------

dat <- make_fabi_data_blocks(n = 100,l= 50,p = 3,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

resEx <- fabia(X,20,0.1,1.0,1.0,3)

rEx <- extract_bic(resEx$L,resEx$Z,lapla=resEx$lapla,Psi=resEx$Psi)

rEx$bic[1,]
rEx$bic[2,]
rEx$bic[3,]

## Not run: 
#---------------
# DEMO1
#---------------

dat <- make_fabi_data_blocks(n = 1000,l= 100,p = 10,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

resToy <- fabia(X,200,0.4,1.0,1.0,13)

rToy <- extract_bic(resToy$L,resToy$Z,lapla=resToy$lapla,Psi=resToy$Psi)

rToy$avini

rToy$bic[1,]
rToy$bic[2,]
rToy$bic[3,]

#---------------
# DEMO2
#---------------

data(Breast_A)

X <- as.matrix(XBreast)

resBreast <- fabia(X,200,0.1,1.0,1.0,5)

rBreast <- extract_bic(resBreast$L,resBreast$Z,lapla=resBreast$lapla,Psi=resBreast$Psi)

rBreast$avini

rBreast$bic[1,]
rBreast$bic[2,]
rBreast$bic[3,]

## End(Not run)

[Package fabia version 0.1.1 Index]