make_fabi_data_pos {fabia}R Documentation

Generation of Bicluster Data

Description

make_fabi_data_pos: R implementation of make_fabi_data_pos.

Usage


make_fabi_data_pos(n,l,p,f1,f2,of1,of2,sd_noise,sd_z_noise,
              mean_z,sd_z,sd_l_noise,mean_l,sd_l)

Arguments

n number of observations.
l number of samples.
p number of biclusters.
f1 nn/f1 max. additional samples are active in a bicluster.
f2 n/f2 max. additional observations that form a pattern in a bicluster.
of1 minimal active samples in a bicluster.
of2 minimal observations that form a pattern in a bicluster.
sd_noise Gaussian zero mean noise std on data matrix.
sd_z_noise Gaussian zero mean noise std for deactivated hidden factors.
mean_z Gaussian mean for activated factors.
sd_z Gaussian std for activated factors.
sd_l_noise Gaussian zero mean noise std if no observation patterns are present.
mean_l Gaussian mean for observation patterns.
sd_l Gaussian std for observation patterns.

Details

Essentially the data generation model is the sum of outer products of sparse vectors. The number of summands p is the number of biclusters.

X = L Z + U

Y = L Z

X = sum_{i=1}^{p} L_i (Z_i )^T + U

Here L_i are from R^n, Z_i from R^l, and X,Y from R^{n times l}.

Sequentially L_i are generated using n, f2, of2, sd_l_noise, mean_l, sd_l. of2 gives the minimal observations participating in a bicluster to which between 0 and n/f2 observations are added, where the number is uniformly chosen. sd_l_noise gives the noise of observations not participating in the bicluster. mean_l and sd_l determines the Gaussian from which the values are drawn for the observations that participate in the bicluster. "POS": The sign of the mean is fixed.

Sequentially Z_i are generated using l, f1, of1, sd_z_noise, mean_z, sd_z. of1 gives the minimal samples participating in a bicluster to which between 0 and l/f1 samples are added, where the number is uniformly chosen. sd_z_noise gives the noise of samples not participating in the bicluster. mean_z and sd_z determines the Gaussian from which the values are drawn for the samples that participate in the bicluster.

U is the overall Gaussian zero mean noise generated by sd_noise.

Implementation in R.

Value

X the noise data from R^{n times l}.
Y the noise free data from R^{n times l}.
ZC list where ith element gives samples beloning to ith bicluster.
LC list where ith element gives observations beloning to ith bicluster.

Author(s)

Sepp Hochreiter

See Also

fabi, fabia, fabiap, fabias, fabiasp, mfsc, nmfdiv, nmfeu, nmfsc, nprojfunc, projfunc, make_fabi_data, make_fabi_data_blocks, make_fabi_data_blocks_pos, extract_plot, extract_bic, myImagePlot, PlotBicluster, Breast_A, DLBCL_B, Multi_A, fabiaDemo, fabiaVersion

Examples


#---------------
# TEST
#---------------

dat <- make_fabi_data_pos(n = 100,l= 50,p = 3,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

myImagePlot(Y)
x11()
myImagePlot(X)

## Not run: 
#---------------
# DEMO
#---------------

dat <- make_fabi_data_pos(n = 1000,l= 100,p = 10,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

myImagePlot(Y)
x11()
myImagePlot(X)

## End(Not run)

[Package fabia version 0.1.1 Index]