Posts

Simulating genetic drift

Genetic drift is the result of bernouli process on survival of individuals (given some probability for each of them) of a population over a number of independent trials (Generation). Apparently there are two techniques of seeing such process – one individual level, other the population level.

Missing negative from the normal

A normal function isn’t so normal The normal density function is: \[ \large f(x) = \frac{1}{\sqrt{2 \pi} \sigma} \exp^{-\frac{(x - \mu)^2}{(2 \sigma^2)}} \] It doesn’t make sense to calculate the probability for a single value in a continuous probability function, it is by definition zero, but you can calculate relative likelihoods (heights).

Relating quantile distribution and selection intensity

What has wikipedia to say about ? In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equals the given probability.

Internals of Mixed Models

Linear mixed models are widely used in Agriculture and Plant Breeding, as of recent. With access to genotype data high resolution phenotype data, it has become more of a requirement to use this family of model. Mixed models allow for experimental (design or outcome) variables’ parameter estimates to have probabilistic distributions – most commonly normal – with opportunity to specify different variance-covariance components among the levels of those variables.

The nature of code: Why is it

Many find genetics, as a field of science on its own, charming. Many more are excited to learn about the science that fits seamlessly into complexity driven life of organisms, providing explanation for natural phenomena at both micro-evolutionary and macro-evolutionary scales.

Color formatting of correlation table

Correlation Correlation is a bivariate summary statistic. It basically talks of direction and magnitidue of association of two variables. Besides formatting with significance stars, color coding correlation coefficient table might be helpful to pick patterns out in a quick glimpse.

Serpentine design and sorting

Take a grid and serpentine it row-wise or column-wise This fn joins two matrices alternately columnwise, which is why this is the source of inspiration for generating serpentine design. alternate.cols <- function(m1, m2) { cbind(m1, m2)[, order(c(seq(ncol(m1)), seq(ncol(m2))))] } A custom function to create a serpentine design in whatever fashion specified:

Video editting using ffmpeg

Background Video editing and format conversion has, up untill recently, been a subject of much domain knowledge. Open source tool ffmpeg is so versatile a toolbox that almost any media file can be tamed to our need. I got chance to learn more of it’s feature of trimming and merging media files.

Cluster dendrogram: An introduction and showcase

A cluster analysis is a classification problem. It is dealt in several ways, one of which is hierarchial agglomeration. The method allows for easy presentation of high dimensional data, more of so when the number of observations is readily fitted into a visualization.

Disease epidemiology: A simulation scenario of infectious viral disease (COVID-19)

SIR model of COVID-19 epidemiology ## # A tibble: 5,555 x 6 ## beta_id time S I R beta_value ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 0 0.7 0.02 0.01 3.2 ## 2 1 0.5 0.