Package 'grafify"

April 30, 2023


TitleEasy Graphs for Data Visualisation and Linear Models for ANOVA



DescriptionEasily explore data by plotting graphs with a few lines of code. Use these ggplot() wrap- pers to quickly draw graphs of scatter/dots with box-whiskers, violins or SD error bars, data dis- tributions, before-after graphs, factorial ANOVA and more. Cus- tomise graphs in many ways, for example, by choosing from colour blind- friendly palettes (12 discreet, 3 continuous and 2 divergent palettes). Use the sim- ple code for ANOVA as ordinary (lm()) or mixed-effects linear models (lmer()), including ran- domised-block or repeated-measures designs, and fit non-linear outcomes as a generalised addi- tive model (gam) using mgcv(). Obtain estimated marginal means and perform post-hoc compar- isons on fitted models (via emmeans()). Also includes small datasets for practis- ing code and teaching basics before users move on to more complex designs. See vi- gnettes for details on usage . Cita- tion: < doi:10.5281/zenodo.5136508

LicenseGPL (>= 2)

Importscar, emmeans, Hmisc, lme4, lmerTest, magrittr, mgcv, patchwork, purrr, stats, tidyr

DependsR (>= 4.0), ggplot2





Suggestsdplyr, knitr, rlang, rmarkdown, pbkrtest, testthat (>= 3.0.0)



AuthorAvinash R Shenoy [cre, aut] () MaintainerAvinash R Shenoy


Date/Publication2023-04-29 22:40:02 UTC


2Rtopics documented:

Rtopics documented:

data_1w_death . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 data_2w_Festing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 data_2w_Tdeath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 data_cholesterol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 data_doubling_time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 data_t_pdiff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 data_t_pratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 data_zooplankton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 ga_anova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 ga_model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 get_graf_colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 graf_colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 graf_col_palette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 graf_col_palette_default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 graf_palettes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 make_1way_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 make_1way_rb_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 make_2way_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 make_2way_rb_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 mixed_anova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 mixed_anova_slopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 mixed_model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
mixed_model_slopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
plot_3d_scatterbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
plot_3d_scatterbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
plot_3d_scatterviolin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
plot_4d_scatterbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
plot_4d_scatterbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
plot_4d_scatterviolin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
plot_befafter_box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
plot_befafter_colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
plot_befafter_shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
plot_density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
plot_dotbar_sd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
plot_dotbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
plot_dotviolin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
plot_gam_predict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
plot_grafify_palette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
plot_histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
plot_lm_predict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
plot_logscale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
plot_point_sd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
plot_qqline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
plot_qqmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
plot_qq_gam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
plot_scatterbar_sd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
data_1w_death3 plot_scatterbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
plot_scatterviolin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
plot_xy_CatGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
plot_xy_NumGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
posthoc_Levelwise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
posthoc_Pairwise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
posthoc_Trends_Levelwise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
posthoc_Trends_Pairwise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
posthoc_Trends_vsRef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
posthoc_vsRef . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
scale_colour_grafify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
scale_fill_grafify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
simple_anova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
simple_model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
table_summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
table_x_reorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
theme_grafify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Index106data_1w_deathIn vitro experiments measuring percentage cell death in three geno- types of cells.Description These data are from in vitro measurements of death of host cells (measured as percentage of total cells) after infection with three different strains of a pathogenic bacterium, from five independent experiments. The three strains are three levels within the fixed factor Genotype. The five inde- pendent experiments are levels within the random variable Experiment. These data can be anal- ysed using linear mixed effects modelling. These data are from

Goddard et al, Cell Rep, 2019,

doi.org/10.1016/j.celrep.2019.03.100 Usage data_1w_death


data.frame: 15 obs. of 3 variables. ExperimentExperiment - a random factor with 5 levels "Exp_1","Exp_2"... GenotypeGenotypes - a fixed factor with 3 levels: "WT","KO_1","KO_2". DeathNumerical dependent variable indicating percentage cell death.

4data_2w_Tdeathdata_2w_FestingData from two-way ANOVA with randomised block design of treat-

ments of strains of mice.Description Data from Festing, ILAR Journal (2014) 55, 472-476 . These data are suitable for two-way linear mixed effects modelling. The activity of GST (numerical dependent variable) was measured in 4 strains of mice (levels with the fixed factor Strain) either treated or controls (levels within the fixed factor Treatment). Once mouse each was used in two randomised blocks, which is the random factor (Block). Usage data_2w_Festing


data.frame: 16 obs. of 4 variables:

BlockA random factor with 2 levels "A" and "B".

TreatmentA fixed factor with 2 levels: "Control" & "Treated" StrainA fixed factor with 4 levels: "129Ola", "A/J", "NIH" & "BALB/C"

GSTNumerical dependent variable indicating GST activity measurementdata_2w_TdeathIn vitro measurement of percentage cell death - two-way ANOVA de-

sign with repeated measures, and randomised blocks.Description These are measurements of death of infected host cells (as percentage of total cells) upon infection with two strains of bacteria, measured at two time points, in 6 independent experiments. These data repeated-measures data suitable for two-way linear mixed effects modelling with experiment and subjects as random factors. Usage data_2w_Tdeath data_cholesterol5


data.frame: 24 obs. of 6 variables: ExperimentA random factor with 6 levels "e1", "e2"... TimeA fixed factor with 2 levels: "t100" & "t300". Time2A numeric column that allows plotting data on a quantitative "Time" axis. The "Time" column has "factor" type values that should be used for the ANOVA.. GenotypeA fixed factor with 2 levels that we want to compare "WT" & "KO". SubjectA random factor with 12 levels: "s1", "s2"... These are cell culture wells that were mea- sured at two time points, and indicate "subjects" that underwent repeated-measures within each of 6 experiments. Subject IDs for WT and KO are unique and clearly indicate different wells. PINumerical dependent variable indicating propidium iodide dye uptake as a measure of cell

death. These are percentage of dead cells out of total cells plated.data_cholesterolHierarchical data from 25 subjects either treated or not at 5 hospitals

- two-way ANOVA design with repeated measures.Description An example dataset on measurements of blood cholesterol levels measured in 5 subjects measured

before and after receiving a Drug. Five patients each were recruited at 5 hospitals (a-e), so that there

are 25 different subjects (1-25) measured twice. Data are from

Micro/Immuno Stats

Usage data_cholesterol


tibble: 30 obs. of 3 variables: HospitalFactor with 5 levels (a-e), representing different hospitals where subjects were recruited. SubjectA factor with 25 levels denoting individuals on whom measurements were made twice. TreatmentA factor with 2 levels indicating when measurements were made, i.e. before and after drug. CholesterolNumerical dependent variable indicating measured doubling time in min.

6data_t_pdiffdata_doubling_timeDoubling time of E.coli measured by 10 students three independent

times.Description An example dataset showing measurements ofE. colidoubling times (in min) measured by 10 different students in 3 independent experiments each. Note that Experiments are just called Exp1- Exp3 even though Exp1 of any of the students are not connected in anyway - this will confuse R!

Data are from

Micro/Immuno Stats

Usage data_doubling_time


tibble: 30 obs. of 3 variables: StudentFactor with 10 levels, representing different students. ExperimentA factor with 3 levels representing independent experiments.

Doubling_timeNumerical dependent variable indicating measured doubling time in min.data_t_pdiffMatched data from two groups where difference between them is con-

sistent.Description An example dataset for paired difference Student"sttest. These are bodyweight (Mass) in grams of same mice left untreated or treated, which are two groups to be compared. The data are in a longtable format, and the two groups are levels within the factor "Condition". The Subject column lists ID of matched mice that were measured without and with treatment. These data are from Sanchez-Garridoet al, Sci Signal, 2018, DOI: 10.1126/scisignal.aat6903. Usage data_t_pdiff


data.frame: 20 obs. of 3 variables: SubjectFactor with 10 levels, denoted by capital letters, representing individuals or subjects. ConditionA fixed factor with 2 levels: "Untreated" & "Treated". MassNumerical dependent variable indicating body mass of mice

data_t_pratio7data_t_pratioMatched data from two groups where ratio between them is consistent.Description

An example dataset for paired ratio Student"sttest. These are Cytokine measurements by ELISA (in ng/ml) from 33 independent in vitro experiments performed on two Genotypes that we want to compare. The data are in a longtable format, and the two groups are levels within the factor "Genotype". The Experiment column lists ID of matched experiments. Usage data_t_pratio


data.frame: 66 obs. of 3 variables: GenotypeFactor with 2 levels, representing genotypes to be compared ("WT" & "KO"). ExperimentA random factor with 33 levels representing independent experiments, denoted as "Exp_1", "Exp_2"...

CytokineNumerical dependent variable indicating cytokine measured by ELISA.data_zooplanktonTime-series data on zooplankton in lake Menon.Description


provided by the Wisconsin Department of Natural Resources Usage data_zooplankton


tibble: 1127 obs. of 8 variables: dayNumeric integer variable. yearNumeric integer variable of years during which data were collected. lakeThis data is for lake Menona; data for other others not included in this subset. taxonNames of zooplankton taxa as factor of 8 levels. densityNumeric values of density of measurements.


density_adjNumeric values of adjusted density . min_densityNumeric values of minimum densities.

desnsity_scaledNumeric value of scaled density.ga_anovaANOVA table from a generalised additive model (gam)Description

The two functionsga_modelandga_anovaare for fitting generalised additive models (gam) with themgcvpackage. It will use thegam()function inmgcvfor ANOVA designs withup to two categorical fixed factors(with two or more levels;Fixed_Factor), andexactly one factor is a continuous variable(e.g. time), which is calledSmooth_Factor. A smooth function is fitted with factor-wise smooth basis function (by =). A default value for number of nodes (the argumentk ingam) may work, but a specific number can be provided using theNodesargument. The model is fit using theREMLmethod. When two categorical fixed factors are provided, an interaction term is included for main effects and smooth basis functions. Usage ga_anova( data,




Random_Factor = NULL,

Nodes = NULL,


dataa data frame where categorical independent variables are converted to factors usingas.factor()first. The function will throw errors without this. Y_valuename of column containing quantitative (dependent) variable, provided within "quotes". Fixed_Factorname(s) of categorical fixed factors (independent variables) provided as a vector if more than one or within "quotes". Convert to factors first withas.factor. Smooth_Factorthe continuous variable to fit smoothly with a basis function, provided within "quotes" (only 1 Smooth_Factor allowed). Random_Factorname(s) of random factors to be provided in "quotes" (only 1 Random_Factor allowed). Convert to factor withas.factorfirst.

Nodesnumber of nodes (the parameterkingam).

...any additional variables to pass on togamoranova ga_model9


If aRandom_Factoris also provided, it is fitted usingbs = "re"smooth. Value

ANOVA table of class "anova" and "data.frame".


#with zooplankton data ga_anova(data = data_zooplankton,

Y_value = "log(density_adj)",

Fixed_Factor = "taxon",

Smooth_Factor = "day")ga_modelFit a generalised additive model (gam)Description The two functionsga_modelandga_anovaare for fitting generalised additive models (gam) with themgcvpackage. It will use thegam()function inmgcvfor ANOVA designs withup to two categorical fixed factors(with two or more levels;Fixed_Factor), andexactly one factor is a continuous variable(e.g. time), which is calledSmooth_Factor. A smooth function is fitted with factor-wise smooth basis function (by =). A default value for number of nodes (the argumentk ingam) may work, but a specific number can be provided using theNodesargument. The model is fit using theREMLmethod. When two categorical fixed factors are provided, an interaction term is included for main effects and smooth basis functions. Usage ga_model( data,




Random_Factor = NULL,

Nodes = "NULL",



dataa data frame where categorical independent variables are converted to factors usingas.factor()first. The function will throw errors without this. Y_valuename of column containing quantitative (dependent) variable, provided within "quotes". Fixed_Factorname(s) of categorical fixed factors (independent variables) provided as a vector if more than one or within "quotes". Convert to factors first withas.factor. Smooth_Factorthe continuous variable to fit smoothly with a basis function, provided within "quotes" (only 1 Smooth_Factor allowed). Random_Factorname(s) of random factors to be provided in "quotes" (only 1 Random_Factor allowed). Convert to factor withas.factorfirst.

Nodesnumber of nodes (the parameterkingam).

...any additional variables to pass on togamoranova


If aRandom_Factoris also provided, it is fitted usingbs = "re"smooth. Value This function gives a generalised additive model object of class "gam", "lm" and "glm".


#fit a model with zooplankton data z1 <- ga_model(data = data_zooplankton,

Y_value = "log(density_adj)",

Fixed_Factor = "taxon",

Smooth_Factor = "day")get_graf_coloursGet graf internalDescription

Function to make grafify colour scheme.

Thank you Dr Simon

Usage get_graf_colours(...)


...internal graf_colours11


To visualise grafify colours useplot_grafify_palette. Value

This function returns names and hexcodes of colours in grafify as a character vector.graf_coloursList of hexcodes of colours in grafify palettesDescription

To visualise these colours useplot_grafify_palette.okabe_ito,bright,contrast,dark, light,muted,pale,vibrant,yello_contifrom Paul Tol"spost . Zesty, Pastel, Elegant from this link . Colourhexcodesforfishy,kelly,r4,safe,OrBl_div,PrGn_div,blue_conti,grey_conti taken from cols4all:c4a_gui package. All schemes are colour blind-friendly . Usage graf_colours


An object of classcharacterof length 154.

Value This is a character vector with names and hexcodes of colours used by palette functions. It is used

byget_graf_coloursto generate palettes.graf_col_paletteCallgrafifypalettes for scale & fill functionsDescription

is less than that in the palette. This is the default forgrafifywithColoSeq = TRUE. If the number

of colours required is more than that in the discrete palette, it fills intervening colours using the

colorRampPalette [grDevices]function. Usage graf_col_palette(palette = "okabe_ito", reverse = FALSE, ...)



