appDatos: Logic and Workflow

Technical documentation of the Shiny application

Published

February 21, 2026

Overview

appDatos is a Shiny application designed for the interactive exploration of environmental and food safety laboratory data. It allows the user to:

Select a matrix (type of product or sample, e.g. water, wine, fruit),
Choose the analytes to display on the X and Y axes,
Apply filters by label (rótulo) and by analytical entity,
Explore data through a scatter plot (two variables) or a histogram (one variable),
Consult a Spearman correlation table for the selected X-axis analyte.

Packages used

Show code

library(shinyWidgets)    # Enhanced widgets (pickerInput, actionBttn)
library(bslib)           # Modern theme and layout (page_navbar, card, sidebar)
library(tidyverse)       # Data manipulation and plotting (ggplot2, dplyr, stringr)
library(plotly)          # Interactive charts (ggplotly)
library(shinycssloaders) # Loading spinner while plots render
library(DT)              # Interactive tables
library(feather)         # Fast file reading (loaded but not actively used)

Data file structure

Data are stored as .rds files in the same directory as app.R. There are three file types per matrix:

Type	Name pattern	Contents
Analytical data	`<key>.rds`	Table with one column per analyte + `Rótulo`, `Entidad`, `Fracción`
Correlations	`correl_<key>.rds`	Spearman correlation matrix between analytes
Analyte names	`analisis_<key>.rds`	Character vector of available analyte names for the matrix

Available matrices (30 total)

Show code

matrices <- c(
  "AGUA PROCESO", "AGUA SUPERFICIAL", "AGUA EFLUENTE", "AGUA SUMINISTRO",
  "AGUA SUBTE", "SUELO REMEDIACIÓN", "JCONT LIMÓN", "JSIM LIMÓN",
  "ACEITE LIMÓN", "DESH TE", "JCONT NARANJA", "JSIM NARANJA",
  "JCONC MANZANA", "FF PERA", "FF MANZANA", "FF DURAZNO", "FF LIMÓN",
  "BAYAS UVA", "HORTALIZA TOMATE", "VINO TINTO", "VINO BLANCO",
  "VINO ROSADO", "PSIM DURAZNO", "PCON DURAZNO", "PSIM DAMASCO",
  "PCON DAMASCO", "PSIM MANZANA", "PCON MANZANA", "CEREAL MAÍZ",
  "TRUCHA MÚSCULO"
)

Data preparation

The .rds files consumed by the application are produced by two functions defined in carga.R: carga_datos() for the analytical data tables and correlaciones() for the Spearman correlation matrices. Both share a common data preparation pipeline applied to a raw CSV exported from the LIMS.

Common pipeline

1. Column selection and renaming

The CSV is read with data.table::fread() for speed. Only the relevant columns are kept, and Resultado convertido is renamed to Resultado_conv.

Show code

datos <- fread(file = archivo) %>%
  select(
    Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad, Análisis,
    Resultado, `Modificador de resultado`, `Límite de detección`,
    `Límite de cuantificación`, `Resultado convertido`, `Unidad inicial`
  ) %>%
  rename(Resultado_conv = `Resultado convertido`) %>%
  mutate(Resultado_conv = as.numeric(Resultado_conv))

2. Handling censored values

Results flagged as below the detection or quantification limit are replaced by the corresponding limit value:

Show code

datos <- datos %>%
  mutate(Resultado_conv = case_when(
    `Modificador de resultado` == "nd" ~ as.numeric(`Límite de detección`),
    `Modificador de resultado` == "<"  ~ as.numeric(`Límite de cuantificación`),
    .default = Resultado_conv
  ))

3. Unit conversion for scaled results

Some rows carry results expressed with a power-of-10 multiplier encoded in Unidad inicial (e.g. "x1³" means ×10³). These rows are separated, rescaled, and rejoined with the rest:

Show code

dat1 <- datos %>% filter(str_detect(`Unidad inicial`, "x1"))

supin <- str_extract(dat1$`Unidad inicial`, "\\W")  # extract the superscript character

dat1 <- dat1 %>%
  mutate(y = as.numeric(as.factor(supin))) %>%       # encode as integer 1–6
  mutate(Resultado = case_when(
    y == 1 ~ Resultado * 1e3,
    y == 2 ~ Resultado * 1e4,
    y == 3 ~ Resultado * 1e5,
    y == 4 ~ Resultado * 1e6,
    y == 5 ~ Resultado * 1e7,
    y == 6 ~ Resultado * 1e8
  ))

datos <- datos %>% filter(!str_detect(`Unidad inicial`, "x1"))
datos <- full_join(dat1, datos)

4. Result consolidation and NA handling

Resultado_conv is the primary numeric value. When it is zero (i.e. originally missing), it falls back to Resultado. Zeros are then converted back to NA to mark missing data, and rows still missing a result are dropped:

Show code

datos <- datos %>%
  select(Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad,
         Análisis, Resultado, Resultado_conv, `Unidad inicial`) %>%
  replace(is.na(.), 0) %>%
  mutate(Resultado_conv = ifelse(Resultado_conv == 0, Resultado, Resultado_conv)) %>%
  replace(. == 0, NA) %>%
  select(Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad,
         Análisis, Resultado_conv, `Unidad inicial`) %>%
  drop_na()

5. Unit harmonization

Three units are converted to a common scale before pivoting:

Original unit	Multiplied by	Resulting unit
g/l	×1 000	mg/l
mS/cm	×1 000	µS/cm
g/kg	×1 000	mg/kg

Show code

datos <- datos %>%
  mutate(Resultado_conv = case_when(
    `Unidad inicial` == "g/l"   ~ Resultado_conv * 1000,
    `Unidad inicial` == "mS/cm" ~ Resultado_conv * 1000,
    `Unidad inicial` == "g/kg"  ~ Resultado_conv * 1000,
    .default = Resultado_conv
  ))

6. Pivot to wide format

Each unique value of Análisis becomes a column; the cell values are the consolidated Resultado_conv:

Show code

datos <- pivot_wider(datos, names_from = Análisis, values_from = Resultado_conv)

7. Fraction aggregation

The last digit of the Fracción code is standardised to "1" to group sub-fractions. Within each group, numeric columns are summed and Rótulo/Entidad are taken from the first row:

Show code

datos <- datos %>%
  mutate(Fracción = str_replace_all(Fracción, "\\d$", "1")) %>%
  replace(is.na(.), 0) %>%
  group_by(Fracción) %>%
  summarise(
    Rótulo   = first(Rótulo),
    Entidad  = first(Entidad),
    across(where(is.numeric), sum)
  ) %>%
  ungroup()

8. Entity ordering and final NA restoration

Entities are sorted alphabetically and stored as an ordered factor. Zeros introduced during aggregation are converted back to NA:

Show code

r <- sort(unique(datos$Entidad))
datos <- datos %>%
  mutate(Entidad = factor(Entidad, levels = r)) %>%
  replace(. == 0, NA)

`carga_datos()`: saving the analytical table

After the common pipeline, the wide-format tibble is saved directly as an .rds file ready for the application:

Show code

saveRDS(datos, guardado)

`correlaciones()`: computing and saving the Spearman matrix

After the same pipeline, correlaciones() additionally:

Drops the Fracción, Rótulo, and Entidad identifier columns to keep only the analyte columns.
Removes any analyte with ≤1 non-NA observation (required for a valid correlation estimate).
Computes the pairwise Spearman correlation matrix with cor(..., use = "pairwise.complete.obs").
Rounds values to 2 decimal places, adds an Analisis name column, and saves as .rds.

Show code

data <- datos %>% select(-(1:3))

# Keep only analytes with more than one observation
x     <- sapply(1:ncol(data), function(i) data %>% filter(!is.na(data[, i])) %>% nrow())
data1 <- data[, which(x > 1)]

matriz <- cor(data1, method = "spearman", use = "pairwise.complete.obs")
correl  <- as_tibble(matriz) %>% round(2)
correl  <- correl %>% mutate(Analisis = names(correl))

write_rds(correl, archivo_output)

Note: correlaciones() also writes an analisis_*.rds file containing the vector of analyte names. In carga.R this filename is hard-coded ("analisis_trucha_musc.rds"); in practice it would need to be parameterised the same way archivo_output is.

User Interface (UI)

The UI is built with bslib::page_navbar(), which produces a top navigation bar with tabs and a collapsible sidebar.

General layout

page_navbar
├── nav_panel("Dos variables")           ← Tab: scatter plot + correlation table
├── nav_panel("Una variable")            ← Tab: histogram
├── nav_item: selectizeInput(matrices)   ← Matrix selector (embedded in navbar)
├── nav_item: uiOutput(selec_analisis_x) ← X-axis selector (dynamic)
├── nav_item: conditionalPanel           ← Y-axis selector (only in "Dos variables")
└── sidebar
    ├── ("Una variable" panel) textInput(rot1) + actionBttn(buscar1)
    ├── ("Dos variables" panel) textInput(rot) + actionBttn(buscar)
    ├── checkboxInput(log)     ← Apply log10() to BOTH axes in the data values
    ├── checkboxInput(logy)    ← Log10 axis scale on Y (axis transformation)
    ├── checkboxInput(logx)    ← Log10 axis scale on X (axis transformation)
    ├── numericInput(vert, vert2)   ← Vertical reference lines at x
    ├── ("Dos variables") numericInput(hor, hor2)  ← Horizontal reference lines at y
    ├── checkboxInput(ent)     ← Color points/bars by entity
    ├── uiOutput(selec_entidades)  ← Multi-select entity picker (dynamic)
    └── ("Dos variables") numericInput(pendiente, ordenada, ...) ← Manual regression lines

Conditional panels

conditionalPanel() is used to show or hide controls based on the active tab (input.nav):

“Dos variables” controls (Y-axis selector, horizontal lines, slope/intercept inputs) appear only in that tab.
“Una variable” controls (label filter for the histogram) appear only in that tab.

Dynamic widgets

Three UI outputs are rendered dynamically by the server because they depend on the loaded data:

Output	Widget produced	Depends on
`selec_analisis_x`	`selectizeInput` (X axis)	`analisis()` — analyte names for the matrix
`selec_analisis_y`	`selectizeInput` (Y axis)	`analisis()` — same as above
`selec_entidades`	`pickerInput` (multi-select)	`datos_full()$Entidad` — unique entities in the data

Server logic

Reactive flow diagram

input$matrices
    │
    ├──► datos_full()   ─────────────────────────────────────────────────┐
    │       (analytical data for the selected matrix)                     │
    │                                                                      │
    ├──► analisis()     ──► selec_analisis_x / selec_analisis_y (UI)     │
    │       (analyte names)                                                │
    │                                                                      │
    └──► correl()       ──► tablacorr (correlation table)                 │
            (Spearman correlation matrix)                                   │
                                                                           │
input$buscar  ──► rotulos()  ────────────────────► subsetted() ──► scatterplot
                   (label filter)                       │
                                                        │ also filters by entity
input$enti  ────────────────────────────────────────────┘

input$buscar1 ──► rotulos1() ───────────────────► subsetted1() ──► scatterplot2
                   (label filter for histogram)

Data loading: `datos_full()`, `analisis()`, `correl()`

All three are eventReactive(input$matrices, {...}): they recompute only when the user changes the matrix. They use switch() to map the matrix name to the corresponding .rds file.

Show code

datos_full <- eventReactive(input$matrices, {
  req(input$matrices)
  switch(input$matrices,
    "AGUA PROCESO"    = read_rds("agupr.rds"),
    "AGUA SUPERFICIAL" = read_rds("super_agua.rds"),
    # ... (30 matrices)
  )
})

Note: The analisis_* and correl_* objects are loaded once at session startup (outside the server function). The analisis() and correl() reactives simply select the already-loaded object from memory via switch(), without reading files on every matrix change.

Label filtering: `rotulos()` and `rotulos1()`

Both are eventReactive triggered by the “Visualizar” button (input$buscar or input$buscar1). This is an intentional design choice: the plot does not update as the user types, only when the button is clicked.

Show code

rotulos <- eventReactive(input$buscar, {
  if (input$rot != "") {
    datos_full() %>%
      filter(str_detect(Rótulo, regex(input$rot, ignore_case = TRUE))) %>%
      pull(Rótulo)
  } else {
    datos_full() %>% pull(Rótulo)  # No filter: returns all labels
  }
})

The search uses str_detect() with regex(..., ignore_case = TRUE), so it accepts regular expressions and is case-insensitive.

Data subsetting: `subsetted()` and `subsetted1()`

These combine the entity filter (multi-select picker) and the label filter:

Show code

subsetted <- reactive({
  req(input$enti)
  datos_full() %>%
    filter(Entidad %in% input$enti & Rótulo %in% rotulos())
})

subsetted() feeds the scatter plot; subsetted1() feeds the histogram.

Visualizations

Scatter plot (`output$scatterplot`)

Built with ggplot2 and made interactive with ggplotly(). The plot only updates when input$buscar is pressed, because it depends on rotulos().

Layers always present

Show code

ggplot(subsetted(), aes(.data[[input$x]], .data[[input$y]],
                         text = paste0("Fracción: ", Fracción, "</br>",
                                       "Rótulo: ", Rótulo))) +
  geom_point() +
  geom_vline(xintercept = input$vert,  color = "red") +
  geom_vline(xintercept = input$vert2, color = "red") +
  geom_hline(yintercept = input$hor,   color = "red") +
  geom_hline(yintercept = input$hor2,  color = "red") +
  geom_abline(slope = input$pendiente,  intercept = input$ordenada,  color = "green") +
  geom_abline(slope = input$pendiente2, intercept = input$ordenada2, color = "green")

Variants based on active controls

The server covers all combinations of three boolean inputs via explicit if/else if branches:

`input$ent`	`input$logx`	`input$logy`	`input$log`	Effect
FALSE	FALSE	FALSE	FALSE	Standard plot, no entity coloring
FALSE	TRUE	FALSE	FALSE	X axis on log10 scale
FALSE	TRUE	TRUE	FALSE	Both axes on log10 scale
FALSE	FALSE	TRUE	FALSE	Y axis on log10 scale
TRUE	FALSE	FALSE	FALSE	Points colored by `Entidad`
TRUE	TRUE	FALSE	FALSE	Entity color + X axis log10
TRUE	FALSE	TRUE	FALSE	Entity color + Y axis log10
TRUE	TRUE	TRUE	FALSE	Entity color + both axes log10
TRUE/FALSE	—	—	TRUE	`log10()` applied to the data values (not the axis scale)

Key distinction: input$logx/input$logy use scale_x/y_continuous(trans = "log10"), which transforms the axis scale while keeping the original values in the tooltip. In contrast, input$log applies log10() directly inside aes(), transforming the data before plotting — so axis labels show the log-transformed values.

Histogram (`output$scatterplot2`)

Uses only the X axis (input$x) to show the distribution of a single analyte. Also made interactive via ggplotly().

Show code

ggplot(subsetted1(), aes(.data[[input$x]],
                          text = paste0("Fracción: ", Fracción, "</br>",
                                        "Rótulo: ", Rótulo))) +
  geom_histogram() +
  geom_vline(xintercept = input$vert,  color = "red") +
  geom_vline(xintercept = input$vert2, color = "red") +
  ylab("N° de muestras")

Variants: combination of input$ent (entity coloring via fill = Entidad) and input$logx (log10 scale on X axis), yielding four branches.

Correlation table (`output$tablacorr`)

Displays the analytes most correlated with the selected X-axis analyte, filtered to |r| > 0.5 and sorted in descending order:

Show code

output$tablacorr <- renderDT({
  req(input$x, correl())

  correl() %>%
    filter(.data[[input$x]] > 0.5 | .data[[input$x]] < -0.5) %>%
    select(Analisis, all_of(input$x)) %>%
    arrange(desc(.data[[input$x]])) %>%
    datatable()
})

The table header is dynamic (output$encabezado) and displays the name of the currently selected X analyte.

Complete workflow summary

User selects a matrix
        │
        ▼
datos_full() + analisis() + correl()   ← loaded / selected from memory
        │
        ▼
X/Y axis selectors and entity picker update (renderUI)
        │
User chooses analytes, types a label filter, selects entities
        │
        ▼
Clicks "Visualizar"
        │
        ▼
rotulos() / rotulos1()   ← label filter applied
        │
        ▼
subsetted() / subsetted1()   ← data filtered by entity and label
        │
        ├──► scatterplot   ("Dos variables" tab)  +  tablacorr
        └──► scatterplot2  ("Una variable" tab / histogram)

Design observations

Strengths

Clear separation of concerns: each reactive has a single role (loading, filtering, subsetting).
“Visualizar” button pattern: prevents the plot from recomputing on every keystroke, which is efficient with large datasets.
Upfront metadata loading: analisis_* vectors and correl_* matrices are loaded once at startup to minimize latency when switching matrices.
Consistent styling: bslib layout with a corporate color scheme applied uniformly via inline CSS.

Opportunities for refactoring (for future reference)

The ~10 if/else if branches in the scatter plot renderer could be simplified by building the base plot once and appending layers conditionally.
The three switch() blocks (data, analisis, correl) all share the same matrix list and could be consolidated into a helper function or a named lookup list.
The feather package is loaded but never used and can be removed.

Overview

Packages used

Data file structure

Available matrices (30 total)

Data preparation

Common pipeline

1. Column selection and renaming

2. Handling censored values

3. Unit conversion for scaled results

4. Result consolidation and NA handling

5. Unit harmonization

6. Pivot to wide format

7. Fraction aggregation

8. Entity ordering and final NA restoration

carga_datos(): saving the analytical table

correlaciones(): computing and saving the Spearman matrix

User Interface (UI)

General layout

Conditional panels

Dynamic widgets

Server logic

Reactive flow diagram

Data loading: datos_full(), analisis(), correl()

Label filtering: rotulos() and rotulos1()

Data subsetting: subsetted() and subsetted1()

Visualizations

Scatter plot (output$scatterplot)

Layers always present

Variants based on active controls

Histogram (output$scatterplot2)

Correlation table (output$tablacorr)

Complete workflow summary

Design observations

Strengths

Opportunities for refactoring (for future reference)

`carga_datos()`: saving the analytical table

`correlaciones()`: computing and saving the Spearman matrix

Data loading: `datos_full()`, `analisis()`, `correl()`

Label filtering: `rotulos()` and `rotulos1()`

Data subsetting: `subsetted()` and `subsetted1()`

Scatter plot (`output$scatterplot`)

Histogram (`output$scatterplot2`)

Correlation table (`output$tablacorr`)