The .rds files consumed by the application are produced by two functions defined in carga.R: carga_datos() for the analytical data tables and correlaciones() for the Spearman correlation matrices. Both share a common data preparation pipeline applied to a raw CSV exported from the LIMS.
Common pipeline
1. Column selection and renaming
The CSV is read with data.table::fread() for speed. Only the relevant columns are kept, and Resultado convertido is renamed to Resultado_conv.
Show code
datos <-fread(file = archivo) %>%select( Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad, Análisis, Resultado, `Modificador de resultado`, `Límite de detección`,`Límite de cuantificación`, `Resultado convertido`, `Unidad inicial` ) %>%rename(Resultado_conv =`Resultado convertido`) %>%mutate(Resultado_conv =as.numeric(Resultado_conv))
2. Handling censored values
Results flagged as below the detection or quantification limit are replaced by the corresponding limit value:
Show code
datos <- datos %>%mutate(Resultado_conv =case_when(`Modificador de resultado`=="nd"~as.numeric(`Límite de detección`),`Modificador de resultado`=="<"~as.numeric(`Límite de cuantificación`),.default = Resultado_conv ))
3. Unit conversion for scaled results
Some rows carry results expressed with a power-of-10 multiplier encoded in Unidad inicial (e.g. "x1³" means ×10³). These rows are separated, rescaled, and rejoined with the rest:
Show code
dat1 <- datos %>%filter(str_detect(`Unidad inicial`, "x1"))supin <-str_extract(dat1$`Unidad inicial`, "\\W") # extract the superscript characterdat1 <- dat1 %>%mutate(y =as.numeric(as.factor(supin))) %>%# encode as integer 1–6mutate(Resultado =case_when( y ==1~ Resultado *1e3, y ==2~ Resultado *1e4, y ==3~ Resultado *1e5, y ==4~ Resultado *1e6, y ==5~ Resultado *1e7, y ==6~ Resultado *1e8 ))datos <- datos %>%filter(!str_detect(`Unidad inicial`, "x1"))datos <-full_join(dat1, datos)
4. Result consolidation and NA handling
Resultado_conv is the primary numeric value. When it is zero (i.e. originally missing), it falls back to Resultado. Zeros are then converted back to NA to mark missing data, and rows still missing a result are dropped:
Show code
datos <- datos %>%select(Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad, Análisis, Resultado, Resultado_conv, `Unidad inicial`) %>%replace(is.na(.), 0) %>%mutate(Resultado_conv =ifelse(Resultado_conv ==0, Resultado, Resultado_conv)) %>%replace(. ==0, NA) %>%select(Fracción, `Tipo de producto`, Matriz, Rótulo, Entidad, Análisis, Resultado_conv, `Unidad inicial`) %>%drop_na()
5. Unit harmonization
Three units are converted to a common scale before pivoting:
Original unit
Multiplied by
Resulting unit
g/l
×1 000
mg/l
mS/cm
×1 000
µS/cm
g/kg
×1 000
mg/kg
Show code
datos <- datos %>%mutate(Resultado_conv =case_when(`Unidad inicial`=="g/l"~ Resultado_conv *1000,`Unidad inicial`=="mS/cm"~ Resultado_conv *1000,`Unidad inicial`=="g/kg"~ Resultado_conv *1000,.default = Resultado_conv ))
6. Pivot to wide format
Each unique value of Análisis becomes a column; the cell values are the consolidated Resultado_conv:
Show code
datos <-pivot_wider(datos, names_from = Análisis, values_from = Resultado_conv)
7. Fraction aggregation
The last digit of the Fracción code is standardised to "1" to group sub-fractions. Within each group, numeric columns are summed and Rótulo/Entidad are taken from the first row:
Show code
datos <- datos %>%mutate(Fracción =str_replace_all(Fracción, "\\d$", "1")) %>%replace(is.na(.), 0) %>%group_by(Fracción) %>%summarise( Rótulo =first(Rótulo),Entidad =first(Entidad),across(where(is.numeric), sum) ) %>%ungroup()
8. Entity ordering and final NA restoration
Entities are sorted alphabetically and stored as an ordered factor. Zeros introduced during aggregation are converted back to NA:
Show code
r <-sort(unique(datos$Entidad))datos <- datos %>%mutate(Entidad =factor(Entidad, levels = r)) %>%replace(. ==0, NA)
carga_datos(): saving the analytical table
After the common pipeline, the wide-format tibble is saved directly as an .rds file ready for the application:
Show code
saveRDS(datos, guardado)
correlaciones(): computing and saving the Spearman matrix
After the same pipeline, correlaciones() additionally:
Drops the Fracción, Rótulo, and Entidad identifier columns to keep only the analyte columns.
Removes any analyte with ≤1 non-NA observation (required for a valid correlation estimate).
Computes the pairwise Spearman correlation matrix with cor(..., use = "pairwise.complete.obs").
Rounds values to 2 decimal places, adds an Analisis name column, and saves as .rds.
Show code
data <- datos %>%select(-(1:3))# Keep only analytes with more than one observationx <-sapply(1:ncol(data), function(i) data %>%filter(!is.na(data[, i])) %>%nrow())data1 <- data[, which(x >1)]matriz <-cor(data1, method ="spearman", use ="pairwise.complete.obs")correl <-as_tibble(matriz) %>%round(2)correl <- correl %>%mutate(Analisis =names(correl))write_rds(correl, archivo_output)
Note:correlaciones() also writes an analisis_*.rds file containing the vector of analyte names. In carga.R this filename is hard-coded ("analisis_trucha_musc.rds"); in practice it would need to be parameterised the same way archivo_output is.
User Interface (UI)
The UI is built with bslib::page_navbar(), which produces a top navigation bar with tabs and a collapsible sidebar.
General layout
page_navbar
├── nav_panel("Dos variables") ← Tab: scatter plot + correlation table
├── nav_panel("Una variable") ← Tab: histogram
├── nav_item: selectizeInput(matrices) ← Matrix selector (embedded in navbar)
├── nav_item: uiOutput(selec_analisis_x) ← X-axis selector (dynamic)
├── nav_item: conditionalPanel ← Y-axis selector (only in "Dos variables")
└── sidebar
├── ("Una variable" panel) textInput(rot1) + actionBttn(buscar1)
├── ("Dos variables" panel) textInput(rot) + actionBttn(buscar)
├── checkboxInput(log) ← Apply log10() to BOTH axes in the data values
├── checkboxInput(logy) ← Log10 axis scale on Y (axis transformation)
├── checkboxInput(logx) ← Log10 axis scale on X (axis transformation)
├── numericInput(vert, vert2) ← Vertical reference lines at x
├── ("Dos variables") numericInput(hor, hor2) ← Horizontal reference lines at y
├── checkboxInput(ent) ← Color points/bars by entity
├── uiOutput(selec_entidades) ← Multi-select entity picker (dynamic)
└── ("Dos variables") numericInput(pendiente, ordenada, ...) ← Manual regression lines
Conditional panels
conditionalPanel() is used to show or hide controls based on the active tab (input.nav):
“Dos variables” controls (Y-axis selector, horizontal lines, slope/intercept inputs) appear only in that tab.
“Una variable” controls (label filter for the histogram) appear only in that tab.
Dynamic widgets
Three UI outputs are rendered dynamically by the server because they depend on the loaded data:
Output
Widget produced
Depends on
selec_analisis_x
selectizeInput (X axis)
analisis() — analyte names for the matrix
selec_analisis_y
selectizeInput (Y axis)
analisis() — same as above
selec_entidades
pickerInput (multi-select)
datos_full()$Entidad — unique entities in the data
All three are eventReactive(input$matrices, {...}): they recompute only when the user changes the matrix. They use switch() to map the matrix name to the corresponding .rds file.
Note: The analisis_* and correl_* objects are loaded once at session startup (outside the server function). The analisis() and correl() reactives simply select the already-loaded object from memory via switch(), without reading files on every matrix change.
Label filtering: rotulos() and rotulos1()
Both are eventReactive triggered by the “Visualizar” button (input$buscar or input$buscar1). This is an intentional design choice: the plot does not update as the user types, only when the button is clicked.
subsetted() feeds the scatter plot; subsetted1() feeds the histogram.
Visualizations
Scatter plot (output$scatterplot)
Built with ggplot2 and made interactive with ggplotly(). The plot only updates when input$buscar is pressed, because it depends on rotulos().
Layers always present
Show code
ggplot(subsetted(), aes(.data[[input$x]], .data[[input$y]],text =paste0("Fracción: ", Fracción, "</br>","Rótulo: ", Rótulo))) +geom_point() +geom_vline(xintercept = input$vert, color ="red") +geom_vline(xintercept = input$vert2, color ="red") +geom_hline(yintercept = input$hor, color ="red") +geom_hline(yintercept = input$hor2, color ="red") +geom_abline(slope = input$pendiente, intercept = input$ordenada, color ="green") +geom_abline(slope = input$pendiente2, intercept = input$ordenada2, color ="green")
Variants based on active controls
The server covers all combinations of three boolean inputs via explicit if/else if branches:
input$ent
input$logx
input$logy
input$log
Effect
FALSE
FALSE
FALSE
FALSE
Standard plot, no entity coloring
FALSE
TRUE
FALSE
FALSE
X axis on log10 scale
FALSE
TRUE
TRUE
FALSE
Both axes on log10 scale
FALSE
FALSE
TRUE
FALSE
Y axis on log10 scale
TRUE
FALSE
FALSE
FALSE
Points colored by Entidad
TRUE
TRUE
FALSE
FALSE
Entity color + X axis log10
TRUE
FALSE
TRUE
FALSE
Entity color + Y axis log10
TRUE
TRUE
TRUE
FALSE
Entity color + both axes log10
TRUE/FALSE
—
—
TRUE
log10() applied to the data values (not the axis scale)
Key distinction:input$logx/input$logy use scale_x/y_continuous(trans = "log10"), which transforms the axis scale while keeping the original values in the tooltip. In contrast, input$log applies log10() directly inside aes(), transforming the data before plotting — so axis labels show the log-transformed values.
Histogram (output$scatterplot2)
Uses only the X axis (input$x) to show the distribution of a single analyte. Also made interactive via ggplotly().
Show code
ggplot(subsetted1(), aes(.data[[input$x]],text =paste0("Fracción: ", Fracción, "</br>","Rótulo: ", Rótulo))) +geom_histogram() +geom_vline(xintercept = input$vert, color ="red") +geom_vline(xintercept = input$vert2, color ="red") +ylab("N° de muestras")
Variants: combination of input$ent (entity coloring via fill = Entidad) and input$logx (log10 scale on X axis), yielding four branches.
Correlation table (output$tablacorr)
Displays the analytes most correlated with the selected X-axis analyte, filtered to |r| > 0.5 and sorted in descending order:
The table header is dynamic (output$encabezado) and displays the name of the currently selected X analyte.
Complete workflow summary
User selects a matrix
│
▼
datos_full() + analisis() + correl() ← loaded / selected from memory
│
▼
X/Y axis selectors and entity picker update (renderUI)
│
User chooses analytes, types a label filter, selects entities
│
▼
Clicks "Visualizar"
│
▼
rotulos() / rotulos1() ← label filter applied
│
▼
subsetted() / subsetted1() ← data filtered by entity and label
│
├──► scatterplot ("Dos variables" tab) + tablacorr
└──► scatterplot2 ("Una variable" tab / histogram)
Design observations
Strengths
Clear separation of concerns: each reactive has a single role (loading, filtering, subsetting).
“Visualizar” button pattern: prevents the plot from recomputing on every keystroke, which is efficient with large datasets.
Upfront metadata loading:analisis_* vectors and correl_* matrices are loaded once at startup to minimize latency when switching matrices.
Consistent styling: bslib layout with a corporate color scheme applied uniformly via inline CSS.
Opportunities for refactoring (for future reference)
The ~10 if/else if branches in the scatter plot renderer could be simplified by building the base plot once and appending layers conditionally.
The three switch() blocks (data, analisis, correl) all share the same matrix list and could be consolidated into a helper function or a named lookup list.
The feather package is loaded but never used and can be removed.