StatsWithR StatDesk User Guide

Purpose and Scope

StatsWithR StatDesk is a browser-based statistical analysis application designed for transparent applied statistics tasks. Users can import or enter data, label variables, run common statistical procedures, view exportable output, create plots, and compare methods with analogous R code examples.

This guide is intended for students, instructors, researchers, and applied analysts who need a practical reference for using the application. It focuses on routine operation and interpretation support rather than replacing a statistics textbook, a full programming environment, or a formal software validation report.

Appropriate Uses

  • Teaching introductory and applied statistics with visible analysis steps.
  • Exploring small to moderate research datasets directly in a browser.
  • Creating preliminary tables, plots, and output for review or documentation.

Uses That Require Extra Verification

StatDesk should not be relied on as the sole source for clinical, regulatory, legal, financial, publication-critical, or other high-stakes decisions. Users remain responsible for data quality review, study design, model assumptions, interpretation, reporting, and independent verification.

Basic Operating Sequence

  1. Load data by entering values in the grid, selecting Load iris sample for practice, or importing a supported file.
  2. Inspect the data in Data View and confirm that variables are recognized with the intended type.
  3. Clean the dataset by editing cells, adding labels, recoding missing indicators, or creating calculated variables.
  4. Choose an analysis from the Analyses panel or use the analysis search field to locate a method.
  5. Complete the Analysis Setup fields. Required fields must be selected before the analysis can run.
  6. Run the analysis and review the Output tab.
  7. Create supporting plots from the Plots tab when graphical review would help.
  8. Save the session in the browser or export an application state file if the work should be preserved.

Interface Overview

The interface is organized around a Variables panel, an Analyses panel, and a main work area with tabs. The top status area shows autosave status, CPU proxy information, memory information, and a feedback/donation link.

StatDesk interface tabs and their functions
Tab or areaRole in StatDesk
Data ViewPrimary workspace for entering, editing, importing, exporting, and clearing data.
PlotsCanvas-based plotting area with plot-type, variable, title, axis-label, draw, export, and clear controls.
Analysis SetupConfiguration area for the currently selected analysis, including required inputs and analogous R code when available.
OutputResults area with controls to clear output or export output as a PDF.
SaveSession management area for browser save, application state file export, saved state import, and browser-session clearing.
Accuracy testsValidation-check area with controls to run validation tests or clear test results.
AboutApplication description, purpose, supported analysis areas, limitations, version, citation, and related notes.
HelpQuick-start guidance, data notes, formula reference, supported file formats, and legal/use disclaimer.

Data View and Dataset Management

Entering Data Directly

Data View allows users to use Edit data and edit cells directly in the grid. This is most useful for small datasets, classroom demonstrations, quick examples, or small corrections after import. Inline plus and minus controls add or remove rows and variables.

  • Use direct entry when the dataset is small enough to visually inspect.
  • After manual edits, use Undo or Redo for recent changes when needed.
  • Use Clear data only when you are ready to remove the current dataset from the active session.

Loading the Iris Sample

Load iris sample is useful for practice because it provides a familiar dataset with numeric flower measurements and a categorical species variable. It is a good way to test Data View, plots, descriptive statistics, grouped summaries, correlations, and simple modeling tools without importing a file.

Importing Files

Use Import data to load an external file into the browser session. Support exists for CSV, TSV/TXT, JSON, StatDesk JSON, XLSX, Excel XML/HTML tables, SPSS SAV, Stata DTA, SPSS syntax, and Stata do-files.

Supported StatDesk import formats
Import formatDescriptionPractical check
CSVComma-separated plain text data.Use for broad compatibility. Check delimiter, headers, missing codes, and numeric parsing.
TSV/TXTTab-separated or text data.Useful when commas appear in text fields. Confirm the file structure after import.
JSONStructured data in JSON format.Useful for programming tasks. Confirm the row/column structure imported as intended.
StatDesk JSONStatDesk-preserving file format.Use when returning to StatDesk with labels and variable type information.
XLSXExcel workbook file.Check sheet choice, header row, date parsing, and blank rows.
Excel XML/HTML tablesSpreadsheet-style XML or HTML table sources.Useful for table-based exports from other systems.
SPSS SAVNative SPSS data file.Native SPSS import is still not fully supported. Verify labels, missingness, and coding.
Stata DTANative Stata data file.Native Stata import is still not fully supported. Verify labels, missingness, and coding.
SPSS syntaxSPSS syntax file capable of rebuilding data and labels.Useful for label-preserving SPSS handoff.
Stata do-fileStata do-file capable of rebuilding data and labels.Useful for label-preserving Stata handoff.

Checking Imported Data

  • Confirm that the expected number of rows and variables is present.
  • Review the first few rows and several later rows to catch header or parsing problems.
  • Confirm that numeric variables were not imported as text because of commas, symbols, or nonnumeric missing codes.
  • Confirm that categorical variables have clean, intentional levels.
  • Check whether dates, identifiers, and coded variables should be treated as numeric, categorical, or text-like fields.

Variable Types and Reordering

The Variables panel shows available variables and supports reordering variables from the sidebar. Variable type settings matter because analysis setup screens and plots depend on whether a variable is treated as numeric, categorical, or otherwise eligible for a given role.

Variable Labels

Variable labels make output and exported files easier to understand. Use Variable labels to provide readable descriptions for variables whose raw names are abbreviated, technical, or coded. Labels are especially useful when exporting StatDesk JSON, SPSS syntax, or Stata do-files because those formats are intended to preserve labels.

Missing Value Recoding

The Missing values tool converts selected values to empty cells. The interface accepts missing indicators separated by commas or line breaks. Users can apply recoding to all variables, numeric variables only, or a selected variable only. A case-sensitive matching option is available, along with Preview count and Recode to empty cells controls.

Missing-value recoding controls
ControlHow to use it
Missing indicatorsValues such as NA, N/A, 999, -99, refused, or unknown that should be treated as missing rather than real data.
Apply to all variablesUse only when the same missing indicators have the same meaning throughout the dataset.
Apply to numeric variables onlyUseful when numeric placeholder codes should be removed but text categories should be preserved.
Apply to selected variable onlyBest for targeted cleanup when one variable has special coding.
Case-sensitive matchingUse when uppercase and lowercase values should be treated as different values.
Preview countReview how many values will be changed before committing the recode.

Calculated Variables

Calculated variables create new variables from formulas. The interface includes fields for a new variable name, optional variable label, formula, and a Create calculated variable button. Formulas use R-like syntax, and bare variable names work for simple variable names. Use brackets for variable names that contain spaces.

Formula Syntax

Calculated-variable formula syntax
Formula featureDetails
Variable referencesUse bare names such as age. Use brackets for names with spaces, such as [pre score].
Arithmetic operators+, -, *, /, ^, %%, %/%, and parentheses. JavaScript-style ** still work, but ^ is recommended for R-like exponent syntax.
ComparisonsUse ==, !=, >, >=, <, and <= to create logical conditions.
Logical operatorsUse R-style &, |, ! or JavaScript-style && and ||.
Math functionsabs, sqrt, log, log10, exp, round, floor, ceiling/ceil, pow, sin, cos, and tan.
Row and column summariesmin, max, pmin, pmax, sum, mean, median, sd, and var work across supplied values. mean(variable) summarizes a column; mean(item1, item2, item3) summarizes row values.
Conditional logicUse ifelse(condition, value_if_true, value_if_false).
Missing and text functionsmissing/is.na, notmissing, as.numeric/asNumber, nchar/len, tolower/lower, toupper/upper, grepl, contains, paste, and paste0.

Formula Examples

  • pre_score - post_score creates a change score when lower post-test values indicate improvement.
  • [pre score] - [post score] does the same thing when variable names contain spaces.
  • ifelse(age >= 18, 1, 0) creates an adult indicator.
  • ifelse(group == "Treatment", 1, 0) creates a treatment-group indicator.
  • mean(item1, item2, item3) creates a row-wise mean across three items.
  • ifelse(is.na(score), 0, score) replaces missing score values with 0 in the new calculated variable.
  • paste0(group, "_", score) combines text and values into a new text-like field.

Exporting Data

Use Export data to download the current dataset. The interface asks for a file name without an extension, and the extension is added automatically. The export options are CSV, TSV, JSON rows, StatDesk JSON + labels, HTML table, Excel XML, SPSS syntax + labels, and Stata do-file + labels.

StatDesk data export options
Export optionPurposeWhen to choose it
CSVGeneral-purpose file for spreadsheets and statistical software.Use when labels are not needed or will be rebuilt elsewhere.
TSVTab-separated version of the dataset.Useful when text values contain commas.
JSON rowsRow-oriented JSON data.Useful for programming tasks.
StatDesk JSON + labelsPortable StatDesk-preserving file.Best export when you want to reload data, labels, and variable settings in StatDesk.
HTML tableReadable table format.Useful for review or web-style table output.
Excel XMLSpreadsheet-compatible XML format.Useful when a spreadsheet-readable file is preferred.
SPSS syntax + labelsSyntax that rebuilds data and labels in SPSS.Use for label-preserving SPSS handoff.
Stata do-file + labelsDo-file that rebuilds data and labels in Stata.Use for label-preserving Stata handoff.

Saving, Restoring, and Privacy

The Save tab supports both browser-based session saving and portable state-file saving. The interface displays autosave status and an Estimated application state file size value. It states that the application state includes data, output, labels, settings, plot state, selected analysis, paging, tabs, and StatDesk version information.

StatDesk save and restore features
Save/restore featureDescription
AutosaveRestores the most recent session after refresh or browser restart when browser storage is available.
Save session in this browserStores the current session locally in the current browser.
Save application state fileDownloads a compact portable JSON file that can be kept, moved, or reloaded later.
Load saved state fileLoads a previously saved application state file. Only load files you created or trust.
Clear saved browser sessionRemoves the browser-stored session copy.

The browser autosave is local-only storage using IndexedDB and it does not send data to a server. However, users should still follow institutional data rules, avoid loading untrusted state files, and remember that private browsing, clearing site data, storage quotas, or mobile browser cleanup can remove browser autosave.

Output and Accuracy Tests

Output

After running an analysis, use the Output tab to review results. The interface includes Clear output and Export output PDF controls. Exported output is useful for teaching, review, and documentation, but it should not replace independent verification for important analyses.

Accuracy Tests

The Accuracy tests area includes Run Validation Tests and Clear Test Results. These tests are side-by-side method checks and validation support that compare app calculations with benchmark values.

Running Analyses

Analyses are selected from the Analyses panel. The interface also includes a search input for filtering analyses. After selecting a method, use Analysis Setup to complete the required fields and review comparable R syntax when shown.

  • Start by identifying the outcome, grouping variable, predictor variables, time variable, item variables, or classification variables required by the research question.
  • Confirm that variables have the correct type before selecting them in Analysis Setup.
  • Read the output for estimates, uncertainty, test statistics, p-values, model summaries, and warnings where applicable.
  • Use plots and descriptive summaries to support interpretation rather than relying on one inferential result in isolation.

Analysis Area Overview

This section describes the supported analysis areas listed in the StatDesk 0.9.1 interface. The exact fields shown in Analysis Setup vary by selected analysis and by the variables available in the dataset. Where the interface names a broad analysis area rather than a specific subprocedure, this guide uses that published area name and explains how the area is used in the application.

Summaries and Basic Inference

StatDesk analysis overview: Summaries and Basic Inference
Analysis areaPurposeTypical inputsInterpretation and cautions
Descriptive statisticsUse descriptive statistics as the first pass for numeric variables. They help users understand the center, spread, range, and completeness of variables before modeling or hypothesis testing.Numeric variables. Review variable type settings if an expected numeric variable does not appear.Check sample size, missingness, unusual ranges, and whether summary values make sense for the measurement scale. Descriptive statistics do not test causal or group differences by themselves.
FrequenciesUse frequencies to count values or categories. This is useful for categorical sample descriptions, data coding checks, and identifying rare or unexpected levels.Categorical variables, binary indicators, or numeric codes treated as categories.Check whether category labels are clean and whether missing or placeholder codes were recoded before reporting percentages.
Grouped summariesUse grouped summaries when descriptive statistics need to be compared across levels of a grouping variable, such as treatment group, sex, site, cohort, or time category.A numeric summary variable and a categorical grouping variable.Grouped summaries are descriptive. Apparent differences should be followed by an appropriate inferential analysis when the research question requires it.
CorrelationsUse correlations to examine pairwise associations among numeric variables.Two or more numeric variables.Inspect scatterplots for nonlinear patterns and outliers. Correlation does not imply causation and may be distorted by restricted range or influential cases.
t testsUse t tests for mean-comparison questions when the design involves one mean, paired measurements, or two groups.A numeric outcome and the comparison structure required by the selected t test.Check design fit before running the test. For paired data, the pairing must represent the same unit measured twice or matched observations.
ANOVAUse ANOVA when comparing a numeric outcome across more than two group levels.A numeric outcome and a categorical group/factor variable.ANOVA identifies evidence of mean differences but does not by itself explain which groups differ unless follow-up comparisons are performed.
Chi-square testsUse chi-square tests for associations between categorical variables.Two categorical variables or a contingency-table-style setup.Sparse cells can make the test unreliable. Review counts before interpreting p-values.

Regression and Predictor Evaluation

StatDesk analysis overview: Regression and Predictor Evaluation
Analysis areaPurposeTypical inputsInterpretation and cautions
RegressionUse regression to model a continuous numeric outcome as a function of one or more predictors.A numeric outcome and predictor variables. Predictors may be numeric or categorical depending on setup.Interpret coefficients in the context of the model specification. Check linearity, influential observations, residual behavior, and multicollinearity.
Generalized linear modelsUse GLM tools when the outcome requires a non-normal modeling family, such as binary, count, or other generalized outcome structures.An outcome compatible with the selected GLM family and appropriate predictors.The family and link function determine interpretation. Verify model convergence, coding, reference groups, and final estimates externally for important work.
MediationUse mediation to explore whether the association between an exposure and outcome may operate through a mediator.An exposure/predictor, mediator, outcome, and any needed covariates supported by the setup screen.Mediation depends on strong design and causal assumptions. Treat browser output as exploratory unless confirmed through a full analytic process.
Relative weightsUse relative weights to compare predictor contribution when predictors overlap or are correlated.A regression-style outcome and multiple predictors.Relative importance is not the same as causal importance. Use it to understand model contribution, not to prove mechanisms.
Commonality analysisUse commonality analysis to decompose explained variance into unique and shared parts across predictors.A numeric outcome and multiple predictors.Shared components can be difficult to explain substantively. Report them carefully and avoid overstating precision.
VIFUse variance inflation factors to screen for multicollinearity among predictors.A set of predictors from a regression-style model.High VIF values suggest unstable coefficient estimates. Consider whether predictors are redundant, transformed, or conceptually overlapping.

Diagnostic, Measurement, and Structured Data

StatDesk analysis overview: Diagnostic, Measurement, and Structured Data
Analysis areaPurposeTypical inputsInterpretation and cautions
Diagnostic accuracyUse diagnostic accuracy tools to summarize classification performance, such as sensitivity, specificity, and related quantities.A true outcome/classification variable and a predicted class, test result, or thresholded score depending on setup.Always verify which level is treated as the positive condition. Mis-specified positives can reverse interpretation.
ROC/AUCUse ROC/AUC to evaluate how well a numeric score or probability discriminates a binary outcome across thresholds.A binary outcome and numeric score/probability.AUC summarizes discrimination, not calibration or clinical usefulness. Threshold choice should reflect the decision context.
Survival summariesUse survival summaries to explore time-to-event data.A time variable and an event/censoring indicator, with optional grouping depending on setup.Confirm event coding, censoring, and time units before interpreting results.
Longitudinal summariesUse longitudinal summaries to inspect repeated measurements over time.An outcome measured over time, a time variable, and subject or grouping identifiers as required.Use summaries to understand patterns before fitting formal longitudinal models. Check missing visits and irregular timing.
Multilevel summariesUse multilevel summaries for clustered or hierarchical data, such as patients within sites or repeated observations within people.Outcome variables with cluster/group identifiers and any variables requested by the setup screen.Interpret ICC and within/between summaries as structure checks. Full multilevel modeling decisions require careful design review.
ReliabilityUse reliability tools to evaluate whether multiple items behave consistently as a scale or measurement set.Multiple item variables intended to measure a common construct.Reliability is not proof of validity. Check item coding, reverse scoring, dimensionality, and substantive item content.
PCAUse principal component analysis to explore component structure and reduce dimensionality.Multiple numeric variables measured on compatible scales.PCA is exploratory and scale-sensitive. Standardization, missingness, and variable selection can materially change results.

Quality and Process Analysis

StatDesk analysis overview: Quality and Process Analysis
Analysis areaPurposeTypical inputsInterpretation and cautions
Process capabilityUse process capability tools to compare observed process performance with specification limits.Process measurement data and lower/upper specification limits as required.Capability statistics require a stable process and appropriate assumptions. Do not interpret capability without process context.
DPMO/yield calculationsUse DPMO and yield tools for Lean Six Sigma style defect and opportunity summaries.Counts of defects, units, opportunities, or yield inputs depending on setup.Make sure the denominator and opportunity definition are consistent. Small definition changes can dramatically change DPMO.
Control chartsUse control charts to monitor process behavior over ordered observations or time.Ordered process data and any subgrouping or chart inputs requested by the setup screen.Control charts are about process stability, not merely whether points look high or low. Interpret signals using the chosen chart rules.
Pareto chartsUse Pareto charts to rank categories by frequency or impact.A category variable and, when applicable, a count or weight variable.Pareto charts support prioritization. They do not show root causes without additional process knowledge.
FMEA priority scoringUse FMEA priority scoring to organize risk ratings for failure modes.Failure-mode records and severity, occurrence, detection, or related rating fields as required.Ratings should be defined consistently across reviewers. Treat scores as a prioritization aid, not a substitute for expert review.
Measurement-system analysisUse measurement-system analysis to evaluate whether measurement variation is acceptable for the intended use.Measurements organized by part/item, operator/rater, trial, or related identifiers depending on setup.A measurement system can be statistically consistent but still unsuitable if it is biased or not aligned with operational needs.
DOE/factor screeningUse DOE and factor-screening tools for early investigation of factors that may influence an outcome.An outcome and experimental factor variables.Designed experiments require attention to randomization, replication, blocking, and design structure. Interpret screening results as preliminary unless the design supports stronger claims.
Taguchi loss calculationsUse Taguchi loss calculations to estimate loss associated with deviation from a target value.Observed or expected values, a target, and loss-function information as required.The loss function must reflect a meaningful cost or quality assumption. Results are only as good as that assumption.

Detailed Plotting Reference

The Plots tab uses a canvas-based plotting tool. Select a plot type, choose the required variables, optionally enter a main title, X-axis title, and Y-axis title, then select Draw plot. Select Export PNG to download the current plot, or Clear to reset the plot area. The interface includes Variable, Y variable, optional Group/color variable, Main title optional, X-axis title optional, and Y-axis title optional fields.

StatDesk plot types, inputs, and uses
Plot typeInputsBest useReading the result
HistogramOne numeric variable.Display the shape, center, spread, and unusual values for a single numeric variable.Use before parametric tests and models to understand distributional shape.
Density plotOne numeric variable.Display a smoothed version of the distribution.Sensitive to smoothing choices; use alongside descriptive statistics.
Normal Q-Q plotOne numeric variable.Compare observed quantiles with expected normal quantiles.Look for strong curvature or outlying tails rather than expecting perfect alignment.
ScatterplotX variable and Y variable.Show the relationship between two numeric variables.Use to inspect nonlinearity, outliers, clusters, and variance patterns.
Scatterplot with fit lineX variable and Y variable.Show the bivariate relationship with a fitted trend line.The line summarizes a pattern; it does not prove causation.
Line chartOrdered X variable and Y variable.Show change over an ordered dimension such as time.Sort/order matters. Confirm the X variable has the intended order.
Bar chartCategorical variable and any required value field.Show category counts or summarized category values depending on setup.Use clean category labels and avoid overcrowding with too many categories.
Pie chartCategorical variable.Show simple composition across a small number of categories.Hard to read with many categories or small differences; bar charts are often clearer.
Boxplot by groupNumeric variable and grouping variable.Compare distribution, median, spread, and potential outliers across groups.Best used with group sample sizes large enough to make distributional summaries meaningful.
Mean plot with 95% CINumeric variable and grouping variable.Compare group means with confidence intervals.Confidence intervals describe uncertainty in the mean, not the full spread of individual observations.
Correlation heatmapMultiple numeric variables.Display correlation patterns across a set of numeric variables.Useful for screening multicollinearity and clusters of related variables.

Plot Labeling and Export Practices

  • Use axis titles when variable names are abbreviated, coded, or not reader-friendly.
  • Use group/color variables only when grouping makes the plot easier to interpret.
  • Exported PNG files are convenient for documentation and slides.

Application Feature Index

This index summarizes where major StatDesk functions appear in the application and what each function does. It is intended as a quick reference for users who know the task they want to perform but are not sure which tab or tool to open.

StatDesk application feature index
FunctionWhere to find itWhat it does
Data editingData ViewUse Edit data, edit cells directly, add or remove rows and variables, reorder variables from the Variables panel, and use Undo/Redo for recent edits.
Sample dataData ViewLoad the iris sample for practice or demonstration.
File importData ViewImport CSV, TSV/TXT, JSON, StatDesk JSON, XLSX, Excel XML/HTML tables, SPSS SAV, Stata DTA, SPSS syntax, or Stata do-files.
Variable labelsVariable labelsAdd readable labels that can be preserved in StatDesk JSON, SPSS syntax, or Stata do-file exports.
Missing value recodingMissing valuesConvert selected indicators to empty cells, choose the recode scope, use case-sensitive matching if needed, preview counts, and apply recoding.
Calculated variablesCalculate variablesCreate new variables from R-like formulas using arithmetic, comparisons, logic, math functions, summaries, conditional logic, missing-value functions, and text functions.
Data exportExport dataExport CSV, TSV, JSON rows, StatDesk JSON + labels, HTML table, Excel XML, SPSS syntax + labels, or Stata do-file + labels.
Session savingSaveSave the session in the browser, download an application state file, load a saved state file, or clear the saved browser session.
Output reviewOutputReview analysis output, clear output, or export output as a PDF.
Validation checksAccuracy testsRun validation tests or clear validation test results.
Analysis setupAnalysis SetupSelect variables and configure the currently selected statistical analysis.
R code examplesAnalysis SetupReview analogous R code shown for many analysis setup screens.
PlotsPlotsCreate histograms, density plots, Q-Q plots, scatterplots, line charts, bar charts, pie charts, boxplots, mean plots with 95% CI, and correlation heatmaps.
Plot exportPlotsExport the current plot as a PNG or clear the current plot.
Version and citationAboutView version information, recommended citation, BibTeX entry, author information, and limitations.
Help and disclaimerHelpReview quick-start help, formula syntax, supported formats, accuracy-test notes, and legal/use disclaimer.

Troubleshooting

This section collects common issues users may encounter while working in a browser-based statistical application.

StatDesk troubleshooting reference
IssueLikely causeRecommended response
A variable is missing from an analysis selector.The variable may have the wrong type, may contain nonnumeric text, or may not be eligible for the selected analysis role.Check the Variables panel, inspect the raw values, and clean or recode the variable if needed.
A numeric variable imported as text.The column may contain commas, symbols, text labels, or missing indicators such as NA or 999.Review values in Data View, recode missing indicators, and consider creating a cleaned numeric version if appropriate.
Categories look duplicated.Levels may differ by capitalization, spaces, spelling, or punctuation.Clean category labels before running frequencies, chi-square tests, grouped summaries, or plots.
The missing value recode count is unexpected.The indicators may be too broad, case-sensitive matching may be wrong, or the recode scope may include too many variables.Do not apply the recode until the preview count makes sense.
An analysis will not run.Required fields may be incomplete, variable types may not match the analysis, or the dataset may not contain enough valid observations.Return to Analysis Setup, complete required fields, and inspect missingness and variable type settings.
A plot is blank or unclear.The selected variables may be incompatible with the plot type, missing, or too sparse.Choose variables that match the plot requirements and check whether the data contain valid values.
Output seems surprising.The model, coding, missing-data handling, or variable roles may not match the intended analysis.Check descriptive summaries, plots, variable coding, and the analogous R code. Verify important results externally.
Browser autosave is gone.Private browsing, clearing site data, browser cleanup, storage quotas, or mobile browser cleanup can remove local storage.Use Save application state file for long-term backup.

Frequently Asked Questions

Does StatDesk upload my data to a server? StatDesk is a client-side web application and data are processed in the browser rather than uploaded to a StatDesk server. Browser autosave is local-only storage using IndexedDB. Users should still follow institutional data-security rules and avoid loading untrusted state files.

When should I use StatDesk JSON? Use StatDesk JSON when you want to preserve data, labels, and variable type settings for later use in StatDesk. For broad compatibility, CSV or TSV may be better. For SPSS or Stata handoff with labels, use syntax exports.

Which export formats preserve labels? Use StatDesk JSON + labels, SPSS syntax + labels, or Stata do-file + labels when variable labels need to be preserved. CSV and TSV are broadly compatible but do not preserve StatDesk label settings in the same way.

What is saved in an application state file? The application state file contains the current data, labels, variable type settings, output, selected analysis, paging, tabs, plot settings, and StatDesk version information.

Why is my variable not available for a plot or analysis? Most often, the variable type or contents do not match the selected procedure. Check whether the variable is numeric, categorical, missing, or imported as text.

How do I export a plot? Open the Plots tab, draw the plot, and use Export PNG to download the current plot as an image file.

How do I export analysis output? Open the Output tab after running an analysis and use Export output PDF. Use Clear output when you want to remove the current output from the display.

Glossary

Application state file: A portable JSON file that stores the current StatDesk session, including data, output, labels, settings, selected analysis, paging, tabs, plot settings, and version information.

Autosave: A browser-based local save mechanism intended to restore the most recent session when browser storage is available.

Categorical variable: A variable whose values represent groups, labels, or categories rather than measured numeric quantities.

Confidence interval: A range of plausible values for a population parameter under a statistical model and repeated-sampling interpretation.

Data View: The StatDesk workspace for directly viewing and editing data.

DPMO: Defects per million opportunities, a Lean Six Sigma quality metric based on defect, unit, and opportunity counts.

GLM: Generalized linear model, a flexible modeling framework for outcomes that may not be normally distributed.

Missing indicator: A code such as NA, 999, -99, refused, or unknown that should be treated as missing rather than a real observed value.

Positive condition: The outcome category treated as the event or positive case in diagnostic accuracy and ROC/AUC analyses.

Variable label: A readable description attached to a variable to make output and exports easier to interpret.

VIF: Variance inflation factor, a diagnostic used to assess multicollinearity among predictors.

Citation and Version Information

The version used for this guide is Version 0.9.1, dated June 19, 2026. The recommended citation shown in the application is:

Harris, M. (2026). StatsWithR StatDesk (Version 0.9.1) [Computer software]. StatsWithR.com. https://www.statswithr.com/statdesk

The BibTeX entry is:

@software{harris2026statdesk,
author = {Harris, Michael},
title = {StatsWithR StatDesk},
version = {0.9.1},
year = {2026},
url = {https://www.statswithr.com/statdesk},
note = {Statistical analysis software by Michael Harris, MS, MAS.}
}

Limitations and User Responsibilities

  • StatDesk is beta software, and results are experimental.
  • Important findings should be verified against trusted statistical software and reviewed by a qualified analyst.
  • Users remain responsible for source data review, study design, missing-data decisions, model assumptions, interpretation, reporting decisions, and final quality control.

Open StatDeskBack to top