StatDesk User Guide

Purpose and Scope

StatsWithR StatDesk is a browser-based statistical analysis application designed for transparent applied statistics tasks. Users can import or enter data, label variables, run common statistical procedures, view exportable output, create plots, and compare methods with analogous R code examples.

This guide is intended for students, instructors, researchers, and applied analysts who need a practical reference for using the application. It focuses on routine operation and interpretation support rather than replacing a statistics textbook, a full programming environment, or a formal software validation report.

Appropriate Uses

Teaching introductory and applied statistics with visible analysis steps.
Exploring small to moderate research datasets directly in a browser.
Creating preliminary tables, plots, and output for review or documentation.

Uses That Require Extra Verification

StatDesk should not be relied on as the sole source for clinical, regulatory, legal, financial, publication-critical, or other high-stakes decisions. Users remain responsible for data quality review, study design, model assumptions, interpretation, reporting, and independent verification.

Basic Operating Sequence

Load data by entering values in the grid, selecting Load iris sample for practice, or importing a supported file.
Inspect the data in Data View and confirm that variables are recognized with the intended type.
Clean the dataset by editing cells, adding labels, recoding missing indicators, or creating calculated variables.
Choose an analysis from the Analyses panel or use the analysis search field to locate a method.
Complete the Analysis Setup fields. Required fields must be selected before the analysis can run.
Run the analysis and review the Output tab.
Create supporting plots from the Plots tab when graphical review would help.
Save the session in the browser or export an application state file if the work should be preserved.

Interface Overview

The interface is organized around a Variables panel, an Analyses panel, and a main work area with tabs. The top status area shows autosave status, CPU proxy information, memory information, and a feedback/donation link.

StatDesk interface tabs and their functions
Tab or area	Role in StatDesk
Data View	Primary workspace for entering, editing, importing, exporting, and clearing data.
Plots	Canvas-based plotting area with plot-type, variable, title, axis-label, draw, export, and clear controls.
Analysis Setup	Configuration area for the currently selected analysis, including required inputs and analogous R code when available.
Output	Results area with controls to clear output or export output as a PDF.
Save	Session management area for browser save, application state file export, saved state import, and browser-session clearing.
Accuracy tests	Validation-check area with controls to run validation tests or clear test results.
About	Application description, purpose, supported analysis areas, limitations, version, citation, and related notes.
Help	Quick-start guidance, data notes, formula reference, supported file formats, and legal/use disclaimer.

Data View and Dataset Management

Entering Data Directly

Data View allows users to use Edit data and edit cells directly in the grid. This is most useful for small datasets, classroom demonstrations, quick examples, or small corrections after import. Inline plus and minus controls add or remove rows and variables.

Use direct entry when the dataset is small enough to visually inspect.
After manual edits, use Undo or Redo for recent changes when needed.
Use Clear data only when you are ready to remove the current dataset from the active session.

Loading the Iris Sample

Load iris sample is useful for practice because it provides a familiar dataset with numeric flower measurements and a categorical species variable. It is a good way to test Data View, plots, descriptive statistics, grouped summaries, correlations, and simple modeling tools without importing a file.

Importing Files

Use Import data to load an external file into the browser session. Support exists for CSV, TSV/TXT, JSON, StatDesk JSON, XLSX, Excel XML/HTML tables, SPSS SAV, Stata DTA, SPSS syntax, and Stata do-files.

Supported StatDesk import formats
Import format	Description	Practical check
CSV	Comma-separated plain text data.	Use for broad compatibility. Check delimiter, headers, missing codes, and numeric parsing.
TSV/TXT	Tab-separated or text data.	Useful when commas appear in text fields. Confirm the file structure after import.
JSON	Structured data in JSON format.	Useful for programming tasks. Confirm the row/column structure imported as intended.
StatDesk JSON	StatDesk-preserving file format.	Use when returning to StatDesk with labels and variable type information.
XLSX	Excel workbook file.	Check sheet choice, header row, date parsing, and blank rows.
Excel XML/HTML tables	Spreadsheet-style XML or HTML table sources.	Useful for table-based exports from other systems.
SPSS SAV	Native SPSS data file.	Native SPSS import is still not fully supported. Verify labels, missingness, and coding.
Stata DTA	Native Stata data file.	Native Stata import is still not fully supported. Verify labels, missingness, and coding.
SPSS syntax	SPSS syntax file capable of rebuilding data and labels.	Useful for label-preserving SPSS handoff.
Stata do-file	Stata do-file capable of rebuilding data and labels.	Useful for label-preserving Stata handoff.

Checking Imported Data

Confirm that the expected number of rows and variables is present.
Review the first few rows and several later rows to catch header or parsing problems.
Confirm that numeric variables were not imported as text because of commas, symbols, or nonnumeric missing codes.
Confirm that categorical variables have clean, intentional levels.
Check whether dates, identifiers, and coded variables should be treated as numeric, categorical, or text-like fields.

Variable Types and Reordering

The Variables panel shows available variables and supports reordering variables from the sidebar. Variable type settings matter because analysis setup screens and plots depend on whether a variable is treated as numeric, categorical, or otherwise eligible for a given role.

Variable Labels

Variable labels make output and exported files easier to understand. Use Variable labels to provide readable descriptions for variables whose raw names are abbreviated, technical, or coded. Labels are especially useful when exporting StatDesk JSON, SPSS syntax, or Stata do-files because those formats are intended to preserve labels.

Missing Value Recoding

The Missing values tool converts selected values to empty cells. The interface accepts missing indicators separated by commas or line breaks. Users can apply recoding to all variables, numeric variables only, or a selected variable only. A case-sensitive matching option is available, along with Preview count and Recode to empty cells controls.

Missing-value recoding controls
Control	How to use it
Missing indicators	Values such as NA, N/A, 999, -99, refused, or unknown that should be treated as missing rather than real data.
Apply to all variables	Use only when the same missing indicators have the same meaning throughout the dataset.
Apply to numeric variables only	Useful when numeric placeholder codes should be removed but text categories should be preserved.
Apply to selected variable only	Best for targeted cleanup when one variable has special coding.
Case-sensitive matching	Use when uppercase and lowercase values should be treated as different values.
Preview count	Review how many values will be changed before committing the recode.

Calculated Variables

Calculated variables create new variables from formulas. The interface includes fields for a new variable name, optional variable label, formula, and a Create calculated variable button. Formulas use R-like syntax, and bare variable names work for simple variable names. Use brackets for variable names that contain spaces.

Formula Syntax

Calculated-variable formula syntax
Formula feature	Details
Variable references	Use bare names such as age. Use brackets for names with spaces, such as [pre score].
Arithmetic operators	+, -, , /, ^, %%, %/%, and parentheses. JavaScript-style * still work, but ^ is recommended for R-like exponent syntax.
Comparisons	Use ==, !=, >, >=, <, and <= to create logical conditions.
Logical operators	Use R-style &, \|, ! or JavaScript-style && and \|\|.
Math functions	abs, sqrt, log, log10, exp, round, floor, ceiling/ceil, pow, sin, cos, and tan.
Row and column summaries	min, max, pmin, pmax, sum, mean, median, sd, and var work across supplied values. mean(variable) summarizes a column; mean(item1, item2, item3) summarizes row values.
Conditional logic	Use ifelse(condition, value_if_true, value_if_false).
Missing and text functions	missing/is.na, notmissing, as.numeric/asNumber, nchar/len, tolower/lower, toupper/upper, grepl, contains, paste, and paste0.

Formula Examples

pre_score - post_score creates a change score when lower post-test values indicate improvement.
[pre score] - [post score] does the same thing when variable names contain spaces.
ifelse(age >= 18, 1, 0) creates an adult indicator.
ifelse(group == "Treatment", 1, 0) creates a treatment-group indicator.
mean(item1, item2, item3) creates a row-wise mean across three items.
ifelse(is.na(score), 0, score) replaces missing score values with 0 in the new calculated variable.
paste0(group, "_", score) combines text and values into a new text-like field.

Exporting Data

Use Export data to download the current dataset. The interface asks for a file name without an extension, and the extension is added automatically. The export options are CSV, TSV, JSON rows, StatDesk JSON + labels, HTML table, Excel XML, SPSS syntax + labels, and Stata do-file + labels.

StatDesk data export options
Export option	Purpose	When to choose it
CSV	General-purpose file for spreadsheets and statistical software.	Use when labels are not needed or will be rebuilt elsewhere.
TSV	Tab-separated version of the dataset.	Useful when text values contain commas.
JSON rows	Row-oriented JSON data.	Useful for programming tasks.
StatDesk JSON + labels	Portable StatDesk-preserving file.	Best export when you want to reload data, labels, and variable settings in StatDesk.
HTML table	Readable table format.	Useful for review or web-style table output.
Excel XML	Spreadsheet-compatible XML format.	Useful when a spreadsheet-readable file is preferred.
SPSS syntax + labels	Syntax that rebuilds data and labels in SPSS.	Use for label-preserving SPSS handoff.
Stata do-file + labels	Do-file that rebuilds data and labels in Stata.	Use for label-preserving Stata handoff.

Saving, Restoring, and Privacy

The Save tab supports both browser-based session saving and portable state-file saving. The interface displays autosave status and an Estimated application state file size value. It states that the application state includes data, output, labels, settings, plot state, selected analysis, paging, tabs, and StatDesk version information.

StatDesk save and restore features
Save/restore feature	Description
Autosave	Restores the most recent session after refresh or browser restart when browser storage is available.
Save session in this browser	Stores the current session locally in the current browser.
Save application state file	Downloads a compact portable JSON file that can be kept, moved, or reloaded later.
Load saved state file	Loads a previously saved application state file. Only load files you created or trust.
Clear saved browser session	Removes the browser-stored session copy.

The browser autosave is local-only storage using IndexedDB and it does not send data to a server. However, users should still follow institutional data rules, avoid loading untrusted state files, and remember that private browsing, clearing site data, storage quotas, or mobile browser cleanup can remove browser autosave.

Output and Accuracy Tests

Output

After running an analysis, use the Output tab to review results. The interface includes Clear output and Export output PDF controls. Exported output is useful for teaching, review, and documentation, but it should not replace independent verification for important analyses.

Accuracy Tests

The Accuracy tests area includes Run Validation Tests and Clear Test Results. These tests are side-by-side method checks and validation support that compare app calculations with benchmark values.

Running Analyses

Analyses are selected from the Analyses panel. The interface also includes a search input for filtering analyses. After selecting a method, use Analysis Setup to complete the required fields and review comparable R syntax when shown.

Start by identifying the outcome, grouping variable, predictor variables, time variable, item variables, or classification variables required by the research question.
Confirm that variables have the correct type before selecting them in Analysis Setup.
Read the output for estimates, uncertainty, test statistics, p-values, model summaries, and warnings where applicable.
Use plots and descriptive summaries to support interpretation rather than relying on one inferential result in isolation.

Analysis Area Overview

This section describes the supported analysis areas listed in the StatDesk 0.9.1 interface. The exact fields shown in Analysis Setup vary by selected analysis and by the variables available in the dataset. Where the interface names a broad analysis area rather than a specific subprocedure, this guide uses that published area name and explains how the area is used in the application.

Summaries and Basic Inference

StatDesk analysis overview: Summaries and Basic Inference
Analysis area	Purpose	Typical inputs	Interpretation and cautions
Descriptive statistics	Use descriptive statistics as the first pass for numeric variables. They help users understand the center, spread, range, and completeness of variables before modeling or hypothesis testing.	Numeric variables. Review variable type settings if an expected numeric variable does not appear.	Check sample size, missingness, unusual ranges, and whether summary values make sense for the measurement scale. Descriptive statistics do not test causal or group differences by themselves.
Frequencies	Use frequencies to count values or categories. This is useful for categorical sample descriptions, data coding checks, and identifying rare or unexpected levels.	Categorical variables, binary indicators, or numeric codes treated as categories.	Check whether category labels are clean and whether missing or placeholder codes were recoded before reporting percentages.
Grouped summaries	Use grouped summaries when descriptive statistics need to be compared across levels of a grouping variable, such as treatment group, sex, site, cohort, or time category.	A numeric summary variable and a categorical grouping variable.	Grouped summaries are descriptive. Apparent differences should be followed by an appropriate inferential analysis when the research question requires it.
Correlations	Use correlations to examine pairwise associations among numeric variables.	Two or more numeric variables.	Inspect scatterplots for nonlinear patterns and outliers. Correlation does not imply causation and may be distorted by restricted range or influential cases.
t tests	Use t tests for mean-comparison questions when the design involves one mean, paired measurements, or two groups.	A numeric outcome and the comparison structure required by the selected t test.	Check design fit before running the test. For paired data, the pairing must represent the same unit measured twice or matched observations.
ANOVA	Use ANOVA when comparing a numeric outcome across more than two group levels.	A numeric outcome and a categorical group/factor variable.	ANOVA identifies evidence of mean differences but does not by itself explain which groups differ unless follow-up comparisons are performed.
Chi-square tests	Use chi-square tests for associations between categorical variables.	Two categorical variables or a contingency-table-style setup.	Sparse cells can make the test unreliable. Review counts before interpreting p-values.

Regression and Predictor Evaluation

StatDesk analysis overview: Regression and Predictor Evaluation
Analysis area	Purpose	Typical inputs	Interpretation and cautions
Regression	Use regression to model a continuous numeric outcome as a function of one or more predictors.	A numeric outcome and predictor variables. Predictors may be numeric or categorical depending on setup.	Interpret coefficients in the context of the model specification. Check linearity, influential observations, residual behavior, and multicollinearity.
Generalized linear models	Use GLM tools when the outcome requires a non-normal modeling family, such as binary, count, or other generalized outcome structures.	An outcome compatible with the selected GLM family and appropriate predictors.	The family and link function determine interpretation. Verify model convergence, coding, reference groups, and final estimates externally for important work.
Mediation	Use mediation to explore whether the association between an exposure and outcome may operate through a mediator.	An exposure/predictor, mediator, outcome, and any needed covariates supported by the setup screen.	Mediation depends on strong design and causal assumptions. Treat browser output as exploratory unless confirmed through a full analytic process.
Relative weights	Use relative weights to compare predictor contribution when predictors overlap or are correlated.	A regression-style outcome and multiple predictors.	Relative importance is not the same as causal importance. Use it to understand model contribution, not to prove mechanisms.
Commonality analysis	Use commonality analysis to decompose explained variance into unique and shared parts across predictors.	A numeric outcome and multiple predictors.	Shared components can be difficult to explain substantively. Report them carefully and avoid overstating precision.
VIF	Use variance inflation factors to screen for multicollinearity among predictors.	A set of predictors from a regression-style model.	High VIF values suggest unstable coefficient estimates. Consider whether predictors are redundant, transformed, or conceptually overlapping.

Diagnostic, Measurement, and Structured Data

StatDesk analysis overview: Diagnostic, Measurement, and Structured Data
Analysis area	Purpose	Typical inputs	Interpretation and cautions
Diagnostic accuracy	Use diagnostic accuracy tools to summarize classification performance, such as sensitivity, specificity, and related quantities.	A true outcome/classification variable and a predicted class, test result, or thresholded score depending on setup.	Always verify which level is treated as the positive condition. Mis-specified positives can reverse interpretation.
ROC/AUC	Use ROC/AUC to evaluate how well a numeric score or probability discriminates a binary outcome across thresholds.	A binary outcome and numeric score/probability.	AUC summarizes discrimination, not calibration or clinical usefulness. Threshold choice should reflect the decision context.
Survival summaries	Use survival summaries to explore time-to-event data.	A time variable and an event/censoring indicator, with optional grouping depending on setup.	Confirm event coding, censoring, and time units before interpreting results.
Longitudinal summaries	Use longitudinal summaries to inspect repeated measurements over time.	An outcome measured over time, a time variable, and subject or grouping identifiers as required.	Use summaries to understand patterns before fitting formal longitudinal models. Check missing visits and irregular timing.
Multilevel summaries	Use multilevel summaries for clustered or hierarchical data, such as patients within sites or repeated observations within people.	Outcome variables with cluster/group identifiers and any variables requested by the setup screen.	Interpret ICC and within/between summaries as structure checks. Full multilevel modeling decisions require careful design review.
Reliability	Use reliability tools to evaluate whether multiple items behave consistently as a scale or measurement set.	Multiple item variables intended to measure a common construct.	Reliability is not proof of validity. Check item coding, reverse scoring, dimensionality, and substantive item content.
PCA	Use principal component analysis to explore component structure and reduce dimensionality.	Multiple numeric variables measured on compatible scales.	PCA is exploratory and scale-sensitive. Standardization, missingness, and variable selection can materially change results.

Quality and Process Analysis

StatDesk analysis overview: Quality and Process Analysis
Analysis area	Purpose	Typical inputs	Interpretation and cautions
Process capability	Use process capability tools to compare observed process performance with specification limits.	Process measurement data and lower/upper specification limits as required.	Capability statistics require a stable process and appropriate assumptions. Do not interpret capability without process context.
DPMO/yield calculations	Use DPMO and yield tools for Lean Six Sigma style defect and opportunity summaries.	Counts of defects, units, opportunities, or yield inputs depending on setup.	Make sure the denominator and opportunity definition are consistent. Small definition changes can dramatically change DPMO.
Control charts	Use control charts to monitor process behavior over ordered observations or time.	Ordered process data and any subgrouping or chart inputs requested by the setup screen.	Control charts are about process stability, not merely whether points look high or low. Interpret signals using the chosen chart rules.
Pareto charts	Use Pareto charts to rank categories by frequency or impact.	A category variable and, when applicable, a count or weight variable.	Pareto charts support prioritization. They do not show root causes without additional process knowledge.
FMEA priority scoring	Use FMEA priority scoring to organize risk ratings for failure modes.	Failure-mode records and severity, occurrence, detection, or related rating fields as required.	Ratings should be defined consistently across reviewers. Treat scores as a prioritization aid, not a substitute for expert review.
Measurement-system analysis	Use measurement-system analysis to evaluate whether measurement variation is acceptable for the intended use.	Measurements organized by part/item, operator/rater, trial, or related identifiers depending on setup.	A measurement system can be statistically consistent but still unsuitable if it is biased or not aligned with operational needs.
DOE/factor screening	Use DOE and factor-screening tools for early investigation of factors that may influence an outcome.	An outcome and experimental factor variables.	Designed experiments require attention to randomization, replication, blocking, and design structure. Interpret screening results as preliminary unless the design supports stronger claims.
Taguchi loss calculations	Use Taguchi loss calculations to estimate loss associated with deviation from a target value.	Observed or expected values, a target, and loss-function information as required.	The loss function must reflect a meaningful cost or quality assumption. Results are only as good as that assumption.

Detailed Plotting Reference

The Plots tab uses a canvas-based plotting tool. Select a plot type, choose the required variables, optionally enter a main title, X-axis title, and Y-axis title, then select Draw plot. Select Export PNG to download the current plot, or Clear to reset the plot area. The interface includes Variable, Y variable, optional Group/color variable, Main title optional, X-axis title optional, and Y-axis title optional fields.

StatDesk plot types, inputs, and uses
Plot type	Inputs	Best use	Reading the result
Histogram	One numeric variable.	Display the shape, center, spread, and unusual values for a single numeric variable.	Use before parametric tests and models to understand distributional shape.
Density plot	One numeric variable.	Display a smoothed version of the distribution.	Sensitive to smoothing choices; use alongside descriptive statistics.
Normal Q-Q plot	One numeric variable.	Compare observed quantiles with expected normal quantiles.	Look for strong curvature or outlying tails rather than expecting perfect alignment.
Scatterplot	X variable and Y variable.	Show the relationship between two numeric variables.	Use to inspect nonlinearity, outliers, clusters, and variance patterns.
Scatterplot with fit line	X variable and Y variable.	Show the bivariate relationship with a fitted trend line.	The line summarizes a pattern; it does not prove causation.
Line chart	Ordered X variable and Y variable.	Show change over an ordered dimension such as time.	Sort/order matters. Confirm the X variable has the intended order.
Bar chart	Categorical variable and any required value field.	Show category counts or summarized category values depending on setup.	Use clean category labels and avoid overcrowding with too many categories.
Pie chart	Categorical variable.	Show simple composition across a small number of categories.	Hard to read with many categories or small differences; bar charts are often clearer.
Boxplot by group	Numeric variable and grouping variable.	Compare distribution, median, spread, and potential outliers across groups.	Best used with group sample sizes large enough to make distributional summaries meaningful.
Mean plot with 95% CI	Numeric variable and grouping variable.	Compare group means with confidence intervals.	Confidence intervals describe uncertainty in the mean, not the full spread of individual observations.
Correlation heatmap	Multiple numeric variables.	Display correlation patterns across a set of numeric variables.	Useful for screening multicollinearity and clusters of related variables.

Plot Labeling and Export Practices

Use axis titles when variable names are abbreviated, coded, or not reader-friendly.
Use group/color variables only when grouping makes the plot easier to interpret.
Exported PNG files are convenient for documentation and slides.

Application Feature Index

This index summarizes where major StatDesk functions appear in the application and what each function does. It is intended as a quick reference for users who know the task they want to perform but are not sure which tab or tool to open.

StatDesk application feature index
Function	Where to find it	What it does
Data editing	Data View	Use Edit data, edit cells directly, add or remove rows and variables, reorder variables from the Variables panel, and use Undo/Redo for recent edits.
Sample data	Data View	Load the iris sample for practice or demonstration.
File import	Data View	Import CSV, TSV/TXT, JSON, StatDesk JSON, XLSX, Excel XML/HTML tables, SPSS SAV, Stata DTA, SPSS syntax, or Stata do-files.
Variable labels	Variable labels	Add readable labels that can be preserved in StatDesk JSON, SPSS syntax, or Stata do-file exports.
Missing value recoding	Missing values	Convert selected indicators to empty cells, choose the recode scope, use case-sensitive matching if needed, preview counts, and apply recoding.
Calculated variables	Calculate variables	Create new variables from R-like formulas using arithmetic, comparisons, logic, math functions, summaries, conditional logic, missing-value functions, and text functions.
Data export	Export data	Export CSV, TSV, JSON rows, StatDesk JSON + labels, HTML table, Excel XML, SPSS syntax + labels, or Stata do-file + labels.
Session saving	Save	Save the session in the browser, download an application state file, load a saved state file, or clear the saved browser session.
Output review	Output	Review analysis output, clear output, or export output as a PDF.
Validation checks	Accuracy tests	Run validation tests or clear validation test results.
Analysis setup	Analysis Setup	Select variables and configure the currently selected statistical analysis.
R code examples	Analysis Setup	Review analogous R code shown for many analysis setup screens.
Plots	Plots	Create histograms, density plots, Q-Q plots, scatterplots, line charts, bar charts, pie charts, boxplots, mean plots with 95% CI, and correlation heatmaps.
Plot export	Plots	Export the current plot as a PNG or clear the current plot.
Version and citation	About	View version information, recommended citation, BibTeX entry, author information, and limitations.
Help and disclaimer	Help	Review quick-start help, formula syntax, supported formats, accuracy-test notes, and legal/use disclaimer.

Troubleshooting

This section collects common issues users may encounter while working in a browser-based statistical application.

StatDesk troubleshooting reference
Issue	Likely cause	Recommended response
A variable is missing from an analysis selector.	The variable may have the wrong type, may contain nonnumeric text, or may not be eligible for the selected analysis role.	Check the Variables panel, inspect the raw values, and clean or recode the variable if needed.
A numeric variable imported as text.	The column may contain commas, symbols, text labels, or missing indicators such as NA or 999.	Review values in Data View, recode missing indicators, and consider creating a cleaned numeric version if appropriate.
Categories look duplicated.	Levels may differ by capitalization, spaces, spelling, or punctuation.	Clean category labels before running frequencies, chi-square tests, grouped summaries, or plots.
The missing value recode count is unexpected.	The indicators may be too broad, case-sensitive matching may be wrong, or the recode scope may include too many variables.	Do not apply the recode until the preview count makes sense.
An analysis will not run.	Required fields may be incomplete, variable types may not match the analysis, or the dataset may not contain enough valid observations.	Return to Analysis Setup, complete required fields, and inspect missingness and variable type settings.
A plot is blank or unclear.	The selected variables may be incompatible with the plot type, missing, or too sparse.	Choose variables that match the plot requirements and check whether the data contain valid values.
Output seems surprising.	The model, coding, missing-data handling, or variable roles may not match the intended analysis.	Check descriptive summaries, plots, variable coding, and the analogous R code. Verify important results externally.
Browser autosave is gone.	Private browsing, clearing site data, browser cleanup, storage quotas, or mobile browser cleanup can remove local storage.	Use Save application state file for long-term backup.

Frequently Asked Questions

Does StatDesk upload my data to a server? StatDesk is a client-side web application and data are processed in the browser rather than uploaded to a StatDesk server. Browser autosave is local-only storage using IndexedDB. Users should still follow institutional data-security rules and avoid loading untrusted state files.

When should I use StatDesk JSON? Use StatDesk JSON when you want to preserve data, labels, and variable type settings for later use in StatDesk. For broad compatibility, CSV or TSV may be better. For SPSS or Stata handoff with labels, use syntax exports.

Which export formats preserve labels? Use StatDesk JSON + labels, SPSS syntax + labels, or Stata do-file + labels when variable labels need to be preserved. CSV and TSV are broadly compatible but do not preserve StatDesk label settings in the same way.

What is saved in an application state file? The application state file contains the current data, labels, variable type settings, output, selected analysis, paging, tabs, plot settings, and StatDesk version information.

Why is my variable not available for a plot or analysis? Most often, the variable type or contents do not match the selected procedure. Check whether the variable is numeric, categorical, missing, or imported as text.

How do I export a plot? Open the Plots tab, draw the plot, and use Export PNG to download the current plot as an image file.

How do I export analysis output? Open the Output tab after running an analysis and use Export output PDF. Use Clear output when you want to remove the current output from the display.

Glossary

Application state file: A portable JSON file that stores the current StatDesk session, including data, output, labels, settings, selected analysis, paging, tabs, plot settings, and version information.

Autosave: A browser-based local save mechanism intended to restore the most recent session when browser storage is available.

Categorical variable: A variable whose values represent groups, labels, or categories rather than measured numeric quantities.

Confidence interval: A range of plausible values for a population parameter under a statistical model and repeated-sampling interpretation.

Data View: The StatDesk workspace for directly viewing and editing data.

DPMO: Defects per million opportunities, a Lean Six Sigma quality metric based on defect, unit, and opportunity counts.

GLM: Generalized linear model, a flexible modeling framework for outcomes that may not be normally distributed.

Missing indicator: A code such as NA, 999, -99, refused, or unknown that should be treated as missing rather than a real observed value.

Positive condition: The outcome category treated as the event or positive case in diagnostic accuracy and ROC/AUC analyses.

Variable label: A readable description attached to a variable to make output and exports easier to interpret.

VIF: Variance inflation factor, a diagnostic used to assess multicollinearity among predictors.

Citation and Version Information

The version used for this guide is Version 0.9.1, dated June 19, 2026. The recommended citation shown in the application is:

Harris, M. (2026). StatsWithR StatDesk (Version 0.9.1) [Computer software]. StatsWithR.com. https://www.statswithr.com/statdesk

The BibTeX entry is:

@software{harris2026statdesk,
author = {Harris, Michael},
title = {StatsWithR StatDesk},
version = {0.9.1},
year = {2026},
url = {https://www.statswithr.com/statdesk},
note = {Statistical analysis software by Michael Harris, MS, MAS.}
}

Limitations and User Responsibilities

StatDesk is beta software, and results are experimental.
Important findings should be verified against trusted statistical software and reviewed by a qualified analyst.
Users remain responsible for source data review, study design, missing-data decisions, model assumptions, interpretation, reporting decisions, and final quality control.

Open StatDesk Back to top