Summary Tables

You can make different summary tables of your data using the package SummaryTables. SummaryTables supports HTML and LaTeX output.

To write an output from a SummaryTables function to a file, you can show it with the correct MIME type:

tbl = table_one(...)
# save as HTML
open("tbl.html", "w") do io
    show(io, MIME"text/html"(), tbl)
end
# save as LaTeX code
open("tbl.tex", "w") do io
    show(io, MIME"text/latex"(), tbl)
end

table_one

SummaryTables.table_oneFunction
table_one(table, analyses; keywords...)

Construct a "Table 1" which summarises the patient baseline characteristics from the provided table dataset. This table is commonly used in biomedical research papers.

It can handle both continuous and categorical columns in table and summary statistics and hypothesis testing are able to be customised by the user. Tables can be stratified by one, or more, variables using the groupby keyword.

Keywords

  • groupby: Which columns to stratify the dataset with, as a Vector{Symbol}.
  • nonnormal: A vector of column names where hypothesis tests for the :nonnormal type are chosen.
  • minmax: A vector of column names where hypothesis tests for the :minmax type are chosen.
  • tests: A NamedTuple of hypothesis test types to use for categorical, nonnormal, minmax, and normal variables.
  • combine: An object from MultipleTesting to use when combining p-values.
  • show_total: Display the total column summary. Default is true.
  • group_totals: A group Symbol or vector of symbols specifying for which group levels totals should be added. Any group levels but the topmost can be chosen (the topmost being already handled by the show_total option). Default is Symbol[].
  • total_name: The name for all total columns. Default is "Total".
  • show_n: Display the number of rows for each group key next to its label.
  • show_pvalues: Display the P-Value column. Default is false.
  • show_testnames: Display the Test column. Default is false.
  • show_confints: Display the CI column. Default is false.
  • sort: Sort the input table before grouping. Default is true. Pre-sort as desired and set to false when you want to maintain a specific group order or are using non-sortable objects as group keys.

Deprecated keywords

  • show_overall: Use show_total instead

All other keywords are forwarded to the Table constructor, refer to its docstring for details.

Example with table_one

using SummaryTables
using Statistics

tbl = [
    :id => [1, 2, 3, 4, 5, 6, 7, 8],
    :sex => ["m", "f", "m", "f", "m", "f", "m", "f"],
    :age => [14, 36, 35, 63, 83, 23, 24, 26],
    :wgt => [52.3, 65.8, missing, 34.2, 80.2, 77.9, 55.0, 66.7],
    :dose => [50, 50, 50, 50, 100, 100, 100, 100],
    :group => [1, 1, 1, 1, 2, 2, 2, 2],
]

table_one(
    tbl,
    [:sex => "Sex", :age => "Age (years)", :wgt => "Weight (kg)"];
    groupby = :group,
)
group
Total 1 2
Sex
f 4 (50%) 2 (50%) 2 (50%)
m 4 (50%) 2 (50%) 2 (50%)
Age (years)
Mean (SD) 38 (23.3) 37 (20.1) 39 (29.4)
Median [Min, Max] 30.5 [14, 83] 35.5 [14, 63] 25 [23, 83]
Weight (kg)
Mean (SD) 61.7 (16) 50.8 (15.9) 70 (11.6)
Median [Min, Max] 65.8 [34.2, 80.2] 52.3 [34.2, 65.8] 72.3 [55, 80.2]
Missing 1 (12.5%) 1 (25%) 0 (0%)

listingtable

SummaryTables.listingtableFunction
listingtable(table, variable, [pagination];
    rows = [],
    cols = [],
    summarize_rows = [],
    summarize_cols = [],
    variable_header = true,
    table_kwargs...
)

Create a listing table Table from table which displays raw values from column variable.

Arguments

  • table: Data source which must be convertible to a DataFrames.DataFrame.
  • variable: Determines which variable's raw values are shown. Can either be a Symbol such as :ColumnA, or alternatively a Pair where the second element is the display name, such as :ColumnA => "Column A".
  • pagination::Pagination: If a pagination object is passed, the return type changes to PaginatedTable. The Pagination object may be created with keywords rows and/or cols. These must be set to Ints that determine how many group sections along each side are included in one page. These group sections are determined by the summary structure, because pagination never splits a listing table within rows or columns that are being summarized together. If summarize_rows or summarize_cols is empty or unset, each group along that side is its own section. If summarize_rows or summarize_cols has a group passed via the column => ... syntax, the group sections along that side are determined by column. If no such column is passed (i.e., the summary along that side applies to the all groups) there is only one section along that side, which means that this side cannot be paginated into more than one page.

Keyword arguments

  • rows = []: Grouping structure along the rows. Should be a Vector where each element is a grouping variable, specified as a Symbol such as :Column1, or a Pair, where the first element is the symbol and the second a display name, such as :Column1 => "Column 1". Specifying multiple grouping variables creates nested groups, with the last variable changing the fastest.
  • cols = []: Grouping structure along the columns. Follows the same structure as rows.
  • summarize_rows = []: Specifies functions to summarize variable with along the rows. Should be a Vector, where each entry is one separate summary. Each summary can be given as a Function such as mean or maximum, in which case the display name is the function's name. Alternatively, a display name can be given using the pair syntax, such as mean => "Average". By default, one summary is computed over all groups. You can also pass Symbol => [...] where Symbol is a grouping column, to compute one summary for each level of that group.
  • summarize_cols = []: Specifies functions to summarize variable with along the columns. Follows the same structure as summarize_rows.
  • variable_header = true: Controls if the cell with the name of the summarized variable is shown.
  • sort = true: Sort the input table before grouping. Pre-sort as desired and set to false when you want to maintain a specific group order or are using non-sortable objects as group keys.

All other keywords are forwarded to the Table constructor, refer to its docstring for details.

Example

using Statistics

tbl = [
    :Apples => [1, 2, 3, 4, 5, 6, 7, 8],
    :Batch => [1, 1, 1, 1, 2, 2, 2, 2],
    :Checked => [true, false, true, false, true, false, true, false],
    :Delivery => ['a', 'a', 'b', 'b', 'a', 'a', 'b', 'b'],
]

listingtable(
    tbl,
    :Apples => "Number of apples",
    rows = [:Batch, :Checked => "Checked for spots"],
    cols = [:Delivery],
    summarize_cols = [sum => "total"],
    summarize_rows = :Batch => [mean => "average", sum]
)

Example with listingtable

listingtable(
    tbl,
    :wgt => "Weight (kg)";
    rows = [:dose => "Dose (mg)", :sex => "Sex", :id => "ID"],
    summarize_rows = :sex =>
        [mean ∘ skipmissing => "Mean", (x -> count(!ismissing, x)) => "Nonmissing"],
)
Dose (mg) Sex ID Weight (kg)
50 f 2 65.8
4 34.2
Mean 50
Nonmissing 2
50 m 1 52.3
3 missing
Mean 52.3
Nonmissing 1
100 f 6 77.9
8 66.7
Mean 72.3
Nonmissing 2
100 m 5 80.2
7 55
Mean 67.6
Nonmissing 2

summarytable

SummaryTables.summarytableFunction
summarytable(table, variable;
    rows = [],
    cols = [],
    summary = [],
    variable_header = true,
    celltable_kws...
)

Create a summary table Table from table, which summarizes values from column variable.

Arguments

  • table: Data source which must be convertible to a DataFrames.DataFrame.
  • variable: Determines which variable from table is summarized. Can either be a Symbol such as :ColumnA, or alternatively a Pair where the second element is the display name, such as :ColumnA => "Column A".

Keyword arguments

  • rows = []: Grouping structure along the rows. Should be a Vector where each element is a grouping variable, specified as a Symbol such as :Column1, or a Pair, where the first element is the symbol and the second a display name, such as :Column1 => "Column 1". Specifying multiple grouping variables creates nested groups, with the last variable changing the fastest.
  • cols = []: Grouping structure along the columns. Follows the same structure as rows.
  • summary = []: Specifies functions to summarize variable with. Should be a Vector, where each entry is one separate summary. Each summary can be given as a Function such as mean or maximum, in which case the display name is the function's name. Alternatively, a display name can be given using the pair syntax, such as mean => "Average". By default, one summary is computed over all groups. You can also pass Symbol => [...] where Symbol is a grouping column, to compute one summary for each level of that group.
  • variable_header = true: Controls if the cell with the name of the summarized variable is shown.
  • sort = true: Sort the input table before grouping. Pre-sort as desired and set to false when you want to maintain a specific group order or are using non-sortable objects as group keys.

All other keywords are forwarded to the Table constructor, refer to its docstring for details.

Example

using Statistics

tbl = [
    :Apples => [1, 2, 3, 4, 5, 6, 7, 8],
    :Batch => [1, 1, 1, 1, 2, 2, 2, 2],
    :Delivery => ['a', 'a', 'b', 'b', 'a', 'a', 'b', 'b'],
]

summarytable(
    tbl,
    :Apples => "Number of apples",
    rows = [:Batch],
    cols = [:Delivery],
    summary = [length => "N", mean => "average", sum]
)

Example with summarytable

summarytable(
    tbl,
    :wgt => "Weight (kg)";
    rows = [:sex => "Sex"],
    cols = [:dose => "Dose (mg)"],
    summary = [mean ∘ skipmissing => "Mean", (x -> count(!ismissing, x)) => "Nonmissing"],
)
Dose (mg)
50 100
Sex Weight (kg)
f Mean 50 72.3
Nonmissing 2 2
m Mean 52.3 67.6
Nonmissing 1 2