Data Manipulation
Base.joinDataValueTables.meltDataValueTables.meltdtDataValueTables.stackDataValueTables.stackdtDataValueTables.unstack
Joins
Base.join — Function.Join two DataValueTables
join(dt1::AbstractDataValueTable,
dt2::AbstractDataValueTable;
on::Union{Symbol, Vector{Symbol}} = Symbol[],
kind::Symbol = :inner)Arguments
dt1,dt2: the two AbstractDataValueTables to be joined
Keyword Arguments
on: a Symbol or Vector{Symbol}, the column(s) used as keys when joining; required argument except forkind = :crosskind: the type of join, options include::inner: only include rows with keys that match in bothdt1anddt2, the default:outer: include all rows fromdt1anddt2:left: include all rows fromdt1:right: include all rows fromdt2:semi: return rows ofdt1that match with the keys indt2:anti: return rows ofdt1that do not match with the keys indt2:cross: a full Cartesian product of the key combinations; every row ofdt1is matched with every row ofdt2
For the three join operations that may introduce missing values (:outer, :left, and :right), all columns of the returned data table will be nullable.
Result
::DataValueTable: the joined DataValueTable
Examples
name = DataValueTable(ID = [1, 2, 3], Name = ["John Doe", "Jane Doe", "Joe Blogs"])
job = DataValueTable(ID = [1, 2, 4], Job = ["Lawyer", "Doctor", "Farmer"])
join(name, job, on = :ID)
join(name, job, on = :ID, kind = :outer)
join(name, job, on = :ID, kind = :left)
join(name, job, on = :ID, kind = :right)
join(name, job, on = :ID, kind = :semi)
join(name, job, on = :ID, kind = :anti)
join(name, job, kind = :cross)Reshaping
DataValueTables.melt — Function.Stacks a DataValueTable; convert from a wide to long format; see stack.
DataValueTables.stack — Function.Stacks a DataValueTable; convert from a wide to long format
stack(dt::AbstractDataValueTable, [measure_vars], [id_vars];
variable_name::Symbol=:variable, value_name::Symbol=:value)
melt(dt::AbstractDataValueTable, [id_vars], [measure_vars];
variable_name::Symbol=:variable, value_name::Symbol=:value)Arguments
dt: the AbstractDataValueTable to be stackedmeasure_vars: the columns to be stacked (the measurement variables), a normal column indexing type, like a Symbol, Vector{Symbol}, Int, etc.; formelt, defaults to all variables that are notid_vars. If neithermeasure_varsorid_varsare given,measure_varsdefaults to all floating point columns.id_vars: the identifier columns that are repeated during stacking, a normal column indexing type; forstackdefaults to all variables that are notmeasure_varsvariable_name: the name of the new stacked column that shall hold the names of each ofmeasure_varsvalue_name: the name of the new stacked column containing the values from each ofmeasure_vars
Result
::DataValueTable: the long-format datavaluetable with column:valueholding the values of the stacked columns (measure_vars), with column:variablea Vector of Symbols with themeasure_varsname, and with columns for each of theid_vars.
See also stackdt and meltdt for stacking methods that return a view into the original DataValueTable. See unstack for converting from long to wide format.
Examples
d1 = DataValueTable(a = repeat([1:3;], inner = [4]),
b = repeat([1:4;], inner = [3]),
c = randn(12),
d = randn(12),
e = map(string, 'a':'l'))
d1s = stack(d1, [:c, :d])
d1s2 = stack(d1, [:c, :d], [:a])
d1m = melt(d1, [:a, :b, :e])
d1s_name = melt(d1, [:a, :b, :e], variable_name=:somemeasure)DataValueTables.unstack — Function.Unstacks a DataValueTable; convert from a long to wide format
unstack(dt::AbstractDataValueTable, rowkey, colkey, value)
unstack(dt::AbstractDataValueTable, colkey, value)
unstack(dt::AbstractDataValueTable)Arguments
dt: the AbstractDataValueTable to be unstackedrowkey: the column with a unique key for each row, if not given, find a key by grouping on anything not acolkeyorvaluecolkey: the column holding the column names in wide format, defaults to:variablevalue: the value column, defaults to:value
Result
::DataValueTable: the wide-format datavaluetable
Examples
wide = DataValueTable(id = 1:12,
a = repeat([1:3;], inner = [4]),
b = repeat([1:4;], inner = [3]),
c = randn(12),
d = randn(12))
long = stack(wide)
wide0 = unstack(long)
wide1 = unstack(long, :variable, :value)
wide2 = unstack(long, :id, :variable, :value)Note that there are some differences between the widened results above.
DataValueTables.stackdt — Function.A stacked view of a DataValueTable (long format)
Like stack and melt, but a view is returned rather than data copies.
stackdt(dt::AbstractDataValueTable, [measure_vars], [id_vars];
variable_name::Symbol=:variable, value_name::Symbol=:value)
meltdt(dt::AbstractDataValueTable, [id_vars], [measure_vars];
variable_name::Symbol=:variable, value_name::Symbol=:value)Arguments
dt: the wide AbstractDataValueTablemeasure_vars: the columns to be stacked (the measurement variables), a normal column indexing type, like a Symbol, Vector{Symbol}, Int, etc.; formelt, defaults to all variables that are notid_varsid_vars: the identifier columns that are repeated during stacking, a normal column indexing type; forstackdefaults to all variables that are notmeasure_vars
Result
::DataValueTable: the long-format datavaluetable with column:valueholding the values of the stacked columns (measure_vars), with column:variablea Vector of Symbols with themeasure_varsname, and with columns for each of theid_vars.
The result is a view because the columns are special AbstractVectors that return indexed views into the original DataValueTable.
Examples
d1 = DataValueTable(a = repeat([1:3;], inner = [4]),
b = repeat([1:4;], inner = [3]),
c = randn(12),
d = randn(12),
e = map(string, 'a':'l'))
d1s = stackdt(d1, [:c, :d])
d1s2 = stackdt(d1, [:c, :d], [:a])
d1m = meltdt(d1, [:a, :b, :e])DataValueTables.meltdt — Function.A stacked view of a DataValueTable (long format); see stackdt