Utilities
Base.dump
Base.unique
DataValueTables.completecases
DataValueTables.eltypes
DataValueTables.head
DataValueTables.names!
DataValueTables.nonunique
DataValueTables.rename
DataValueTables.rename!
DataValueTables.tail
DataValueTables.unique!
DataValues.dropna
DataValues.dropna!
StatsBase.describe
DataValueTables.eltypes
— Function.Return element types of columns
eltypes(dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTable
Result
::Vector{Type}
: the element type of each column
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
eltypes(dt)
DataValueTables.head
— Function.Show the first or last part of an AbstractDataValueTable
head(dt::AbstractDataValueTable, r::Int = 6)
tail(dt::AbstractDataValueTable, r::Int = 6)
Arguments
dt
: the AbstractDataValueTabler
: the number of rows to show
Result
::AbstractDataValueTable
: the first or last part ofdt
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
head(dt)
tail(dt)
DataValueTables.completecases
— Function.Indexes of complete cases (rows without null values)
completecases(dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTable
Result
::Vector{Bool}
: indexes of complete cases
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt[[1,4,5], :x] = DataValue()
dt[[9,10], :y] = DataValue()
completecases(dt)
StatsBase.describe
— Function.Summarize the columns of an AbstractDataValueTable
describe(dt::AbstractDataValueTable)
describe(io, dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTableio
: optional output descriptor
Result
nothing
Details
If the column's base type derives from Number, compute the minimum, first quantile, median, mean, third quantile, and maximum. Nulls are filtered and reported separately.
For boolean columns, report trues, falses, and nulls.
For other types, show column characteristics and number of nulls.
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
describe(dt)
DataValues.dropna
— Function.Remove rows with null values.
dropna(dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTable
Result
::AbstractDataValueTable
: the updated copy
See also completecases
and dropna!
.
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt[[1,4,5], :x] = DataValue()
dt[[9,10], :y] = DataValue()
dropna(dt)
dropna(X::AbstractVector)
Return a vector containing only the non-missing entries of X
, unwrapping DataValue
entries. A copy is always returned, even when X
does not contain any missing values.
DataValues.dropna!
— Function.Remove rows with null values in-place.
dropna!(dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTable
Result
::AbstractDataValueTable
: the updated version
See also dropna
and completecases
.
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt[[1,4,5], :x] = DataValue()
dt[[9,10], :y] = DataValue()
dropna!(dt)
dropna!(X::AbstractVector)
Remove missing entries of X
in-place and return a Vector
view of the unwrapped DataValue
entries. If no missing values are present, this is a no-op and X
is returned.
dropna!(X::DataValueVector)
Remove missing entries of X
in-place and return a Vector
view of the unwrapped DataValue
entries.
Base.dump
— Function.Show the structure of an AbstractDataValueTable, in a tree-like format
dump(dt::AbstractDataValueTable, n::Int = 5)
dump(io::IO, dt::AbstractDataValueTable, n::Int = 5)
Arguments
dt
: the AbstractDataValueTablen
: the number of levels to showio
: optional output descriptor
Result
nothing
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dump(dt)
DataValueTables.names!
— Function.Set column names
names!(dt::AbstractDataValueTable, vals)
Arguments
dt
: the AbstractDataValueTablevals
: column names, normally a Vector{Symbol} the same length as the number of columns indt
allow_duplicates
: iffalse
(the default), an error will be raised if duplicate names are found; iftrue
, duplicate names will be suffixed with_i
(i
starting at 1 for the first duplicate).
Result
::AbstractDataValueTable
: the updated result
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
names!(dt, [:a, :b, :c])
names!(dt, [:a, :b, :a]) # throws ArgumentError
names!(dt, [:a, :b, :a], allow_duplicates=true) # renames second :a to :a_1
DataValueTables.nonunique
— Function.Indexes of duplicate rows (a row that is a duplicate of a prior row)
nonunique(dt::AbstractDataValueTable)
nonunique(dt::AbstractDataValueTable, cols)
Arguments
dt
: the AbstractDataValueTablecols
: a column indicator (Symbol, Int, Vector{Symbol}, etc.) specifying the column(s) to compare
Result
::Vector{Bool}
: indicates whether the row is a duplicate of some prior row
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt = vcat(dt, dt)
nonunique(dt)
nonunique(dt, 1)
DataValueTables.rename
— Function.Rename columns
rename!(dt::AbstractDataValueTable, from::Symbol, to::Symbol)
rename!(dt::AbstractDataValueTable, d::Associative)
rename!(f::Function, dt::AbstractDataValueTable)
rename(dt::AbstractDataValueTable, from::Symbol, to::Symbol)
rename(f::Function, dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTabled
: an Associative type that maps the original name to a new namef
: a function that has the old column name (a symbol) as input and new column name (a symbol) as output
Result
::AbstractDataValueTable
: the updated result
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
rename(x -> @Symbol(uppercase(string(x))), dt)
rename(dt, Dict(:i=>:A, :x=>:X))
rename(dt, :y, :Y)
rename!(dt, Dict(:i=>:A, :x=>:X))
DataValueTables.rename!
— Function.Rename columns
rename!(dt::AbstractDataValueTable, from::Symbol, to::Symbol)
rename!(dt::AbstractDataValueTable, d::Associative)
rename!(f::Function, dt::AbstractDataValueTable)
rename(dt::AbstractDataValueTable, from::Symbol, to::Symbol)
rename(f::Function, dt::AbstractDataValueTable)
Arguments
dt
: the AbstractDataValueTabled
: an Associative type that maps the original name to a new namef
: a function that has the old column name (a symbol) as input and new column name (a symbol) as output
Result
::AbstractDataValueTable
: the updated result
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
rename(x -> @Symbol(uppercase(string(x))), dt)
rename(dt, Dict(:i=>:A, :x=>:X))
rename(dt, :y, :Y)
rename!(dt, Dict(:i=>:A, :x=>:X))
DataValueTables.tail
— Function.Show the first or last part of an AbstractDataValueTable
head(dt::AbstractDataValueTable, r::Int = 6)
tail(dt::AbstractDataValueTable, r::Int = 6)
Arguments
dt
: the AbstractDataValueTabler
: the number of rows to show
Result
::AbstractDataValueTable
: the first or last part ofdt
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
head(dt)
tail(dt)
Base.unique
— Function.Delete duplicate rows
unique(dt::AbstractDataValueTable)
unique(dt::AbstractDataValueTable, cols)
unique!(dt::AbstractDataValueTable)
unique!(dt::AbstractDataValueTable, cols)
Arguments
dt
: the AbstractDataValueTablecols
: column indicator (Symbol, Int, Vector{Symbol}, etc.)
specifying the column(s) to compare.
Result
::AbstractDataValueTable
: the updated version ofdt
with unique rows.
When cols
is specified, the return DataValueTable contains complete rows, retaining in each case the first instance for which dt[cols]
is unique.
See also nonunique
.
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt = vcat(dt, dt)
unique(dt) # doesn't modify dt
unique(dt, 1)
unique!(dt) # modifies dt
unique(A::CategoricalArray)
unique(A::DataValueCategoricalArray)
Return levels which appear in A
, in the same order as levels
(and not in their order of appearance). This function is significantly slower than levels
since it needs to check whether levels are used or not.
DataValueTables.unique!
— Function.Delete duplicate rows
unique(dt::AbstractDataValueTable)
unique(dt::AbstractDataValueTable, cols)
unique!(dt::AbstractDataValueTable)
unique!(dt::AbstractDataValueTable, cols)
Arguments
dt
: the AbstractDataValueTablecols
: column indicator (Symbol, Int, Vector{Symbol}, etc.)
specifying the column(s) to compare.
Result
::AbstractDataValueTable
: the updated version ofdt
with unique rows.
When cols
is specified, the return DataValueTable contains complete rows, retaining in each case the first instance for which dt[cols]
is unique.
See also nonunique
.
Examples
dt = DataValueTable(i = 1:10, x = rand(10), y = rand(["a", "b", "c"], 10))
dt = vcat(dt, dt)
unique(dt) # doesn't modify dt
unique(dt, 1)
unique!(dt) # modifies dt