Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: JuliaML/TableTransforms.jl
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.33.0
Choose a base ref
...
head repository: JuliaML/TableTransforms.jl
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref

Commits on Jul 4, 2024

  1. Bump version

    juliohm committed Jul 4, 2024
    2
    Copy the full SHA
    5ea8d2d View commit details

Commits on Jul 8, 2024

  1. Bump version

    juliohm committed Jul 8, 2024
    2
    Copy the full SHA
    4fc4fe2 View commit details

Commits on Jul 20, 2024

  1. Copy the full SHA
    06474d0 View commit details
  2. Update Project.toml

    juliohm authored Jul 20, 2024
    2
    Copy the full SHA
    9d032e8 View commit details

Commits on Sep 10, 2024

  1. Bump peter-evans/create-pull-request from 6 to 7 (#289)

    Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 6 to 7.
    - [Release notes](https://github.com/peter-evans/create-pull-request/releases)
    - [Commits](peter-evans/create-pull-request@v6...v7)
    
    ---
    updated-dependencies:
    - dependency-name: peter-evans/create-pull-request
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Sep 10, 2024
    Copy the full SHA
    4e74218 View commit details

Commits on Sep 16, 2024

  1. Minor fix in qsmooth helper

    juliohm committed Sep 16, 2024
    Copy the full SHA
    2027420 View commit details
  2. Bump version

    juliohm committed Sep 16, 2024
    2
    Copy the full SHA
    56917f6 View commit details
  3. Copy the full SHA
    5b387cf View commit details
  4. Bump version

    juliohm committed Sep 16, 2024
    2
    Copy the full SHA
    d5207e1 View commit details

Commits on Oct 9, 2024

  1. Use lts in CI.yml

    juliohm committed Oct 9, 2024
    Copy the full SHA
    43f47aa View commit details

Commits on Oct 17, 2024

  1. Bump version

    juliohm committed Oct 17, 2024
    2
    Copy the full SHA
    648d3c6 View commit details
  2. Copy the full SHA
    da8e83a View commit details

Commits on Oct 18, 2024

  1. Bump version

    juliohm committed Oct 18, 2024
    2
    Copy the full SHA
    fc1c803 View commit details
  2. Improve docstring of Coerce

    juliohm committed Oct 18, 2024
    Copy the full SHA
    19c5079 View commit details
  3. Bump version

    juliohm committed Oct 18, 2024
    2
    Copy the full SHA
    a106ea3 View commit details

Commits on Nov 18, 2024

  1. Bump codecov/codecov-action from 4 to 5 (#295)

    Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4 to 5.
    - [Release notes](https://github.com/codecov/codecov-action/releases)
    - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
    - [Commits](codecov/codecov-action@v4...v5)
    
    ---
    updated-dependencies:
    - dependency-name: codecov/codecov-action
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Nov 18, 2024
    Copy the full SHA
    8758da8 View commit details

Commits on Jan 22, 2025

  1. Copy the full SHA
    1ab1fe9 View commit details
  2. Update test data

    juliohm committed Jan 22, 2025
    Copy the full SHA
    e6a8229 View commit details
  3. Bump version

    juliohm committed Jan 22, 2025
    2
    Copy the full SHA
    6574280 View commit details

Commits on Jan 27, 2025

  1. Copy the full SHA
    1e47869 View commit details
  2. Closure is not revertible

    juliohm committed Jan 27, 2025
    Copy the full SHA
    f1b933f View commit details
  3. Bump version

    juliohm committed Jan 27, 2025
    2
    Copy the full SHA
    cf0d5fc View commit details

Commits on Mar 19, 2025

  1. Handle unitless columns in Unit transform (#297)

    * Handle unitless columns in Unit transform
    
    * Add string column to new test
    juliohm authored Mar 19, 2025
    Copy the full SHA
    351d388 View commit details
  2. Update Project.toml

    juliohm authored Mar 19, 2025
    2
    Copy the full SHA
    c2ab404 View commit details

Commits on Mar 26, 2025

  1. Copy the full SHA
    2b3461c View commit details

Commits on Mar 27, 2025

  1. Copy the full SHA
    1de5b52 View commit details

Commits on Mar 28, 2025

  1. Add KMedoids (#298)

    * Initial implementation of KMedoids
    
    * Add basic test for KMedoids
    
    * Add KMedoids to docs
    
    * Add more tests for KMedoids
    
    * Use existing _nrows utility
    
    * Use _assert utility function
    
    * Retrieve distance type
    
    * Minor adjustments
    juliohm authored Mar 28, 2025
    Copy the full SHA
    1bb3db8 View commit details
  2. Bump version

    juliohm committed Mar 28, 2025
    2
    Copy the full SHA
    373f0fb View commit details
  3. Copy the full SHA
    b143548 View commit details
  4. Bump version

    juliohm committed Mar 28, 2025
    2
    Copy the full SHA
    ab2d836 View commit details

Commits on May 30, 2025

  1. Copy the full SHA
    6c8d54f View commit details
Showing with 349 additions and 171 deletions.
  1. +2 −2 .github/workflows/CI.yml
  2. +2 −2 .github/workflows/FormatPR.yml
  3. +5 −3 Project.toml
  4. +6 −0 docs/src/transforms.md
  5. +5 −3 src/TableTransforms.jl
  6. +3 −3 src/distributions.jl
  7. +1 −0 src/transforms.jl
  8. +3 −23 src/transforms/closure.jl
  9. +0 −2 src/transforms/coalesce.jl
  10. +4 −4 src/transforms/coerce.jl
  11. +8 −8 src/transforms/compose.jl
  12. +0 −2 src/transforms/dropextrema.jl
  13. +0 −2 src/transforms/dropmissing.jl
  14. +0 −2 src/transforms/dropnan.jl
  15. +0 −2 src/transforms/filter.jl
  16. +142 −0 src/transforms/kmedoids.jl
  17. +24 −18 src/transforms/logratio.jl
  18. +0 −2 src/transforms/map.jl
  19. +3 −4 src/transforms/quantile.jl
  20. +0 −2 src/transforms/replace.jl
  21. +0 −2 src/transforms/sample.jl
  22. +6 −6 src/transforms/satisfies.jl
  23. +0 −4 src/transforms/select.jl
  24. +0 −2 src/transforms/sort.jl
  25. +17 −9 src/transforms/unit.jl
  26. +3 −2 test/Project.toml
  27. BIN test/data/eigenanalysis-1.png
  28. BIN test/data/eigenanalysis-2.png
  29. BIN test/data/projectionpursuit-1.png
  30. BIN test/data/projectionpursuit-2.png
  31. BIN test/data/projectionpursuit-3.png
  32. BIN test/data/quantile.png
  33. +3 −4 test/runtests.jl
  34. +3 −3 test/shows.jl
  35. +1 −0 test/transforms.jl
  36. +1 −3 test/transforms/closure.jl
  37. +12 −12 test/transforms/compose.jl
  38. +11 −11 test/transforms/eigenanalysis.jl
  39. +1 −1 test/transforms/filter.jl
  40. +25 −0 test/transforms/kmedoids.jl
  41. +25 −11 test/transforms/logratio.jl
  42. +5 −5 test/transforms/projectionpursuit.jl
  43. +7 −7 test/transforms/sample.jl
  44. +21 −5 test/transforms/unit.jl
4 changes: 2 additions & 2 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
@@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
version:
- '1.9'
- 'lts'
- '1'
os:
- ubuntu-latest
@@ -45,7 +45,7 @@ jobs:
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-runtest@v1
- uses: julia-actions/julia-processcoverage@v1
- uses: codecov/codecov-action@v4
- uses: codecov/codecov-action@v5
with:
file: lcov.info
token: ${{ secrets.CODECOV_TOKEN }}
4 changes: 2 additions & 2 deletions .github/workflows/FormatPR.yml
Original file line number Diff line number Diff line change
@@ -10,11 +10,11 @@ jobs:
- uses: actions/checkout@v4
- name: Install JuliaFormatter and format
run: |
julia -e 'import Pkg; Pkg.add("JuliaFormatter")'
julia -e 'import Pkg; Pkg.add(name="JuliaFormatter", version="1")'
julia -e 'using JuliaFormatter; format(".")'
- name: Create Pull Request
id: cpr
uses: peter-evans/create-pull-request@v6
uses: peter-evans/create-pull-request@v7
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: ":robot: Format .jl files"
8 changes: 5 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "TableTransforms"
uuid = "0d432bfd-3ee1-4ac1-886a-39f05cc69a3e"
authors = ["Júlio Hoffimann <julio.hoffimann@gmail.com> and contributors"]
version = "1.33.0"
version = "1.34.1"

[deps]
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
@@ -18,6 +18,7 @@ PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
TableDistances = "e5d66e97-8c70-46bb-8b66-04a2d73ad782"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
TransformsBase = "28dd2a49-a57a-4bfb-84ca-1a49db9b96b8"
Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"
@@ -26,8 +27,8 @@ Unitful = "1986cc42-f94f-5a68-af5c-568840ba703d"
AbstractTrees = "0.4"
CategoricalArrays = "0.10"
CoDa = "1.2"
ColumnSelectors = "0.1"
DataScienceTraits = "0.3"
ColumnSelectors = "1.0"
DataScienceTraits = "1.0"
Distributed = "1.9"
Distributions = "0.25"
InverseFunctions = "0.1"
@@ -37,6 +38,7 @@ PrettyTables = "2"
Random = "1.9"
Statistics = "1.9"
StatsBase = "0.33, 0.34"
TableDistances = "1.0"
Tables = "1.6"
TransformsBase = "1.5"
Unitful = "1.17"
6 changes: 6 additions & 0 deletions docs/src/transforms.md
Original file line number Diff line number Diff line change
@@ -242,6 +242,12 @@ SDS
ProjectionPursuit
```

## KMedoids

```@docs
KMedoids
```

## Closure

```@docs
8 changes: 5 additions & 3 deletions src/TableTransforms.jl
Original file line number Diff line number Diff line change
@@ -6,19 +6,20 @@ module TableTransforms

using Tables
using Unitful
using Statistics
using PrettyTables
using AbstractTrees
using LinearAlgebra
using TableDistances
using DataScienceTraits
using CategoricalArrays
using LinearAlgebra
using Statistics
using Random
using CoDa

using DataScienceTraits: SciType, coerce
using TransformsBase: Transform, Identity,
using ColumnSelectors: ColumnSelector, SingleColumnSelector
using ColumnSelectors: AllSelector, Column, selector, selectsingle
using DataScienceTraits: SciType, Continuous, Categorical, coerce
using Unitful: AbstractQuantity, AffineQuantity, AffineUnits, Units
using Distributions: ContinuousUnivariateDistribution, Normal
using InverseFunctions: NoInverse, inverse as invfun
@@ -90,6 +91,7 @@ export
DRS,
SDS,
ProjectionPursuit,
KMedoids,
Closure,
Remainder,
Compose,
6 changes: 3 additions & 3 deletions src/distributions.jl
Original file line number Diff line number Diff line change
@@ -20,7 +20,7 @@ EmpiricalDistribution(values) = EmpiricalDistribution{eltype(values)}(values)

quantile(d::EmpiricalDistribution, p::Real) = quantile(d.values, p, sorted=true)

function cdf(d::EmpiricalDistribution{T}, x::T) where {T}
function cdf(d::EmpiricalDistribution, x)
v = d.values
n = length(v)

@@ -37,9 +37,9 @@ function cdf(d::EmpiricalDistribution{T}, x::T) where {T}
l, u = v[head], v[tail]

if x < l
return T(0)
return 0.0
elseif x > u
return T(1)
return 1.0
else
if l == u
return tail / n
1 change: 1 addition & 0 deletions src/transforms.jl
Original file line number Diff line number Diff line change
@@ -286,6 +286,7 @@ include("transforms/quantile.jl")
include("transforms/functional.jl")
include("transforms/eigenanalysis.jl")
include("transforms/projectionpursuit.jl")
include("transforms/kmedoids.jl")
include("transforms/closure.jl")
include("transforms/remainder.jl")
include("transforms/compose.jl")
26 changes: 3 additions & 23 deletions src/transforms/closure.jl
Original file line number Diff line number Diff line change
@@ -13,19 +13,17 @@ See also [`Remainder`](@ref).
"""
struct Closure <: StatelessFeatureTransform end

isrevertible(::Type{Closure}) = true

assertions(::Closure) = [scitypeassert(Continuous)]

function applyfeat(::Closure, feat, prep)
cols = Tables.columns(feat)
names = Tables.columnnames(cols)

# table as matrix and get the sum acros dims 2
# convert table to matrix
X = Tables.matrix(feat)
S = sum(X, dims=2)

# divides each row by its sum (closure operation)
# divide each row by its sum (closure operation)
S = sum(X, dims=2)
Z = X ./ S

# table with the old columns and the new values
@@ -34,21 +32,3 @@ function applyfeat(::Closure, feat, prep)

newfeat, S
end

function revertfeat(::Closure, newfeat, fcache)
cols = Tables.columns(newfeat)
names = Tables.columnnames(cols)

# table as matrix
Z = Tables.matrix(newfeat)

# retrieve cache
S = fcache

# undo operation
X = Z .* S

# table with original columns
𝒯 = (; zip(names, eachcol(X))...)
𝒯 |> Tables.materializer(newfeat)
end
2 changes: 0 additions & 2 deletions src/transforms/coalesce.jl
Original file line number Diff line number Diff line change
@@ -44,8 +44,6 @@ Coalesce(cols::C...; value) where {C<:Column} = Coalesce(selector(cols), value)

parameters(transform::Coalesce) = (; value=transform.value)

isrevertible(::Type{<:Coalesce}) = false

colcache(::Coalesce, x) = nothing

colapply(transform::Coalesce, x, c) = coalesce.(x, transform.value)
8 changes: 4 additions & 4 deletions src/transforms/coerce.jl
Original file line number Diff line number Diff line change
@@ -16,10 +16,10 @@ This transform uses the `DataScienceTraits.coerce` function. Please see their do
# Examples
```julia
import DataScienceTraits as DST
Coerce(1 => DST.Continuous, 2 => DST.Continuous)
Coerce(:a => DST.Continuous, :b => DST.Continuous)
Coerce("a" => DST.Continuous, "b" => DST.Continuous)
using DataScienceTraits
Coerce(1 => Continuous, 2 => Continuous)
Coerce(:a => Continuous, :b => Continuous)
Coerce("a" => Continuous, "b" => Continuous)
```
"""
struct Coerce{S<:ColumnSelector,T} <: StatelessFeatureTransform
16 changes: 8 additions & 8 deletions src/transforms/compose.jl
Original file line number Diff line number Diff line change
@@ -3,37 +3,37 @@
# ------------------------------------------------------------------

"""
Compose(; as=:CODA)
Compose(; as=:coda)
Converts all columns of the table into parts of a composition
in a new column named `as`, using the `CoDa.compose` function.
Compose(col₁, col₂, ..., colₙ; as=:CODA)
Compose([col₁, col₂, ..., colₙ]; as=:CODA)
Compose((col₁, col₂, ..., colₙ); as=:CODA)
Compose(col₁, col₂, ..., colₙ; as=:coda)
Compose([col₁, col₂, ..., colₙ]; as=:coda)
Compose((col₁, col₂, ..., colₙ); as=:coda)
Converts the selected columns `col₁`, `col₂`, ..., `colₙ` into parts of a composition.
Compose(regex; as=:CODA)
Compose(regex; as=:coda)
Converts the columns that match with `regex` into parts of a composition.
# Examples
```julia
Compose(as=:comp)
Compose(as=:composition)
Compose([2, 3, 5])
Compose([:b, :c, :e])
Compose(("b", "c", "e"))
Compose(r"[bce]", as="COMP")
Compose(r"[bce]", as="composition")
```
"""
struct Compose{S<:ColumnSelector} <: StatelessFeatureTransform
selector::S
as::Symbol
end

Compose(selector::ColumnSelector; as=:CODA) = Compose(selector, Symbol(as))
Compose(selector::ColumnSelector; as=:coda) = Compose(selector, Symbol(as))

Compose(; kwargs...) = Compose(AllSelector(); kwargs...)
Compose(cols; kwargs...) = Compose(selector(cols); kwargs...)
2 changes: 0 additions & 2 deletions src/transforms/dropextrema.jl
Original file line number Diff line number Diff line change
@@ -52,8 +52,6 @@ DropExtrema(cols::C...; low=0.25, high=0.75) where {C<:Column} = DropExtrema(sel

parameters(transform::DropExtrema) = (low=transform.low, high=transform.high)

isrevertible(::Type{<:DropExtrema}) = false

function preprocess(transform::DropExtrema, feat)
cols = Tables.columns(feat)
names = Tables.columnnames(cols)
2 changes: 0 additions & 2 deletions src/transforms/dropmissing.jl
Original file line number Diff line number Diff line change
@@ -41,8 +41,6 @@ DropMissing() = DropMissing(AllSelector())
DropMissing(cols) = DropMissing(selector(cols))
DropMissing(cols::C...) where {C<:Column} = DropMissing(selector(cols))

isrevertible(::Type{<:DropMissing}) = false

function preprocess(transform::DropMissing, feat)
cols = Tables.columns(feat)
names = Tables.columnnames(cols)
2 changes: 0 additions & 2 deletions src/transforms/dropnan.jl
Original file line number Diff line number Diff line change
@@ -35,8 +35,6 @@ DropNaN() = DropNaN(AllSelector())
DropNaN(cols) = DropNaN(selector(cols))
DropNaN(cols::C...) where {C<:Column} = DropNaN(selector(cols))

isrevertible(::Type{<:DropNaN}) = false

_isnan(_) = false
_isnan(x::Number) = isnan(x)

2 changes: 0 additions & 2 deletions src/transforms/filter.jl
Original file line number Diff line number Diff line change
@@ -27,8 +27,6 @@ struct Filter{F} <: StatelessFeatureTransform
pred::F
end

isrevertible(::Type{<:Filter}) = false

function preprocess(transform::Filter, feat)
# lazy row iterator
rows = tablerows(feat)
Loading