logo

How to Sort Dataframe in julia 📂Julia

How to Sort Dataframe in julia

Code

using DataFrames

Unit1 = DataFrame(
    member = ["다영","루다","수빈","진숙"],
    birth = [99,97,96,99],
    height = [161,157,159,162]
)

Unit2 = DataFrame(
    member = ["소정","주연","지연","현정"],
    birth = [95,98,95,94],
    height = [166,172,163,165]
)
WJSN = vcat(Unit1, Unit2)

push!(WJSN, ["다원",97,167])
push!(WJSN, ["연정",99,165])

sort(WJSN, 3)
sort(WJSN, :birth)
sort(WJSN, [:birth, :height])
sort(WJSN, :birth, rev = true)

Let’s run the example code above and check the results.

julia> WJSN
10×3 DataFrame
 Row │ member  birth  height 
     │ String  Int64  Int64  
─────┼───────────────────────
   1 │ 다영       99     161
   2 │ 루다       97     157
   3 │ 수빈       96     159
   4 │ 진숙       99     162
   5 │ 소정       95     166
   6 │ 주연       98     172
   7 │ 지연       95     163
   8 │ 현정       94     165
   9 │ 다원       97     167
  10 │ 연정       99     165

The WJSN dataframe looks like the above.

Sort by Column Number sort(df, cols::integer)

Sorts based on the colsth column.

julia> sort(WJSN, 3)
10×3 DataFrame
 Row │ member  birth  height 
     │ String  Int64  Int64  
─────┼───────────────────────
   1 │ 루다       97     157
   2 │ 수빈       96     159
   3 │ 다영       99     161
   4 │ 진숙       99     162
   5 │ 지연       95     163
   6 │ 현정       94     165
   7 │ 연정       99     165
   8 │ 소정       95     166
   9 │ 다원       97     167
  10 │ 주연       98     172

You can see that it is sorted based on the 3rd column, which is height.

Sort by Column Name sort(df, cols::Symbol)

Sorts based on the column named by the symbol cols.

julia> sort(WJSN, :birth)
10×3 DataFrame
 Row │ member  birth  height 
     │ String  Int64  Int64  
─────┼───────────────────────
   1 │ 현정       94     165
   2 │ 소정       95     166
   3 │ 지연       95     163
   4 │ 수빈       96     159
   5 │ 루다       97     157
   6 │ 다원       97     167
   7 │ 주연       98     172
   8 │ 다영       99     161
   9 │ 진숙       99     162
  10 │ 연정       99     165

It is sorted based on :birth, which means the sorting criterion was birth.

Sorting Priority sort(df, cols::Array)

Sorts according to the order of cols, assigning priority.

julia> sort(WJSN, [:birth, :height])
10×3 DataFrame
 Row │ member  birth  height 
     │ String  Int64  Int64  
─────┼───────────────────────
   1 │ 현정       94     165
   2 │ 지연       95     163
   3 │ 소정       95     166
   4 │ 수빈       96     159
   5 │ 루다       97     157
   6 │ 다원       97     167
   7 │ 주연       98     172
   8 │ 다영       99     161
   9 │ 진숙       99     162
  10 │ 연정       99     165

It is sorted by birth, but height is also considered. Compared to just sorting by birth, rows 2 and 3 have been reversed.

Sort in Reverse Order sort(df, rev::Bool=false)

Set rev = true to sort in reverse order. The default value is false.

julia> sort(WJSN, :birth, rev = true)
10×3 DataFrame
 Row │ member  birth  height 
     │ String  Int64  Int64  
─────┼───────────────────────
   1 │ 다영       99     161
   2 │ 진숙       99     162
   3 │ 연정       99     165
   4 │ 주연       98     172
   5 │ 루다       97     157
   6 │ 다원       97     167
   7 │ 수빈       96     159
   8 │ 소정       95     166
   9 │ 지연       95     163
  10 │ 현정       94     165

Environment

  • OS: Windows
  • julia: v1.6.3