Solution to TaskFailedException; nested task error when opening CSV files in Julia
Error
When opening a CSV file in Julia, you may encounter an error like the one above.
julia> CSV.read("test_primitive.csv", DataFrame)
┌ Warning: thread = 1 warning: only found 13 / 16 columns around data row: 3192. Filling remaining columns with `missing`
└ @ CSV C:\Users\JGH\.julia\packages\CSV\XLcqT\src\file.jl:592
ERROR: TaskFailedException
nested task error: thread = 7 fatal error, encountered an invalidly quoted field while parsing around row = 3184, col = 8:
Cause
By default, when opening a CSV file, it is processed by splitting into threads. If there is string data, issues may arise during parsing.
Solution
Adding the keyword option ntasks = 1
resolves the problem.
julia> CSV.read("test_primitive.csv", DataFrame; ntasks = 1)
9046×16 DataFrame
Row │ Column1 material_id formation_energy_per_atom dft_band_gap pretty_formula e_above_hull elements cif ⋯
│ Int64 String15 Float64 Float64 String31 Float64 String String ⋯
──────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 0 mp-10009 -0.575092 0.898 GaTe 0.0 ['Ga', 'Te'] # generated ⋯
2 │ 1 mp-1218989 -0.942488 0.0 SmThCN 0.0441088 ['C', 'N', 'Sm', 'Th'] # generated
3 │ 2 mp-1225695 0.0648625 0.0 CuNi 0.0648625 ['Cu', 'Ni'] # generated
4 │ 3 mp-1220884 -1.45612 0.0 NaTiVS4 0.0 ['Na', 'S', 'Ti', 'V'] # generated
5 │ 4 mp-1224266 0.0241391 0.0 Ho3TmMn8 0.0364961 ['Ho', 'Mn', 'Tm'] # generated ⋯
6 │ 5 mp-1002572 -2.11731 0.8595 LiMnO2 0.042556 ['Li', 'Mn', 'O'] # generated
7 │ 6 mp-626680 -1.36116 2.4707 Fe(HO)2 0.000428191 ['Fe', 'H', 'O'] # generated
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
9041 │ 9040 mp-1232301 -1.95955 1.2452 Tb2MgSe4 0.00526556 ['Mg', 'Se', 'Tb'] # generated
9042 │ 9041 mp-21084 -1.77775 1.0134 In6Ga2PtO8 0.0 ['Ga', 'In', 'O', 'Pt'] # generated ⋯
9043 │ 9042 mp-571486 -0.359373 0.0 CuSe 0.0 ['Cu', 'Se'] # generated
9044 │ 9043 mp-14410 -1.20508 0.4594 Tl6TeO12 0.0 ['O', 'Te', 'Tl'] # generated
9045 │ 9044 mp-1079192 -2.81452 0.0 Sr2GdRuO6 0.0143609 ['Gd', 'O', 'Ru', 'Sr'] # generated
9046 │ 9045 mp-13501 -0.359 0.0 ErCoC2 0.0 ['Er', 'Co', 'C'] # generated ⋯
9 columns and 9033 rows omitted
Environment
- OS: Windows11
- Version: Julia 1.11.3, CSV v0.10.15, DataFrames v1.7.0