Symbols in Julia
Overview
When first encountering Julia, one might be perplexed by the symbol data type. Symbols are used with a preceding :
, functioning simply by their name without any internal data. They are commonly used as names, labels, or dictionary keys1.
Explanation
In other programming languages, when giving options to a function, they are often provided as numbers or strings to clarify the meaning. For example, the following two functions illustrate this.
julia> function foo0(x, option = 0)
if option == 0
return string(x)
elseif option == 1
return Int(x)
else
error("wrong")
end
end
foo0 (generic function with 2 methods)
julia> foo0(3.0, 0)
"3.0"
julia> foo0(3.0, 1)
3
julia> function foo1(x, option = "string")
if option == "string"
return string(x)
elseif option == "Int"
return Int(x)
else
error("wrong")
end
end
foo1 (generic function with 2 methods)
julia> foo1(3.0, "string")
"3.0"
julia> foo1(3.0, "Int")
3
In contrast, the definition using a symbol is shown below. At first glance, it might seem no different from the two functions above.
julia> function foo2(x, option = :string)
if option == :string
return string(x)
elseif option == :Int
return Int(x)
else
error("wrong")
end
end
foo2 (generic function with 2 methods)
julia> foo2(3.0, :string)
"3.0"
julia> foo2(3.0, :Int)
3
The reason for using symbols can be explained simply: they are not meant to change mid-program. Sometimes, this might be inconvenient, but unlike integers or strings, there is no chance of them unexpectedly changing.
Moreover, symbols are true embodiments of assignment and command. From an interface perspective, there’s little difference between strings and symbols, but taking the example of receiving the string "Int"
and interpreting it as a command to return an integer, versus directly receiving the symbol :Int
and returning an integer without question, there is a subtle difference. Even if this difference doesn’t resonate, there’s no need to force understanding.
Other instances of using symbols include column names in data frames, where it’s difficult or undesirable to distinguish variables from strings. While the notation might seem daunting due to its unfamiliarity, understanding its purpose and differences alleviates the concern.