Do you know the answer to this Elixir Challenge?
- Discussion
# What will this return ?
MapSet.new(1..32) |> Enum.to_list()
# Observe what happens if we request 33 elements instead:
MapSet.new(1..33) |> Enum.to_list()
# Let's discuss what happens here;
# What is special about the number 33 and what does it mean ?
- Explanation
A bunch of stuff is happening here. Elixir has a concept of protocols which provide polymorphism on data types and structures. In our example we pipe the result of generating a MapSet into the to_list/1 function on the Enum module. Enum is a module containing functions that work on data types implementing the Enumerable protocol; examples of these are lists, maps, and ranges. Enum can do all kinds of neat stuff on these, such as taking a slice of a range:
# Give me first 5 elements after the 10th element in the range 1 to 20
iex> Enum.slice(1..20, 10, 5)
[11, 12, 13, 14, 15]
Intuitively we would think a data structure has an order; for instance a Range is a sequence of numbers increasing (or decreasing) from the start point to the end point of the range. When we ask to turn a Range into a list using Enum.to_list/1 it will materialize into a list of numbers, going from first to last. A Range has a natural order, and so does a list—which is implemented as a linked list; the elements has an order based on the order they were inserted, as the element at the head of the list will point to the next in the list.
But this intuition fails when it comes to the Map data structure. If we ask for the list representation of a Map containing keys consisting of integers we might get fooled into believing that a Map has an reliable order:
iex> Enum.to_list(%{1 => :a, 2 => :b, 3 => :c})
[{1, :a}, {2, :b}, {3, :c}]
Which might seem odd when we try to construct a Map with 33 or more elements and observe what happens when we convert it to a list:
iex> Enum.into(1..33, %{}, fn (x) -> {x, x} end) |> Enum.to_list()
[{11, 11}, {26, 26}, {15, 15} | _output_omitted]
What is special about 33? Why did we get a seemingly random order in our return?
The technical explanation is that the Erlang Virtual Machine stores maps in different data structures depending on the number of elements in the Map. For a Map containing 32 or less elements it is stored in a sorted array—which explains why we observe an order iterating over the elements using the Enum and Stream functions—and when the Map contain more than 32 elements the Erlang VM will switch to a data structure called «hash array mapped trie,» which provides fast lookups for bigger data sets as well as very fast insertions. When we convert the big map to a list we will get whatever order the internal representation has; which is not intuitive.
For our initial example we convert a MapSet to a list. The MapSet data structure is useful for when we want to store terms in a set. It allows us to ask if a term is present, and perform comparisons between two sets, such as getting the difference or intersection, in a fast and efficient manner. Behind the scenes a MapSet uses a Map to store its values, so it inherit the performance from the Map, but also the unordered nature when the set grows above 32 elements.
To conclude:
# A Map is an unordered data structure, and we should never rely on the order of a Map
While we seemingly observe an order when we convert maps to lists, or iterate over them, we should never rely on the order of a Map. Assuming a reliable order could break our application in a future Erlang/OTP release, as the OTP team could choose to change the internal representation of maps. That said, it is fine to use the functions in the Enum module to iterate over a map, as long as no assumptions on the order of the elements are made.
- https://en.wikipedia.org/wiki/Hash_array_mapped_trie
- https://elixir-lang.org/getting-started/protocols.html
- https://hexdocs.pm/elixir/Enumerable.html#content
- https://hexdocs.pm/elixir/Map.html#content
- https://hexdocs.pm/elixir/MapSet.html#content
Did you find this interesting? Want more? Join like-minded individuals at Code Elixir LDN 2019, July 18 in London, UK. Also, for more reading, check out our previous blog post in this series.
 
        Martin Gausby
Martin is a long time Elixir developer with a keen interest in implementing network protocols. For the last couple of years he has been working with Erlang and Elixir systems for a living, and during that time spent way too much time tinkering with his Emacs configuration. Besides that he has a horrible taste in music, enjoys coffee, mechanical keyboards, and is a friend of the podcast.
ARTICLES: 1
What are Elixir's trending themes for 2019? ElixirConf EU
 
        ElixirConf EU conference is the European hub for the Elixir community, their 27 talks and are carefully picked each year. As a result, the themes of the conference are a good guide in finding out what is trending in 2019. Our Elixir expert has considered all the talks that will take place between 8–9 April, and found 6 main themes this year.
READ MORE