This is the (1 of N) installment of a series of short posts where I list some “smaller” features of Haskell that I’ve found neat and valuable.
Introduction
Haskell is a remarkable functional programming language. It’s also well-known for pushing the boundaries of “what programming languages can be” by offering a mix of features not typically seen in other languages, such as Type-Families, GADTs, and Linear Types.
However, Haskell also equips developers with loads of, perhaps “smaller”, tools that we may take for granted and thus never explicitly think of them as such, and yet they are usually present in our code. This series aims to list some of those “smaller” features that I find particularly helpful.
Today, we’ll talk about where
clauses and their support for writing functions scoped within functions.
Disclaimer: This is a series based on my personal, subjective preferences. Your mileage may vary.
Where clauses
As a functional language, it’s no surprise that Haskell has a syntax that makes it easy to write functions. It’s not uncommon to have modules composed of a lot of functions, typically small ones, perhaps just one line long, that work together via composition to solve complex problems.
Sometimes we want to break down a high-level function f
into smaller low-level functions g, h, ...
, but perhaps these low-level functions aren’t interesting enough to deserve standing out in the module.
Maybe because they’re tightly specific to f
only, or too short, or ambiguous with other low-level functions that would emerge from other outer-functions within the module, etc.
Thus, we want to let these low-level functions available only to f
. So, we will make their scopes as small as possible, and that is within the scope of f
itself.
From now on, we refer to such low-level functions as “local functions” (inner, contained, child), as in “local with regard to another (outer, containing, parent) function”.
With local functions, we:
- keep the reader’s attention on the high-level algorithm that we deem relevant and not overwhelmed with details.
- avoid duplicating knowledge that needs to stay in sync.
- hide identifiers from an outer scope where they might not make sense.
- retain locality by keeping the local function closer to the call site.
- may opt-in for shorter names, as the point of definition and point of use are close.
We can write local functions withlet
expression or where
clauses. And each construct has its differences and characteristics. But in this post, we will only talk about where
, by showing three scenarios in which it might help.
Threading the same arguments through recursive calls
Consider an implementation of map
:
map' :: (a -> b) -> [a] -> [b]
map' _ [] = []
map' f (x : xs) = f x : map' f xs
We threaded f
through the recursive call map' f xs
. However, f
doesn’t change across recursive invocations to map'
.
Perhaps we prefer not to explicitly pass it. We can write a go
helper that closes over f
to achieve that:
map'' :: (a -> b) -> [a] -> [b]
map'' f = go
where
go [] = []
go (x : xs) = f x : go xs
Here and elsewhere, I named the helper as
go
, but any other name (maybe a more descriptive one?) would have done the trick.
Now, map''
isn’t recursive any longer. Instead, it pushes the recursive step into go
and this one captures f
from the outer-scope freeing us from explicitly passing it as an argument of the recursion call go xs
.
Making recursive functions tail-call
Consider an implementation of length
:
length' :: [a] -> Integer
length' [] = 0
length' (_ : xs) = 1 + length' xs
Perhaps we prefer tail recursion. We can write a go
helper that increments the accumulator and pass it as a parameter:
length'' :: [a] -> Integer
length'' = go 0
where
go acc [] = acc
go acc (_ : xs) = go (1 + acc) xs
Now, we accumulate the list’s length in the acc
parameter that we first increment and then pass the incremented value as an argument of the recursive call to go
.
Breaking down complex functionality
Say that want to generate a simple identifier for a list of strings that should be insensitive to order. For brevity, let’s stick with the format:
Given the
prefix
and the list ofparameter_1
, …,parameter_N
, the identifier shall beprefix:{parameter_1|...|parameter_N}
.Example:
prefix
<-device
andparameter_1
<-Debian
andparameter_2
<-Arch Linux
, the identifier shall bedevice:{Arch Linux,Debian}
.
We may implement it as:
import Data.List (intercalate, sort)
generateId :: String -> [String] -> String
generateId prefix parameters = prefix <> ":" <> "{" <> intercalate "|" (sort parameters) <> "}"
Perhaps we prefer to split up the implementation into smaller pieces arbitrarily as:
import Data.List (intercalate, sort)
generateId' :: String -> [String] -> String
generateId' prefix parameters = prefix <> ":" <> suffix
where
suffix = "{" <> append parameters <> "}"
append parameters = intercalate "|" (normalize parameters)
normalize parameterrs = sort parameters
Or – don’t explicitly pass the parameters
along:
import Data.List (intercalate, sort)
generateId'' :: String -> [String] -> String
generateId'' prefix parameters = prefix <> ":" <> suffix
where
suffix = "{" <> appended <> "}"
appended = intercalate "|" normalized
normalized = sort parameters
Or – move everything into the where
clause:
import Data.List (intercalate, sort)
generateId''' :: String -> [String] -> String
generateId''' prefix parameters = formatted
where
formatted = prefix <> ":" <> suffix
suffix = "{" <> appended <> "}"
appended = intercalate "|" normalized
normalized = sort parameters
Etc.
Disclaimer: These are just some of the ways to express the same contrived algorithm. The point is not to cover all styles or decide which style is the best, but rather to illustrate how we may express it with
where
clauses.
So, we broke up the implementation of generateId
in terms of smaller pieces that helped us to organize the functionality. Yet they might not be suitable to live at the same scope as the generateId
, but rather within it.
Conclusion
We’ve shown three examples of local functions defined with where
clauses. I’d say that whether to use it or not depends on the case in hand, as sometimes we might be better off with let
expressions, lambdas, “fully-fledged” functions, unfolding the logic in-place, etc.
I’m particularly into where
clauses because not only does it allow me to scope helpers into functions and keep definition and usage closer, but also because I can refer (use, call) to the local functions before actually defining them. That helps me to concentrate on the high-level algorithm (policy) and only when I need, do I scroll down to check the details (mechanisms). By that time, I already know how they are used (context) and I suppose that helps me to stay focused and work more effectively – with less back and forth.
There’s more to where
clauses, such as manually ascribing types to functions defined within it (instead of letting GHC infers them) and how that interacts with ScopedTypeVariables
or maybe shortening the examples with LambdaCase
, etc. Nevertheless, those are topics for another day.