Note that the implementation above is not stack-safe, but I didn’t worry to much
about it. We can check that the implementation works as expected by using map
over some Tree instances:
On the above, we won’t be able to call map directly over instances of Branch
or Leaf because we don’t have Functor instances in place for those types. To
make the API more friendly, we can add smart constructors to Tree (i.e.
branch and leaf methods that return instances of type Tree).
Exercise 3.6.1.1: Showing off with Contramap
To implement the contramap method, we can create a Printable instance that
uses the format of the instance it’s called on (note the self reference) and
uses func to transform the value to an appropriate type:
With this contramap method in place, it becomes simpler to define a
Printable instance for our Box case class:
Exercise 3.6.2.1: Transformative Thinking with imap
To implement imap for Codec, we need to rely on the encode and decode
methods of the instance imap is called on:
Similarly to what’s described in the chapter, we can create a Codec for
Double by piggybacking on the Codec for String that we already have in
place:
When implementing the Codec for Box, we can use imap and describe how to
box and unbox a value, respectively:
For this exercise, rather than defining instances for the proposed types, I
defined instances for Cats’ Monoid directly. For that purpose, we need to
import cats.Monoid.
For the Boolean type, we can define 4 monoid instances. The first is boolean
or, with combine being equal to the application of the || operator and
empty being false:
The second is boolean and, with combine being equal to the application of the
&& operator and empty being true:
The third is boolean exclusive or, with combine being equal to the application
of the ^ operator and empty being false:
The fourth is boolean exclusive nor (the negation of exclusive or), with
combine being equal to the negation of the application of the ^ operator and
empty being true:
To convince ourselves that the monoid laws hold for the proposed monoids, we can
verify them on all instances of Boolean values. Since they’re only 2 (true
and false), it’s easy to check them all:
Exercise 2.4: All Set for Monoids
Set union forms a monoid for sets:
Set intersection only forms a semigroup for sets, since we can’t define an
identity element for the general case. In theory, the identity element would be
the set including all instances of the type of elements in the set, but in
practice we can’t produce that for a generic type A:
The book’s solutions suggest an additional monoid (symmetric difference), which
didn’t occur to me at the time:
Exercise 2.5.4: Adding All the Things
The exercise is clearly hinting us towards using a monoid, but the first step
can be defined in terms of Int only. The description doesn’t tell us what we
should do in case of an empty list, but, since we’re in a chapter about monoids,
I assume we want to return the identity element:
Changing the code above to also work with Option[Int] and making sure there is
no code duplication can be achieved by introducing a dependency on a Monoid
instance:
With the above in place we continue to be able to add Ints, but we’re also now
able to add Option[Int]s, provided we have the appropriate Monoid instances
in place:
To be able to add Order instances without making any modifications to add,
we can define a Monoid instance for Order. In this case, we’re piggybacking
on the Monoid instance for Double, but we could’ve implemented the sums and
the production of the identity element directly:
I solved the exercises in a sandbox Scala project that has Cats as a
dependency. The book recommends using a Giter8 template, so that’s what I used:
The above command generates (at the time of writing) a minimal project with the
following build.sbt file:
The above differs a bit from what the book lists, since there are both new Scala
2.13 and Cats versions out already, but I followed along using these settings
with minimal issues.
Exercise 1.3: Printable Library
The definition of the Printable type class can be as follows:
In terms of defining the Printable instances for Scala types, I’d probably
prefer to include those in the companion object of Printable so that they were
readily available in the implicit scope, but the exercise asks us explicitly to
create a PrintableInstances object:
The interface methods in the companion object of Printable can be defined as
follows:
On the above, the print method could have relied on the format method
directly, but I opted to not have the unnecessary call.
For the Cat example, we can define a Printable instance for that data type
directly in its companion object:
This allows us to use the Printable instance without explicit imports:
For the extension methods, we can define the PrintableSyntax object as
follows:
I have opted to use a value class for performance reasons, but for the purpose
of this exercise it was likely unnecessary.
By importing PrintableSyntax._ we can now call print directly on our Cat
instance:
Exercise 1.4.6: Cat Show
To implement the previous example using Show instead of Printable, we need
to define an instance of Show for Cat. Similar to the approach taken before,
we’re defining the instance directly in the companion object of Cat:
Cats implements summoners for the Show type class, so we no longer need to use
implicitly.
This can be used as follows:
Cats doesn’t have an extension method to directly print an instance using its
Show instance, so we’re using println with the value returned by the show
call.
Exercise 1.5.5: Equality, Liberty, and Felinity
A possible Eq instance for Cat can be implemented as follows. Similar to the
above, I’ve opted to include it in the companion object of Cat.
We use Scala at $WORK for multiple projects. These projects rely on various
internal libraries. Being able to rely on built artifacts between projects in a
way that is convenient for developers in different teams is a huge benefit.
The whole company uses GitHub to manage source code, so we have
recently started using GitHub Packages to share Scala
artifacts privately. After circumventing some quirks, it is actually a quite
convenient way to share Scala (and other Maven) artifacts privately.
We use sbt as the build tool for all of our Scala projects, so the
remainder of this post is written for sbt. It should be easy to adapt the
instructions below to other build tools.
Setting Up Credentials to Authenticate with GitHub Packages
Authentication in GitHub Packages is done through personal access tokens. We can
generate one in our GitHub personal settings. The token must
have the read:packages (when we want to read packages from GitHub Packages)
and the write:packages (when we want to write to GitHub Packages) permissions.
We can then set the credentials for sbt to be able to read them via the
following, replacing <username> and <token> with our username and previously
created token, respectively:
The token is a password, so we should treat it as such. We shouldn’t commit this
into our repositories, and ideally we have this set up in a global location that
sbt has access to (like ~/.sbt/1.0/github-credentials.sbt).
Publishing an Artifact to GitHub Packages
When publishing artifacts in sbt, we always need to specify a repository where
artifacts and descriptors are uploaded. In the case of GitHub Packages, every
GitHub project provides a repository we can use to publish artifacts to. This
means that, in sbt, we can define the location of our repository by setting the
publishTo task key to something like the following:
In the snippet above, we should replace the <org> and <project> placeholders
by the organization and project we want to publish to, respectively.
If our credentials are properly set up, this now allows us to run sbt publish
and have our artifacts published to GitHub Packages. Note that packages in
GitHub Packages are immutable, so we can’t directly replace a package with the
same version. We can, however, delete an existing version in GitHub.
Downloading Artifacts from GitHub Packages
In order to download artifacts from GitHub Packages as dependencies of our
projects we must set up the appropriate resolvers in our sbt build. For that
purpose, we can set up the same location we mentioned previously when publishing
artifacts:
And then add the project as a regular library dependency:
If credentials are properly set up, this now allows us to rely on GitHub
Packages as a source of dependencies.
There is one slight inconvenience with the process suggested above, which is the
fact that every project has its own resolver. When depending on multiple
projects from the same organization, this can become cumbersome to manage, since
every dependency would bring its own resolver. Fortunately, there’s a way to
work around this and have an organization-wide resolver. The thing is that the
<project> section of the resolver doesn’t need to exist, so we can reference
some arbitrary repository, like _:
This will give us access to packages published on any repository within the
organization. The personal access token we use will control our access. If the
token only has access to public repositories, then this resolver won’t allow
access to private ones. If it does have access to private repositories, then all
artifacts will be visible.
With this resolver in place, we have convenient access to all artifacts
published within the organization.
Interacting with GitHub Packages in Automated Workflows
Using GitHub Packages in a pipeline of continuous integration or continuous
delivery is also possible. There are various ways to manage this. One way is to
rely on an environment variable that is populated with the contents of some
secret that includes a personal access token with appropriate access. For that
purpose, we can set up something like the following in our sbt build:
With the above in place, builds of our project will look at the existence of a
GITHUB_TOKEN environment variable and use it to set up the appropriate sbt
crendentials. Note that the above uses _ as the username for the crendentials.
This is doable because GitHub Packages doesn’t care about the actual username
that is used, only if the token has appropriate access.
When using GitHub Actions, there’s always a GITHUB_TOKEN
secret that has access to the repository where the action is executed, so we can
reference that:
Note that if we need to fetch artifacts from other projects, we need to set up a
personal access token with more permissions.
Managing Snapshot Versions
It is customary for Maven artifacts to have snapshot versions which are usually
versioned as X.Y.Z-SNAPSHOT. These snapshots are usually mutable and new
versions continuously replace the existing snapshot. This doesn’t play very well
with GitHub Packages because versions there are immutable and you can’t easily
replace one. It is possible to delete the existing one and publish again, but it
is cumbersome.
To allow for snapshots while using GitHub Packages, we have started using
sbt-dynver. sbt-dynver is an sbt plugin that dynamically sets the
version of our projects from git. You can look at some details on how
sbt-dynver sets the version, but, essentially, when there
is a tag in the current tree, then the version of the project is the version
specified in the tag and, when there is not a tag in the current tree, then the
version of the project is a string built from the closest tag and the distance
to that reference.
With sbt-dynver we can have snapshot-like versions with the version immutability
that GitHub Packages provides.
Pricing
In terms of billing, we get a total amount of free
storage and some amount of free data transfer per month. Anything above that
incurs in $0.008 USD per GB of storage per day and $0.50 USD per GB of data
transfer. One important note is that traffic using a GITHUB_TOKEN from within
GitHub Actions is always free, regardless of where the runner is hosted.
In short, using GitHub Packages is a very convenient way to share Scala
artifacts within a private organization, particularly if said organization
already uses GitHub to manage their source code.
I have recently moved
this website from DreamHost to AWS. While I was able to
automate the setup of the infrastructure, I was still deploying changes
manually. It is not a very cumbersome process and it involves the following
steps after a change is created:
Build the website;
Sync the new website contents with the main S3 bucket;
Invalidate the cache of the non-www CloudFront distribution;
Invalidate the cache of the www CloudFront distribution.
In its essence, this involves running the following 4 commands, in sequence:
This is not terrible to run each time I introduce a new change, but it would be
easier if I could make it so that every push to the master branch of the
repository which holds the contents of the website would trigger
a deploy. Fortunately we can use GitHub Actions for this.
Setting Up the GitHub Action
In order to set that up, we first need to create a workflow. Workflows live in
the .github/workflows folder, and that is where I have created the
deploy.yml file.
We start by giving the workflow a name:
Then, we setup which actions trigger a workflow run. In this case, I want every
push to the master branch to trigger it:
Following that, we can start defining our job. In this case, we need to specify
in which environment the job should run and the list of steps that comprise it.
We’re OK with running on the latest Ubuntu version:
To build the website, we need to have 3 steps: (1) checkout the repository, (2)
setup ruby and install dependencies and (3) run bundle exec jekyll build:
Once the site is built, we need to publish it to S3 and invalidate the caches of
the CloudFront distributions. The AWS Command Line Interface is
already available in GitHub-hosted virtual environments, so we just need to set
up the credentials we want to use. In this case, we want to reference some
repository secrets which we will set up later:
With the credentials set up, we can run the commands we previously listed:
The full YAML for the workflow definition is as follows:
Creating a User for GitHub Actions
To set up the credentials this workflow is going to use to interact with AWS, I
wanted to create a user with permissions to interact with the relevant S3 bucket
and CloudFront distributions only. To do that, I have added the following to the
Terraform definition (refer to the previous post for more
details on the existing Terraform definition):
This creates a new IAM user, attaches a policy to it that gives it
access to the relevant S3 and CloudFront resources, and creates a new access key
which we will set up as a secret in our GitHub repository. The secret access key
gets stored in the Terraform state, but we define an output that allows us to
read it with terraform output -raw github-actions_aws_iam_access_key_secret.
With the GitHub secrets appropriately set up, we now have a
workflow that publishes this website whenever a new commit is pushed to the
master branch.