Pros and cons of Scala as a server-side programming language at ShiftForward

Joao Azevedo / @jcazevedo
http://jcazevedo.net/commit2016

About me

  • Graduated from FEUP in 2010;
  • Worked on a crew-scheduling application for a railway company at SISCOG from 2010 to 2011;
  • Worked on mobile application development and signal analysis tools at Fraunhofer Portugal from 2011 to 2012;
  • At ShiftForward since 2012, working on distributed, responsive and scalable systems for the online advertising industry.

ShiftForward:
Why Scala in the first place?

Statically typed
with decent type inference

  • Specifying program invariants at compile time:
    • Increases development speed;
    • Removes the need for some classes of tests.
  • Not having to declare every single type removes excessive verbosity.

JVM

  • Platform independence (predictable runtime environment) eases the deployment of services;
  • The ecosystem is very rich: libraries and tools.

Concurrency

  • Thread-based concurrency (from Java);
  • Actor-based concurrency (inspired by Erlang, available in Akka);
  • Parallel collections;
  • Controllable immutability.

Scala pros: The type system

Scala class hierarchy and unified types

Parametric polymorphism


def dup[T](x: T, n: Int): List[T] =
  if (n == 0)
    Nil
  else
    x :: dup(x, n - 1)
                        

scala> dup(3, 3)
res0: List[Int] = List(3, 3, 3)

scala> dup("three", 3)
res1: List[String] = List(three, three, three)
                        

Type inference


def id[T](x: T) = x
                        

scala> val x = id(322)
x: Int = 322

scala> val x = id("hey")
x: String = hey
                        

Functions as types


def foo[A, B](l: List[A], f: A => B): List[B] =
  if (l.length == 0)
    Nil
  else
    f(l.head) :: foo(l.tail, f)
                        

scala> foo(List(1, 2, 3), { x: Int => x * 2 })
res0: List[Int] = List(2, 4, 6)
                        

Traits

Collection of fields and behaviors that can be extended or mixed in to classes.


trait Car {
  val brand: String
  def emitBrand: String = "My brand is " + brand
}
trait Shiny {
  val shineReflection: Int
}
class BMW extends Car with Shiny {
  val brand = "BMW"
  val shineReflection = 12
}
                        

scala> val bmw = new BMW
bmw: BMW = BMW@2da8ed80

scala> bmw.emitBrand
res0: String = My brand is BMW
                        

Scala pros:
Defining domain objects

Case classes


case class User(id: Int, name: String, age: Int)
                        
  • Can be constructed without using new;
  • Automatically have equality and hash code methods;
  • Nice toString methods;
  • Pattern matching.

Case classes


case class User(id: Int, name: String, age: Int)
case class Site(users: List[User])

val site1 = Site(List(User(0, "alice", 37), User(1, "bob", 46)))
val site2 = Site(List(User(0, "alice", 37), User(1, "bob", 46)))
                        

scala> site1.toString
res0: String = Site(List(User(0,alice,37), User(1,bob,46)))

scala> site1 == site2
res1: Boolean = true
                        

Pattern matching


def averageAge(site: Site) = {
  def ageSum(users: List[User]): Int = users match {
    case User(_, _, age) :: rest => age + ageSum(rest)
    case Nil => 0
  }
  ageSum(site.users) / site.users.length
}

val site1 = Site(List(User(0, "alice", 37), User(1, "bob", 46)))
                        

scala> averageAge(site1)
res0: Int = 41
                        

Example: Modeling JSON


sealed trait JsValue {
  def toJsonString: String
}
case class JsNumber(n: Int) extends JsValue {
  def toJsonString = n.toString
}
case class JsString(s: String) extends JsValue {
  def toJsonString = "\"" + s + "\""
}
case class JsArray(values: Array[JsValue]) extends JsValue {
  def toJsonString = {
    val elementsString = values.map(_.toJsonString).mkString(", ")

    "[" + elementsString + "]"
  }
}
                        

Example: Modeling JSON


case class JsObject(values: Map[String, JsValue]) extends JsValue {
  def toJsonString = {
    val elements = values.map { case (k, v) =>
      "\"" + k + "\": " + v.toJsonString
    }

    "{ " + elements.mkString(", ")  + " }"
  }
}
                        

val userJson = JsObject(Map(
  "id" -> JsNumber(0),
  "name" -> JsString("alice"),
  "age" -> JsNumber(37)))
                        

scala> userJson.toJsonString
res0: String = { "id": 0, "name": "alice", "age": 37 }
                        

Scala pros: Monads

A way to abstract computations


trait M[A]
def unit[A]: A => M[A] = ???
def flatMap[A, B]: M[A] => (A => M[B]) => M[B] = ???
                        

Monadic operations everywhere


scala> List(1, 2, 3).flatMap { x => List(x - 1, x, x + 1) }
res0: List[Int] = List(0, 1, 2, 1, 2, 3, 2, 3, 4)

scala> List(1, 2, 3).map { x => x * 2 }
res1: List[Int] = List(2, 4, 6)
                        

(flat)Map is the glue


val l1 = List(1, 2)
val l2 = List(8, 9)

l1 flatMap { v1 =>
  l2 map { v2 =>
    v1 * v2
  }
}
                        

res0: List[Int] = List(8, 9, 16, 18)
                        

For comprehensions


val l1 = List(1, 2)
val l2 = List(8, 9)

for {
  v1 <- l1
  v2 <- l2
} yield v1 * v2
                        

res0: List[Int] = List(8, 9, 16, 18)
                        

Reusable components


def f(v: Int) = v * 2

val l = List(1, 2, 3, 4)
val o = Some(4)
val t = Try(5)
                        

scala> l.map(f)
res0: List[Int] = List(2, 4, 6, 8)

scala> o.map(f)
res1: Option[Int] = Some(8)

scala> t.map(f)
res2: scala.util.Try[Int] = Success(10)
                        

Scala pros: Type classes

Prerequisite:
Implicit parameters


def adder(a: Int)(implicit b: Int) = a + b
                        

scala> adder(2)
<console>:13: error: could not find
  implicit value for parameter b: Int
       adder(2)
            ^

scala> implicit val x = 5
x: Int = 5

scala> adder(2)
res0: Int = 7
                        

Prerequisite:
Implicit conversions


def double(v: Int) = v * 2

implicit def stringToInt(s: String) = s.toInt
                        

scala> double("1234")
res0: Int = 2468
                        

Ad hoc polymorphism


trait JsonFormat[A] {
  def write(value: A): JsValue
  def read(json: JsValue): A
}

def toJson[A](a: A)(implicit format: JsonFormat[A]): String = {
  format.write(a).toJsonString
}

implicit object IntJsonFormat extends JsonFormat[Int] {
  def write(value: Int) = JsNumber(value)
  def read(json: JsValue) = json match {
    case JsNumber(value) => value
    case _ => throw new Exception("Unexpected JSON type")
  }
}
                        

scala> toJson(1)
res7: String = 1

scala> toJson("I'm a string!")
<console>:17: error: could not find
  implicit value for parameter format: JsonFormat[String]
       toJson("I'm a string!")
             ^
                        

Extension methods


trait JsonWritable[A] {
  def toJson: String
}

implicit def toJsonWriteable[A](v: A)(implicit format: JsonFormat[A]) =
  new JsonWritable[A] {
    def toJson = format.write(v).toJsonString
  }
                        

scala> 1.toJson
res0: String = 1

scala> User(0, "alice", 37).toJson
res1: String = { "id": 0, "name": "alice", "age": 37 }
                        

Scala pros: Concurrency

Future


scala> Future { /* expensive computation */ 2 }
res0: scala.concurrent.Future[Int] =
  scala.concurrent.impl.Promise$DefaultPromise@2cc44ad

scala> res0.onComplete(println)
Success(2)
                        

Composing futures

Future is a Monad


val userFuture: Future[User] =
  userInfo(request.username)
val geoFuture: Future[Geo] =
  userGeo(request.ip)
val tpData: Future[Data] =
  userFuture.flatMap(user => thirdPartyData(user.id))
val incomeInfo: Future[Double] =
  geoFuture.flatMap(geo => averageIncome(geo))
                        

for {
  user <- userFuture
  data <- tpData
  avgIncome <- incomeInfo
} yield AugmentedUser(user, data, avgIncome)
                        

Akka


class MyActor extends Actor {
  def receive = {
    case value: String => println("received " + value)
    case _ => println("received unknown message")
  }
}
                        

scala> val actor = system.actorOf(Props(new MyActor))
actor: akka.actor.ActorRef =
  Actor[akka://default/user/$b#1137211984]

scala> actor ! "hello"
received hello

scala> actor ! 1
received unknown message
                        

Akka


class Doubler extends Actor {
  def receive = {
    case v: Int => sender ! (v * 2)
    case _ => println("received unknown message")
  }
}
                        

scala> val res = actor ? 3
res: scala.concurrent.Future[Any] = Success(6)
                        

Akka: What else?

  • The message-driven programming model promotes asynchronous interfaces;
  • Supervision strategies enable fault tolerance;
  • Location transparency promotes scalability;
  • Persistency enables internal state recovery.

Large-scale data processing: Apache Spark

RDD is another monad!


val textFile = sc.textFile("hdfs://...")
val counts = textFile
  .flatMap(line => line.split(" "))
  .map(word => (word, 1))
  .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")
                        

Other scala pros

  • Power to create DSLs;
  • Macros;
  • Higher kinded types.

Scala cons: Learning curve

Quick to start using,
hard to master

When coming from an OO-background,
it's easy to write OO code in Scala.

Functional in the small,
OO in the large

Pure functions, referential transparency and immutability are applied in small, isolated areas.

More functional abstractions

OO or Functional design?

Do I design it using typeclasses and ad hoc polymorphism or using subtyping and parametric polymorphism?

Scala cons: Compilation times

Startup overhead

Scalac has lots of classes which have to be loaded and jit-compiled. The long startup time is mitigated by using incremental compilation.

Compilation speed

  • Type inference;
  • Implicit resolution;
  • Functional idioms generate many more classes per given file size than Java.

Compilation speed

Compilation speed

Compilation speed

Scala cons: Build system

SBT is widely used

Simple (?) build tool: a type-safe programmable build configuration.

Slightly overengineered
and convoluted

  • A build definition consists of one or more Settings;
  • A Setting describes a transformation of the build description;
  • Settings are scoped: project, configuration and task axis;
  • A Task is a computation of values or side effects;
  • Tasks are also assigned to keys and can depend on settings and other tasks;
  • TaskKey[T] returns Setting[Task[T]].

Dependency resolution is slow

  • The dependency resolution (ivy) is single threaded, and not cached by default;
  • There are alternatives: coursier, a pure Scala dependency resolver, is around 5 times faster resolving dependencies in a multi-project (28 projects) build (5 vs 25 minutes).

Scala cons: IDE and tooling

IDES

  • IntelliJ IDEA;
  • Scala IDE (for Eclipse);
  • ENSIME (for Emacs, Atom, Vim and Sublime).

Rapidly evolving ecosystem

IDE plugins are not as sophisticated (particularly when compared with Java), so developers might get frustrated. Work on the Scala presentation compiler (used by Scala IDE and ENSIME) and other tools makes us look with optimism to the future.

Questions?