Skip to main content

4 posts tagged with "scala"

View All Tags

· 8 min read

That’s the whole idea.

We want to model data in Scala, but instead of using instances of classes at the term level, we want to use their type-constructors at the type level.

Recruitment system

Let’s pick an example to help us visualize the whole process better, because just like a wise person once said “A picture is worth a thousand words”.

We will represent candidate profiles in a recruitment process for software engineering companies. Let’s start with the term model code and I’ll walk you through it.

case class Candidate(
name: String,
experience: List[Experience],
otherQualities: List[String]
)

case class Experience(
duration: Int,
expLevel: ExpLevel,
company: String,
technologies: List[String]
)

enum ExpLevel:
case Junior
case Regular
case Senior
case CEO
export ExpLevel.*

We can represent candidates by providing their:

  • name – just a String
  • experience history – list of Experience entries
  • other qualities – list of string

And experience entry is represented by:

  • duration – number of months at the job
  • experience level – enum value representing experience levels in IT
  • company name – a String
  • technologies – list of technologies used

So if we were to create a very simplified profile using our model, it will look like this.

val candidate = Candidate(
"John Paul",
Experience(
29,
Junior,
"VirtusLab",
"Scala" :: Nil
) :: Nil,
"Motivated" :: Nil
)

Cool, nothing new so far.

Making it spicier

That was some basic Scala. Now what we want to do: is to be able to have all this information on the type level.

You might be asking: We already declared a model in the previous section. Can’t we just use that one? As expected, the answer is: No. That’s because Scala distinguishes terms from types. The previous model worked on the term level, and we want to do it on the type level. So, we will have to tweak it a bit.

To make our intentions 100% clear, we want to be able to declare a type like the following (or at least similar).

type candidate = Candidate[
"John Paul",
Experience[
29,
Junior,
"VirtusLab",
"Scala" :: Nil
] :: Nil,
"Motivated" :: Nil
]

Let’s start our work with the most basic class and work our way up the dependency graph.

Experience level

First, let’s look at ExpLevel. We declared it before as

enum ExpLevel:
case Junior
case Regular
case Senior
case CEO
export ExpLevel.*

When we think about it, its type constructors carry the same amount of information as its data constructors, so we could leave it as it is. There is a small problem with the current declaration though. When we want to access the type of Junior and use it as e.g. a type parameter for List, we cannot just say List[Junior]. That’s because there is no such type constructor as Junior. Instead, we will have to type List[Junior.type]. This can be quite annoying, specifically when it’s a part of the interface exposed to the user. Is there a way to fix it then? Yes, and it’s actually quite simple. Just like by writing in Python I can force myself into a crippling depression, you can force Scala to generate classes for all our cases by just adding parentheses after the constructors. Then, those won’t just be values, but classes with empty constructors.

enum ExpLevel:
case Junior()
case Regular()
case Senior()
case CEO()
export ExpLevel.*

Nice, on to the next one.

Experience

Now that we fixed the ExpLevel data type, let’s move on to Experience. In the term model, it looked like this

case class Experience(
duration: Int,
expLevel: ExpLevel,
company: String,
technologies: List[String]
)

We want all of those term parameters to become type parameters, so let’s try just adding them. The strategy will be, for every term parameter we will:

  1. create a type parameter with the same name
  2. add a type constraint for it using <: operator

It is important that we use <: here and not :. That is because, when used on types, the first one is semantically equivalent to “is subtype of” and the latter means “has implicit instance of”. Let’s take a look at the result of our transformation then.

case class Experience[
duration <: Int,
expLevel <: ExpLevel,
company <: String,
technologies <: List[String]
]()

At first glance, it looks ok and it looks very similar to the term model. We have an entry for every parameter and the constraints are the same as before. But does it work? Well, no. If I were to play the role of a build tool, I would say that we have one warning and one error.

Let’s start with the warning. Take a look at this class and think, what does the case keyword give us here. Well, it gives us the apply function to our empty constructor, getters to our non-existent fields, the unapply function for a class we will never construct, and some other extremely useful methods. Do you get the point? The case keyword here is just as useful as a cats-effect expert at Ziverge.

Cool. On to the error now. This one might not be as easy to spot. To make it easier, let’s look at how List is implemented. Skipping a lot of details, we have:

sealed abstract class List[+A]
final case class :: [+A](head: A, next: List[A]) extends List[A]
case object Nil extends List[Nothing]

We have a supertype List and two type constructors :: (cons) and Nil. Nil, carries no information since it just symbolizes an empty list. No problem here. But, when we take a look at ::, it only has one type parameter. This would mean that it will only be able to carry the definition of one String.

Let’s create our own data structure then. To make it easier, it should only contain Strings.

sealed trait StrList
class Nl extends StrList
class :|:[head <: String, tail <: StrList] extends StrList

Voila. We just take a look at the definition of List and move every term parameter to type-level, like before.

If we put all the parts together, we get.

class Experience[
duration <: Int,
expLevel <: ExpLevel,
company <: String,
technologies <: StrList
]

Candidate

Let’s take a look at our last class – Candidate.

case class Candidate(
name: String,
experience: List[Experience],
otherQualities: List[String]
)

Right off the bat, we can spot similar problems as with Experience – Lists. Fortunately, we already have a structure for type-level lists of Strings from before. This means that we just need lists of Experiences. We can declare it in a similar way as with lists of strings, right? Let’s try.

sealed trait Experiences
class Empty extends Experiences
class :+:[head <: ???, tail <: Experiences] extends Experiences

Ok. This looks exactly like the StrList with some minor name changes. Why is there a question mark instead of the constraint of head? That’s because we cannot use Experience there. Experience is a type constructor that takes a non-empty parameter list. We would have to specify it on the spot.

Is there some trick we can use here? Or is Scala’s type system not expressive enough? Of course, there is a workaround. It is also quite a common pattern. It’s every functional programmer’s biggest nightmare and every object-oriented programmer’s wet dream: Inheritance.

If we add a supertype to our Experience class, we can use it in every place where we would usually use a type and treat Experience as the implementation.

sealed trait Exp
class Experience[
...
] extends Exp

Is this solution pretty? No. But as the tapeworm said: There was no other way.

Now that we have fixed this issue, there is nothing interesting anymore with transforming the Candidate class.

class Candidate[
name <: String,
experience <: Experiences,
otherQualities <: StrList
]

Final form

After all that work we can finally write our correct example instance.

type mystery = Candidate[
"John Paul",
Experience[
29,
Junior,
"VirtusLab",
"Scala" :|: Nl
] :+: Empty,
"Motivated" :|: Nl
]

And it compiles, which means that it works!

C'mon, Do Something

Now, you’re probably thinking: “Cool, we can model data now but there is more to computer systems than just data.” There is always some domain logic that needs to be implemented. In our case, we should definitely add some sanity checks. Like removing any experience in Rust and adding a “Good sense of humor” quality instead.

Can we do that? Yes, but since this blog post is already longer than the documentation for http4s I will have to end it here and if this blog post gets enough views, I will write a part II.

I hope the content was at least mildly interesting and that you didn’t take anything I wrote seriously. Especially type-level programming.

Medium link: https://medium.com/virtuslab/data-modeling-in-scala-3-but-i-only-use-types-b6f11ead4c28

· 11 min read

Disclaimer

Most of the article is to be perceived as a joke or satire. The post is intended as a light read. If you manage to get any educational value from it, you will most likely also enjoy reading the ingredients list of 2% milk. Enjoy!

Intro

We computer programmers frequently state that we code because it is our passion, or because we enjoy building things, or for some other fanciful reason. At the end of the day, though, all Software Engineers write code to make money. This has been on my mind quite a bit lately. So I did some market research and analysis, and after crunching all the figures, I put my findings into this graphic.

There it is plain and simple. You can clearly see that having a job has a huge impact on the amount of money you make. This is where Scala 3 comes in. We will employ Scala 3 to ensure complete job security. How are we going to do that? It’s easy! By making the code impossible to read. If no one else can maintain, let alone read, our code, we will never get fired! In this article, I will use the most straightforward programming problems like “Hello World” or “isPrime” to demonstrate how you can master this essential skill.

Running the examples

This might be a good time to say that since all the snippets in this blog post are GitHub Gists, you can run them using scala-cli easily, using the command below.

> scala-cli run gist-url

Hello World

Ok, let’s start with the best-known coding “problem” i.e. “Hello World”. In Scala 3, the solution to this problem usually looks like this:

@main
def main =
println("Hello World")

The good stuff

How do we make this simple code unreadable? Well, instead of writing the code explicitly, we can generate the code that prints the output? Sounds promising! After all, generating code isn't easy, right?

import scala.quoted.*

inline def helloWorld =
${ helloWorldImpl }

def helloWorldImpl(using Quotes): Expr[Unit] =
'{ println("Hello World") }

Well, it turns out that it’s really straightforward. We just add an inline method that calls an actual implementation. And the actual implementation is the quoted code from our previous solution.

Abstract tree

We've hit a minor obstacle. How can we get around it? The biggest problem with our implementation is that what was generated is very obvious as we just quoted our code and spliced it. How about we disallow quoted blocks then? That way, the generated code will be way more obscure. Exactly what do we need to do here? First, we need to return something that is of type Expr[Unit] and is semantically equivalent to { println("Hello World") }. The main ways to construct Expr are using helper functions in its companion object or creating an (Abstract syntax) Tree and converting it to an Expr.

Yeah, right, Expr's and Tree's, we don't really care about the semantics. We know that our expression should be the same as quoted println("Hello World"). So, let's do what software developers usually do: put a bunch of `println's in random places and hope for the best.

def helloWorldImpl(using Quotes): Expr[Unit] =
println('{ println("Hello World") }.asTerm)
'{ () }

After running it, at compile time we get:

Inlined(Ident(helloworldMacro$package$),List(),Apply(Ident(println),List(Literal(Constant(Hello World)))))

We can ignore the Inlined because it means that the term we got was an inlined expression. The exciting part is:

Apply(Ident(println),List(Literal(Constant(Hello World))))

So it’s an Apply which is a function application of Ident(println) to List(Literal(Constant(Hello World))). Great, let’s try recreating it using Quotes API. So, let’s construct an Apply, along with ‘Something’ as the first argument and a list containing a string literal as the second:

private def helloWorldImpl(using Quotes): Expr[Unit] =
import quotes.reflect.*

val tree = Apply(???, List(Literal(StringConstant("Hello World"))))

tree.asExprOf[Unit]

It's a good start, but what about the ???? Well… it has to be the reference of println and from the definition of ApplyModule.apply, it has to be a Term. From this, we can imply that what we are looking for is most likely Ref, which requires a Symbol.

The Symbol object has methods with a naming pattern beginning with required, for example: requiredClass, requiredMethod, requiredModule, requiredPackage and so on. Those methods let us 'summon' symbols of a specific type defined in the compilation unit or on the classpath. Seems great since we want a static method, right? Let's try.

private def helloWorldImpl(using Quotes): Expr[Unit] =
import quotes.reflect.*

val prntln: Ref =
Symbol.requiredMethod("scala.Predef.println").pipe(Ref(_))

val tree = Apply(prntln, List(Literal(StringConstant("Hello world"))))

tree.asExprOf[Unit]

Note: We use pipe here, a function from scala.util.chaining. Although the actual implementation is slightly different, it can be thought of like so:

extension [A](a: A)
def pipe[B](f: A => B): B = f(a)

If you see a similarity with | bash, then you’re absolutely right. That’s one of the inspirations for it. And If you see a resemblance with $ or & from Haskell, then you should go out more often.

Right, we can now run it.

[error] ./main.scala:31:3: Exception occurred while executing macro expansion.
[error] dotty.tools.dotc.core.TypeError: Failure to disambiguate overloaded reference with
[error] method println in object Predef: (x: Any): Unit and
[error] method println in object Predef: (): Unit
[error] at dotty.tools.dotc.core.Denotations$MultiDenotation.suchThat(Denotations.scala:1244)
[error] at dotty.tools.dotc.core.Denotations$Denotation.requiredSymbol(Denotations.scala:297)
[error] at dotty.tools.dotc.core.Symbols$.requiredMethod(Symbols.scala:908)
[error] at scala.quoted.runtime.impl.QuotesImpl$reflect$Symbol$.requiredMethod(QuotesImpl.scala:2450)
[error] at scala.quoted.runtime.impl.QuotesImpl$reflect$Symbol$.requiredMethod(QuotesImpl.scala:2450)
[error] at tmp.tmp$package$.helloWorldImpl(tmp.scala:12)
[error] at tmp.tmp$package$.inline$helloWorldImpl(tmp.scala:9)

Cool, we got some good old-fashioned compiler error, with many references to compiler internals in the stack trace. That’s what we wanted to see. Well, except that after skipping the first line, the message is actually pretty reasonable. There simply are two functions named println at this path. And the scaladoc confirms it:

That means that we cannot solve the problem that easily. But that’s good. The more code, the less readable it becomes. If we cannot access the method itself, let’s access the owner first and get the method from the list of its members. The way we do that is:

Symbol.required("scala.Predef")

Get its member methods named ‘println’...

  .memberMethod("println")

Then filter out the methods with no arguments…

  .flatMap { m =>
m.tree match
case defdef: DefDef
if !defdef.paramss.flatMap(_.params).isEmpty => Some(m)
case _ => None
}

And finally, just take the head of the list and wrap it in Ref...

  .head.pipe(Ref(_))

Cool. So, after composing it into our previous template, we get:

private def helloWorldImpl(using Quotes): Expr[Unit] =
import quotes.reflect.*

val prntln = Symbol.requiredPackage("scala.Predef")
.memberMethod("println")
.flatMap { m =>
m.tree match
case defdef: DefDef
if !defdef.paramss.flatMap(_.params).isEmpty => Some(m)
case _ => None
}.head.pipe(Ref(_))

val tree = Apply(prntln, List(Literal(StringConstant("Hello world"))))

tree.asExprOf[Unit]

And testing it gives us…

> scala-cli run ...
Hello World

Great! And just like that, we were able to utilize metaprogramming to write confusing and unmaintainable code.

Not so simple algebra

Since we've already done quite a few Hello-Worlds. Let's change things up a bit now. Several well-known problems are usually used to learn recursion, e.g. the Fibonacci sequence, factorial, greatest common divisor, is prime, etc. Let's pick one at random; let's choose is prime.

Now you may start asking yourself: "How can I really be sure that it's actually at random?". And my answer to you would be: "My blog post, my rules. So if I say that it's random, then it's random, ok?"

Vanilla

Let’s implement the standard version as a warm-up.

def isPrimeCheat(a: Int): Boolean =
2.until(a).forall(a % _ != 0)

Looks okay, right? RIGHT?! Of course not. I mean… it correctly checks if a number is prime, I’ll give you that. But we said that it’s an exercise for recursion. And do you see any recursion here? Let’s fix it then.

def isPrime(a: Int, acc: Int = 2): Boolean =
if a <= acc then a == acc
else a % acc != 0 && isPrime(a, acc+1)

See? It looks better now.

No value calculations

Now that the warm-up is over, how do we make this unreadable? How about disallowing the use of values? Sounds great, but can we make it work? The answer is obviously yes; we can use types. Scala 3 has a pretty impressive type system that can operate on singleton types. We can say that the type of 1 is 1 and 1 is a subtype of Int. Cool right? We also know that there is a pretty comprehensive set of type families (functions on type-level) that let us operate on singleton types. This means that we can almost rewrite our standard implementation 1:1. A reasonable attempt will look like this:

import scala.compiletime.ops.int.*

type IsPrime[A <: Int] = IsPrimeRec[A, 2]

type IsPrimeRec[A <: Int, B <: Int] <: Boolean = A <= B match
case true => A == B
case false => A % B != 0 && IsPrimeRec[A, S[B]]

First of all, we cannot have default parameters, so we will have to create a proxy function that calls our actual implementation. Then when we look at the actual definition, it is really similar to what we wrote on the value level. The only difference is that we used a match instead of an if statement. And that’s because Scala has match types, but there is no such thing as if-else types.

You might be thinking now: “Hold on a sec. How are we going to print the result if it’s a type?”. First of all: Ok, smarty-pants. And secondly, we can use the function constValue from scala.compiletime, which lets us summon values of a given type if the type has a single decidable inhabitant. Like so:

import scala.compiletime.*

@main
def main =
println(constValue[IsPrime[13]])
println(constValue[IsPrime[12]])

So when we run it, we get:

> scala-cli run ...
true
false

Success!

Manual labor

What did we learn from our exercise just now? Scala 3 makes type-level programming (at least for chosen types) way too easy. Mainly, that's because there are so many util functions. That's right, you know what's coming. Let’s limit our usage of functions from scala.compiletime.ops.int to just S. If you don't know what S does, let me explain. It's a type-level successor function for integers. So e.g. S[0] = 1, S[1] = 2 and so on. And why did we choose S in the first place? That's because, together with the literal 0, it works just like the definition of an inductive set for natural numbers. And since all of our operations are only required to work on natural numbers, we're going to implement them that way, so it's undefined behaviour for negative numbers.

Since the only thing that has to change is the declarations of the helper function, the actual implementation can stay as it was.

type IsPrimeRec[A <: Int, B <: Int] <: Boolean = A <= B match
case true => A == B
case false => A % B != 0 && IsPrimeRec[A, S[B]]

What function do we need to implement first? Looks like it’s going to be <=. Since we are implementing it for natural numbers, it seems obvious that the implementation has to follow their inductive definition.

type <=[A <: Int, B <: Int] <: Boolean = A match
case 0 => true
case S[a] =>
B match
case 0 => false
case S[b] => a <= b

It's pretty simple:

  • if A is zero then it is smaller or equal to any natural number
  • otherwise A is a successor of any a, so:
    • if B is zero then it cannot be larger or equal than S[a]
    • otherwise B is a successor of any b and we check if their predecessors satisfy the predicate.

Let's do the % next. This one is also pretty simple and looks like this:

type %[A <: Int, B <: Int] <: Int = A < B match
case true => A
case _ => (A - B) % B

Nothing exciting going on here. If A is smaller than B, just return B; otherwise, return (A-B) % B.

The rest of the functions are left as an exercise for the reader, but let’s check if it works?

> scala-cli run ...
true
false

Conclusions

So, what did we learn from this article? Mainly that, in Scala 3, even code that's meant to be complicated isn't actually that bad. And writing unreadable code requires some skill.

So, why not use some of the languages that are unmaintainable by definition, like Rust or Python? In the case of Rust, it is pretty easy: nobody actually uses Rust; people just like talking about using Rust. And when it comes to Python, the problem is that every program in Python is unmaintainable, and we wanted to write code that is readable only to us.

Medium link: https://medium.com/virtuslab/achieving-indisputable-job-security-using-novel-scala-3-features-a-case-study-65180eab810a

· 8 min read

Intro

If you have decided to read this blog post, you probably used or at least heard of macros. But just to make sure that we are on the same page: Macros / metaprogramming in Scala provide a way to either generate scala code at compile-time or analyze existing code to gather syntactic data.

Since the interface for writing macros in Scala 3 is completely different from that of Scala 2, macro libraries should become easier to develop and maintain. It also means that macro libraries from Scala 2 can’t be easily migrated or ported and instead have to be rewritten using the new TASTY API.

The aim of this blog post is to serve as a manual on efficiently using and navigating through Quotes API (which is the core of metaprogramming), rather than being a migration guide for macros or Scala projects in general. So for some preface/further reading macros documentation can be found here and the migration guide is here. There is also quite a powerful tool scala3-migrate, which automated most of the migration work.

All code snippets as well as the example mini-project were tested on Scala versions 3.0.0-RC1 and 3.0.0-RC2.

Problem

I strongly believe that the best way to learn is by example. So let’s formulate a problem so that we have something to solve (because that’s how real life works). Let’s create a program that for a class (of kind -> ), generates a neat type description for it, so for a case class like this:

case class NonEmpty[T](e: T, tail: Option[NonEmpty[T]])

we want to generate a string like this:

"NonEmpty(e: T, tail: Option[NonEmpty[T]])"

Base

Like the title of the article suggests, we are going to be using TASTY reflect. So let’s start by creating an empty object for our code.

import scala.quoted.*

object TypeInfo {
inline def apply[T[_]]: String = ${ typeInfoImpl[T] }

def typeInfoImpl[T[_]: Type](using Quotes): Expr[String] = {
import quotes.reflect.*

???
}
}

Let’s take a look at what is going on here. First, we import scala.quoted.* to have access to Type and Quotes. Then we have the apply method. It only takes a single type parameter because our code isn’t supposed to depend on the value, but rather on the given type. The body of apply is just spliced value of typeInfoImpl. When it comes to typeInfoImpl declaration, it takes the same type parameter and two implicit arguments:

  • qctx (short for Quotes Context) - gives us access to reflect API
  • tpe - type information of the type parameter while returning a value of type Expr[String], which after splicing yields a String.

Code <3

Cool, so now that we have a base, we can start writing actual code. Let’s start with something simple, like just getting the class’s name.

Our starting point is the tpe value, but in order to get the data we need, we have to transform this Type[T] into something from TASTY reflect. Let’s take a look at the hierarchy in dotty/Quotes.scala then. The important part is this:

+- TypeRepr -+- NamedType -+- TermRef
| +- TypeRef
+- ConstantType

So we know that we need a TypeRepr, but in the Quotes file there are no functions that may allow us to do it. That’s because all methods and functions for operating on TASTY types are in QuotesImpl.scala. The basic structure in this file is that for every AST node there are three main entries:

  • type alias for the internal node type
  • companion object, which implements constructor functions like apply, but also methods like unapply and copy
  • given with extension methods for our type. The name of this given is always type_name + “Methods” So the relevant entries for TyprRepr are:
type TypeRepr = dotc.core.Types.Type

object TypeRepr extends TypeReprModule:
...
def of[T <: AnyKind](using tp: scala.quoted.Type[T]): TypeRepr =
tp.asInstanceOf[TypeImpl].typeTree.$tpe
...
end TypeRepr

given TypeReprMethods: TypeReprMethods with
extension (self: TypeRepr)
...
def typeSymbol: Symbol = self.typeSymbol
...
end extension
end TypeReprMethods

Great, now we have a TypeRepr. Unfortunately, it doesn’t have any methods that can give us access to the type’s name, to get that information we have to access typeSymbol. After looking through the extension methods in SymbolMethods we can find the method name, which is exactly what we are looking for. Our very much WIP code looks like this:

val tpe = TypeRepr.of[T]
val name = tpe.typeSymbol.name
Expr(name)

Now that we have the basics covered, it’s time to handle value parameters. Once again, we start with tpe of type TypeRepr. We want to access the type declaration, so we have to get typeSymbol. After looking in SymbolMethods for something that can get us case declarations of the class, we can find:

def caseFields: List[Symbol] = ...

Which does exactly what we want. Our description displays the label and type for every parameter. Getting the label is simple because, just like T’s name, we have a Symbol with the name method. Unfortunately, there is no method that can give us the type of a declaration straight from Symbol. That means we have to look into the AST tree, which can be accessed from Symbol with the method tree (who would have thought :D). Ok, so can we deduce what types of AST nodes are our Symbols? Let’s try, by looking at the hierarchy in Quotes. We can intuitively guess that our case declarations are some kinds of declarations :o. Here is the relevant piece then:

+- Definition --+- ClassDef
| +- TypeDef
| +- DefDef
| +- ValDef

Let’s go through all the options one by one:

  • ClassDef is a definition of a class, so it obviously cannot be a case declaration
  • TypeDef is a declaration of a type. Type parameters are of type TypeDef, but they aren’t considered case fields
  • DefDef is a definition of a method, which can’t be a case field either
  • ValDef is a value definition (or variable)- all case fields are of this type Based on that, we should match on ValDefs. Let’s take a look at the code we have described so far.
val caseFields = tpe.typeSymbol.caseFields.map { s =>
val name = s.name
val tpe = s.tree match {
case v: ValDef =>
???
}
s"$name: $tpe"
}

Cool, what can we get from our ValDef then? We don’t have much choice here:

given ValDefMethods: ValDefMethods with
extension (self: ValDef)
def tpt: TypeTree = self.tpt
def rhs: Option[Term] = optional(self.rhs)
end extension
end ValDefMethods

Obviously, we want the TypeTree here and after looking at the TypeTreeMethods, there is only one method- tpe: TypeRepr. TypeRepr has a bunch of possible specific types we will have to look into in a second. But for now, let’s do the same trick as we did in the very beginning to get the class name (.typeSymbol.name). Now our code looks like this:

val tpe = TypeRepr.of[T]  
val name = tpe.typeSymbol.name

val caseFields = tpe.typeSymbol.caseFields.map { s =>
val name = s.name
val tpe = s.tree match {
case v: ValDef =>
v.tpt.tpe.typeSymbol.name
}
s"$name: $tpe"
}

Expr(
s"$name(${caseFields.mkString(",")})"
)

And it gives this output:

"NonEmpty(e: T,tail: Option)"

Looks almost done. The only thing missing are the type parameters of Option. As I mentioned before, TypeRepr has many specific node types. So let’s take a look at some of them:

+- TypeRepr -+- NamedType -+- TermRef
| +- TypeRef
+- AppliedType
+- AndOrType -+- AndType
| +- OrType
...

There are more of them, so in a real-life scenario, we would have to handle all of them. But my example, my rules. Most of those types are structurally recursive, so will delegate our type extraction logic to a function. For every AST node type we can look for desired methods just like before. For NamedType there is a method name, for AppliedType we can just use unapply to get the tycon (Type Constructor) and args and so on. The result looks like this:

def fullTypeName(tpe: TypeRepr): String = tpe match
case t: NamedType =>
t.name
case o: OrType =>
fullTypeName(o.left) + " | " + fullTypeName(o.right)
case o: AndType =>
fullTypeName(o.left) + " & " + fullTypeName(o.right)
case AppliedType(base, args) =>
fullTypeName(base) + args.map(fullTypeName).mkString("[", ",", "]")

After using the function call in our main code. The result presents like this:

"NonEmpty(e: T,tail: Option[NonEmpty[T]])"

Which is exactly what we wanted :D

Takeaways

The examples shown in this article are intentionally straightforward, just to show the basic process of working with TASTY reflect API. But the main ideas I wanted to show are:

  • Look for node types in Quotes
  • Look for implementation and methods in QuotesImpl
  • Macros in dotty are way easier to write than in Scala 2

Code for this example is available here.

Medium link: https://medium.com/virtuslab/tasty-way-of-re-writing-macros-in-scala-3-3ce704a2c37c

· 9 min read

Motivation

Programmers tend to use strongly typed languages for the safety in the runtime and their own comfort while developing applications. While using new dependency, they often have to browse the documentation by symbolic names of classes and functions. Oftentimes, they don’t know the function name, but are convinced there must be a function somewhere that fits given type transformation. In this talk, we will focus on a prototype tool that lets you browse the docs using types as search keys in Kotlin.

Once in a while every developer stumbles upon a code like this:

val list = listOf("Andrzej", "Filip", "Michał")
return Pair(
list.filter { it.length <= 5 },
list.filter { it.length > 5 }
)

And then a thought comes in. This looks like something people might do a lot. It surely can be done in a shorter, more readable way. So, what do we know that can help us refactor this code? Well in order to replace this Pair(list.filter(...), list.filter(...)) we want a function that behaves like this:

<T> List<T>.((T) -> Boolean) -> Pair<List<T>, List<T>>

Ok, that’s great, but we still are pretty much nowhere. And that’s because we need this function’s name to call it. How would we conventionally do it? Well, we could use Dokka for stdlib and look through potential functions, but that can take a lot of time. Plus it is way too close to actual work and we (software developers) don’t really like that. That’s where Inkuire comes in. Inkuire lets us search a library documentation with function signatures as search keys.

Oooo, by the way the function we are looking for is partition.

Why Scala for Kotlin tooling?

One can wonder: Why are you using Scala for Kotlin tooling? Those are actually two questions framed as one:

  • “Why for Kotlin?” - This one is really simple. As software developers we don’t really like doing too much work. In case of gathering Kotlin source data, dokka can do a huge share of work for us. We just need to format the data and persist it. Additionally Kotlin has a way simpler type system than Scala (especially Scala3). Therefore, having Hoogle for Kotlin is like proof of concept for having a similar tool in Scala3 world.
  • “Why in Scala?” - The first reason is that Scala is a more mature language. Scala.js has better support and documentation than Kotlin/JS. The other reason is just our personal preference. Scala with the use of Cats and similar libraries allows us to write code in a more functional way and probably everyone can agree, that is the 2020 way to code.

Gathering code data

First of all, we need a lot of data about code. It’s not plain data from source code but rather complete information about types provided by Kotlin compiler. Therefore we have to analyse sources before we can serialize them. Of course we could use descriptors analysis offered by JetBrains, but there is a more convenient way of doing that thanks to the recently released documentation tool - dokka. You can find out more about dokka here, but what you have to know is its powerful pluggability abilities that enable you to have all required data about Kotlin and Java sources enclosed in a very simple and intuitive API. If you would like to use dokka to analyse your own sources, check out this great article by Marcin Aman.

Once we have the data, it’s time to use it to find our mystery function. The first thing we have to worry about is how to tell the engine what we want, in other words: what should be the format of the query. After reading the title and motivation, it shouldn’t come as a surprise, that we want to search for a function with a specific signature, so our input is just going to be a Kotlin signature.

The first step in processing an input string is parsing the given text with a grammar that recognises Kotlin function signatures and then map it to our model. Ironically, searching through scala-parser-combinators with signatures as search keys would be really helpful, since the most commonly used functions from this library are: ^^, ~, ~>, |, <~, ^^^. All those seem pretty self explanatory, so I won’t go into much detail about the parser itself. But if you’d like to learn more about using scala-parser-combinators the getting started page is a nice starting point.

After parsing, we have our signature mapped into a more approachable form. So let’s look at our application from the user's perspective. If I input a signature, let’s say something like String.(Int) -> Any. What functions do I want to see as the result? In other words what should be the relation between our input signature and the result signatures? Well, the easiest and most intuitive relation would be substitution. So for the given signature anything that can be used in its place should be fine. So a function like drop with a signature String.(Int) -> String is a good fit, since it has the same input types and just a more specific return type. But a function like maxOf (Int.(Int) -> Int) doesn’t fit, because clearly the receiver- Int has nothing to do (in terms of subtyping) with the expected receiver String.

HTTP Client

What would be Inkuire without an easily-accessed, user friendly client? The most intuitive and the simplest to deploy on your own is a RESTful service. Inkuire offers a ready to use JAR container that lets you ship the engine locally or globally without much overhead. Graphic design is not our passion, but we did our best.

You can also try it yourself here.

What if we would like to embed the engine into the documentation itself?

Imagine that: you configure dokka for your own library. Your code is encouraging to use it functional-programming style, maybe has an ArrowKt as a dependency. You would like to ship your documentation as the HTML pages, but the default search bar in dokka’s default template allows you to search by function names. It would be awesome, if users could browse the documentation using signatures as search keys. We thought the same. So we decided to enable that using Scala.js!

Is it even possible?

Well, Scala.js always has been a dark horse of Scala. Many Scala developers remain unaware to these days that Scala.js exists. But it does. And has really good support from community libraries. The idea is: you can transpile your Scala code to JavaScript if all your dependencies can or you depend on stdlib. Luckily, many popular libraries guarantee that compatibility.

You can try it yourself here.

So how does it work internally?

The querying engine is pure. It has just an input signature and an output list of matching functions. Transpilation to JavaScript is as easy as a piece of cake. The JavaScript obtained from Scala code lets you call the matching function the same way you would call it from standard JVM target. The only thing missing is the way to bind function to the DOM search bar. Luckily, Scala.js provides a DOM API, so you can include all the logic in Scala code without writing a single line of JavaScript by yourself. Isn’t it awesome?

Why Scala.js and not RESTful service?

Why did we decide to transpile the engine code into JavaScript and not use the previously stated RESTful server to delegate calls and present results? Mainly, because we can encapsulate the whole deployment process in one plugin. The user has not to bother with deploying the JAR with the engine. If he could ship docs generated by dokka, he is able to ship them with our plugin attached. This approach also removed the problem with having to update the data for the server with every release. The database is built with documentation, so it will always be in sync with it. The cost of adding the plugin to dokka isn’t that big (memory wise), the JavaScript code itself has only a few MB and e.g. the JVM part of stdlib has 15MB.

Runtime efficiency test of JS and JVM

Is it worth using an engine running in your browser instead of a dedicated JVM? Let’s see. The criteria of the test are: time of engine processing and overall time for the user since he typed the signature till received results. The JVM tests have been conducted using Apache JMeter and JS with Selenium (Chrome runner). The table below shows results:

PlatformAvg engine processing timeStd engine processing timeAvg ovberall timeStd overall time
JVM330.26 ms26.64 ms332 ms25.65 ms
JS1165.57 ms100.98 ms2170.31 ms101.43 ms

As you can see, the JVM version is about 5 times faster than JS one. The additional 1 second in overall time in JS comes from the debounce time of the input field, so we can detect when the user starts typing. One could think, it’s better to use RESTful service, however, the time latency is so relatively small, it is hard to experience inconvenience from waiting for the results, having the advantage of jumping directly to the exact documentation subpage.

What if I would like to use it myself?

Currently, we do not publish artifacts to remote repositories. If you would like to use Inkuire for your project here source code. Installation guide can be found in readme. Note that Inkuire has two main drawbacks. One is not a fully integrated multiplatform - you have to choose arbitrarily which source sets you would like to query from. Also, there is still the problem with getting a full hierarchy tree of types declared in dependencies. The rule of thumb is the same as with Scala.js: To obtain a full hierarchy tree, you must provide types databases from all dependencies. We know that going recursively deeper in the dependencies tree and generating all types databases is a tedious job, but it’s the only solution available right now. However, using a type database only for a given library will cause engine work heuristicly; it will give true and applicable results, though he won’t see all possible substitutions, and you will not be able to use types that you know are higher in the inheritance tree.

Medium link: https://medium.com/virtuslab/how-to-write-hoogle-for-kotlin-in-scala-and-scala-js-8c98c1c303ff