Skip to main content

Ralsina.Me — Roberto Alsina's website

New project: croupier

Intro to Dataflow Programming

This post is about ex­plain­ing a new pro­jec­t, called Croupi­er, which is a li­brary for dataflow pro­gram­ming.

What is that? It's a pro­gram­ming par­a­digm where you don't spec­i­fy the se­quence in which your code will ex­e­cute.

In­stead, you cre­ate a num­ber of "tasks", de­clare how the da­ta flows from one task to an­oth­er, pro­vide the ini­tial da­ta and then the sys­tem runs as many or as few of the tasks as need­ed, in what­ev­er or­der it deems bet­ter.

Examples

Put that way it looks scary and com­plex but it's some­thing so sim­ple al­most ev­ery pro­gram­mer has ran in­to a tool based on this prin­ci­ple:

make

When you create a Makefile, you declare a number of "targets", "dependencies" and "commands" (among other things) and then when you run make a_target it's make who decides which of those commands need to run, how and when.

Let's con­sid­er a more com­plex ex­am­ple: a stat­ic site gen­er­a­tor.

Usu­al­ly, these take a col­lec­tion of mark­down files with meta­da­ta such as ti­tle, date, tags, etc, and use that to pro­duce a col­lec­tion of HTML and oth­er files that con­sti­tute a web­site.

Now, let's con­sid­er it from the POV of dataflow pro­gram­ming with a sim­pli­fied ver­sion that on­ly takes mark­down files as in­puts and builds a "blog" out of them.

For each post in a file foo.md there will be a /foo.html.

But if that file has tags tag1 and tag2, then the contents of that file will affect the output files /tags/tag1.html and /tags/tag2.html

And if one of those tags is new, then it will affect tags/index.html

And if the post itself is new, then it will be in /index.html

And al­so in a RSS feed. And the RSS feeds for the tags!

As you can see, adding or mod­i­fy­ing a file can trig­ger a cas­cade of changes in the site.

Which you can mod­el as dataflow.

That's the ap­proach used by Niko­la, a stat­ic site gen­er­a­tor I wrote. Be­cause it's im­ple­ment­ed as dataflow, it can build on­ly what's need­ed, which in most cas­es is just a tiny frag­ment of the whole site.

That is done via doit an awe­some tool more peo­ple should know about, be­cause a lot more peo­ple should know about dataflow pro­gram­ming it­self.

So, what is Croupier?

It's a li­brary for dataflow pro­gram­ming in the Crys­tal lan­guage I am writ­ing!

Here's an ex­am­ple of it in use, from the doc­s, which should be self­-­ex­plana­to­ry if you have a pass­ing knowl­edge of Crys­tal or Ruby:

require "croupier"

b1 = ->{
  puts "task1 running"
  File.read("input.txt").downcase
}

Croupier::Task.new(
  name: "task1",
  output: "fileA",
  inputs: ["input.txt"],
  proc: b1
)

b2 = ->{
  puts "task2 running"
  File.read("fileA").upcase
}
Croupier::Task.new(
  name: "task2",
  output: "fileB",
  inputs: ["fileA"],
  proc: b2
)

Croupier::Task.run_tasks

Why?

Be­cause I want to write a fast SSG in Crys­tal, and be­cause dataflow pro­gram­ming is (to me) a fun­da­men­tal tool in my tool­kit.

Anything else?

I will prob­a­bly al­so do a sim­ple make-­like just as a play­ground for Croupi­er.


Contents © 2000-2023 Roberto Alsina