Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

Publicaciones sobre programming

Benchmarking Markdown in Crystal

I am work­ing a bit (s­low­ly) on Nicol­i­no a stat­ic site gen­er­a­tor writ­ten in Crys­tal. One of the things it does is, it ren­ders mark­down files.

Since the mark­down us­age is lim­it­ed to, like, 3 lines, I thought "Why not try all the mark­down li­braries and see which one is faster?"

So, I did.

The bench­mark is sim­ple:

  • An emp­ty Nicol­i­no site.
  • 4000 sim­ple mark­down files (a few lorem ip­sum para­graph­s)
  • Nicol­i­no com­piled in re­lease mode
  • Ren­der the whole site 10 times, av­er­age the last 7 runs
  • De­fault con­figs for ev­ery­thing

All this on this ma­chine:

        a8888b.           Host        -  ralsina@mindy
       d888888b.          Machine     -  Micro Computer (HK) Tech Limited Default string UM560 XT
       8P"YP"Y88          Kernel      -  6.9.9-arch1-1
       8|o||o|88          Distro      -  EndeavourOS
       8'    .88          DE          -  Qtile
       8`._.' Y8.         Packages    -  1340 (pacman), 1 (cargo), 18446744073709551615 (Homebrew)
      d/      `8b.        Terminal    -  zellij
     dP        Y8b.       Shell       -  fish
    d8:       ::88b.      Uptime      -  10d 18h 7m
   d8"         'Y88b      CPU         -  AMD Ryzen 5 5600H with Radeon Graphics (12)
  :8P           :888      Resolution  -  4096x2160, 1920x1080
   8a.         _a88P      CPU Load    -  15%
 ._/"Yaa     .| 88P|      Memory      -  9.2 GB/24.5 GB
 \    YP"    `|     `.
 /     \.___.d|    .'
 `--..__)     `._.'

Of course Nicol­i­no does some oth­er things be­sides ren­der­ing mark­down, so to iso­late that part I ran a cou­ple of dum­my im­ple­men­ta­tion­s:

  • NOOP It does noth­ing. When asked to com­pile a string to mark­down, it re­turns the same string.
  • TOEMP­TY It does less than noth­ing. When asked to com­pile a string to mark­down, it re­turns an emp­ty string.

So, the dif­fer­ence be­tween emp­ty and noop is prob­a­bly what it takes to ren­der some tem­plates and store the out­put in disk.

Then I ran the bench­mark for the 5 mark­down li­braries I could find:

Here is a chart show­ing the times in sec­onds for each li­brary with­out the time that is used in oth­er things (NOOP's time, which was 2.62 sec­ond­s).

Markdown libraries benchmark 0011223344550.3898.3446153846154505.25393356643360.66201.3723076923077491.63942307692310.33304.40000000000003507.68509615384620.38407.4276923076924505.25393356643365.28510.4553846153847267.0Markdown libraries benchmarkmarkdcr-discountcrystal-cmarkcr-mark-gfmluce

So, luce is much, much, much slow­er, and crys­tal-c­mark is the fastest. And cr-dis­count (MY OWN BIND­ING) is much slow­er than the oth­er­s, which is a bit dis­ap­point­ing.

On the oth­er hand, there are two sides to op­ti­miz­ing. One is choos­ing the fastest li­brary, the oth­er is not car­ing if the dif­fer­ence is small in ab­so­lute, even if it's large in rel­a­tive.

What does that mean?

This is over 4000 doc­u­ments.

So, while cr-dis­count is slow­er, it's still ren­der­ing 4000 doc­u­ments in 0.66 sec­ond­s. That's 0.000165 sec­onds per doc­u­men­t.

That's 1.65 tenths of a thou­sand of a sec­ond. That's over 6000 doc­u­ments per sec­ond.

If the nor­mal use­case was to ren­der thou­sands of doc­u­ments, then that would make a dif­fer­ence. But it's not. It's usu­al­ly 3 doc­u­ments.

So, as long as cr-dis­count has any fea­ture I need and it's not in the oth­er li­braries, it's fine to use it, and the same goes for the oth­ers (ex­cept luce, I guess, but stil­l: 757 doc­u­ments per sec­ond is not bad).

Update: Compiling discount with -O3 brings it down to 0.48, which is bettern than 0.66 but makes no difference for the conclusions.

The Beauty and Horror of Crystal Shards

I have been us­ing a lot of Crys­tal for per­son­al projects for about a year. Think of it like "na­tive Ru­by with type in­fer­ence" or some­thing like that.

But this post is not about the lan­guage it­self, it's about one part of its ecosys­tem: the shard­s.

The Nice

Shards are like Ru­by gem­s, or like what­ev­er go calls them nowa­days, or rust crates, just "li­braries" that are writ­ten most­ly in crys­tal.

The thing is ... they are awe­some!

Suppose you want to use markdown in your app. You go to shards.info which crawls GitHub and finds things written in Crystal. You search for markdown and you find a few that look good. Suppose you want to use my cr-discount shard.

You just add the shard to your shard.yml and use it, like this:

dependencies:
  cr-discount:
    github: ralsina/cr-discount

Then in your code:

require "cr-discount"

markdown = "This *is* **markdown**"
html = Discount.compile(markdown)

And that's it, you just in­te­grat­ed it in­to your build, your code is us­ing it, and that's all there is to it.

But not on­ly that, sup­pose you are us­ing a shard like docr but you find a tiny bug or two. You fork it, fix the bugs and cre­ate a PR. If the au­thor is ac­tive, they will merge it, and you can go back to us­ing the shard, but even if they aren't ... you can just use your fork in the mean­time.

Just use ralsina/docr instead of marghidanu/docr in your shard.yml and you are using my fix.

And there's more! Sup­pose you find some code in your projects that you keep re­peat­ing. For ex­am­ple, I al­ways use the same log­ging set­up in CLI app­s:

  • Col­or­ful if it makes sense
  • Con­fig­urable ver­bosi­ty
  • Er­rors and worse in stderr
  • In­fo and bet­ter in std­out

So, why copy that ev­ery­where? Just make it a tiny shard.

There is no overhead in your code, you don't have to ask users to install anything, you just add it to your shard.yml and it's there for everyone that cares.

The Not So Nice

Of course noth­ing is free of cost.

  • Be­cause it's easy to cre­ate shard­s, it's easy to aban­don shard­s.
  • Be­cause it's easy to fork shard­s, it's easy to at­om­ize the ecosys­tem.
  • Be­cause it's easy to find shards and it's de­cen­tral­ized, it's triv­ial to poi­son the sup­ply chain (although TBH it seems to be easy enough in any lan­guage).

So, you have to be care­ful. You have to check if the shard is main­tained, if it's the best one for the job, if you fork it you need to com­mit to keep­ing your fork work­ing, you al­ways need to push the PR even if the au­thor does­n't pick it up be­cause it's there for oth­er user­s.

So, con­clu­sion­s... they are what they are. For me they are the most prac­ti­cal thing ev­er and I of­ten wish Python had some­thing like them, but they are al­so quite ... scary? And see­ing the aban­doned shards makes me sad for a lan­guage that should be much more pop­u­lar than it is.

New Project: FaaSO

Be­cause yes, all self­-host­ed FaaS so­lu­tions suck this week­end I wrote the be­gin­nings of a new one, called Faa­SO.

Is it go­ing to be great? Prob­a­bly not, but it's go­ing to do ex­act­ly what I need it to do. Be­cause the best part of rein­vent­ing the wheel is that by the sec­ond left el­bow of Kali, this wheel is go­ing to be ex­act­ly the shape I like.

Faa­SO has very strict de­sign con­straints:

  1. It needs to be easy to use. I need to be able to write a funko (func­tion in Faa­SO par­lance) in a minute and de­ploy it with one com­mand, and I won't have to con­fig­ure any­thing for that funko to work.
  2. It will run in a sin­gle ma­chine, it will de­ploy in a min­ute, and it will be ready to take new de­ploy­ment re­quests right away.
  3. It will have some sort of se­cret man­age­ment API
  4. It will sup­port mul­ti­ple lan­guages, be­cause I want to use dif­fer­ent lan­guages.
  5. It will have very lit­tle mag­ic. It will not lock you in­to need­ing it.
    • You should be able to take a funko and make it a sep­a­rate app in a minute
    • You should be able to con­trol what you are run­n­ing, and how it run­s, and mon­i­­tor it and so on with­­out go­ing through the tool if you wan­t.
  6. It will be small. My cur­rent goal is un­der 1500 LOC.
  7. It's aimed at de­ploy­ing one ten­ant. It will not pro­tect one funko from a hos­tile funko run­ning in the same sys­tem. It will not pro­tect you from your­self.
  8. In the same way, it will be as se­cure as I can make it against ex­ter­nal threat­s, but it's not go­ing to pro­tect you from some­one with ac­cess to the same sys­tem.
  9. It will be light. I am writ­ing it in Crys­tal so it's na­tive code and runs with very lim­it­ed de­pen­den­cies and lit­tle over­head.

Can I do all that? Maybe. The cur­rent pro­to­type does about half of what I wan­t, so there is on­ly an­oth­er 90% of the work left :-)

If you want to check the pro­to­type, it's here, I am not look­ing for con­trib­u­tors now be­cause I want a free hand on sud­den re­design.

There is some doc­u­men­ta­tion about how it works here and some brain­dump about the se­cret man­age­ment as well as the ini­tial brain­dump about de­sign

Version 0.1.3 of Hacé is out

A new release of Hacé my make-like tool backed by Croupier is out!

New in this version

Features

  • Set vari­ables from the com­mand line
  • Al­low pass­ing out­put files as ar­gu­ments
  • Au­to mode works bet­ter
  • Han­dle bo­gus ar­gu­ments bet­ter
  • Made --question more verbose, and only report stale tasks matching arguments
  • New -k option to keep going after errors.
  • Switched to croupi­er main, sup­ports de­pend­ing on di­rec­to­ries
  • Au­to­mat­i­cal­ly build bi­na­ries for re­lease
  • Gen­er­al house­keep­ing
  • Build it­self us­ing a Hace­file in­stead of a Make­file
  • Re­ject if two tasks share out­puts (lim­i­ta­tion of croupi­er for now)

Bugs Fixed:

  • Warn about un­known tasks used in com­mand line
  • Tasks with out­puts passed wrong tar­get to croupi­er
  • Com­mand out­put was not vis­i­ble in the log.

Full Changel­og: v0.1.2...v0.1.3

Croupier v0.5.2 released

A new release of Croupier my Crystal library for tasks and dataflow programming is out!

Ver­sion 0.5.2 of coupier, a crys­tal li­brary to do dataflow ori­ent­ed pro­gram­ming.

Changes:

  • Bugs fixed when us­ing au­to mode with de­pen­den­cies that are di­rec­to­ries.

Full Changel­og: v0.5.1...v0.5.2


Contents © 2000-2024 Roberto Alsina