JKat - A Concatenative Language on the Java Virtual Machine


Logo by Ville Oikarinen

Introduction

JKat is a dynamically typed, stack-based (concatenative) language similar to Forth and its relatives. It's implemented as an interpreter on top of the JVM, and aims at easy embeddability and simplicity like Lua, providing powerful tools for creating domain spesific languages when Java becomes insufficiently expressive.

Why concatenative?

Concatenative languages are interesting from many perspectives:

  1. They're almost completely syntax free. Programs are written as a sequence of whitespace- separated words consisting of virtually any character, making them nothing short of excellent DSL platforms.
  2. (Naïve) interpreters are small, simple and straightforward to implement
  3. They encourage functional thinking - what instead of how, constructing new programs by composing existing programs using combinators
  4. Many seemingly complicated features found in other languages are either unnecessary or get very concrete semantics (see function composition, keyword arguments etc)
  5. Most refactorings are very simple cut&paste operations (f.ex. extract word, inline word), not requiring complicated IDE's for editing

Obtaining and using JKat

Latest JKat distribution can be found from here.
Sources can be browsed here.

To run jkat interpreter, you must set environment variable JKAT_HOME and PATH to point to the extracted distribution. Two startup scripts (jkat.bat and jkat.sh) are provided for Windows and Unix, respectively.

Without any command line arguments, the interpreter starts in interactive mode. Otherwise, it will execute the given script file and exit when program finishes.

License

JKat will be distributed under a BSD license

Basics

Everything in JKat works by either

  1. pushing literal items (numbers, strings etc) to the stack as they occur in the source code
  2. executing a word (called function in other languages) that takes items out of the stack and pushes the results back. Words are not required to consume anything from the stack, nor return anything.

For example:

  interactive> 1 2 + print
  3

First, integers 1 and 2 are pushed into the data stack. + word takes out two elements from the stack, performs the addition and result (3) is pushed back to the stack. Word 'print' takes one item out of the stack and prints it to the standard output.

New words can be defined by using the ':' word:

  interactive> : mysquare ( n -- n^2 ) dup * ;

'dup' word duplicates the top element of the stack and * multiplies the top two values, squaring the number on the stack. The ( n -- n^2 ) is a stack effect declaration which tells what must be in the stack before word is executed (left side of double-dash '--'), and what is left on the stack after word completes. Stack effect declarations are purely for documentation purposes currently, and naming of the stack locations is just informative.
The new word is used like all other words:

   interactive> 5 mysquare print
   25

Word definitions can also be used to mimick constant definitions, since word is not required to take anything from the stack. Just pushing value(s) is okay:

   : MY-CONSTANT "foo" ;

Word definitions can be collected into groups called vocabularies, that are just files on the filesystem, file name suffixed with '.jkat'. To load the word definitions and bring them to the current vocabulary's namespace, you can use the 'USE:' word:

  interactive> USE: base math printing ;

Note that you can load multiple vocabularies in the same USE: declaration. You can specify the vocabulary path when starting the interpreter:

   bash$ jkat -vocabularypath ../../library myprogram.jkat

The current directory is always in the vocabulary load path.

Built-in datatypes

Numbers

JKat has two numeric types: Int and Real. All basic math operations are supported. Precision is maintained, so for example if you add up a Real and an Int, the result will be of Real type.

  interactive> 2 3 + print
  5
  interactive> 2 3 - print
  -1
  interactive> 2 3 * print
  6
  interactive> 2 3 / print
  0
  interactive> 2.0 3 / print
  0.666666666666666
  interactive> 5476543 43 % print
  20  

Strings

String literals are surrounded with '"' character. Most commonly used string escapes will work.

  interactive> "foo\nbar\t..." print
  foo
  bar      ...
  interactive> "foo" "bar" append print
  foobar

Arrays:

Arrays are constructed using the '{' and '}' words. '{' starts the parsing of array elements up to '}'. Nesting of arrays are of course allowed.

   interactive> 1 { 1 { 2 "foo" } 3.14 } nth print
   { 2 "foo" }
   interactive> { 1 2 } { 3 4 } append print
   { 1 2 { 3 4 } }
   interactive> USE: base ;
   interactive> { 1 2 } { 3 4 } [ append ] each print
   { 1 2 3 4 }

Linked lists:

Linked lists are constructed using words '(' and ')'.

   interactive> 1 ( "foo" 3.14 ) cons
   interactive> dumpstack
   0:          ( 1 "foo" 3.14 )
   interactive> tail dumpstack
   0:          ( "foo" 3.14 )
   interactive> [ 1 ] dip   !! put index 1 before linked list in the stack
   interactive> nth print
   3.14

Tuples:

Tuples are actually nothing but special arrays. Word 'TUPLE:' will generate constructor words tuplename-new (construct a new tuple using values from the stack) and <tuplename> (creates an empty tuple). Accessor words for tuple elements are also generated, getter named fieldname>> and setter >>fieldname.

   interpreter> TUPLE: point xcoord ycoord ;
   interpreter> 2.0 3.0 point-new
   interpreter> ycoord>> print
   3.0
   interpreter> 
                   2.0 >>xcoord
                   3.0 >>ycoord
                xcoord>> print
   2.0

Blocks

Blocks are unnamed functions similar to lambda functions in other languages. They're sequences of words surrounded with words '[' and ']'. Blocks are always pushed as values to the stack.

   [ 1 + ]

You can run the block from the stack by using the 'apply' word:

   2 [ 1 + ] apply

   (is the same as)
   
   2 1 +

results in 3 on top of the stack

Stack shufflers

Stack shufflers are used to reorganize the stack. Examples:

  interactive> 2 3 drop dumpstack   !! dumpstack displays the data stack
  0:       2
  interactive> 3 2drop dumpstack
  interactive> 2 dup dumpstack
  0:       2
  1:       2
  interactive> clear dumpstack    !! clear empties the stack completely
  interactive> 2 3 swap dumpstack
  0:       2
  1:       3
  interactive> nip dumpstack
  0:       2
  interactive> 3 4 2nip dumpstack
  0:       4
  interactive> clear
  interactive> 2 3 2dup dumpstack
  0:       3
  1:       2
  2:       3
  3:       2

Useful combinators

  interactive> 1 1 = [ "then branch" print ] [ "else branch" print ] if
  then branch
  interactive> true [ "it is true" print ] when
  it is true
  interactive> false [ "...unless" print ] unless
  ...unless
  interactive> [ 1 + ] [ 2 + ] compose print
  [ 1 + 2 + ]
  interactive> [ 1 + ] [ 2 + ] prepose print
  [ 2 + 1 + ]
  interactive> : tagize [ "<" ] dip append ">" append ;
  interactive> { "one" "two" "three" } [ tagize ] map [ print ] each
  

Input and output

JKat currently doesn't support any other means of I/O than writing to standard output. See embedding for more information providing your own I/O facilities.

Embedding

TBD. Meanwhile, check Main.java

Other resources

Factor - Slava Pestov's masterpiece on it's own (blazingly fast) VM