- Lana works again (i.e. it doesn’t cheerfully move the bytecode around in memory while you’re trying to run it)
- I’ve implemented the Hough transform and the inverse Hough transform
- The following ImageLana code runs:
123456789101112go = procedure()a=imgload("testvertline.png")b=a.hough()c=b.dehough(0)c.normalise()a.view(0)b.view(1)c.view(2)endgo()wait()
And I get the following images out:
Well, that bug’s fixed. Simple matter of changing the order in which reference counts are incremented and decremented. Valgrind reports no problems from the unit tests under both memcheck and exp-ptrcheck. Now for more documentation (which is about halfway through its initial release.)
I’ve reached what feels like a fairly significant milestone in Lana today – the following transcript shows dictionaries and iterators working, with all the objects and temporaries being garbage collected successfully. There is, of course, a bunch of rather more extensive unit tests to write now, but it’s starting to feel like a proper language.
Script started on Sun 24 Jul 2011 11:43:44 BST
white@cranberry:~/misccode/script/lib/lana$ cat foo
for i in values(a)
for i in keys(a)
print(str(i) + " " + str(a[i]))
white@cranberry:~/misccode/script/lib/lana$ valgrind ./testmain foo
==4441== Memcheck, a memory error detector
==4441== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==4441== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==4441== Command: ./testmain foo
==4441== HEAP SUMMARY:
==4441== in use at exit: 0 bytes in 0 blocks
==4441== total heap usage: 160 allocs, 160 frees, 60,172 bytes allocated
==4441== All heap blocks were freed -- no leaks are possible
==4441== For counts of detected and suppressed errors, rerun with: -v
==4441== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 18 from 7)
Script done on Sun 24 Jul 2011 11:43:58 BST
Having an actual day job (Wii programming) has meant that work on Lana has had to take a back seat. Combining this with the fact that I’m now doing the tricky stuff, development has slowed down somewhat. But here’s what’s new:
Notwithstanding what I said in an earlier post, Lana now has garbage collection! This works for strings, objects, dictionaries, everything. I’ve used the same techniques Python uses:
- Reference counting for all managed data – whenever a managed entity (such as a string or object) is referred to in a value, a new entity is created, or a reference to an entity is copied into a new value, its reference count is increased; when the new reference goes away the reference count is decreased; and when the count is zero the entity is destroyed. This is made rather more complex by there being two distinct allocation models – SimpleMalloc, used for strings and the like and malloc()/free() are used; and Complex/SimpleNew, where the entity is some subclass of GarbageCollected – and in the Complex case can contain references to other entities.
- This idea of referenced entities also containing references means that plain reference counting isn’t enough – entities which refer to each other in a loop will never be deleted. Therefore we also have a cycle detector which uses a list of all the reference counted entities which can refer to other entities (so called Complex entities.) It can detect entities in a cycle which are not referred to from outside the cycle, and delete them. This has to be run periodically, and we currently do it with the Lana
These are containers which can be keyed on, and can contain, any type of value:
b.dict = a # created a cycle!
These proved somewhat hairy – possibly the hairiest thing in Lana. That’s because of the way variables work. When a variable is encountered in the code, what actually happens is that the virtual machine stacks a Ref – a reference to the variable. If a subsequent instruction wants to read the variable, it just calls the internal
popval() method, which will pop the Ref from the stack and dereference it, returning the variable’s contents to the instruction code. If we want to set the value – and only the code for the
OP_SET instruction does this – we pop the reference without dereferencing it and write to the variable it refers to.
Exactly the same technique is used for object properties – these are represented by PropRef values, which contain a pointer to the object and the ID of the property.
Naturally I wanted to do the same thing with dictionaries – have a type of value (a “DictRef”) which refers to a dictionary entry. To do this, I need to have the value contain both the key and a reference to the dictionary. However, if the key can be any Lana value, I need to be able to create a value type which contains both a dictionary reference and any value type. Nastily recursive, that definition. So I cheated, and use a special “packed value” in which the type field holds the hash key type and the allocation type flag (SimpleMalloc, etc.) is set to a special DictRefAlloc type, half the value field holds the key itself (which is the “primary part” of the key value,) and the other half holds the dictionary pointer. Nasty, and it means that values with both secondary and primary parts – PropRefs, DictRefs themselves, and some of the advanced types I may create later like delegates and closures – can’t serve as keys. It’s a small price.
- Iterators – these are in the C++ code but there’s no Lana
forstatement. Also, I’ve just realised that I need to iterate over both keys and values, which opens a small can of worms.
- Array lists – again, the basic object exists in the C++ code but isn’t linked into Lana’s syntax.
- User iterables – i.e. objects the user creates in C++ (or Lana I suppose) could have the iterable interface, allowing
forto work with them.
- Optimisation of common instruction sequences such as
- Serialisation – this is a big one, and comes close to the “point” of the language.
- Delegates in the C# sense – values which contain both a method reference and the object to call it on, as a single value: very useful for event-driven code!
- Closures – I have some notes on how to do this, by using a value to hold both the function pointer and the environment in which it was created (or a subset thereof.)
I called it Lana, some wag called it Jimscript. Anyway, as those who know me will know by now, my projects tend to expand until I need to embed a scripting language in them. And my current project is no exception. Previously, however, I’ve always just bolted in some kind of Forthlike, and hang the readability. This time, I wanted to do it properly – either embed Lua or Python, or do a proper language.
What I wanted this time is a language that would allow me to “decorate” my projects’ data structures with tiny fragments of arbitrary code, and for the code to be fully integrated with my data structures. Lua and Python’s integration – particularly with C++ – is still pretty revolting, so once again I had an excuse to do it myself. And earlier today, Lana ran (after a whole bunch of other unit tests) the following program:
# create an object
a = create()
# set some values in it
# set a function in it (i.e a method)
a.setvals = function(a,b)
this.val1 = a
this.val2 = b
# and another one which does a calculation
a.somecalc = function() returns
# assert that the calculation gets the correct value
# make a clone of the object
b = clone(a)
# assert that the calculation now gives a different result
# create a new function as a first-class value
# but first copy the old function somewhere safe!
oldfunc = a.somecalc
newfunc = function() returns
# and set this in 'a' - it should also change in 'b'
# now change 'b' back to the old calc; 'a' should not change
# all done
Quick list of features:
- Fairly powerful language – repeat/until, while/endwhile, break and continue, if/elseif/else/endif, and (I’m afraid) goto.
- Reasonably rapid – code is compiled into bytecode before execution.
- Fully interactive – each line of input is compiled to bytecode and run immediately, unless you’re compiling a function expression.
- Functions are first class objects, so you can create functions inside functions, assign them to variables and properties and so on. No closures yet, because that requires garbage collection – but I can do that with simple refcounting for this one case, so maybe non-local variables and closures will appear.
- No garbage collection – I could add it, but it would be missing the point – the language needs to be fast. I’ve lived without GC in C++ for a good few years, I can cope without it in this.
- Strings, however, are tidied up properly.
- Because functions are first class objects, functions are just values and can be stored anywhere else values are stored – inside Lana objects, inside your own objects (which can also be Lana objects) etc.
- Source code is not stored, but can be regenerated at any point from the bytecode, including comments. A lot of work has gone into making this work correctly, and it’s a big focus of the unit tests.
- Object properties (members and methods) are stored in fast hashes, based on the Python hash table algorithm.
- It’s easy to register new native C++ functions, and native C++ methods for Lana objects.