|
Þ briarpig |
news?
try: log |
This new column shows recent log page blog posts.
demos
Lately I write demos under thorn with a todo list using C++ under a BriarPig mu-babel license. New toy language material about lathe (Lisp under thorn) appears under mu with update indexes itemized on toy. The run and hex demos were done on 13apr2008; crc on 14apr2008; buf on 20apr2008; in on 27apr2008; ctype on 04may2008; out on 18may2008; slice on 25may2008; quote on 31may2008; escape on 31may2008; mutex on 14jun2008; rand on 16jun2008; stat on 17jun2008; primes on 19jun2008; list on 23jun2008; heap on 29jun2008; iter on 02jul2008 and 04jul2008; atomic on 06jul2008; node on 13jul2008; page on 23jul2008; hash on 27jul2008; book on 27jul2008; pile on 03aug2008; stack on 07aug2008; mu on 12aug2008; toy on 16aug2008; weight on 23aug2008; symbol on 02sep2008; imm on 04sep2008; gc on 06sep2008; map on 14sep2008; meter on 16sep2008; thread on 22oct2008; arc on 05nov2008; this menu links all demos: mu: toy, peg, imm, tag, box, symbol, token, number, bigint, class, method, reader, writer, eval, env, vm, gc, world, pcode, compiler, asm, lathe, lisp, smalltalk, design, weight, jar, card, harp, debug, profile thorn: todo, names, iovec, assert, log, run, hex, crc, buf, in, out, quote, escape, compare, file, deck, cow, arc, blob, tree, slice, rand, time, stat, heap, node, primes, page, book, pile, stack, atomic, lock, mutex, thread, map, meter, list, iter, ctype 14nov08
¶
thoughtful faces
type classes ¶ Jonathan Shapiro kindly wrote a generous response on the bitc-dev mailing list to a question I asked recently at the Lambda The Ultimate blog. I asked about terms typeclasses and polyinstantiation — which I had not heard before (because I don't really bother to keep up with programming language terminology) — because naasking had suggested v-tables in BitC might be unnecessary given these other mechanisms. It looked like a good opportunity to ask a newbie question, since I never mind looking ignorant. (Which is convenient, since I'm often ignorant.) Over at LtU, David Barbour gave a good short description of polyinstantiation, including a comparison to static specialization in C++ templates, which made me infer the term means something like: "polymorphism via type specialization at point of instantiation — usually static but could be dynamic." (Because you can always make anything dynamic, of course: late binding is easy, while early binding is hard since it narrows options.) I'm replying here because I've written long posts at LtU in the past making me self-conscious about verbosity while still not letting me say as much as I'd like. I'm happy to respond at LtU when I can keep it short. Note I don't use my last name on this site, and I'd rather not see it used by others. (I had a taste once of greater notoriety, and I don't like it much. The first name I adopted a few years and use at LtU is just fine to keep a low profile.) Sorry for a long setup; the rest of this is response. C++ does tend to tie too many things together. I'm not an apologist for C++, just someone who uses it a lot. Please don't take me for one of those provincial coders who think v-tables are the right way to do things. An array of function pointers is just one representation of a map, and is equivalent to (but more fragile than) dictionaries in other formats. Smalltalk style method dictionaries are nice and flexible, and this hashmap approach to api dispatch is used by Self and Python and other languages to similar effect, though Python uses it for everything including member variables; I gather objects in Python are tuples in the form of mapped name/value pairs. As far as I'm concerned, Objective-C's approach to method maps and dispatch is superior to that in C++, but I haven't had opportunity to use it much. My work life would have been simpler if Objective-C had bested C++ many years ago. Oh well. When I asked about space efficiency of typeclasses as compared to v-tables in C++, I was wondering whether it resembled (just guessing blindly of course) something like Ian Piumarta's approach in cola, which I glanced at some months ago, just long enough to get an impression it aped structural aspects of both Smalltalk class pointers in objects and C++ vtable pointers in objects — embedding a reference to metainformation about a class somewhere in the physical object representation, where it can be found at runtime by dynamic dispatch for polymorphism. Oh, and yes, yes, I know Piumarta is a god (someone I admire) and I mean no disrespect by using casual verbs like ape. The answer I was hoping to hear was one like, "Every single object spends a little space on overhead to access class oriented metadata in a very direct manner, instead of finding very clever indirect ways to factor that overhead out of every object." I like my indirection straight and without too much gamesmanship in optimization. I have a knee jerk negative reaction to fractal sounding solutions (where every rule has an exception for special cases described by new rules with their own exceptions, ad infinitum). I'm not really familiar with Java interfaces, except that when I first read about them in the 90's they sounded so close to Objective-C interfaces I assumed it was a direct copy. (Or they're cousins since essentially the same community of folks would have been involved.) The material on dictionary resolution you mentioned sounds like the part I find interesting. The interactions between multiple types is likely where I'd worry about problems. What I'd really like is a simple operational definition of outcomes from a casual coder's point of view: where should I look in code to resolve conflicts myself? Can I tell — just by looking at the code — how the resolution will work out in the compiler? Or is it too complex to do by hand in practice? This bears very closely upon an aspect of generic functions in Dylan that put me off the language: I had trouble guessing which method would be called by my own casual, mental static typing. I didn't want the runtime to "do the right thing" — I wanted to know myself what would dispatch a priori. In other words, I want to do global static analysis myself, without any maybes or probablys involved. One of the reasons I liked Smalltalk with single inheritance was ease of deciding where to look to see what method executes, once I made an assumption about what actual types were passed to methods. If you like, think of me as coding in object oriented assembler. I still want to imagine a deterministic assembler model despite a nice high level language. I'm glad you wanted to resist dynamic allocation of dictionaries. As a user I would be satisfied by a feature allowing me to turn off dynamic dictionary allocation for types I cared most to control. Incidentally, I like your use of capsule as an alternative for object, since the length is short enough, the syllables few enough, the relation to encapsulation close enough, and odds of confusion with random associations low enough. Thanks a lot for expanding on typeclasses. Mechanics of dispatch is one of my favorite topics, which is why I had trouble resisting an urge to ask for elaboration. I think of you as someone whose time is much more valuable than mine, so I'm flattered to get my interest addressed. I'm not sure I was confused (and I lack the modesty to claim I was), but I sure was ignorant about typeclasses and I thank you for tutorage. I hope it helps other folks familiar with C++ become more familiar with BitC in areas permitting some cross over. In some ways C++ looks like a train headed to run into a mountain without benefit of a tunnel, so it would be nice to have options. simpler C++ rules ¶ Coincidentally (given the sub-topic in shap's remarks about interfaces generally being better than v-tables specifically) this afternoon I added similar remarks along these lines to a wiki page at work outlining future ground rules for a style of writing C++ embedded inside C oriented processes. My first draft of C++ usage rules concerned things not to do, in order to have no runtime impact on nearby C code linking no std C++ libs. The purpose of the wiki page was to inscribe a circle around a set of things to be explicitly permitted by C++ code lurking inside a C-only environment, as distinguished from another set of things explicitly denied to C++ in the same context. But a reviewer asked me to include guidelines on use of inheritance, explicitly spelling out what is considered good inheritance style. In particular, for components intended swappable with replacements, the best approach to minimize future conflict uses pure virtual abstract base class interfaces, with neither inherited state nor implementation to cause coupling in subclasses. Naturally I agreed and formalized rules a bit for the wiki page to encourage good inheritance style in the direction of abstract interfaces. But in practice, extremism of interface purism is one I seldom bother to pursue in my own casual projects, like ones I post on this site. In code you see around here, I make no effort to enforce pure separation of abstract interfaces from concrete subtypes implementing them. I could, but I don't. So, why not? When you cleanly separate abstract interfaces from concrete implementations, you tend to get at least two sets of names: the abstract ones and the concrete ones, with the obvious one-to-one mapping, of course. Proliferation of code names often upsets many people — or at least most folks I see all the time. I tend to create so many class names in the first place that I already get complaints, even before adding more by separating abstract from concrete. The number one request I get is always the same: please write less — less of everything: fewer classes, methods, docs, and checkin notes. Just less, please. When I try to make code replaceable by substitutes, I'm the only one who ever does it (currently anyway). And the abstraction making it possible is usually criticized. Because essentially, abstraction is also obfuscation. Code written abstractly is much easier to change, but is somewhat harder to grasp. The former counts for little, and the latter counts for a lot, in day jobs of late. Sample code on this site blurs edges between abstract and concrete, to make it more clear and to give me less to write, both. The result is hacking fodder, mostly. Which is more or less what it's for: grist for future mills. 12nov08
¶
war bucks
tinkering ¶ Rather than writing web pages, I've been tinkering with new code. I was aiming for a new blob api, but I veered sideways into a new file api rev because it seemed plausible to make a blob mimic a file. Then I kept veering into some pseudo memory mapping support for files. So effectively I'm cooking up some content for a file demo (which is still a stub) because it will help me frame some context for blob content. And it wouldn't hurt to have demos get slightly code heavy. But I'm sure I can go too far that way, getting too many code names. (In code design, when you favor orientation toward data, the result is often more flexible and adaptive. But when each type needs a unique name, you get a lot of names; for a complex task, like paging iovec-based files, it's easy to use many classes with only slightly different names, because their meanings are intertwined.) Since it has a high amusement factor just now, I'm not throttling much. So I might just crank out code for several related demos, all oriented around schlepping around bytes in blobs, files, and wherever. I might write demos as if Wil composed code in stream-of-consciousness mode, in response to Zé asking "what if" too many times. Half baked results would be both good and bad: fertile suggestions and lines of inquiry, but you'd need to boil it down yourself, if you want minimalism. tombstones ¶ This last week I updated code for a closed/probing hash map (using tombstones) so accumulated entropy in the form of tombstones could incrementally boil off. I was seeing too high a percentage of tombstones in map occupancy (bordering on pathological). So I thought about it; then I said oh yeah once I'd simulated it a little. In a closed hashing map which probes to find unused slots when the target bucket is used, you only know a search has ended when you see an empty slot. As a result, when you delete a map entry you must replace it with a tombstone if you don't want to rehash, in order to keep searches valid for buckets used downstream from that spot. When do tombstones go away? When a successor is an empty slot. So removing tombstones is dependent on getting them next to empty slots. High frequency of empty slots is important to maintain constant time access. I decided my last map implementation wasn't working hard enough to remove tombstones — I hadn't asked myself how to make them go away proactively. But once I did, an obvious approach suggested itself. Say you search for key K in a map, but the hash H=h(K) points at bucket B, and K is found in a slot downstream to the right of B (because the slots were used when K was added). When some older keys are removed, you might run across tombstones on the way to finding K. If you do, then swapping K with the first tombstone slot improves the map two ways: 1) K is closer to bucket B, and 2) a moved tombstone gets closer to the next empty slot. Both these improvements remove map entropy, and you can do this incrementally in every map lookup. So far I call this bubbling tombstones rightward because it pushes tombstones to the right on each swap, getting closer to the next empty slot each time, promoting conversion of tombstone slots to empty slots when they become adjacent. When the total empty slot count gets too low, the same tactic can be pursued more aggressively by considering other keys seen in passing besides K, the target of the search. For example, the next used key K' in a slot that is not equal to K might also rehash to an earlier tombstone, which would bubble another tombstone rightward. |
This column is the original sole content of this home page. It was moved to just one column to make room in the other column for a copy of recent additions to the log weblog. The main purpose of this site is to describe some open software I'm writing — so that it's documented — and to apply some code here experimentally, so I have proof of concept, and so I can teach my kids to code in real web contexts, using dynamic languages. For clarity, and for my sons' sake, part of this is an experiment in literate programming. But instead of mixing code and docs in files, I plan to mix them in the pages of the site. Eventually how the site works ought to be (approximately) self describing. license What I publish here will use some open software license† minimizing my exposure to nuisances, among which I number corporate patents and open source community code forks. What's sure to annoy me? Racing ahead, using what I make to put social and/or legal roadblocks in my path. costs What's this costing me? Not much in dollars. But writing in plain language will seriously eat into my free time to code, which interests me considerably more. I'm eager to avoid seeing the cost balloon as a result of more email. (Publication of email is partly deterrent.) It might seem I'm spending energy on appearance, with nice page formats and color, etc. But this is really easy after long practice, and the clarity of the result is at least as important as the words. benefits What do I stand to get out of this? As much as I enjoy just coding for fun, it still must justify the time to me by providing value someday, by shrinking or streamlining something I do professionally, eventually. Code that's undocumented is hard to use industrially. I hope the amount I document here later opens doors for professional use, in a more comprehensive sense than normal cherry picking over the years. But that's long term. In the short term, I only mean to benefit by seeing my sons have an environment in which they can learn to code by writing software they and their friends might care about. And of course, I expect to enjoy playing with novel web site infrastructure too. attitude Though I want to have my say here, it doesn't mean I suppose you care. I'd be surprised if you did. Most folks who keep a blog hope to be read, but I expect little. Starting February 2008, I expect to write more fiction in the third person about imaginary developers who sometimes discuss ideas, games, and charades in some mix of boring technical programming chatter and slightly interesting science fiction and fantasy. (Short named characters are virtual personas.) hand images The hand images on some pages only mean to provide visual cues to reinforce perception of place. (I made several dozen of these a couple years ago in the same format, but never used them until now.) briars & thorns Thorn imagery and a briar patch metaphor both correctly convey a clue I might be prickly or socially distant toward feedback since my purpose here isn't really social, so peer pressure is moot. peers If you insist on peer pressure, here's what would make you a peer. You need minimally the following: much more than 10 years of professional C++ coding, a focus in memory managment and performance, plus experience coding Lisp and Smalltalk runtimes in assembler, C, or C++. Naturally, I'm likely to appreciate fine technical points from folks who know a lot more than I do about Scheme, Smalltalk, and Python, as long as the perspective favors implementation. If you're a big dynamic language geek or enthusiast, that's great and your interest is welcome, but I'm really trying to get stuff done. No hard feelings. I just want to keep working without frittering away much time. I've no time at all for politics or pep rallies. Just because C++ is my tool of choice to be productive today, that doesn't mean I'm wed to it, or that I want code & libraries properly styled in C++ fashion or religion, or that I seek existing tools. I just need total runtime control. competition Many software developers are competitive to the point of sociopathy. There's a good chance something about what I'm doing will stick in someone's craw because I encroach some territory, even if I don't undermine a business model with my alternative. I'm not competing, and I'm not out for your kudos, territory, money or whatever. I'm not parading my chops, or advertising or selling anything. Whatever you need to be best at — terrific, you win. Cut me some slack, I'm not in the game. identity I'm no one of consequence. This isn't one of those masked man fighting for justice things, nor is it a Dread Pirate Roberts thing. Just think of me as that one guy — yet another geeky engineer. (On pseudonyms: if a message is missing my name or briarpig, then I didn't write it.) contact david (at) briarpig (dot) com — I seldom respond to email, but it's possible I'll make an exception in your case, if you write something interesting. Questions are rarely interesting, tending to flattery and other forms of shallow antagonism. |