Þ   briarpig  » log  » may09


May 2009 This way to May 2009 entries.

30may09 misanthropes on ice

less

     I deleted an entire day's update last week before posting any of it. Lately I've been redacting a lot of material. Most often my reason is quite simple: I say too much about myself, and I'm not that interesting. Worse yet, having others think they understand you from a light sampling is risky.

     I have much less to say now that's not fiction. I may write a lot of fiction, but I expect to say less about code. More about code in the next section below.

trampolines

     I'm still slowly working on a basic async runtime for simulations and async programming languages, but mainly this involves imagery alone with a lot of reasoning. I ask myself, "What happens when I do this?" And then I work through the consequences of what happens in memory. I don't look at any code because it's distracting.

     I spend a lot of time closing my eyes and picturing patterns of stuff in memory, while trying out lines of analysis. Just thinking about basic non-blocking i/o for printing keeps me fairly busy, and it leads almost immediately to a bunch of issues folks usually call "operating systems." I hope I can think of something even more primitive, which isn't well behaved unless you add more constraints.

     It looks like I'm going to head straight into the teeth of trampoline style here, which is where I wanted to go in programming languages anyway, once, but I thought that was a little too much in the "being thorough" vein. Well, just to write simulations in async style with continuations might require trampolining in green threads, to avoid doing something even more exotic.

infamous

     My sons and I are highly engrossed in a new video game named InFamous. It's excellent: easily as good as any other game my sons have ever played. I think it's my favorite so far, with a score of 9.5 out of 10.

     So, um, obviously I'll have to put a long reference into my story, with Zé running a tribute environment. The game is much too good for me to palm off a brief review in a few words, so I'm come back to it later.

     I was sorely tempted to give Zé a future ability along lines shown in InFamous. (Can you guess why? Of course you can.) But I'm trying to put as little supernatural content in my story as possible when it is presented as "real" — as in, not in one of Zé's alternate realities. Thus far I've written mainly what I call "living room" science fiction, since most of it happens in Wil's living room. (Yes, I think this is funny.)

     If I constrain myself to strange abilities that occur only in the mind for a while, it's possible these are mediated by nanotech—and thus vaguely possible if you squint really hard—with the exception of Zé's games which predate Finch's arrival. (Wil and Zé will never see conclusive evidence nanotech is actually involved, so they must rely on what they're told. You can make up your own mind.)

aleph

     I would have resumed writing fiction by now, but one character is taking a lot of thought. I'm putting a lot of time into Aleph's back story and point of view. None of my characters are sock puppets, although Flywheel's secretary comes close so far (but she'll get complex later).

     For autonomous characters, you must know their reaction to anything another character does. It's hard to make a story any stronger than characters involved, so weak characters yield weak stories. I need Aleph to be fairly interesting or it undermines Finch by using weak props. Also, it's a shame to waste an opportunity to go long with a character who can carry the material.

     Last night I had a good insight into who Aleph is. And I can't hint here because figuring it out is part of your fun. It won't be very hard though, so the puzzling part will be what I think I'm doing. (Hey, she needs to be a loose cannon, but an internally consistent one. Who do you think will like her more: Zé or Wil? See, you're learning how this works.)

     Mick, in constrast, is not too hard to invent because men of a certain type affect a typical style of manner in their behavior which is consistent, acting as a frame for content with more variability. His imagination is directed and focused, so you can predict where he might be on the map.

24may09 yes men plagues

back to basics

     I'm starting to ramp up a revision of hobby code related to async continuations and simulation tools in a programming language environment. But I don't think I'm going to write about it while doing it, because it's sure to be boring. I won't get through it unless speed is one operating parameter.

     I've been thinking about how I'll start, and what path I might take to trick myself into seeing it as vaguely interesting for its own sake.

     For example, the "main thread" hosting green threads must not make blocking calls. So I'll write the blocking calls in other threads first, one at a time, checking that each seems to work as expected. I'll pretend this is fun because I see empirical results as I go. For example, to write to standard out streams, I need a threadsafe queue to a thread devoted to writing to stdout (and perhaps other things).

     Then I'll have a thread pool for other things suited to multi-threaded treatment. (Writing to stdout is not suited to multiple threads when you want to serialize output; so some services are best handled by a single thread, rather than a pool.)

     Basically I want to wire up all the usual suspects when writing a typical backend process, from signals to sockets, while letting a non-blocking core interact through async api. I won't get far if I bother to write about it. But when I'm all done sometime I might post some of it under a cy style BSD license.

23may09 fractal processes

simulation

     Okay, here's a short piece on simulation I recently said I'd write. What's the context? It was difficulty in telling whether software you write works correctly—just because complex systems are hard to evaluate. Three years ago I advocated an idea of getting more support directly from a language when profiling code for expected behavior. I had intended to write a nice language with support for runtime analysis.

     Since then I haven't found a lot of time to hack programming languages. But I could have found time if I was still as interested now as then.

     (When I first pursued languages in the early 90's, I imagined folks would use tools to make things, and these might be creations I'd enjoy. But lately I see almost nothing folks create in software systems is very interesting. A typical goal now is usually some kind of social networking system—or more generally, a way for users to manifest themselves online and interact with friends. This is basically consumer drivel. I could not care less about helping eloi self actualize without doing anything more constructive than preen and hobnob. So there's not much point in making authoring tools. Authoring happens very little. Thus I'm no longer under an illusion it matters much what I do. These days I mainly optimize existing infrastructure.)

     Let's get back to my point about simulation. Okay, so you write a piece of software that runs in a very complex distributed environment, so at any given moment you might have trouble seeing current status of either a system or your piece of software, so it's hard to tell if your code is correct.

     As an example, a typical problem is failure to engage. Negatives are very hard to evaluate: why something didn't happen. (It's much easier to study a thing that happened when it should not have, because you find a positive trail. Trail absence is problematic.) Someone asks you, "Why didn't your code do XYZ?" So you reply by asking, "Did you even call my code?" And they shrug. Hmm, we need more auditing.

     My current feeling is this: you should simulate (or emulate, which implies closer faith to detail in reality) the world outside your code. A simulation can use your component the same way real world code will, to achieve the same things. But a simulation can be under your control in ways real world systems seldom permit, because off-the-shelf code has low transparency and little consistent, reproducible, guaranteed behavior. You can run a simulation in one process for simplicity, or in multiple processes for distributed action, or using multiple machines. Comparing all of these allows you to study why single process differs from distributed behavior. Then when code works in simulations, but not in reality, you can study why reality is not like your simulation.

     A simulation helps you answer an important question: Is my code basically screwed up? When the answer comes out "no" you can focus attention on what happens in the real distributed enviroment, to characterize how it differs from simulations which work. This is a vaguely scientific approach in the sense of comparing things for equivalence, and breaking things into smaller pieces. The process is amenable to reductionism. It may require more effort, but effort with systematic structure your management can audit; transparency in dev groups might be better.

     Anyway, a simulation can let you demonstrate what your code does when used both correctly and incorrectly. Then all you have to do is measure how real systems correspond to your simulations. Ideally you'd like a simulation infrastructure to be rich enough to model real system calls very closely, so you can drive your module exactly the same way it gets used by real code.

     There's probably a programming language aspect to this, but it hardly matters what programming language is used. However, if your components are async, (like network application modules) then you need good async support in whatever language simulates your execution environment.

21may09 singular fantasies

singularity

     Since I mention a science fiction idea of singularity in fiction, maybe I should clarify my own take on it, so folks won't confuse me for a fan.

     I don't think there will ever be a singularity, in the sense of miraculous asymptotic tech nirvana yielding a nerd rapture of epic transcendent scope. Note it's not impossible: just infinitesimally likely. Using it in fiction strikes me as okay since it's conceivable. Sorta. But it's about as silly as expecting anti-gravity one day will be possible. In fact, it's worse than that, because a singularity presumes an entire interlocking canvas of unlikely things as improbable as anti-gravity.

     Why do I feel that way? An explanation close to gut level is roughly: it seems like yet another violation of thermodynamics. As a fantasy, an idea of tech singularity assumes no entropy (read: crazy, noisy, chaotic back pressure) will lace the good parts. It requires assuming benefit without cost, and faith problems cannot grow as fast as solutions. It has a flavor of belief in Santa Claus or the tooth fairy.

     Human beings tend to ignore problems, preferring the bright side to negative outlook. As a result, they tend to avoid even counting the number of ways things can go wrong. If a final result is going to be some random collection of good and bad chance outcomes, you ought to consider whether bad outcomes greatly outnumber good ones in a problem space.

     (One reason I tend to write useful code is because I exhaustively search ways it might not work as I design and code. Ways you can fail are numerous. Just avoiding all the pitfalls is a lot of work.)

     Note here I touch a meme I should handle again sometime later: we're absurdly optimistic. I've spent a lot of time thinking about this. (About as much time as some folks do when they write a book afterward, but I won't write a book.) For example, I tried the standard evolution question: how is it an advantage to throw yourself into a grotesquely optimistic plan?

     Well, evolutionary models are rather statistical: creatures survive given better odds of surviving, in context. Imagine a large population of creatures in a species threatened by harsh circumstances. If each acts the same way, every one might die if one common behavior doesn't work. (This is a mono culture idea.) But if each creature tries something different, and follows through with vigor and enthusiasm (no matter how stupid the idea), then one of the tactics might work. If it works, half-hearted effort might not be good enough. So evolution might favor enthusiastic vigor in pursuit of a plan: ie, optimism. Being right has nothing to do with it. Evolution just likes gung-ho (gonzo).

     If being gung-ho was in fact a strong survival trait, it would also be a survival trait to prefer gung-ho partners who give their all to whatever stupid plan comes up. Socially, we prefer folks who are sure of themselves, right or wrong, because it's gung-ho. Alpha males are just being gonzo. It seems irrational we like folks sure of themselves, even when bent on idiotic agendas. But it favors evolution based on luck: lucky and enthusiastic spells success because when chance favor occurs you milk a result more than reason says to expect.

     Yes, that was a long digression, but I was trying to establish optimism without basis has a structural advantage in crisis. Your brain might be pre-disposed to like ideas without weighing them well, because luck matters more than deciding correctly. So get in there and play your heart out; if your spine is crushed, we can always send in another guy from the bench.

     What am I suggesting? That when an idea strikes you as neat, or cool, or just vaguely promising, your brain might not follow up with analysis as much as you think. Instead, latching on blindly with enthusiasm might be standard operating procedure.

20may09 blooming strains

scavenger hunts

     Hey, let's start a scavenger hunt. (I'm about as serious as if I'd said: Let's put on a play.) In general, I like scavenger hunt memes a lot, especially in game structure because they can scale to give really intelligent folks something to do after easy stuff is exhausted. Note the next paragraph might seem out of context.

     Yes, that was the data structure I was talking about. Are you having fun with clues yet?

belief without proof

     Soon I can update my comments on What do you believe about Programming Languages mentioned on Hacker News today. I'm the one who said:

     Just to pick a single thing on the list, it takes at least as much time to analyze what a system is doing under debug. In other words, after you write the code, you spend at least as much time wondering what the system is actually doing, even if the code is nearly all correct.

     I wrote that three years ago, when I still thought I had time to develop new programming languages. (Non-toy ones, that is. I still have time for toy languages.)

     Lately I'm trying to solve the same problem described through simulation and emulation, to characterize components in isolation. This also turns simulations into effective unit tests. I guess I should write a longer piece about it. Sure: in less than a week.

16may09 extinct mega fauna

wheel spinning

     Right now I'm having trouble being interested in my programming language project. I might not resume until I exit my current coding phase at work. At the moment I'm doing something very simulation oriented, and parts are far too similar to the business of async continuations and coroutines I want to do next in language work. So when I try to think about async api along those lines, I begin working instead. I actually want to spend all day Sunday writing code for work.

     Usually I treat that like a symptom of a work/life balance violation that needs fixing. It means you're turning into a grind and you're risking a burnout when it goes on long enough. But burnout is hypothetical to me: I'm not sure I've ever experienced it related to work. Generally I don't work long hours each week. Since I officially have my kids 50% of the time, I actually have them four or five nights a week—the last couple years anyway. It's very hard to work much more than a forty hour work week. Forty-five is pushing it, unless I work on weekends, but that interferes with how smart I am in short order, so it can't be done much.

     Tomorrow might be a mix of fiction and coding for work. At work they've cleverly started giving me update deadlines on Monday, under the theory this might get me to work weekends.

     Mind you, I'm never asked to work overtime. But folks hint the world might come to an end, now and then, if I don't produce faster. The hinting is very indirect, expressed as anxiety about schedule timetables in certain granularities. There's an interesting unspoken premise—sometimes actually said explicitly in weak terms—that I'm the only person who can do this. What that really means is this: when people help me, two bad things happen. First, the schedules have to get longer to sound realistic because people are cooperating. And second, they're afraid the result may lose the crisp, uniform, well-behaved, yummy flavor I give to code I develop. That has been the result before. My boss wants a rev like those I've done myself, and is afraid of a rev like one shared among three developers. I'm reading between the lines a little here. Otherwise he'd share the load with other developers.

     There's one quality about my code folks like in particular: when I'm asked a question, I know the answer. When asked about space, time, or complexity of some aspect of code, I render a complete, articulate explanation. I don't shrug and say I don't know. When tests reveal an odd runtime behavior, I have an idea where I should look in the code to study causes.

     What I'm doing at work is interesting. This is actually quite unusual. Most jobs ask me to do something rather boring, and I've had a problem being bored out of my mind since I was a teenager. (I had to stop watching television when I was seventeen, because one day it all started being the same thing with tiny variations.) One of the reasons I've done hobby work all my life at home was to address my boredom in a way work cannot.

     Recently I noticed that, in principle, I don't have to do hobby coding to avoid boredom. My work is just as interesting as hobby coding (this is pretty good). And when I write fiction, that's even better than hobby coding with respect to relief of boredom. I might be at a point in my life where I can just work all the time and write fiction when I'm not working, and I'd never have to be bored again. That would be cool.

     Probably work will one day get more boring. It's a required phase during development: the time when almost nothing changes, except for bug hunts in the jungle. Then I'll have to do hobby coding at home, because fiction won't be enough. (It's always necessary to have exact problems with correct answers in the mix, to avoid vague ambivalance of pure imagination.)

     I'm bending my programming language project in a direction I expect to need from now on at each job I have. I'm going to need async simulations of distributed systems, or otherwise the only way to test systems I write is in real deployments. So mainly I'm interested (now anyway) in making hobby code support for writing high level simulations with very good real time async support. When I get back to it, that's likely to be my focus. (Not the next big language, which was never very interesting to me.)

13may09 evil spouses

information theory

     This week I'm too busy to think about fiction. I'd be writing code for work now still, late at night, but I'm trying to wind down. I'm in that phase of coding a prototype where folks wait impatiently for signs of progress.

     I'm the über plumber where I work, so I'm expected to pull feats that verge on impossible. But when someone asks me to do better than information theory predicts is possible, at first I think they're joking (and then I wonder whether it's a subtle kind of political strategy).

     I have a data structure which, when full, reaches maximum entropy, with no further bit density available, and there's no metadata. Nothing can be factored out. Someone asked me to think of a smaller representation, taking advantage of sparse representation. I asked: What do you want to happen when it becomes full? The answer? They want a smaller representation than max bit density. I said, I'm pretty sure there's a proof no smaller representation is possible. I was treated like I was being difficult.

     (The data structure has an information theoretic definition. It's almost defined to be the smallest form that holds the content involved. But for some reason this isn't obvious to coworkers. I would just tell you the name of the data structure, but I don't want to spell out bounds on algorithms we use.)

     It's been thirty years since I read Atlas Shrugged, but this reminds me of a scene where Hank Reardon is told he could work miracles if he wanted to, implying failure to transcend limits was malingering. (I would work an AS reference into fiction, but there's not much to work with; maybe I'll later have Finch claim an engine is powered by static electricity in the air, invented by some nameless engineer.)

10may09 maze of twisty coroutines

coroutines

     I'm working on coroutines, and a lot of really complex related ideas. I think I'm on a useful trail, but it won't be clear for a while. However, there's a really good part: I can test anything I consider with a very small amount of code. This is great: it implies I'm pursuing something primitive I want to apply sometime soon.

     Coroutines and related ideas are simpler in garbage collected systems, because you don't have as big a problem in defining where state lives and how it's managed. Part of what I'm doing is trying out many variations in storage ideas for threads and coroutines related to a simple api. The new api looks almost the same as the continuation api posted yesterday, except the you parameter is another continuation. If we use r for (co)routine, and Rt in C++, we might get this:

typedef struct cy_r_ cy_r; typedef void (*cy_rfn)(cy_who me, int err, const cy_r* them); struct cy_r_ { /* coroutine return */ cy_rfn r_fn; /* async callback */ cy_who r_me; /* me=this context */ };

     The only thing that's different is the third argument to cy_rfn: it's another coroutine continuation instead of just a who value. (It's passed by pointer instead of by value since it's not yet defined where cy_rfn is declared.)

namespace cy { class Rt : public cy_r { // (co)routine public: Rt() { r_fn = 0; r_me.who_ptr = 0; r_me.who_gen = 0; } Rt(cy_rfn cb) { r_fn = cb; r_me.who_ptr = 0; r_me.who_gen = 0; } Rt(cy_r const& x) { r_fn = x.r_fn; r_me = x.r_me; } Rt(cy_rfn cb, cy_who me) { r_fn = cb; r_me = me; } void fn_do(int err, const cy_r* them) { (*r_fn)(r_me, err, them); /* callback */ } }; // class Rt }; // namespace cy

     Presumably space used for parameters and return values were negotiated up front, so the who values implicitly point to this space, but indirectly through state in the coroutines involved.

     Instead of passing just an integer middle argument to cy_rfn, it might be helpful to pass a more complex value to help manage any state that might be involved. Perhaps another argument would help, something like this:

typedef struct cy_r_ cy_r; typedef void (*cy_rfn)(cy_who me, int err, /*optional value:*/ const cy_who* value, const cy_r* them); struct cy_r_ { /* coroutine return */ cy_rfn r_fn; /* async callback */ cy_who r_me; /* me=this context */ };

     Contracts for memory management might be hairy in the fully general case. If you knew a thread stack was used with standard memory conventions, it might simplify coroutine contracts.

     I'm puzzling out some sample code to illustrate coroutines done this way, perhaps with a simple kind of thread and scheduler. I think I want to use coroutine callbacks to manage signals and other kinds of events. For example, if I write a scaling i/o facility like epoll, I might use coroutines for async event push instead of polling.

Entries appear in reverse chronological order. Content here is permanent: Each entry has a permalink () to the long-lived persistent copy here. Clearly, to link anything, you'd best link the permanent copy.

09may09 impetuous cadets

red, blue, green

     I saw Star Trek today. There's a scene where Kirk, Sulu, and a guy named Olson wear pressure suits that are almost exactly the same colors as nano armor worn by Wil, Zé, and Eli, respectively. Guess what happens to the guy in the red suit.

labels

     (All the rest of today's entry appears on the new label page under cy.)

label «

     On this page, label is short word for descriptor. For example, a file descriptor is a kind of label using terms defined here. But here we generalize descriptors for use in async messaging and call them labels. The idea means something very similar to name or address. In effect, a label is a name that can be turned into an address by whoever coined that name.

     The word descriptor serves my purposes perfectly, except the word is too damn long. In a thesaurus, label means the closest thing to descriptor as used here. And then if you look in a thesaurus for label, you can find synonyms with both noun and verb meanings. You can replace label wtih logo or mark as a noun, or with call or name as a verb. All of these are pertinent meanings—especially call when label means continuation.

why «

     Here's the confusing part: Why do I want this generalization of descriptors for use as labels? You don't need to know an answer to this at first (because descriptors are easy concepts) but as soon as I start showing applied label practice, you might be stumped unless you know the end goal.

     My goal is to decouple callers and callees in async api. A label can be used to indirectly refer to return addresses, for example, to name the caller who should receive a reply to a request. But a label can also refer indirectly to a callee preparing a response. Async api works by message passing, so callers and callees have no easy way to refer to one another without labels used as tokens to route messages. If a client C sends a request to server S, then C must pass a label to show S where to reply, and if server S wants to allow C to cancel the request (or say anything about the request before a reply), then S should give C a label referring to the request in progress. One kind of label points forward, the other backward. Confusing? Absolutely.

descriptors «

     The normal way to represent a descriptor is by a simple integer. For example, if you open a file you might be given a file descriptor consisting of nothing but an integer.

int fd = ::open( /* ... */);

     A big problem with that kind of descriptor is inability to invalidate stale descriptors. Let's say you close descriptor fd above and then open a new file. The new file might use the same descriptor. If someone thinks the old descriptor is still valid, you might have a problem. For example, suppose someone thinks the old file still needs to be closed to clean up a resource. If they close the old descriptor, the new file gets closed. Oops. It would be nice to see the old descriptor is no longer valid. If only the descriptor was annotated with a generation number.

typedef struct cy_fid_ { uint16_t fid_idx; /* index in some file array */ uint16_t fid_gen; /* current file generation */ } cy_fid; cy_fid fd = ::open( /* ... */);

     Using the approach shown above, you can tell an old descriptor is invalid. If the old file descriptor was just an index into some fix-sized resource, that index now becomes the 16-bit fid_idx field. (This is big enough when you never have more than 64K instances of this resource alive at once. When you need more, obviously both fields must be 32-bit values instead.)

     Every time the indexed resource is re-allocated, it gets a new generation number. (I strongly recommend pseudo random generation numbers.) If someone tries to use an old descriptor, with the wrong generation number, they get an error. Errors are good: much better than random execution behavior.

     I use a lot of descriptors like the one above when I design a library with hidden addresses. Each different type of object gets a separate struct, but typically they all look the same. A four byte struct like this can be passed around as efficiently as an integer value. I use 32-bit fields only when necessary.

     However, this sort of descriptor is a pain in the ass, because the logical firewall hiding a physical address is a kind of friction. Guess who pays cost of the friction? That's right, you do. A small pain like this, repeated zillions of times, can wear you down. Even worse, you don't want to impose this discipline on callers. You can do this with a library, but typically you want to represent client callbacks more directly.

callbacks «

     Here's a normal async callback in a C api:

typedef void (*cy_foo_cb)(void* cx, int err); int /* nonzero: errno; zero: you will be called back */ cy_do_some_foo(ArgA* a, ArgB* b, cy_foo_cb cb, void* cx);

     This is logically a function taking just two args, a and b, but callback cb and context cx are passed in order to get a reply later, asynchronously with respect to the call.

     In a "normal" function (with synchronous behavior) the role of cb is served by the return address in the caller. And because the caller's stack stays around until this call returns, you can keep state in the stack instead of passing a pointer to state in context pointer cx. So cb and cx manually cope with a feature normally hidden by the runtime. This async approach lacks grace: it's complex. (But it works well enough.)

     But the caller has a problem: context arg cx is actually a descriptor. It refers to a resource in the caller that must remain in place until the callback occurs. (If it's refcounted, then the call is one of the references.) But since cx has no associated generation number, it's hard for the caller to invalidate. What if the caller gets tired of waiting for the callback, and would rather attend to other business, re-using the space that cx describes? Without a generation number, how can you distingish a callback from the earlier call (which you abandoned) and a later one?

     The next section adds a generation number to cx.

who «

     This is the label type described by this page:

typedef struct cy_who_ { /* async ID */ void* who_ptr; /* async context pointer */ uint32_t who_gen; /* generation num or state */ } cy_who;

     Note: by convention, a nil pointer in who_ptr is never valid. (Nil is a synonym for zero, not any other value.)

     I changed the name of this struct several times before settling on who. You don't want to know the longer analytical names I tried first. This describes either a caller or a callee. Let's look at a new form of callback signature.

typedef void (*cy_cfn)(cy_who me, int err, cy_who you);

     This looks like the last callback, but with two differences. First, the first void* context argument is now a who which couples a generation number with the pointer. Second, a new last who argument makes it possible to identify who is replying.

     The next section explains expected usage in terms of continuations. First let's address terms me and you.

     In object oriented languages, the first parameter typically represents the object receiving a message. In C++, this is that object, and in Smalltalk self is that object. Here the use of me refers to the object receiving the callback, so me means this. Term you is the opposite of me: the label for whoever is calling back. (Weird? Other conventions are just as bad.)

continuations «

     The term continuation is a fancy way to say return address plus state of the caller. The continuation object below is named cy_c because c is short for continuation. This is just the original callback cb and context cx args shown earlier, but packaged differently so the context has a generation number.

typedef struct cy_c_ { /* async continuation */ cy_cfn c_fn; /* async callback */ cy_who c_me; /* me=this context */ } cy_c;

     Now look at opening a file using async notation:

cy_c myself; myself.c_fn = function_to_callback; myself.c_who = my_descriptor; cy_who you = cy_open(/* ...*/, myself);

     How do you tell whether this failed? By convention, zero in who_ptr is always invalid. So if you.who_ptr is nil, you can look in errno for the error value, and the callback will never be called. But if you.who_ptr is non-nil, this means the myself.c_fn callback will be called exactly one time with a non-negative err value. (We might use negative err in a callback for progress reports. Success is shown by zero in err when (*myself.c_fn)(myself, err, you) is called. Actual contracts may vary on a call by call basis. Always read docs.)

     The you value returned here allows a caller to ask about this request later—perhaps to attempt canceling the request. Note the callee appears burdened with a need to maintain the value for you returned by this request, because the same value must be passed later to the callback. But a cy_who value might be easy to get from request state; extra state might not be needed.

efficiency «

     Note sizeof(cy_who) is at least eight bytes, and more when pointers exceed 32 bits in size. This might seem hefty, especially when you know 16-bit descriptors and generation numbers are enough in your app. You might statically allocate all space in a server and expect tens of thousands of outstanding requests, wasting more space than necessary using this definition of cy_who. Yes, yes, this might not be ideal.

     The cy_who shown here aims to be least annoying when writing code to use it the first time. Having an actual pointer in there saves a lot of grief. Minimizing space isn't my goal in this version. Instead I want to reduce my grief in writing very complex first drafts of async systems. I can always write a new version with a smaller definition of cy_who.

encoding «

     Although cy_who is declared as containing a pointer and a 32-bit generation number, that doesn't mean that's what is really inside an instance of cy_who. By convention, each who instance is opaque, understood only by the cy_cfn continuation function called with that value passed in the me position.

     You can put anything inside as long a size does not exceed that of cy_who. Implementations that want to protect themselves might put integer indexes in the pointer, to avoid revealing memory addresses. And systems with memory that moves might use a relative pointer instead of absolute encoding.

     You could even use a bit inside cy_who to say how it's encoded, if you want to encode multiple ways for some reason. The only part that isn't negotiable is cy_cfn—that has to be a pointer to a function taking args as described.

C vs C++ «

     So why did I write the api above in C? Why didn't I use C++ since I prefer C++? I'm sure you can guess the answer: my callers or callees might be written in C. As long as basic interaction is defined in C, no C user is shut out.

     But I plan to use this api in C++:

external continuations «

     Note class cy::Ce means exactly the same thing as cy_c, but with convenient constructors defined. (Absence of constructors in C-based structs is irritating when code is verbose.)

     I might have named this class just cy::C, but one letter class names felt a little disturbing. So I appended e for external. This allows me to use i for internal below.

namespace cy { class Ce : public cy_c { // continuation external public: Ce() { c_fn = 0; c_me.who_ptr = 0; c_me.who_gen = 0; } Ce(cy_cfn cb) { c_fn = cb; c_me.who_ptr = 0; c_me.who_gen = 0; } Ce(cy_c const& x) { c_fn = x.c_fn; c_me = x.c_me; } Ce(cy_cfn cb, cy_who me) { c_fn = cb; c_me = me; } // same thing as cy_cfn_do(): void fn_do(int err, cy_who you) { (*c_fn)(c_me, err, you); /* callback */ } }; // class Ce }; // namespace cy

internal continuations «

     Similarly, class cy::Ci means exactly the same thing as cy_who, but with convenient constructors defined.

     Why do I say continuation referring to just a who value? Because when an async request returns a Ci denoting the future you passed to a callback, it refers to a continuation of a request: the interior of a call, as opposed to external callers.

     Internal continuations don't need a cy_cfn function pointer—you already supplied it when calling a request! Only an external continuation needs a callback function pointer, because it hasn't been called yet. (In other words, once you're inside a request method, you don't need to function pointer to get there anymore. So internal continations are just pointers.)

namespace cy { class Ci : public cy_who { // continuation internal public: Ci() { who_ptr = 0; who_gen = 0; } Ci(void* p, uint32_t g) { who_ptr = p; who_gen = g; } }; // class Ci }; // namespace cy

cross language dispatch «

mixed runtimes «

     Did you notice callers and callees do not need to use the same runtimes? Internal and external continuations can denote completely different kinds of runtime. You can call back and forth between garbage collected and non-garbage-collected runtimes, for example, or between different kinds of virtual machine.

     I wrote this page to crystalize thoughts from this afternoon (09may2009) before I go deeper into code designs for async programming language runtimes.

     Several years ago, around 2000 or so, I thought I would represent code continuations in virtual machines in a way that labels a function address with the type of runtime intended to execute the code. This would let me explore multiple VMs at the same time, as well as call between interpreted s-expresssions, byte-code compiled methods, or whatever else was used.

     After I started articulating this design to myself this afternoon, I noticed it resembled that earlier idea of typing code pointers. A cy_c external continuation representing a caller's callback has the same character: passing the who state for a function allows it to interpret what runtime should process the callback.

     Of course I'm glossing over details. (When a garbage collected runtime makes a request, does it know it should encode the continuation in a way that works when calling back into the gc runtime?) But in principle they just involve work.

stacking «

     The main reason for cy_c on this page was to unify the two sorts of continuation I normally use: descriptors for internal continuations and void* contexts plus function pointers for exerternal continuations.

     Now one format kinda looks like both. This allows me to make the async dispatch style universal inside and outside an async library, so I needn't use wildly different runtime styles inside an outside—that's a total pain when writing simulations and other test harnesses. This allows me to define a thread model crossing library boundaries.

     An async thread is a stack of continuations. Presumably all other state is allocated outside continuation stacks.

     I've been working on lightweight process api lately, and I started drafting the api on this page to represent async control flow, and ended up with a threading model instead.

     Presumably lightweight processes and async stackless threads are partly orthogonal. Process semantics are mainly about defining disjoint mutable address spaces. But for a process to do anything, it needs at least one async thread inside doing something. (Since interacting lightweight processes are all inside one heavyweight OS process, it might be possible to have async stackless threads that span lightweight processes—but this sounds disturbing somehow. Hmm.)

     When an async thread modifies no state, it might look like a process. So I might have a process spawning api for immutable activities, which just makes another async thread stack. Another kind of process spawning api would need to clone mutable state using a copy-on-write protocol.

07may09 identical snowflakes

censorship

     I censor myself, typically at least once a month, every time I post a piece with negative emotions. So if I always seem charming, you're duped by an obvious kind of sample bias you ought to expect: it's human nature to gloss over bad things. But this skews your observed data set.

     Almost by definition, whatever I cut is quite interesting if you like mean-spirited gossip.

withdrawal

     Briefly abstaining from fiction is giving me a bit of withdrawal. Apparently I was really enjoying it, because it's absence is now a downer. I guess I should find the right balance: a little now and then, all the time.

     When I started writing fiction here, every new piece surprised me: parts turned out rather better than I expected, usually because I gave emergent ideas free rein to manifest. In short, I wrote tactically fun things that didn't follow directly from a strategy I had in mind. Writing was more fun than consuming canned media product. I guess it still is. But now I'm less surprised.

     By the time I wrote the last two pieces, I knew they were going to look about like that. I can get a result I like without much risk of painting myself in a corner and writing a piece of crap. This makes it more tempting.

     Most likely, I'll continue where I left off, with only minor discontinuities. Too much lively fodder is sitting there ready to be explored. The dialog skeletons with Aleph and Mick are too interesting, and something juicy might pop out.

     Parts of writing are like programming. In particular, conflicts are like programming. I can think of a dozen potential eddies in story current, but some directly contradict one another: if A happens then B makes sense, but now D is impossible. Can I still save D? Hmmm. This is exactly like coding when resolving conflicts.

     For example, I had planned to give Eli an interesting ability, one he was holding out. But then I realized as a side effect, he would be able to see inside Finch, at least a little bit. Well, we can't have that, unless Finch lays a geas on him to never think of doing that. But you would find that frustrating, wouldn't you? See, one of the goals to pursue in design is this: don't piss off the reader.

     Before I resume, I need to explore more of what Aleph and Mick want. To do a good job with characters, you need to know a lot about them, so they don't act against their own interests, which you see in retrospect when you learn more. (I hate when Hollywood writers screw up and write a script where a main character's behavior makes no sense once you learn what's happening.)

     What would tempt Mick to come back to work? He knows his skin is on the line. See, this is why I don't want to skip ahead too far. You'd be surprised how often I have the characters explicitly address what seems wrong to me at a given moment in the story. This is one of the reasons I like them: they're smart enough to analyze the story on the fly, unlike so many other dumb characters hell bent on pulling the next bone-headed maneuver in the plot.

04may09 tin foil hats

downtime

     I'm just reading a little fiction tonight. The dialog is usually good by this author, but a few non-dialog sections I have to skim, as an alternative to giving up entirely. But I prefer to read everything.

pruning

     I managed to squelch an urge to write a long new section a couple nights ago, mostly from Eli's view, in three different time frames. I plan to keep all three (because they're good). But if I hold them longer, I can connect pieces to others by salting topics, references, and questions in common.

     I guess I'm looking for a balance between two tensions. On the one hand, it drives me wild to sit on a story idea for six months or a year without writing any of it. On the other hand, if I don't wait at all, I prune inferior content less. If I let stuff stew a while, and I still want to write, it means I liked it more. Sloughing ideas is part of the process.

     I throw away at least half of various potential bits of business I have planned to fit in someplace. Often there simply isn't room: a conversation moves on before a remark can be said; going back loses momentum, so I cut it.

     I first read about a practice of pruning content many years ago. I saw it in an article about the works of Dr. Seuss. When asked about his process, he said after first drafts he threw away about two thirds, keeping just the best parts. I was amazed: it seemed like a high cost to bear. But ever afterward I kept it in mind: maybe culling weak stuff helps.

subtext

     Don't get the idea that story is about the ring. That's part of the text. The story is actually about the subtext: what the story almost seems to be about without quite saying it explicitly. I'm trying to fit in as many things as I can, so there isn't a central theme. If I wrote multiple drafts, I'd go back and fold things together and add more subtext on a second pass. Obviously I like problems in trust and epistemology, but those are part of the text too.

coding

     I'm working on a new page about async calls, for methods that mimic system calls. The main page is about lightweight process api, starting from an idea you might want to take Linux system call api and find a place for some while discarding others. (For example, I don't need any support for file systems since the engine I have in mind will always be hosted in an OS with a file system.)

     Anyway, the process is weirdly interesting at the moment. It's fun to take a system call that has a meaning you might want to preserve, like exit(), and figure out how that fits into a runtime for an async programming language that can't actually exit the lightweight process at that point. Lots of interesting things like that. Fun so far.