|
31dec07
¶
finding patterns
pattern categories ¶ Several days ago I found Jonathan Tran's blog with good remarks about patterns in his 27sep2007 Methods to the Aha entry. Below I quote several parts of the same piece of Tran's writing, but out of order. Let's start with the best part: I've actually attempted to devise an algorithm for finding patterns before. The thing is, there are a seemingly infinite number of ways to categorize things. So if you try to categorize things and then look for something re-occurring, you'll never know if you're simply using the wrong categorization. Almost all my items from 23dec07 — particularly the section on taxonomy — are really about this problem. There are so many ways to categorize things, how do you know the right way to categorize? You don't. (And confirmation bias suggests any categorization you try is plausible.) Every scheme you choose tends to hinder seeing others, unless you're prone to lateral thinking, which helps. I'll quote Tran more below. First let's look at the work of John Wilkins in the 1600's for an example of wonky categories. I first looked up Wilkins in 1980 when I was studying ontologies for artificial languages, but the technique of Wilkins left me aghast. Wilkins intended an analytical approach to universal language (a kind of precursor to Esperanto centuries later, you might say) but it was undermined by very poor orthogonality in his categories, which today read as surrealism, or perhaps comedy. There's a famously funny comment on Wilkins' work by Jorge Luis Borges, in his The Analytical Language of John Wilkins containing the following often quoted passage: These ambiguities, redundancies and deficiencies remind us of those which doctor Franz Kuhn attributes to a certain Chinese encyclopaedia entitled 'Celestial Empire of benevolent Knowledge'. In its remote pages it is written that the animals are divided into: (a) belonging to the emperor, (b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification, (i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from a long way off look like flies. Borges' paragraph right before this one is also very good, but I felt I had to choose one or the other. Better to cite the following instead from the same piece, in which Borges states the issue in terms closest to those worrying Jonathon Tran: I have registered the arbitrarities of Wilkins, of the unknown (or false) Chinese encyclopaedia writer and of the Bibliographic Institute of Brussels; it is clear that there is no classification of the Universe not being arbitrary and full of conjectures. The reason for this is very simple: we do not know what thing the universe is. Since I agree with Borges (not that I'm anybody) you can see why I roll my eyes when folks boosting the "semantic web" get all keen on the idea of categorizing all information in some canonical way, to render all online data rational enough for automated processing. You might was well tattoo serial numbers on everyone while you're at it, if you'll ignore all minority reports. Sorry, that had a little soapbox tone, didn't it? Well, from Borges' following remark, you'd think he forgave them in advance: The impossibility of penetrating the divine pattern of the universe cannot stop us from planning human patterns, even though we are concious they are not definitive. seeing patterns ¶ Let's get back to Jonathon Tran's Methods to the Aha entry, so I can quote more random remarks I agree with, though they often seem different topics to me. It's just interesting to see so many insightful comments on patterns in one place. But I guess the name of his blog explains it. Tran notes a distinguishing characteristic of smart folks: According to Jeff Hawkins, you have to see a pattern before you can learn anything. The people who tend to be seen as smart are the ones who pick up new things quickly, and they can do this because they see the patterns quicker than others. But it is a completely unconscious activity to them. Yes, that's the attribute I find interesting and telling myself. (To others, fast talkers seem smart, but saying things fast needn't correlate with being very clever.) Fast and unconscious association of related ideas is very productive. It also seldom occurs as a result of explicit linear thinking. Instead, associations bubble up as "obvious" insights from some strata of the mind operating below a conscious threshold. It's one of the clearest signs our minds "think" at a level below conscious perception: we get unsolicited telegrams. So this leads us to our next question. How do you become good at creating analogies? I'm not sure making analogies is a learnable skill. If you're naturally good at them, you offer so many in your teens other folks tell you to stop doing it. Folks of a more literal frame of mind detest analogy because it doesn't seem factual, as if comparison of similar things is not evidentiary. But I think you can learn to make more analogies, no matter how good you are naturally, by asking the right questions all the time: what is this like? What am I reminded of? Is there another system isomorphic to this one? But all the questions assume you want a general explanation, not a specific one. This begs the question though, is there another way of finding patterns? Can we find patterns without first categorizing things? Yes, but there's definitely tail-chasing, chicken and egg aspects to the problem, as you try to group similar things inductively while testing what you see against plausible potential propositions. The first and most important step is to ask questions, which you're doing. It helps to imagine combinatorial permutations of options that pop into your mind. But you're somewhat constrained by whether your mind does this naturally without any prompting. Sometimes my unconscious is clearly performing a kind of exhaustive search through some space, and I'm only seeing the more interesting hits in the iteration. What you can do consciously is make the space bigger, by thinking of more dimensions in which the search might occur. But of course, any new dimension that occurs to you will appear — yes, that's right — as an unconscious insight. Wanting the insight is a good way to prime the pump though. 30dec07
¶
zoning restriction
quotes ¶ Here I'll cite a few fun quotes found on Bill Clementson's blog in his post about fortune files in Unix. This is in lieu of writing many paragraphs about creativity and what-not I thought I'd find time for at noon today. I'll restrict myself to a couple of short items from Clementson's list of fortunes. I can't understand why people are frightened of new ideas. I'm frightened of the old ones. -- John Cage That really sums it up for me: many conventional ideas give me the willies when I think about what they permit or restrict. If the status quo doesn't frighten you, god bless you. Here's where you'd hear me rustling papers if this was a radio show. Let's see, okay here's the next one: Lisp is a language for doing what you've been told is impossible. -- Kent Pitman This one explains a lot of my motivation for using Lisp for some internal purposes: when I run into something that "can't be done" I can say up yours and do it anyway, ignoring arbitrary limits. in the zone ¶ I'm in the zone and I don't want to stop coding. I worked through the holidays instead of taking any vacation, plus this weekend and last one too. It's all work for my day job. I only let it encroach on my life like this because it's what I need. This first version of an event based system I'm polishing is basically on my critical path for doing things in future programming language work, if I want to start competing with Erlang style runtimes, which I do. Some aspects of designs I fleshed out became ugly when born, bolstering an addage you should plan to throw one away. For example, I knew C didn't seem like a good idea, but it turned out significantly more verbose than C++ to do everything. And I gave in when pressured to permit "file descriptor" style integer access to some objects managed by refcounted handles under the covers, causing weirdly awkward constraints and extra code. Though it'll work well enough, I see a simpler version of event management hiding in there. So I'll redesign it all when I get around to a version for my purposes, hopefully much smaller. happy birthday ¶ Here's wishing another happy birthday to a new teen. And a belated happy birthday to Luther Huffman, who I was tempted to ping just after Christmas. (But I was working all this last week, and had a spectacularly bad cold besides — I guess the flu since it caused a fever in addition to other symptoms.) Many happy returns. Of all folks I talked to on my old site, Luther Huffman was the main one who actually gave me ideas. These days I go to trouble to avoid email contact because I learned from years of experience on my old site that odds of value to me in exchange were fifty to one against. Too long of odds without a lot of spare time. 23dec07
¶
signal categories
taxonomy ¶ Yesterday I wrote several tiny pieces related to Steve Yegge's Code's Worst Enemy without ever getting to the topic I really wanted to talk about, which is the underlying issue of whether implicit categories in language actually constrain categories in your thought. It might; let's consider. If you're unaware of a means of categorizing parts of a process you see, what makes you notice a category? If words in your language lack detail and taxonomy for a category that's relevant, what will make you form useful perceptions? You might excel at making new kinds of classification. Maybe you're a wizard at induction, and you find new patterns as quickly as they show themselves. Or more realistically, when you're tired, distracted and/or busy, you're most likely to use the same old buckets to group ideas you always do. Taxonomies you use for normal experience act as blinders stopping you from seeing other ways of slicing and dicing reality. By default you see business as usual. When you make mistakes of omission — failing to respond to new conditions — the cause is often rooted here: you project old schemas. As a programmer, have you ever had an experience while debugging code like this one: while reviewing code, you miscategorize the effect of some instructions, and then later find yourself unable to see the actual effect when it's crucially relevant? Painful when that happens, isn't it? This is actually what I mean any time I talk about language constraining thought. I don't literally mean the language you speak. I really mean the patterns in how you think, since those patterns — when wrong — tend to cause grief. You must ask more questions: is what I believe actually true? Asking questions puts you in a receptive state necessary to see a new thing. Most of my worst errors in life were due to not asking the right question, since if I had merely asked, I'd have seen evidence at hand, plain as day if I just looked. Surprises are there, whether or not you look — 'tis best to see. hypothesis cliché ¶ Programmers interested in effects of habitual thought on coding practice often invoke the Sapir-Whorf hypothesis, as I did yesterday. For example, two years ago Reg Braithwaite wrote about signal-vs-noise in never ending language debates: I believe that programming is an idiomatic activity. We learn idioms and then apply a kind of pattern-matching to recognise problems that can be solved with an idiom we already know. Some idioms are easier to express than others in each programming language. (Does the Weak Sapir-Whorf Hypothesis apply to programming?) Of concern is the way a programming language models reality in computing systems. A language supports a way of simulating executable effects, and you use these mechanisms to get all the effects you need. Following programming language rules can gently make you suppose the rules are real — that a computing system might not work differently. It's a bit like watching a movie, where you suspend disbelief to follow the plot of a story. While immersed in drama, you're willing to hold rather odd things as true for the sake of the story. If you pretend strongly enough, you believe tenets of a parallel world, and put away your common sense until later. However, you could love a story so much, you carry it home with you and wear it like rose-colored glasses, filtering your perceptions. And then there's dogma, which pretends not to be just a story. We have lots of dogma in computing. thinking differently ¶ A week ago I sought a story I read twenty-five years ago whose punchline was, "Think in different categories." My memory of the story was: a researcher was intrigued by reputations of psychoactive drugs for inducing creative ideas, and dosed himself with some while armed with note paper. Coming back to his normal senses afterward, he found he'd only written a single sentence: "Think in different categories." Somewhat to my surprise, I was able to find a citation for this story. And the account is so close to the one I recalled I knew it must be the one I'd read. (Note I've no interest at all in psychoactive chemicals — I just find the theory about creativity interesting. I think you should too.) John Raithel writes in Storming the Kingdom of Heaven about P. D. Ouspensky's classic experiment. (Note Ouspensky died in 1947, well before the drug-happy 60's.) In a now-classic example of the use of drugs and perhaps their ultimate utility, P. D. Ouspensky relates how, after an experiment of his own, in which he tried desperately to convey something to himself from the higher state he was experiencing, he had managed to write down one phrase. The next day he read on the paper "Think in different categories". It is not a bad idea, in fact a good idea, but just another idea, with no particular power of its own for us, no power at all like the power that has made one read this far, the power that results from somehow knowing there is much more, and that leads one to look for a way to it. Instead of ending this piece right here, please consider quotes below found while searching for related material. (Isn't it fine how one can use Google these days for unusually productive free association?) See if you cannot find a relation between this material and Steve Yegge's objection to Java programmers—for example—blindly following herds. In his 2000 Creative Agnosticism (Center for Cognitive Liberty and Ethics, Vol. 2, Issue No. 1 pages 61-84) Robert Anton Wilson got my attention with these comments about Ouspensky: [Ouspensky] realized that "normal" consciousness is much like hypnosis indeed. People in a trance will do what they are told—even if they are told to march into battle against total strangers who have never harmed them, and attempt to murder those strangers while the strangers are attempting to murder them. Orders from above are tuned-in; the possibility of choice is not-tuned-in. Obviously likening normal consciousness to the susceptible state of hypnosis is interesting, if it helps explain why people conform to expectations despite bad consequences. It suggests the idea of viral meta programming (which Alan Kay might call goal cloning.) But the next paragraph is even better in the way it relates awareness, choices, models, creativity, speed of revision, and apprehension of signals. Static views filter perception: To use the brain efficiently — to be aware of where one is and what one is doing and what is going on around one, and to take responsibility for one's bets or choices — seems to increase "intelligence" and "creativity." That is hardly a surprise. Whatever our technical definitions of these mysterious functions, it is obvious that they are somehow connected with the number of signals consciously apprehended, and with the rapidity of the revision process. When one model is held statically between ourselves and experience, the number of signals drops, no revision occurs, and "intelligence" and "creativity" correspondingly decline. When many models are available, and when we are consciously involved in our choices, the number of signals consciously apprehended increases, and we behave more "intelligently" and "creatively." Finally, I think it's worth your while to read the longer quote below from Robert S. Root-Bernstein's 1999 Music, Creativity and Scientific Thinking, which posits some kind of causal relation between breadth of knowledge and experience in polymaths with their obvious creativity, presumably due to multiple choices in perspective. This is not to equate having multiple interests or skills with creativity; it is not simply that the people I have described are multi-talented, or polymathic. Their talents are correlated in such a way that they interact fruitfully. I stress the fruitfulness. Creativity comes from finding the unexpected connections, from making use of skills, ideas, insights and analogies from disparate fields. Thus, my concept of correlative talents and its own correlate, synosia, help explain for me why true creative ability is so rare. Of the set of multi-talented people, who are in turn a subset of all the people who are singly talented, only some will develop the necessary integration of thinking modes necessary to make their talents interactive. This isn't simply polymath adoration. He makes the point that integration of multiple perspectives is the key, and not just presence of variety in field expertise. The more you already see connections between things, the more you see new connections. (This is also a byword among cognitive psychologists, who feel you learn more easily the more you already know, through more numerous associations.) It is my belief, after many years of study, that those who do develop interactive or correlative talents often do so because they have a predisposition--learned or innate or a combination of the two, I cannot tell--to view their intellectual world globally and holistically. Thus, the view I have just given of music as a manifestation of thinking, rather than as an independent type of thinking, is colored by my interest in these polymaths and by my particular theory of creativity as being an integrative, transformational process. I read his globally and holistically category as essentially the same as intuitive mapping. A view of creativity as integrative is rather interesting in so far as it contrasts with commonplace notions of creativity as wild and/or unpredictable divergence. Apparently, when a person who correlates disparate ideas because they are actually related, this looks a bit like a random walk to an observer using a different map. Presumably — given my choices of material here today — there are two different ways to stimulate creative thinking. You can dose yourself with a drug that jangles your normal hardwired thought process enough to see new paths. Or you can already be so immersed in strong integrative impulses to connect ideas that you jump between "normal" paths as a matter of course. How much energy do you need to get out of a normal rut? For some folks, thinking releases energy enough to break bonds. 22dec07
¶
inventing futures
Although each section is related, I won't spell it out each time, to give you something to do. (Yeah, the transitions are kinda jumpy, sorry.) thought channels ¶ Every section below is related: how you represent information can channel or bias thought. Any encoding system favors some ideas — expressions are short and brief when you say exactly what a vocabulary supports. But say a new thing, and use more words. Of course, this is related to programming languages. This week, everyone is responding to Steve Yegge's Code's Worst Enemy essay, where he says: I happen to hold a hard-won minority opinion about code bases. In particular I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size. [bloat] Steve's main message is: verbose languages and/or coding styles make big programs, and big is bad for several reasons paralleling the way large government bureaucracies have soul sapping excess process. Big code systems harbor bureaucracy, which sucks. What to do? Steve says use better languages saying more in fewer lines. A good idea, but possibly flawed: you can always add bureaucracy, even if a language doesn't. (A better language won't automatically mean systems can't sap your soul anyway. But less unavoidable inbuilt baggage is doable.) Some programming languages have a community with a culture of verbosity and an ethos of big-is-good encouraging bureaucracy. Smaller is better. But it might require systems have less standardization. point of view ¶ Alan Kay's 1989 Predicting The Future (Stanford Engineering, Volume 1, Number 1, Autumn 1989, pg 1-6) contains the following useful observation on the effect of representation and notation on the ability of programmers to code effectively. At PARC we had a slogan: "Point of view is worth 80 IQ points." It was based on a few things from the past like how smart you had to be in Roman times to multiply two numbers together; only geniuses did it. We haven't gotten any smarter, we've just changed our representation system. We think better generally by inventing better representations; that's something that we as computer scientists recognize as one of the main things that we try to do. harder vs smarter ¶ At Apple in the 90's I had coworkers who hated a goal of working harder instead of smarter, when it was clear someone wanted to dramatize effort instead of minimizing cost for better productivity. Furious effort can look productive, but it can be much less so than careful, precisely targeted effort. Doing just the right thing, though, is hard to compare with past experience. The novel parts look new (duh). But often commercial coding is part theater: many folks prefer grading on effort to actual results. information theory ¶ Basic information theory says a message informs only to a degree it's unlikely the message occurs. A message occurring constantly says little since you've already heard it. Very informative messages are therefore strange, or at least unexpected. So highly expressive (saying something new or significant) coding styles must have a high novelty factor in source code: bits must be unexpected not to repeat what came before. Brevity must look weird, or novel, or it's not saying more new info in fewer bits. code well ¶ Responding to Yegge's essay on code size, Reg Braithwaite said in school he found a new design pattern which didn't repeat the usual ones: When I gave it to the prof, his first reaction, on reading the code, was ... utter bafflement. He and another TA actually went over what I wrote in an attempt to find and catalog the GoF patterns that I'd used when coding the application. Their conclusion after a fairly thorough review was that my main pattern was, "Code well." So apparently academics say they prefer code composed of clichés (famous patterns) even though cliché-ridden style is poor practice in fields other than coding. Are they serious? If so, do they reward rote regurgitation and not original thought? You can rewrite concise code as pattern-spewing verbose expression by treating standard patterns as a sort of virtual machine. Instead of using the low level machine as the simulator, you can use a pattern language as a higher level simulator. But such a rewrite would make code longer, without adding any clarity. The novel part — achieving the task — is not reduced by swapping the vocabulary used. So using patterns might simply bloat expression. Here's a game: take a short clear sentence in plain English prose, and perform successive substitutions using recursive grammatical rules to make shorter phrases longer, but with the same meaning. If you do it as many times as you like, how is clarity affected? representational ¶ An adj definition at yourdictionary.com says:
The first meaning is the one I normally have in mind, but the second is closely related. In either case, the idea is consistently one meta level above representation. It means: about representing things. The Representational_State_Transfer (REST) Wikipedia entry shows this word was also well known in computing after Roy Fielding's 2000 disertation. Representational State Transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web. The terms "Representational State Transfer" and "REST" were introduced in 2000 in the doctoral dissertation of Roy Fielding,[1] one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification. The terms have since come into widespread use in the networking community. In the context of REST, the word representational means about representing resources. So my usage doesn't seem so weird in retrospect, in 2003. There's another meaning for representational from Art History (cf second listed at yourdictionary.com), which Tate Online explains as follows : Blanket term for art that represents some aspect of reality, in a more or less straightforward way. The term seems to have come into use after the rise of modern art and particularly abstract art as a means of referring to art not substantially touched by modern developments. That's consistent with my interest in representing data in a manner preserving similarity to an original domain, when possible. For example, in the context of languages, as a young man I pursued the idea of literal mimicry in some spatial encodings. And even today I prefer to keep structure when possible in data modeling. In this context representational might mean "not recoded arbitrarily". I heard the word used in this sense for many years during the 80's while my ex wife was getting a PhD in Art History at UC Berkeley. (I was her sounding board while theorizing about complexity in social patterns of 17th century and the effect on northern art.) Now let's segue to Sapir-Whorf hypothesis. sapir-whorf ¶ I've been interested in something called the Sapir-Whorf_hypothesis since around 1980, when I started designing a spatially oriented language. (Not a programming language: something general.) The hypothesis postulates that a particular language's nature influences the habitual thought of its speakers. Different language patterns yield different patterns of thought. This idea challenges the possibility of representing the world perfectly with language, because it acknowledges that the mechanisms of any language condition the thoughts of its speaker community. Here you can see I've returned to territory related to Steve Yegge's essay, if you apply the Sapir-Whorf hypothesis to how Yegge feels programming languages impactd programmer productivity. The language used by a programmer influences what a programmer thinks. (Not determines... just influences.) Despite criticism of his hypothesis as monocausal and deterministic, Whorf sought to insist that thought and action were linguistically and socially mediated, and not monolithically determined. In doing so he opposed what he called a "natural logic" position which held, according to him, that "talking, or the use of language, is supposed only to 'express' what is essentially already formulated nonlinguistically". Obviously the way information is presented has an effect on how people think about the information. Even if you never followed Edward Tufte theory or graphical user interface arguments, this idea must be obvious to most tech folks by now. But in 1980 I pursued an idea that a language with a spatial basis might distort reasoning less which was itself grounded in space. (In the early 90's I worked on a graphical programming language environment, on a project that let me pursue a tangent of my early 80's interests. But graphical programming languages are even more verbose than text-based programming languages.) iverson apl ¶ The Sapir-Whorf_hypothesis Wikipedia entry also has a section on Ken Iverson's feeling the hypothesis related to programming language design: Kenneth E. Iverson, the originator of the APL programming language, believed that the Sapir-Whorf hypothesis applied to computer languages (without actually mentioning the hypothesis by name). His Turing award lecture, "Notation as a tool of thought", was devoted to this theme, arguing that more powerful notations aided thinking about computer algorithms. From his 1979 ACM Turing Award Lecture Notation as a Tool of Thought (pdf), Iverson says: The importance of nomenclature, notation, and language as tools of thought has long been recognized. In chemistry and in botany, for example, the establishment of systems of nomenclature bo Lavoisier and Linnaeus did much to stimulate and to channel later investigation. However, Iverson's opinions on the importance of programming language notation might be slightly undermined by the reputation of APL as a highly obfuscated programming syntax: From a user's standpoint, the additional characters can give APL a special elegance and concision not possible in other languages, using symbols visually mnemonic of the functions they represent. Or it can lead to a ridiculous degree of complexity and unreadability, typically when the symbols are strung together into a single mass without any comments. Or it can be unreasonably difficult and time consuming to enter then later edit those APL statements. This last paragraph illustrates the basic problem with notation that is effective at minimal expression. If you know a notation, it can express ideas very efficiently in a small amount of space — provided you get everything. But any parts not perfectly clear to you can seem very obfuscating. Readability of a language depends a lot on the training and experience of programmers. For any given programmer, the ideal language suits what a programmer wants to say with a minimum of notation, but without imposing any other notation than the programmer wants to use. |
Entries appear in reverse chronological order.
Content here is permanent: Each entry has a permalink
(¶) to
the long-lived persistent copy here. Clearly, to link
anything, you'd best link the permanent copy.
16dec07
¶
spook country
arctic circle ¶ I'm still deep in an arctic circle of seven-day-a-week work winter, and it's endless night in the sense I seldom break from work code longer than I need to slough off fatigue. If I wasn't writing this after midnight on Sunday, I'd be coding. :-) Before I note my shared memory style, consider something else. fiction ¶ I'm always reading something. Today it's William Gibson's Spook Country, which got off to a slow start, but now shows his trademark interpersonal networks of complications. But lately I only read when taking a break. (I'm working all the time on my day job, even when I'm at home.) A few days ago I finished Orson Scott Card's A War of Gifts, which was quite good, despite being very short. And before that I finished re-reading Neal Stephenson's Confusion for the third time. My first edition copy of Confusion is signed by Stephenson, or purports to be signed by him, since I didn't see it done. (I found it signed and labeled as such in the bookstore. So, did Stephenson guerilla-sign it in passing? Or did one get shipped in each batch to wherever?) I'm always deeply impressed by Stephenson's story telling ability, and have to fight the urge to cite passages in his works that illustrate remarkable command of technique. type safety ¶ My code using shared memory in C is taking longer than I planned, mainly since it's more involved than expected. Thus I'm working all the time in pursuit of deadlines. Some extra effort goes into making code type safe, which is possible but tedious in C, because everything must be done by hand. Every kind of data that means something different goes in another uniquely typed struct. Sometimes all my values are integers, just wrapped up in specific hierarchies. I can't put pointers in shared memory when a segment can be mapped in different spots in each process address space. So each "pointer" value is just an integer offset from segment start. But to make them typesafe, I define a struct wrapping each offset. So an offset to type Foo goes in a struct named by convention to mean offset-to-Foo. The C compiler won't let me mix pointers to these with offset-to-Bar, so my integers are typesafe as pointers would be. Of course that's not safe enough, so I also check everything that might be wrong, when I can. I check objects to see if they contain correct magic signatures, and whether their offsets fall inside the range where they live, and whether every other part of object lifecycle is observably on track. This would be slow, except I check things typically already recently touched in a cache line. And I'm careful about cache lines. cache lines ¶ The speed of the server I'm writing is mainly constrained by how much I touch memory, and whether my access pattern has good locality or not. A few years ago I didn't think about cache lines in processors very much. Now I think about them all the time. All my space layout reasoning is affected by whether I think one choice has better cache line behavior than another. I also write code with cache line behavior in mind. So when A calls B I try to arrange B immediately follows A when I can. Other minor tactics sound similar. When I arrange member variables in structs, I try to place fields used earlier in front of fields used later, so when memory is loaded earlier there's a better chance fields after will also end up in the same cache line as other content pre-loaded by the processor. I put fields next to each other when used together, if I can. I avoid lists as much as possible. (But often lists are the simple first version of something.) Allocation pools, in particular, are stored contiguously as much as possible, instead of in lists. When I pre-allocate an entire population of objects, I also allocate the circular array of free references at the same time. So recently used parts of a free pool are typically already in a cache line. Allocation by pointer bumping is always preferrable, since the new pointer is likely to space already in a cache line. pre-allocation ecology ¶ As you'd expect, when I store dozens of different types of things in shared memory, I need to work around the fact I've pre-allocated all the space I'm going to get already. A given shared memory segment is a single, contiguous, ungrowable heap I use to allocate everything that goes inside. (And I can't use pointers — just offsets relative to segment start.) It took me a while to see what coding styles in C are good practice in this context. But for several days I had to think about everything. In C++ I would just do it, almost as fast as I can type and review what I've typed. But this shared memory code in C required a lot of thinking. Eventually I found the pivot points around which everything else turns fluidly. In C I have to use a lot more names, so I worked out naming conventions to show how several related names are associated in specific relationships. But this was time consuming anyway, because we're talking about a lot of names. Every key, value, pair, and map (and pointers to things referenced by segment offset) need their own names. And most of these things have relationships to other keys and values in other contexts. So in shared memory code, I have a big ecology of conventions. 09dec07
¶
dancing angels
grinder ¶ I'm still being a grind; I spent most of today working on a Sunday at home, coding async events in C for my day job. I skipped my mid-week update here on purpose to focus on work. Lately I've been putting about 25% of my coding time into print methods, so everything will print in pseudo XML. It often dramatically improves debug times when I see state presented exactly the way I want to view it. creativity ¶ If I gave myself much time to write here, I'd probably write most about two recent links on creativity:
These are both good and worth reading. Asimov's treatise is basically inspired and well worth your time to understand. Since I think of myself as a creative person, I look to fun items like these two for ways to explain how I think. Normally I'd write a few hundred words to make all my points; instead I'll comment briefly. My creative thinking ability is at least as many standard deviations high as my intelligence -- maybe higher -- and explains variance from norms better than IQ does. Over decades I've read about creativity and studied how I think when I seem effective, looking for clues. In Carter's terms, juxtapositional thought is my normal mode: Perhaps the one word which best sums up the faculties so easily blown away by stress is juxtaposition. The ability to bear more than one thing in mind, compare and contrast their structures. Recognizing this, we can see that it's not just the ability to handle code that is affected. Without juxtaposition, cost/benefit analysis becomes very difficult, and indeed we frequently see people spending weeks implementing "optimizations" that cost more than they save by orders of magnitude. To juxtapose ideas, it helps to know a lot, and to think in terms of how things relate to one another. Intuitive folks are often mappers (c2.com wiki), as oppsed to packers: Mapping is the kind of learning you do when, after you pick up some information, you sit and think about it in an effort to simplify the way you think about it (i.e., simplify your mental map). Mappers are the world's great thinkers: they are the inventors, the scientists, those who think and control. Through school, many folks thought I had photographic memory because my conversion to long term memory was high (about 99% of anything I noticed). Asimov notes creative folks must have many bits of information: 1) The creative person must possess as many "bits" of information as possible; i.e. he must be educated. But as Asimov continues his analysis, he eventually reaches a more important topic: creative folks often seem odd because they have the courage to reveal the result of their creativity: It takes courage to announce the results of your creativity. The greater the creativity, the greater the necessary courage in much more than direct proportion. Asimov's point is mainly that being creative requires permission (granted by yourself or by others) to think unconventional ideas. But when you're in the habit of granting yourself such permission, it makes you seem odd to others. At the very least, it means you are not as responsive to social pressure, if you show it. Usually a man who possesses enough courage to be a scientific genius seems odd. After all, a man who has sufficient courage or irreverence to fly in the face of reason or authority must be odd, if you define "odd" as "being not like most people." And if he is courageous and irreverent in such a colossaly big thing, he will certainly be courageous and irreverent in many small things so that being odd in one way, he is apt to be odd in others. Creative thinking is so commonly punished as social deviance — not following the herd in lock step — that you learn to stop showing you do it, even when you think outside the box all the time. This gives an impression you have ideas on demand when folks ask for your input. But it's simply being willing to state your ideas out loud. It's a matter of permission to speak, not to think. intuition ¶ Let's skip Asimov's section on intelligence since we take it for granted in tech circles. Consider Asimov's take on intuition instead: 3) The creative person must be able to see, with as little delay as possible, the consequences of the new combinations of "bits" which he has formed; i.e. he must be intuitive. As far as I can tell, intuition is the limiting factor in creative thinking, because it's less common than intelligence, as Asimov notes: Now of the three criteria mentioned so far, I feel (intuitively) that intuition is the least common. It is more than that none will be intelligent or none educated. If no individual in the group is intuitive the group as a whole will not be intuitive. You cannot add not-intuition and form intuition. Part of intuition involves seeing more than one thing at once: what Carter calls juxtapositional thought. This is easier if you think in a non-linear way, which in turn is easier if your primary mode of thinking does not use words which often encourage linear order. If you think spatially, and especially if your memory is partly photographic, you can juxtapose many possibilities at once in your mind's eye, like folding a map to bring different parts together. Or in some cases, it's like flashing through piles of mental transparencies on a desktop or projector. The more things you can imagine, the more different things you are willing to see, and the less likely you are to believe any particular case pertains in one situation when many options are present. The better your imagination, the more you test ideas to see which are true in a given situation. Intuition provokes analysis. Carter eventually relates juxtapositional awareness to the idea of falsifiability as a way to filter options: ... people thinking is a critical way about their work, looking at it deeply and holistically, and verifying what they believed to be true by attempting to prove their knowledge to be false. This idea of falsifiability is the core of the scientific method. An idea is of no value unless it makes statements which can be tested, and so shown to be false if they are wrong. Lack of intuition causes blindness. A person who thinks just one thing can be true won't see other options. Prior belief is prophylactic against insight. first class functions ¶ This week at Lambda the Ultimate folks argued about how many angels can dance on the head of a pin. No, actually most of the comments are about the "definition" of what would make a first class function. But the result is just as tedious and pointless as counting angels. I find such arguments pointless when the result cannot change any decisions you make about anything. The discussion resembles sport: game playing for points without any real world consequence. However, several folks made correct observations, so I'll quote them here. (In the past I'd have posted a "me too" on LtU instead.) Andreas Rossberg notes C has function pointers that are first class, but not the functions themselves: Yes, function pointers are first-class -- but they aren't functions. Anton van Straaten gets top marks as usual in citing context as a critical factor in relevance: Saying "language X doesn't have first-class functions" is a particular statement that's interpreted according to its context. My point is that your definition relates to one such context, but that the term is also usefully used in other ways, and that its meaning is relative. Anton van Straaten also
applies
fuzzy logic
(cf I'm suggesting that it is useful to map the n-tuple which describes a point in this set to a real number between 1 and 2, the purpose of which is to identify where a feature fits on a single dimension which ranges from 1 meaning "as first class as possible" to 2 meaning "entirely second-class". Many other technical discusions — in most venues — suffer from similar problems (failure to be meaningful or useful) when participants insist on taking sides with "yes" or "no" with respect to questions whose actual answers depend on too many factors in different contexts to be strictly true or false. "Yes and no" is more often the correct answer. It's the distribution of gray between black and white that's interesting, not the winner of a counting coup spat. 02dec07
¶
river cards
work day ¶ What's the opposite of a sick day? Maybe weekends where you work? This month I'm working long hours to game a deadline. I'd rather be done early. I guess I don't mind working Sundays as long as I'm personally interested in the result. (This is rare in practice: a once in a ten year deal.) Expect sparse updates here for a while. microbrands ¶ Yes, I'm posing þ as my personal microbrand. It's more useful than a memorable name. (The name I go by actually means thorn ... according to me anyway.) The theme is basically "making software hurt less." The trick is holding stems lightly, so the pointy things draw less blood. Yes, I'm joking. shared memory ipc ¶ There's not much reason to use shared memory except to share state between processes for ipc (inter-process comm) in a way mapping shared segments into each address space — but not necessarily at the same spot in each address space. I might write about a few related practical matters. For example, suppose you put refcounts in shared memory, and each process maps segments to different addresses. You have atomic integer updates written in assembler ... but will this work across address spaces? Why would you assume atomic update in one process would also work with another? On Intel architectures, folks assure me locking the bus targets real physical memory and not logical addresses. So I'm told this is going to work. But until I see a convincing test, I'll keep worrying about it. vector algebra ¶ Sometimes I wonder how I'd code if I hadn't taken vector calculus and linear algebra in college. I do a lot of complex address arithmetic and mappings, and I often think about one-dimensional vector algebra, with stacks of linear translations changing coordinate systems. Such metaphors just help me cross-check my thinking. I don't think it helps any, beyond seeing that I'm right. It's just mental verbal patter decorating the visual images. 01dec07
¶
flushed away
mail optimization ¶ Yesterday on news.ycombinator.com someone linked Bill Clementson's post from Marc 2006 on Lisp is for Entrepreneurs - Part 2 (Amazon), containing a long excerpt from Steve Yegge's blog. I believe I'm quoting Yegge below, even though this comes from Clementson's blog. Yes, here it is at Tour de Babel: Shel wrote Mailman in C, and Customer Service wrapped it in Lisp. Emacs-Lisp. You don't know what Mailman is. Not unless you're a longtime Amazon employee, probably non-technical, and you've had to make our customers happy. Not indirectly, because some bullshit feature you wrote broke (because it was in C++) and pissed off our customers, so you had to go and fix it to restore happiness. That's interesting, because I think it describes a job Amazon wanted me to take 3 or 4 years ago. (They didn't non-disclose me; but even so, I'm rarely comfortable discussing interviews I had with anyone, at all. Even saying this much is out of my comfort zone. So I'll be quite brief.) There was a batch mail program needing optimization, and apparently I looked perfect for it. I don't remember Lisp being mentioned; if this was Mailman, now I understand the interest a little better. I'd have taken the job, but my kids are in San Jose and Amazon is in Seattle. Combined with other downsides, I had trouble seeing a personal silver lining. I suppose my strong Lisp interest was a bonus. I'd thought it was all just about me being a performance wonk. But it's interesting to see more connections later. de dup ¶ Today on Wes Felter's Hack the Planet, Wes links recent storage news, including Sun's Taylor Allis yesterday on Top 10 Storage Technology Trends, which contains an item related to things I've done, which goes like this: There are two emerging de-dup architectures: "Inline" - where the de-dup magic happens in real-time, as data comes into the system... As it happens, I know a lot about it after redesigning a codec to be more efficient. But I really can't ever say anything about details. Too many secrets. longer hours ¶ To make a deadline, I expect to work a little on weekends, when my sons are not around, before the holidays this year. So this month might see little material here, unless I write about runtime stuff in common to most things I do sooner or later. Today I'm inclined to write about shared memory address games since I likely won't say how I do it in Lathe later. Tomorrow (Sunday) I'll go into the office and work. |