Background

Programmers are used to working with code as if it were a collection of short stories, because computer code when stored on disk is traditionally organized into files and projects. When being edited, it is also presented to the user that way. For most programming languages, the file structure is important, since it often corresponds to an organizational method; e.g.:

  • Header files in C-based languages are imported by name and path.
  • Java uses the folder structure to mirror the package structure.
  • Java also uses the filename to show the public class defined in that file.
  • Python works similarly for package structure and filename.

So presenting the file-level structure to the programmer makes sense, since it is an important organizational detail. But what about the contents of these files? Again, computer code is traditionally stored in plain text files, can be read or written by any simple text editor, and is presented in a linear fashion, like English text. Dedicated IDEs and smarter editors add features on top of this presentation, like syntax highlighting, auto-completion, compiler integration, etc., but the code itself is still presented as it appears in the text file: one line after another. There are various tricks applied on top of that to aid navigation (code folding, symbol/tag lists), but none of them fundamentally alter the linear, file-oriented presentation of the code.

A few reasons why this is still the case:

  • For some languages, order in the file is significant. In C, for example, you have to mention a function or variable before you can use it. Mostly this is accomplished by #include-ing a header file, but within a single file, if you define a function, any code after that can call that function; if you move the definition around, code that was after the original definition could fail to compile.
  • Text editors are a lowest-common-denominator. Sometimes you have nothing else, but need to modify some code right now; a standard text editor may not be as nice as your dedicated development environment, but they both show the code the same way, so at least you won’t get lost.

A new proposal

I won’t claim to speak for all programmers, but when I’m coding, what I care about are logical constructs. I may be looking at 3 different functions at once, and I don’t particularly care about where in the source file they live; for that particular task, I want to look at them together. With smaller files, good code organization can mostly solve this problem, but as projects get more complex, I find myself wishing for a view that is not bound by file layout. And of course, sometimes you want to examine code from different files side by side. Editors with good splitting and navigation capabilities can make this less painful, but this is still a band-aid on the problem.

So why not break free from the traditional text editor view of the world, and present code in a different way? Specifically:

  • Analyze source files to find the units of code the programmer is interested in. For most languages the logical unit is functions or methods, but an argument could also be made for classes, provided they are small enough.
  • Extract these units, along with their original location and namespace information.
  • Present them to the user in a fashion that allows the user to organize the units on the screen however they want; this includes allowing them to fold units up and put them out of the way.

This allows the user to easily grab related units of code, place them together, and study them, regardless of where in the file(s) they actually live. Armed with the original namespace and location information, editing is also possible. Changes as well line insertions and deletions can easily be reconciled with the source document; indeed, completely new units can be inserted, with the editor selecting an appropriate location for them.

A proof of concept

As a proof of concept, I’ve implemented part of the above feature-set. The implementation uses Python to analyze Python, and presents the results with HTML. The units extracted are functions, and the presentation allows the user to drag the units around the page. The main components are:

  • Python’s built-in ast module, which is used for analysis of the Python source.
  • The pygments library, which is used to turn the source code into syntax highlighted HTML and CSS.
  • A template library to assemble the final HTML from the snippets; tornado‘s is used, but almost any one could be substituted.
  • jQuery UI, which is used to provide basic interactivity (making each source snippet draggable).

Clearly several features discussed above are missing, most notably any sort of editing or round-trip support. Multiple file support and folding are also desired features that are absent at the moment. Getting to this point was surprisingly easy, mostly due to Python’s self-analysis abilities. Below is a snippet showing the ease of extracting all the functions from a Python source file, complete with class and line number information:

class FunctionVisitor(ast.NodeVisitor):
    def __init__(self, source):
        self.source = source
        self.current_class = None
        self.current_function = None
        self.functions = []

    def visit_ClassDef(self, node):
        self.current_class = node.name
        self.generic_visit(node)
        self.current_class = None

    def visit_FunctionDef(self, node):
        reset = False
        if not self.current_function:
            reset = True
            self.current_function = Function(self.source, node.name, self.current_class)
            self.functions.append(self.current_function)

        self.current_function.add_line(node.lineno - 1)

        self.generic_visit(node)

        if reset:
            self.current_function = None

    def generic_visit(self, node):
        try:
            if self.current_function:
                self.current_function.add_line(node.lineno - 1)
        except AttributeError:
            pass

        super(FunctionVisitor, self).generic_visit(node)

...

top = ast.parse(text)
v = FunctionVisitor(source)
v.visit(top)

return v.functions

The full implementation can be found in a github repository. Here’s a screenshot of it in action:

nle-screenshot

Non-linear code display in action

Related work & future directions

I’m not the only person to think of concepts along this line. In fact, much of the impetus for this came from reading about the Light Table project, which takes a similar idea about code editing and runs with it even further, adding features like interactivity and in-editor output. I wanted to explore the idea myself, with a language that I use, which was the genesis of this experiment. I’ve also seen this idea mentioned in various other places around the web and Twitter.

For future work on this project, code folding should be easily accomplished with JS & CSS, and multi-file support is an easy extension to the parsing framework. More interesting is editing support; if the presentation continues to be HTML then some sort of backend load/save framework will be needed, but there are already code-centric editors that are HTML components. Code Mirror is one such editor, which is coincidentally also used by Light Table.