The Long View

MacNosy

Humans do some things, like pattern recognition, much better than computers do. Computers are way better at keeping track of large data sets, and applying patterns over and over. I have always thought that the coolest program you could write would be one that uses the user’s extraordinary (human) abilities to see patterns and iteratively improves a computer’s best solution for a large problem. The user’s job is just to correct errors made by the program. Then the program applies the lesson learned in the form of a constraint to find a new solution. Over a few iterations, the user-computer interaction might solve problems neither could solve alone. The program learns how to solve the program using a few small hints from its human advisor. The problem is solved when the human can’t find any more mistakes. I’ve still never written a program like that, but I’ve used one.

MacNosy is a disassembler. It reads the file containing an executable program, and expands it to assembly language. This doesn’t sound so hard to do. Maybe it seems like something the computer could do by itself. But it’s harder than it sounds, because a lot of different things are embedded in an executable file. Only some of it should be interpreted as executable code. There are also statically allocated data blocks. And data blocks contain data in different formats, including text strings, simple integers, floating point numbers, arrays of any of that, or complex data structures with multiple fields differing in interpretation. There is nothing in the executable data to indicate the boundaries between data blocks or between code and data. Of course there are no explicit statements of the format or length of data in a data block. Some data are constant, in that their values are initialized in the compiled code and so the number stored there in the executable file may provide some clue about their format. Others are uninitialized. They probably contain zeroes (but not necessarily) and they will acquire their values one or more times during execution of the program. It’s hard to know what kind of data they will contain. Some data blocks are never used at all. For example, some compilers (like MPW) put function names as strings in memory at the boundaries between named code blocks. These are used during debugging, and obviously they could be very useful during disassembly. But they aren’t always there. You’d like to discover whether they are there or not and change your interpretation of those unreferenced data blocks.

Even finding the boundaries between code blocks can be a challenge. You might think that all subroutines would begin with a LINK instruction, and end with an UNLINK followed by a RTS (return from subroutine). But all this is optional. To optimize (by a tiny bit), compilers or assembly language programmers sometimes do not bother with link or unlink. Sometime they even to return to the caller by popping the return address into a register and doing a JMP.

When you find the start of a code block, you want to know how the subroutine gets its parameters. Most 68k Macintosh compilers (and programmers) pass function parameters on the stack. Sometimes they follow the Pascal convention of pushing arguments onto the stack from left to right, whereas others follow the opposite (C) convention. Calls to the toolbox must be made in the Pascal way, so C compilers often do some subroutine calls one way and some the other. Also, some toolbox routines don’t receive their parameters or return their return values on the stack, but rather use registers for this. Some compilers opt to pass parameters in registers by default, if there not too many of them. If there are, they might pass the first k of them in registers, and the rest on the stack. Of course, this is the usual convention for PowerPC compilers (yech). Once you start looking around in a program and you see how its compiler created its code, you can often see the pattern right away. But it’s very tough to expect a program to figure that stuff out on its own.

MacNosy has a lot of a priori knowledge about Macintosh code. It knows the m68k instruction set. It knows the structures of Macintosh standard data types and data structures, and it knows the library functions that are standard for some often used Macintosh development systems (like Lisa Pascal, MPW, Consulair Mac C). You can inform MacNosy about the library used by other development systems. If you have a symbols file (that associates variable and function names with addresses), it will use that. If you are looking at code you compiled yourself, you probably have one of those.

MacNosy also makes assumptions, which are true most of the time, but not always. For example, if an address is used in a JSR (jump to subroutine) instruction, it is assumed to be the entry point for a code block. If it is used by a LEA (load effective address) instruction, it is assumed to be the address of a data block. On it’s first pass, it tries to disassemble the program and it shows you the result.

Try it out

Start by double clicking the LaunchNosy icon. Everything changes. The menubar is gone, there are no windows. You are now entering the unique world of MacNosy.

You can use the standard file dialog to find the program you are planning to disassemble. Nosy offers some instructive examples. One is an image of the Macintosh ROM. More about that later, but I’ll start by navigating to one of my own programs, and disassembling it. It asks a couple of questions about what you want to disassemble. I tell it to disassemble CODE resources, and to read symbols from the .Map file created by Mac C.

I now get the TreeWalk options. You can see the libraries it will search. This program uses the default A5 to point to globals. Most development systems do. MacNosy can search for all the standard libraries, although I know this program was compiled with Consulair Mac C, so this is the only one I need. This is not a DRVR resource, but a regular program. Don’t click Debug. This is for debugging MacNosy. We’re not ready for that (we never will be). When we click the Continue button, MacNosy will have a go at the program, go back to regular menu and window world, and give us a list of putative code blocks in a window along the left side of the screen. And it has a list of mysteries that it knows it needs our help with.

You can see that they are JMP instructions. A look at function printdefault reveals that it has an unusual entry point that is reached by a JMP (A0) instruction. This is nothing wrong, just a violation of MacNosy’s assumptions. Put this window away, and look over the list of code blocks. Some have recognizable names, but some have names like proc_1 or com_1. It’s possible to bring up another window that has data blocks.

Some of these have names I know are associated with functions (like malloc, srcat, signal and ptocstr). Those should be code, and MacNosy has misinterpreted these blocks. It’s time for some human intervention. This is done by by reviewing the data blocks. We pick Review from the Reformat menu, and we will go thought all of the unTyped and unRefed data blocks and verify that they are data, or maybe convert them to code. These are data blocks that MacNosy can’t figure out what they are, or don’t seem to be used as data by any code block. If it’s data, why doesn’t any code read or write to it? Here’s the second data block, the one with the name STRCAT. That’s a funny name for a data block. Nosy shows it to us as data, and gives us some choices in the form of an ugly old command line list of things we could type. You have to get used to Jasik’s way of explaining commands. For example, the first one says <Hex|Dec|Asc|Zero><B|W|L>n_item. This means we could reformat the data block as bytes, words or longs written in Hex, Decimal, Asci, or Zeros, with some number (n_item) of each per lline. That one is cosmetic. A really useful one is New{Byt} cnt. This breaks the data block into two pieces, with the first one being cnt bytes long. Finding the boundaries between data blocks and between data and code is hard. If you see it and MacNosy doesn’t, that’s how you make the cut. MacNosy says this is an unreferenced data block. If this really is data, maybe it isn’t referenced because it is really part of the previous data block, that is In that case, you could try combining it with the preceding block using cOmbine (type O).

But in this case it is probably code. If we type c it will convert the data block to code and show that. Wow that looks much better. This is the standard C string concatenation routine. Now my choices are to confirm that it is code by typing i, or revert it to data by typing r. I’m sure this is code.

Block by block, MacNosy will take you through all the questionable data blocks, and let you do your best to correct its mistakes. When that is done, you should tell it to Explore again (from the Reformat menu) so it can go though the entire program and apply everything it learned to everything it already knew. The next version of the disassembled file has a lot fewer errors. It doesn’t just correct the errors you found, but others you didn’t know about. On the other hand, if you told it something wrong, it may have applied that erroneous advice, and got itself into trouble. Like all good students, MacNosy can get pretty confused if it’s given erroneous information. Better information on the next iteration can turn that around. Going a few iterations through this reduces identifiable errors to zero. I think I am finished. Man, was that fun. Fun, but futile, because I’m looking at my own code. I already knew how it worked.

Secrets of the ROM

The same approach will work to understand other people’s code. Is this okay? Like everything else, it depends on what you want to do it for. Are you trying to steal somebody’s trade secrets? Not okay. Are you trying to become a better programmer by understanding how your computer works? Good for you. Are you trying to remove copy protection so you can make a harmless backup of a program you purchased? Are you trying to figure out the operating system well enough to do something the designers didn’t envision and don’t approve of? Might be a little bit of a gray area legally, but I think these things ought to be allowed.

Maybe you are just trying to use a library that was provided to you only as executable code. And the library we most want to understand is...the Macintosh toolbox. This was a common problem for Macintosh programmers in the early days. Inside Macintosh documented the toolbox. It was very well written and much effort was put into making Inside Macintosh useful, but sometimes you couldn’t figure out what it was actually saying. For whatever reason, some things were inadequately documented. But really, the toolbox routines in ROM were mostly not long complicated things, but simple routines efficiently coded in assembly language. Why not just look at it and see how it works?

A common problem back in the 1980’s was implementing a floating window, with a tool palette in it. Cool programs all had those, but there was nothing useful in Inside Macintosh about how to implement one. There was no alternative but to really understand the Window Manager, and figure out how it managed the window list. Then you could make sure your floating palette didn’t get put behind a regular window, make it disappear when your program went into the background, bring it back when you want it, etc. Here is the complete text on this topic from Inside Macintosh, page I-287.

What? How will a GhostWindow behave? Does that mean I can attach my window pointer to GhostWindow and it will become a floating palette? What will happen to events sent to that window? When will it be redrawn? Let’s take a look at the Window Manager code using MacNosy.

Start MacNosy, and open the file named ROM. This is an image of the original Macintosh ROM. Jasik has already Nosied the ROM, so the data blocks already have all the right names, and can be found just by searching using Find, and then display that Code Block.

Nosing about in the Window Manager

Start with FrontWindow, since that is the only toolbox routine that Inside Macintosh mentions. The entire routine fits in a single screen. Note no LINK or UNLINK. It doesn’t store any registers, and it clobbers A0 and D0. It first clears the place for it’s returned value 4(A7) [A7 is the stack pointer]. It checks the system global WWExist, which is non-zero if the window manager has been initialized. It returns nil if it hasn’t. Then it gets the WindowList (also a system global, and tests to make sure it is non-zero. The WindowList should hold a pointer to the the Windows, which are linked in front-to-back order by the WindowRecord field nextWindow. If WindowList is zero there are no windows, so we return without changing the zero result and return nil. If there is a window at WindowList its address is stored in A0. It should be the frontmost window, if it is not hidden. But now it compares the frontmost window in A0 with GhostWindow. If they are the same, it skips the window by putting its nextWindow field in A0 and returning to line mx1, to try again. If it finds a window that is not GhostWindow but is visible, it puts that in the return value 4(A7) and returns. Well, that agrees with what Inside Macintosh said. Real question is, what happens when I try to bring some other window to the front. Will it be put in front of my GhostWindow? It better not. This is a longer subroutine, so I ask Nosy to save it as text. System globals have been colored red, and I have added some comments.

;- $A920 BringToFront(theWindow:WindowPtr)

BringToFront

CLR.B DragFlag ;clear that old drag boolean

proc925 MOVEM.L D3-D6/A2-A4,-(A7) ; this is an alternative entry point called proc925

MOVEA.L 32(A7),A3 ;Get the windowPtr in A3

BSR proc883 ;Get old Port in D4, set the window manager's port

LEA WindowList,A2 ;Get the windowList in A2

CMPA.L (A2),A3 ;Is the top window the one we're bringing to the top?

BEQ.S mgs_5 ;if so we can just restore the port and get out

BSR proc879 ;Allocate a new region it's handle is on the stack

POP.L D3 ;pop the region handle into D3, use as clobbered rgn

MOVEA.L (A2),A4 ;copy the current top windowPtr to A4

;in this loop we walk the windowlist looking for the right window. The

;visrgn of each visible window, minus its structure region, is added to the

;accumulating clobbered rgn

;entering this loop, A4 contains the windowPtr we're testing

;A3 has the windowPtr that we want to bring to the front

mgs_1 CMPA.L A3,A4 ;is this the one we're bringing to the front?

BEQ.S mgs_3 ;if so, we've found it, go to

TST.B visible(A4) ;see if this window is visible

BEQ.S mgs_2 ;if not we can skip this window.

; Otherwise now we have to accumulate this window’s visible region

; into the part that needs to be redrawn

MOVE.L 8(A4),D6 ;get topLeft of it's BitMap.bounds (global) into D6

MOVE.L visRgn(A4),D5 ;get the visRgn in D5 (local coord)

PUSH.L D5 ;push the visRgn

PUSH.L D6 ;push the topLeft corner of the bitMap

BSR.S proc926 ;make both the h and v fields negative

_OfSetRgn ; (rgn:RgnHandle; dh,dv:INTEGER) Offset the visRgn to 0,0 in the Port

PUSH.L D5 ;push the visRgn again

PUSH.L structRgn(A3) ;push the structRgn

PUSH.L D5 ;and the visRgn again

_DiffRgn ; (srcRgnA,srcRgnB,dstRgn:RgnHandle) ;subtract the structRgn out of visRgn

PUSH.L D5 ;push the visRgn again

PUSH.L D6 ;push that topLeft corner

_OfSetRgn ; (rgn:RgnHandle; dh,dv:INTEGER)Offset visRgn back to where it was

PUSH.L 114(A4) ;push that structRgn again

PUSH.L D3 ;now get that temporary rgnHandle

PUSH.L D3 ;and add to it the result of the previous work

_UnionRgn ; (srcRgnA,srcRgnB,dstRgn:RgnHandle) into our temporary handle

mgs_2 MOVEA.L 144(A4),A4 ;get a pointer to the next window in A4

MOVE.L A4,D0 ;copy it to D0

BNE mgs_1 ;and loop back until NIL (end of the list of windows)

; or target is found

; when we get here, the target is found. Now we want to adjust the WindowList

mgs_3 PUSH.L A3 ;Bypass the window we're bring to the front,

BSR proc869 ;removing it from the WindowList

MOVE.L (A2),144(A3) ;now make it's nextWindow field point at

;the former top window

MOVE.L A3,(A2) ;and make WindowList point at it. Now it's the top

TST.B DragFlag ;what about that DragFlag?

BNE.S mgs_4 ;if it's true we can skip this (e.g. selectWindow)

PUSH.L 114(A3) ;get the strucRgn of the new front window

PUSH.L D3 ;and the clobberedRegion

PUSH.L D3 ;and find the intersection

_SectRgn ; (srcRgnA,srcRgnB,dstRgn:RgnHandle)

PUSH.L A3 ;then get the window being brought to the front

PUSH.L D3 ;and its intersection with the clobbered region

_PaintOne ; (window:WindowPeek ; clobbered:RgnHandle) ;draw the frame and erases

; the clobbered place

mgs_4 PUSH.L A3

_CalcVis ; (window:WindowPeek) ;Calculate the new VisRgn

PUSH.L D3

_DisposRgn ; (rgn:RgnHandle) ;dispose the working region (clobbered region)

mgs_5 BSR proc884 ;Restore the port pointer to one saved in D4

MOVEM.L (A7)+,D3-D6/A2-A4

POP.L (A7)

RTS ;that's it

Thank you Steve Jasik

Hey wait. No reference to GhostWindow in there?. If some window is brought to the front, it’s going to go right over the top of my GhostWindow. That lightly-documented feature of Inside Macintosh is useless for making a floating palette that sits on top of everything else. In fact, floating palettes were not envisioned as a part of the Macintosh user interface, and Apple engineers, as good as they were, were not in any way helpful in making them happen. To invent those, programmers had to go deeper into the Macintosh operating system than they had the right to do. They had to go into the gray area, and reverse engineer the toolbox well enough to hack a new feature. Later, Apple added support for floating windows. Do you suppose this could still happen?

Thanks Steve Jasik, for MacNosy. Many of the features that users have grown to expect from Macintosh software required programmers to stretch the user interface beyond the place envisioned by the architects of the Macintosh operating system. The tool that you created made that possible.

-- BG (basalgangster@macGUI.com)

Saturday, April 24, 2010

next >

< previous