Created
May 8, 2023 06:20
-
-
Save clarking/ffbc8579ca21b9ec7a3395f61b9aa26f to your computer and use it in GitHub Desktop.
bytecode-less smalltalk
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"(C)1984-1985 Jecel Assumpcao Jr" | |
"$Id: st84.txt,v 1.1.1.1 1995/06/29 00:14:36 root Exp $" | |
"This is an implementation of Smalltalk described | |
in Smalltalk itself. This was designed in late 1984, but | |
what follows is a translation into English of the | |
February 26, 1985 version. It is only a paper design | |
as there was no way to test it then and represents | |
the first real object oriented design for the author. | |
A confusing aspect of this design is the *maxLex* | |
thing that keeps popping up - the author had assumed | |
that blocks were real closures and compiled temporary | |
variable references within blocks as (lexical level, | |
variable index) pairs with the associated complications | |
in the runtime access. The design is for a sequencial | |
interpreter, but a dataflow-like parallel implementation | |
is possible with the same object code format ( which | |
was one of the reasons why the format was chosen ). | |
-------------------------------------------------------- | |
The program below is a discription of part of the Smalltalk | |
system that will be eventually written in machine language | |
and that interprets compiled methods. It is interesting to | |
note the resemblence between the static data structures | |
( templates and blocks ) and the dynamic structures used | |
by the interpreter. These structures take up more memory | |
than the bytecodes used in other Smalltalk systems, but | |
achieve a greater uniformity, which we hope will result | |
in a higher performance. | |
Only the methods that help to make clear how the interepreter | |
works will be shown." | |
Object subclass: #Template | |
instanceVariableNames: 'expressions' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This is the basic unit of Smalltalk object code. It | |
represents a single message as an Array of expressions | |
containing the selector, the receiver and the arguments | |
of the message. While the selector is a string, the | |
receiver and the arguments are themselves Templates, | |
forming a tree." ! | |
! Template methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"makes an ( almost ) clone of itself and | |
saves the specified dynamic environment" | |
| tmp | | |
tmp <- MethodContext new. | |
tmp sender: context1 index: anInteger maxLex: context2. | |
tmp expressions: expressions shallowCopy. | |
^ tmp ! | |
saveContext | |
"the current environment is stored away for later" | |
| | | |
^ self sender: CurrentProcess currentContext | |
index: CurrentProcess index | |
maxLex: CurrentProcess maxLex ! | |
eval | |
"creates a new context to determine the | |
value of this code in the current environment" | |
| | | |
CurrentProcess dispatch: self saveContext !! | |
Template subclass: #ReplyTemplate | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class is for code which starts with the $^ | |
character" ! | |
! ReplyTemplate methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"remember the home context, rather than | |
the current one" | |
| | | |
^ super sender: context2 sender | |
index: context2 index | |
maxLex: context2 !! | |
Template subclass: #SuperTemplate | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents code generated from | |
messages sent to super" ! | |
! SuperTemplate methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"the clone is a SuperContext" | |
| tmp e | | |
tmp <- SuperContext new. | |
tmp sender: context1 index: anInteger maxLex: context2. | |
e <- expressions shallowCopy. | |
e at: 2 put: context2 receiver. | |
tmp expressions: e. | |
^ tmp !! | |
Template subclass: #SelfTemplate | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents code generated from | |
messages sent to self" ! | |
! SelfTemplate methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"the receiver is the same as context2 receiver" | |
| tmp e | | |
tmp <- MethodContext new. | |
tmp sender: context1 index: anInteger maxLex: context2. | |
e <- expressions shallowCopy. | |
e at: 2 put: context2 receiver. | |
tmp expressions: e. | |
^ tmp !! | |
Template subclass: #HCTemplate | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents code generated from | |
messages sent to the home context" ! | |
! HCTemplate methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"the receiver is the sending context" | |
| tmp e | | |
tmp <- MethodContext new. | |
tmp sender: context1 index: anInteger maxLex: context2. | |
e <- expressions shallowCopy. | |
e at: 2 put: context1. | |
tmp expressions: e. | |
^ tmp !! | |
Template subclass: #LiteralTemplate | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents literal objects | |
in the source code. The expressions | |
instance variable contains a single | |
object ( of any kind ) rather than an | |
Array of Templates" ! | |
! LiteralTemplate methodsFor: '' ! | |
eval | |
"this is easy: we already know the answer" | |
| | | |
CurrentProcess reply: expressions | |
to: CurrentProcess currentContext | |
index: CurrentProcess index | |
maxLex: CurrentProcess maxLex !! | |
Template subclass: #LiteralReply | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents code generated from | |
a literal object in the source code after | |
the $^ character" ! | |
! LiteralReply methodsFor: '' ! | |
eval | |
| | | |
"the answer goes to the home context sender" | |
CurrentProcess reply: expressions | |
to: CurrentProcess maxLex currentContext | |
index: CurrentProcess maxLex index | |
maxLex: CurrentProcess maxLex maxLex !! | |
Template subclass: #Block | |
instanceVariableNames: 'numTemporaries' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents blocks in the source | |
code. numTemporaries tells us how much space | |
to add when cloning. expressions is an Array | |
of Templates which are evaluated in order and | |
have all ( but the last ) values thrown away. | |
Source code methods are also compiled to | |
blocks." ! | |
! Block methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"cloning much reserve space for the temporaries - | |
this could be avoided by merging the temporaries | |
and expressions Arrays" | |
| tmp | | |
tmp <- BlockContext new. | |
tmp nextLex: context1. | |
tmp expressions: expressions. "no need to copy | |
them as they cannot | |
be changed ( the | |
returned values are | |
thrown away" | |
tmp temporaries: ( Array new: numTemporaries ). | |
^ tmp ! | |
eval | |
"evaluating a block results in a BlockContext" | |
| | | |
CurrentProcess reply: self saveContext | |
to: CurrentProcess currentContext | |
index: CurrentProcess index | |
maxLex: CurrentProcess maxLex !! | |
Object subclass: #Context | |
instanceVariableNames: 'sender index maxLex expressions' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This represents the current state of a computation" ! | |
! Context methodsFor: '' ! | |
sender: context1 index: anInteger maxLex: context2 | |
"initialize" | |
| | | |
sender <- context1. | |
index <- anInteger. | |
maxLex <- context2. ! | |
expressions: anArray | |
| | | |
expressions <- anArray ! | |
at: int put: value | |
| | | |
expressions at: int put: value ! | |
lexicalLevel: anInt | |
"normal Contexts may nest in the same | |
lexical level" | |
| | | |
^ sender lexicalLevel: anInt !! | |
Context subclass: #BlockContext | |
instanceVariableNames: 'temporaries nextLex' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This class represents the dynamic environment | |
of the evalutation of a block" ! | |
! BlockContext methodsFor: '' ! | |
temporaries: anArray | |
| | | |
temporaries <- anArray ! | |
nextLex: context | |
| | | |
nextLex <- context ! | |
value | |
"starts the evaluation of the block" | |
| | | |
sender <- CurrentProcess currentContext. | |
index <- CurrentProcess index. | |
maxLex <- CurrentProcess maxLex. | |
CurrentProcess dispatch: self ! | |
value: arg | |
| | | |
temporaries at: 1 put: arg. | |
self value ! | |
value: arg1 with: arg2 | |
| | | |
temporaries at: 2 put: arg2. | |
self value: arg1 ! | |
evalAt: anInt | |
"evaluate one of the block's expressions" | |
| | | |
( anInt = expressions size ) | |
ifTrue: [ CurentProcess return ] "replies | |
to whom I should reply" | |
ifFalse: [ ( expressions at: anInt ) eval ] ! | |
temporary: anInt | |
| | | |
^ temporaries at: anInt ! | |
temporary: anInt put: obj | |
| | | |
temporaries at: anInt put: obj ! | |
at: anInt put: obj | |
| | | |
^ self "just ignore it - blocks throw aways results" ! | |
lexicalLevel: anInt | |
"a block always starts a new lexical level" | |
| | | |
( anInt = 1 ) ifTrue: [ ^ self ] | |
ifFalse: [ ^ nextLex lexicalLevel: anInt - 1 ] !! | |
Context subclass: #MethodContext | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"stores the dynamic environment of evaluating a message. | |
There are two stages: | |
1) the receiver and arguments must be evaluated | |
2) the message must be sent ( the associated | |
method is found and this becomes the home | |
context of that method as it is evaluated )" ! | |
! MethodContext methodsFor: '' ! | |
evalAt: anInt | |
| | | |
( anInt > expressions size ) | |
ifTrue: [ CurentProcess send ] | |
ifFalse: [ ( expressions at: anInt ) eval ] ! | |
receiverClass | |
| | | |
^ self receiver class ! | |
receiver | |
| | | |
^ expressions at: 2 ! | |
selector | |
| | | |
^ expressions at: 1 ! | |
argument: anInt | |
| | | |
^ expressions at: anInt + 2 !! | |
MethodContext subclass: #SuperContext | |
instanceVariableNames: '' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"Kludge to implement super" ! | |
! SuperContext methodsFor: '' ! | |
receiverClass | |
"this dynamic lookup is very inefficient and only | |
must be done because a messages holder is not | |
stored anywhere" | |
| | | |
^ ( maxLex receiver class superclassWhichHas: maxLex selector ) | |
superclass !! | |
Object subclass: #Interpreter | |
instanceVariableNames: 'currentContext index maxLex' | |
classVariableNames: '' | |
poolDictionaries: '' | |
category: '' | |
"This represent a single thread of execution. The global | |
variable CurrentProcess is always an object of this | |
class" ! | |
! Interpreter methodsFor: '' ! | |
currentContext | |
| | | |
^ currentContext ! | |
index | |
| | | |
^ index ! | |
maxLex | |
| | | |
^ maxLex ! | |
step | |
"evaluate the next thing" | |
| | | |
index <- index + 1. | |
currentContext evalAt: index ! | |
dispatch: newContext | |
| | | |
"starts an additional level - like the | |
colon in FORTH" | |
currentContext <- newContext. | |
index <- 0. | |
self step ! | |
return | |
"ends a level of evaluation - like the | |
semicolon in FORTH" | |
| | | |
index <- currentContext index. | |
maxLex <- currentContext maxLex. | |
currentContext <- currentContext sender ! | |
reply: value to: context1 index: anInt maxLex: context2 | |
"replaces a context subtree with the computed | |
result" | |
| | | |
index <- anInt. | |
maxLex <- context2. | |
currentContext <- context1. | |
currentContext at: index put: value. | |
self step ! | |
send | |
"starts the second phase of MethodContext evaluation" | |
| method value | | |
method <- self lookFor: currentContext selector | |
in: currentContext receiverClass. | |
method isNil ifTrue: [ self sendMessageNotUnderstood ] | |
ifFalse: [ value <- PrimitiveFailed. | |
method primitive isNil | |
ifFalse: [ value <- self doPrimitive: | |
method primitive ]. | |
( value = PrimitiveFailed ) | |
ifFalse: [ | |
self reply: value | |
to: currentContext sender | |
index: currentContext index | |
maxLex: currentContext maxLex | |
] | |
ifTrue: [ | |
maxLex <- currentContext. | |
method compiled eval | |
] ] ! | |
lookFor: selector in: class | |
"not really needed - just to speed up things" | |
| value | | |
Cache includes: selector and: class | |
ifTrue: [ value <- Cache at: selector and: class ] | |
ifFalse: [ value <- class lookup: selector. | |
value isNil | |
ifFalse: [ Cache at: selector and: class | |
put: value ] ] | |
^ value !! | |
"----------------------------------------------- | |
To see how this might work, it is important to know | |
what the compiler output looks like. On page 546 of | |
the Blue Book ( Smalltalk-80: The Language and Its | |
Implementation ) we find this example: | |
merge: aRectangle | |
| minPoint maxPoint | | |
minPoint <- origin min: aRectangle origin. | |
maxPoint <- corner max: aRectangle corner. | |
^ Rectangle origin: minPoint | |
corner: maxPoint | |
which gives as the bytecodes: | |
#( 0 16 209 224 105 1 16 211 226 106 69 17 18 224 124 ) | |
and the literal frame: | |
#( min: origin max: corner origin:corner: Association(Rectangle,...) ) | |
For our interpreter we would have: | |
Block(2,#(HCTemplate(#(LiteralTemplate('temporary:put:') | |
nil | |
LiteralTemplate(1) | |
Template(#(LiteralTemplate('min:') | |
SelfTemplate(#(LiteralTemplate('basicAt:') | |
nil | |
LiteralTemplate(1) | |
)) | |
Template(#(LiteralTemplate('origin') | |
HCTemplate(#(LiteralTemplate | |
('argument:') | |
LiteralTemplate(1) | |
)) | |
)) | |
)) | |
HCTemplate(#(LiteralTemplate('temporary:put:') | |
nil | |
LiteralTemplate(2) | |
Template(#(LiteralTemplate('max:') | |
SelfTemplate(#(LiteralTemplate('basicAt:') | |
nil | |
LiteralTemplate(2) | |
)) | |
Template(#(LiteralTemplate('corner') | |
HCTemplate(#(LiteralTemplate | |
('argument:') | |
LiteralTemplate(1) | |
)) | |
)) | |
)) | |
ReplyTemplate( | |
#(LiteralTemplate('origin:corner:') | |
Template(#(LiteralTemplate('value') | |
LiteralTemplate(Association(Rectangle,...) | |
)) | |
HCTemplate(#(LiteralTemplate('temporary:') | |
LiteralTemplate(1) | |
)) | |
HCTemplate(#(LiteralTemplate('temporary:') | |
LiteralTemplate(2) | |
)) | |
)) | |
) | |
) | |
Now this includes 38 Template objects, which would consume a | |
huge amount of memory ( plus the 14 Array objects ). Of these, | |
14 LiteralTemplate objects are there only to quote the selector | |
name - if the interpreter were modified to skip selector evaluation | |
they could be eliminated. This would still make the memory requirements | |
of this implementation several times that of a bytecoded one. If | |
the compiler cached objects like: | |
HCTemplate(#(LiteralTemplate('temporary:') | |
LiteralTemplate(2) | |
)) | |
which occur repeatedly, the actual code expansion could be less than | |
100%. Also, the interpreter could short circuit evaluation when | |
encountering such common templates. | |
The reason for all this mess is that a bytecoded implementation of | |
Smalltalk didn't seem object-oriented enough. This proposal has | |
message passing at the bottom ( to use a Self term ) and the compiled | |
code itself is made up of objects. It was hoped that such an | |
implementation would be easier to render into hardware." |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment