Last active
February 22, 2016 02:30
-
-
Save jackrusher/57525bef17382c9a7207 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; This is the sort of thing that makes me think you'd enjoy using one | |
;; of the better Common Lisp implementations for certain kinds of | |
;; work. I'm using the latest version of SBCL here. | |
;; Ok, contrived example, but easy to imagine the code that it should | |
;; generate... | |
(defun add (x y) | |
(+ x y)) | |
(disassemble #'add) | |
; disassembly for ADD | |
; Size: 41 bytes. Origin: #x1004CB7393 | |
; 93: 498B4C2460 MOV RCX, [R12+96] ; thread.binding-stack-pointer | |
; no-arg-parsing entry point | |
; 98: 48894DF8 MOV [RBP-8], RCX | |
; 9C: 488BD6 MOV RDX, RSI | |
; 9F: 488BFB MOV RDI, RBX | |
; A2: 41BBD0010020 MOV R11D, 536871376 ; GENERIC-+ | |
; A8: 41FFD3 CALL R11 | |
; AB: 488B5DE8 MOV RBX, [RBP-24] | |
; AF: 488B75F0 MOV RSI, [RBP-16] | |
; B3: 488BE5 MOV RSP, RBP | |
; B6: F8 CLC | |
; B7: 5D POP RBP | |
; B8: C3 RET | |
; B9: 0F0B10 BREAK 16 ; Invalid argument count trap | |
;; ... not terrible, especially for a runtime that provides arbitrary | |
;; precision arithmetic and traps errors in a civilized way, but with | |
;; safety and generic #'+ that figures out arity and handles different | |
;; types we pay a price in performance relative to the asm we'd write. | |
;; There are a number of different ways to speed this up: declaring a | |
;; greater speed to safety preference, annotating types. Here's an | |
;; optimized version with that sort of ugly cruft added to get really | |
;; good performance specifically for 32-bit integers. | |
(defun add (x y) | |
(declare | |
(type (unsigned-byte 32) x y) | |
(optimize (speed 3) (safety 0))) | |
(the (unsigned-byte 32) (+ x y))) | |
(disassemble #'add) | |
; disassembly for ADD | |
; Size: 12 bytes. Origin: #x1004CEFD82 | |
; 2: 4801F9 ADD RCX, RDI ; no-arg-parsing entry point | |
; 5: 488BD1 MOV RDX, RCX | |
; 8: 488BE5 MOV RSP, RBP | |
; B: F8 CLC | |
; C: 5D POP RBP | |
; D: C3 RET | |
;; ... I can definitely live with that. | |
;; We can write macros that produce code with these annotation to | |
;; compile our own high-level DSLs into the assembly we would have | |
;; written. Common Lisp has its warts, but it really is amazing when | |
;; it comes building whatever abstractions you prefer while retaining | |
;; control over low-level code generation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; I'm not sure what Common Lisp you're using, ot why you'd want to | |
;; use a counter to repeatedly poke some data into a value cell, but | |
;; but with SBCL's compiler I get: | |
(defun test () | |
(let ((x 0)) | |
(declare | |
(optimize (speed 3) (safety 0))) | |
(type (unsigned-byte 32) x) | |
(loop for i to 999999999 do (incf x)))) | |
(disassemble #'test) | |
; disassembly for TEST | |
; Size: 58 bytes. Origin: #x1004AA1614 | |
; 14: 498B442460 MOV RAX, [R12+96] ; thread.binding-stack-pointer | |
; no-arg-parsing entry point | |
; 19: 488945F8 MOV [RBP-8], RAX | |
; 1D: 31D2 XOR EDX, EDX | |
; 1F: 31C0 XOR EAX, EAX | |
; 21: EB15 JMP L1 | |
; 23: 660F1F840000000000 NOP | |
; 2C: 0F1F4000 NOP | |
; 30: L0: 4883C202 ADD RDX, 2 | |
; 34: 4883C002 ADD RAX, 2 | |
; 38: L1: 483DFE933577 CMP RAX, 1999999998 | |
; 3E: 7EF0 JLE L0 | |
; 40: BA17001020 MOV EDX, 537919511 | |
; 45: 488BE5 MOV RSP, RBP | |
; 48: F8 CLC | |
; 49: 5D POP RBP | |
; 4A: C3 RET | |
; 4B: 0F0B10 BREAK 16 ; Invalid argument count trap | |
;; ... which runs in 0.000586 seconds of system time at 99.67% CPU utilization: | |
;; 811,693,353 processor cycles | |
;; 0 bytes consed | |
;; ... note that this last bit means no memory was allocated during the function call and loop. |
Julia
(using the LispREPL)
lisp> (defn test []
(do
(def x 0)
(for [i (range 1 1000000000)]
(def x (+ x 1)))
(@show x)))
test (generic function with 1 method)
lisp> (@time (test)) ; JIT warmup
x = 1000000000
0.008133 seconds (2.02 k allocations: 100.414 KB)
1000000000
lisp> (@time (test))
x = 1000000000
0.000078 seconds (25 allocations: 976 bytes)
1000000000
lisp> (@code_llvm (test))
define i64 @julia_test_21611() {
top:
%0 = alloca [6 x %jl_value_t*], align 8
%.sub = getelementptr inbounds [6 x %jl_value_t*]* %0, i64 0, i64 0
%1 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 2
%2 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 4
store %jl_value_t* inttoptr (i64 8 to %jl_value_t*), %jl_value_t** %.sub, align 8
%3 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 1
%4 = load %jl_value_t*** @jl_pgcstack, align 8
%.c = bitcast %jl_value_t** %4 to %jl_value_t*
store %jl_value_t* %.c, %jl_value_t** %3, align 8
store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8
store %jl_value_t* null, %jl_value_t** %1, align 8
%5 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 3
store %jl_value_t* null, %jl_value_t** %5, align 8
store %jl_value_t* null, %jl_value_t** %2, align 8
%6 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 5
store %jl_value_t* null, %jl_value_t** %6, align 8
%7 = load %jl_value_t** inttoptr (i64 140018150460440 to %jl_value_t**), align 8
store %jl_value_t* %7, %jl_value_t** %2, align 8
%8 = call %jl_value_t* @julia_getindex1788(%jl_value_t* inttoptr (i64 140018154010064 to %jl_value_t*), %jl_value_t** %2, i32 1)
store %jl_value_t* %8, %jl_value_t** %1, align 8
%9 = getelementptr inbounds %jl_value_t* %8, i64 1
%10 = bitcast %jl_value_t* %9 to i64*
%11 = load i64* %10, align 8
%12 = call %jl_value_t* @jl_gc_allocobj(i64 48)
%13 = getelementptr inbounds %jl_value_t* %12, i64 -1, i32 0
store %jl_value_t* inttoptr (i64 140018134741360 to %jl_value_t*), %jl_value_t** %13, align 8
%14 = getelementptr inbounds %jl_value_t* %12, i64 0, i32 0
store %jl_value_t* %8, %jl_value_t** %14, align 8
store %jl_value_t* %12, %jl_value_t** %2, align 8
%15 = load %jl_value_t** @jl_true, align 8
%16 = getelementptr %jl_value_t* %12, i64 1
%17 = bitcast %jl_value_t* %16 to i8*
%18 = bitcast %jl_value_t* %15 to i8*
%19 = load i8* %18, align 16
store i8 %19, i8* %17, align 8
%20 = load %jl_value_t** @jl_true, align 8
%21 = bitcast %jl_value_t* %12 to i8*
%22 = getelementptr i8* %21, i64 9
%23 = bitcast %jl_value_t* %20 to i8*
%24 = load i8* %23, align 16
store i8 %24, i8* %22, align 1
%25 = load %jl_value_t** @jl_true, align 8
%26 = getelementptr i8* %21, i64 10
%27 = bitcast %jl_value_t* %25 to i8*
%28 = load i8* %27, align 16
store i8 %28, i8* %26, align 2
%29 = load %jl_value_t** @jl_false, align 8
%30 = getelementptr i8* %21, i64 11
%31 = bitcast %jl_value_t* %29 to i8*
%32 = load i8* %31, align 16
store i8 %32, i8* %30, align 1
%33 = getelementptr inbounds %jl_value_t* %12, i64 2, i32 0
%.c4 = inttoptr i64 %11 to %jl_value_t*
store %jl_value_t* %.c4, %jl_value_t** %33, align 16
%34 = load i64* inttoptr (i64 140018177782208 to i64*), align 64
%35 = getelementptr inbounds %jl_value_t* %12, i64 3, i32 0
%.c5 = inttoptr i64 %34 to %jl_value_t*
store %jl_value_t* %.c5, %jl_value_t** %35, align 8
%36 = load i64* inttoptr (i64 140018132803712 to i64*), align 128
%37 = getelementptr inbounds %jl_value_t* %12, i64 4, i32 0
%.c6 = inttoptr i64 %36 to %jl_value_t*
store %jl_value_t* %.c6, %jl_value_t** %37, align 16
%38 = load i64* inttoptr (i64 140018132803616 to i64*), align 32
%39 = getelementptr inbounds %jl_value_t* %12, i64 5, i32 0
%.c7 = inttoptr i64 %38 to %jl_value_t*
store %jl_value_t* %.c7, %jl_value_t** %39, align 8
store %jl_value_t* %12, %jl_value_t** %5, align 8
%40 = call %jl_value_t* @julia_showall4784(%jl_value_t* %12, i64 1000000000)
store %jl_value_t* inttoptr (i64 140018208272512 to %jl_value_t*), %jl_value_t** %2, align 8
store %jl_value_t* %12, %jl_value_t** %6, align 8
%41 = call %jl_value_t* @julia_takebuf_string(%jl_value_t* inttoptr (i64 140018152780912 to %jl_value_t*), %jl_value_t** %6, i32 1)
store %jl_value_t* %41, %jl_value_t** %6, align 8
%42 = call %jl_value_t* @jl_apply_generic(%jl_value_t* inttoptr (i64 140018154603088 to %jl_value_t*), %jl_value_t** %2, i32 2)
%43 = load %jl_value_t** %3, align 8
%44 = getelementptr inbounds %jl_value_t* %43, i64 0, i32 0
store %jl_value_t** %44, %jl_value_t*** @jl_pgcstack, align 8
ret i64 1000000000
}
lisp> (@code_native (test))
.text
Filename: show.jl
Source line: 127
pushq %rbp
movq %rsp, %rbp
Source line: 127
pushq %r15
pushq %r14
pushq %r13
pushq %r12
pushq %rbx
subq $56, %rsp
movq $8, -88(%rbp)
movabsq $jl_pgcstack, %r12
movq (%r12), %rax
movq %rax, -80(%rbp)
leaq -88(%rbp), %rax
movq %rax, (%r12)
xorps %xmm0, %xmm0
movups %xmm0, -72(%rbp)
movups %xmm0, -56(%rbp)
movabsq $140018150460440, %rax # imm = 0x7F58841ECC18
movq (%rax), %rcx
Source line: 0
leaq -56(%rbp), %r14
Source line: 127
movabsq $140026750891312, %rax # imm = 0x7F5A84BEF530
movq %rcx, -56(%rbp)
movabsq $140018154010064, %rdi # imm = 0x7F588454F5D0
movq %r14, %rsi
movl $1, %edx
callq *%rax
movq %rax, %r15
movabsq $jl_gc_allocobj, %rax
movq %r15, -72(%rbp)
movq 8(%r15), %r13
movl $48, %edi
callq *%rax
movq %rax, %rbx
movabsq $jl_true, %rax
movq (%rax), %rdx
movabsq $140018134741360, %rcx # imm = 0x7F58832EF170
movabsq $jl_false, %rax
movq (%rax), %rsi
movabsq $140026752408992, %r8 # imm = 0x7F5A84D61DA0
movq %rcx, -8(%rbx)
movabsq $140018132803616, %rcx # imm = 0x7F5883116020
movq %r15, (%rbx)
movq %rbx, -56(%rbp)
movb (%rdx), %al
movb %al, 8(%rbx)
movb (%rdx), %al
movb %al, 9(%rbx)
movb (%rdx), %al
movabsq $140018132803712, %rdx # imm = 0x7F5883116080
movb %al, 10(%rbx)
movb (%rsi), %al
movabsq $140018177782208, %rsi # imm = 0x7F5885BFB1C0
movb %al, 11(%rbx)
movq %r13, 16(%rbx)
movq (%rsi), %rax
movq %rax, 24(%rbx)
movq (%rdx), %rax
movq %rax, 32(%rbx)
movq (%rcx), %rax
movq %rax, 40(%rbx)
movq %rbx, -64(%rbp)
movq %rbx, %rdi
movl $1000000000, %esi # imm = 0x3B9ACA00
callq *%r8
leaq -48(%rbp), %rsi
movabsq $140026751044992, %rax # imm = 0x7F5A84C14D80
movabsq $140018208272512, %rcx # imm = 0x7F588790F080
movq %rcx, -56(%rbp)
movq %rbx, -48(%rbp)
movabsq $140018152780912, %rdi # imm = 0x7F5884423470
movl $1, %edx
callq *%rax
movabsq $jl_apply_generic, %rcx
movq %rax, -48(%rbp)
movabsq $140018154603088, %rdi # imm = 0x7F58845E0250
movq %r14, %rsi
movl $2, %edx
callq *%rcx
movq -80(%rbp), %rax
movq %rax, (%r12)
movl $1000000000, %eax # imm = 0x3B9ACA00
addq $56, %rsp
popq %rbx
popq %r12
popq %r13
popq %r14
popq %r15
popq %rbp
ret
Common Lisp:
How would you do this efficiently?
[1]> (defun test ()
(setq x 0)
(loop for i to 999999999
do (setq x (+ x 1)))
(print x)))
TEST
[2]> (disassemble #'test)
WARNING: in TEST : X is neither declared nor bound,
it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
it will be treated as if it were declared SPECIAL.
Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = X
(CONST 2) = 999999999
0 required arguments
0 optional arguments
Sin parámetro de resto
Sin parámetros clave
reads special variable: X
writes special variable: X
18 byte-code instructions:
0 (CONST 0) ; 0
1 (SETVALUE 1) ; X
3 (PUSH)
4 (JMP L14)
6 L6
6 (GETVALUE&PUSH 1) ; X
8 (CALLS2 177) ; 1+
10 (SETVALUE 1) ; X
12 (LOAD&INC&STORE 0)
14 L14
14 (LOAD&PUSH 0)
15 (CONST&PUSH 2) ; 999999999
16 (CALLSR&JMPIFNOT 1 50 L6) ; >
20 (SKIP 1)
22 (GETVALUE&PUSH 1) ; X
24 (PUSH-UNBOUND 1)
26 (CALLS1 142) ; PRINT
28 (SKIP&RET 1)
NIL
[3]> (time (test)) ; had to put it to "sleep" after 10 minutes :(
What am I doing wrong?
Edit:
I read more about it and used let:
[5]> (defun test ()
(let ((x 0))
(loop for i to 999999999
do (setf x (+ x 1)))
(print x)))
TEST
[6]> (disassemble #'test)
Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = 999999999
0 required arguments
0 optional arguments
Sin parámetro de resto
Sin parámetros clave
15 byte-code instructions:
0 (CONST&PUSH 0) ; 0
1 (CONST&PUSH 0) ; 0
2 (JMP L8)
4 L4
4 (LOAD&INC&STORE 1)
6 (LOAD&INC&STORE 0)
8 L8
8 (LOAD&PUSH 0)
9 (CONST&PUSH 1) ; 999999999
10 (CALLSR&JMPIFNOT 1 50 L4) ; >
14 (SKIP 1)
16 (LOAD&PUSH 0)
17 (PUSH-UNBOUND 1)
19 (CALLS1 142) ; PRINT
21 (SKIP&RET 2)
NIL
[7]> (time (test))
1000000000
Real time: 766.20514 sec.
Run time: 766.13995 sec.
Space: 440 Bytes
1000000000
Python
(I tried to use the Hy lang) but I didn't understand how to use loop and recur even for something so trivial, well I have to admit I am hungry and sleepy)
In [66]: def test():
....: x = 0
....: for i in range(1000000000):
....: x += 1
....: print(x)
....:
In [67]: %timeit -n 1 -r 1 test()
1000000000
1 loop, best of 1: 1min 7s per loop
@jackrusher it doesn't work for me in neither lisp implementation:
Clisp
Break 1 [3]> (defun test ()
(let ((x 0))
(declare
(optimize (speed 3) (safety 0)))
(type (unsigned-byte 32) x)
(loop for i to 999999999 do (incf x))))
TEST
Break 1 [3]> (time (test))
*** - EVAL: undefined function TYPE
The following restarts are available:
USE-VALUE :R1 Input a value to be used instead of (FDEFINITION 'TYPE).
RETRY :R2 Retry
STORE-VALUE :R3 Input a new value for (FDEFINITION 'TYPE).
ABORT :R4 Abort debug loop
Break 2 [4]> (disassemble #'test)
WARNING: in TEST : Function TYPE is not defined
WARNING: in TEST : Function UNSIGNED-BYTE is not defined
Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = 32
(CONST 2) = UNSIGNED-BYTE
(CONST 3) = TYPE
(CONST 4) = 999999999
0 required arguments
0 optional arguments
No rest parameter
No keyword parameters
16 byte-code instructions:
0 (CONST&PUSH 0) ; 0
1 (CONST&PUSH 1) ; 32
2 (CALL1&PUSH 2) ; UNSIGNED-BYTE
4 (LOAD&PUSH 1)
5 (CALL2 3) ; TYPE
7 (CONST&PUSH 0) ; 0
8 (JMP L14)
10 L10
10 (LOAD&INC&STORE 1)
12 (LOAD&INC&STORE 0)
14 L14
14 (LOAD&PUSH 0)
15 (CONST&PUSH 4) ; 999999999
16 (CALLSR&JMPIFNOT 1 50 L10) ; >
20 (NIL)
21 (SKIP&RET
CBSL
* (defun test ()
(let ((x 0))
(declare
(optimize (speed 3) (safety 0)))
(type (unsigned-byte 32) x)
(loop for i to 999999999 do (incf x))))
; in: DEFUN TEST
; (TYPE (UNSIGNED-BYTE 32) X)
;
; caught WARNING:
; The function TYPE is undefined, and its name is reserved by ANSI CL so that
; even if it were defined later, the code doing so would not be portable.
; (UNSIGNED-BYTE 32)
;
; caught WARNING:
; The function UNSIGNED-BYTE is undefined, and its name is reserved by ANSI CL so
; that even if it were defined later, the code doing so would not be portable.
;
; compilation unit finished
; Undefined functions:
; TYPE UNSIGNED-BYTE
; caught 2 WARNING conditions
TEST
* (time (test))
debugger invoked on a UNDEFINED-FUNCTION in thread
#<THREAD "main thread" RUNNING {1002AAC483}>:
The function COMMON-LISP:UNSIGNED-BYTE is undefined.
Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
restarts (invokable by number or by possibly-abbreviated name):
0: [ABORT] Exit debugger, returning to top level.
("undefined function")
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@jackrusher how about Julia with some LispSyntax (work in progress)? 😄
Note: Julia uses machine integer arithmetic by default with no automatic promotion to arbitrary precision numbers, but that may change in the future: