Skip to content

Instantly share code, notes, and snippets.

@ghtdak
Last active August 29, 2015 14:06

Revisions

  1. ghtdak revised this gist Sep 17, 2014. 1 changed file with 0 additions and 7 deletions.
    7 changes: 0 additions & 7 deletions about.txt
    Original file line number Diff line number Diff line change
    @@ -1,7 +0,0 @@

    Some info from: http://stackoverflow.com/a/19761645/4022960 though there's a bug where he uses '' instead of None.

    I found using the Python script too slow as a filter. The script is invoked multiple times by git tools like tig,
    git diff, etc.

    Launching a tiny python daemon and using Netcat to twist the ends together works well.
  2. ghtdak revised this gist Sep 17, 2014. 5 changed files with 33 additions and 10 deletions.
    7 changes: 7 additions & 0 deletions about.txt
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@

    Some info from: http://stackoverflow.com/a/19761645/4022960 though there's a bug where he uses '' instead of None.

    I found using the Python script too slow as a filter. The script is invoked multiple times by git tools like tig,
    git diff, etc.

    Launching a tiny python daemon and using Netcat to twist the ends together works well.
    3 changes: 3 additions & 0 deletions attributes
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,3 @@
    # This goes in .git/info

    *.ipynb filter=dropoutput_ipynb
    12 changes: 12 additions & 0 deletions initalize.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,12 @@
    #!/bin/bash

    # Tell the repository to use the strip filter

    git config filter.dropoutput_ipynb.clean /home/change_this_path/bin/stripfilter.sh
    git config filter.dropoutput_ipynb.smudge cat

    # Your .git/config will have the following lines added to it
    #
    # filter "dropoutput_ipynb"]
    # clean = /home/change_this_path/bin/stripfilter.sh
    # smudge = cat
    20 changes: 10 additions & 10 deletions nbstripdaemon.py
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,12 @@
    #!/usr/bin/python

    import socket
    import itertools as it
    from IPython.nbformat.current import reads, writes

    TCP_IP = '127.0.0.1'
    TCP_PORT = 5005
    BUFFER_SIZE = 1024

    def processIPython(strin):
    json_in = reads(strin, 'json')

    @@ -17,29 +20,26 @@ def processIPython(strin):
    data = writes(json_in, 'json')
    return data

    TCP_IP = '127.0.0.1'
    TCP_PORT = 5005
    BUFFER_SIZE = 1024 # Normally 1024, but we want fast response

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind((TCP_IP, TCP_PORT))
    s.listen(1)

    while True:
    try:
    conn, addr = s.accept()
    print 'Connection address:', addr
    ghtBuffer = []
    #print 'Connection address:', addr
    buffer = []
    while 1:
    data = conn.recv(BUFFER_SIZE)
    if not data: break
    ghtBuffer.append(data)
    buffer.append(data)

    data = "".join(ghtBuffer)
    data = "".join(buffer)

    data = processIPython(data)

    conn.send(data) # echo
    conn.close()
    except:
    print "weird exception"
    print "weird exception, handle to taste"

    1 change: 1 addition & 0 deletions stripfilter.sh
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,4 @@
    #!/bin/bash
    # This is my all time favorite script

    nc 127.0.0.1 5005
  3. ghtdak revised this gist Sep 17, 2014. 3 changed files with 48 additions and 7 deletions.
    7 changes: 0 additions & 7 deletions Git for IPython Notebooks
    Original file line number Diff line number Diff line change
    @@ -1,7 +0,0 @@
    Managing IPython Notebooks under Git is problematic as the output (derived data) is stored in the notebook (*.ipynb) file. This bloats the repository and makes it very difficult to use Git as anything but a blob storage device.

    Stripping output from the Notebook is straightforward but precludes the advantages of output in the notebook.

    Fortunately, Git has hooks which can be used to "do the right thing", most of the time.

    The best discussion of how to do this right is http://stackoverflow.com/a/20844506/4022960.
    45 changes: 45 additions & 0 deletions nbstripdaemon.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,45 @@
    #!/usr/bin/python

    import socket
    import itertools as it
    from IPython.nbformat.current import reads, writes

    def processIPython(strin):
    json_in = reads(strin, 'json')

    for sheet in json_in.worksheets:
    for cell in sheet.cells:
    if "outputs" in cell:
    cell.outputs = []
    if "prompt_number" in cell:
    cell.prompt_number = None

    data = writes(json_in, 'json')
    return data

    TCP_IP = '127.0.0.1'
    TCP_PORT = 5005
    BUFFER_SIZE = 1024 # Normally 1024, but we want fast response

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind((TCP_IP, TCP_PORT))
    s.listen(1)

    while True:
    try:
    conn, addr = s.accept()
    print 'Connection address:', addr
    ghtBuffer = []
    while 1:
    data = conn.recv(BUFFER_SIZE)
    if not data: break
    ghtBuffer.append(data)

    data = "".join(ghtBuffer)

    data = processIPython(data)

    conn.send(data) # echo
    conn.close()
    except:
    print "weird exception"
    3 changes: 3 additions & 0 deletions stripfilter.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,3 @@
    #!/bin/bash
    # This is my all time favorite script
    nc 127.0.0.1 5005
  4. ghtdak renamed this gist Sep 17, 2014. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  5. ghtdak created this gist Sep 17, 2014.
    7 changes: 7 additions & 0 deletions README
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@
    Managing IPython Notebooks under Git is problematic as the output (derived data) is stored in the notebook (*.ipynb) file. This bloats the repository and makes it very difficult to use Git as anything but a blob storage device.

    Stripping output from the Notebook is straightforward but precludes the advantages of output in the notebook.

    Fortunately, Git has hooks which can be used to "do the right thing", most of the time.

    The best discussion of how to do this right is http://stackoverflow.com/a/20844506/4022960.