Monthly Archives: July 2015

Read a file in chunks in Python

This article is just to demonstrate how to read a file in chunks rather than all at once.

This is useful for a number of cases, such as chunked uploading or encryption purposes, or perhaps where the file you want to interact with is larger than your machine memory capacity.

# chunked file reading
from __future__ import division
import os

def get_chunks(file_size):
    chunk_start = 0
    chunk_size = 0x20000  # 131072 bytes, default max ssl buffer size
    while chunk_start + chunk_size < file_size:
        yield(chunk_start, chunk_size)
        chunk_start += chunk_size

    final_chunk_size = file_size - chunk_start
    yield(chunk_start, final_chunk_size)

def read_file_chunked(file_path):
    with open(file_path) as file_:
        file_size = os.path.getsize(file_path)

        print('File size: {}'.format(file_size))

        progress = 0

        for chunk_start, chunk_size in get_chunks(file_size):

            file_chunk = file_.read(chunk_size)

            # do something with the chunk, encrypt it, write to another file...

            progress += len(file_chunk)
            print('{0} of {1} bytes read ({2}%)'.format(
                progress, file_size, int(progress / file_size * 100))
            )

if __name__ == '__main__':
    read_file_chunked('some-file.gif')

Also available as a Gist (https://gist.github.com/richardasaurus/21d4b970a202d2fffa9c)

The above will output:

File size: 698837
131072 of 698837 bytes read (18%)
262144 of 698837 bytes read (37%)
393216 of 698837 bytes read (56%)
524288 of 698837 bytes read (75%)
655360 of 698837 bytes read (93%)
698837 of 698837 bytes read (100%)

Hopefully handy to someone. This of course isn't the only way, you could also use `file.seek` in the standard library to target chunks.

Getting console.log errors with Selenium, PhantomJS in Python

So I had some functional tests passing on my workstation, but when pushed to CI environment they would fail with an “ElementNotVisibleException” exception, because scripts which created the element weren’t doing their job.

I wanted to view the browser console.log to get some clues to what went wrong on the front-end.

The selenium docs state to use:

driver.get_log('browser')

But in my case that returned an empty list, not very useful.

I’ve found if you use log type of “har”, not in the docs:

driver.get_log('har')

It will return a bunch information, including, if you look carefully, some “NOT FOUND” errors for for requests triggered by Javascript code.

[{
    'timestamp': 1436900661766,
    'message': '{"log":{"version":"1.2","creator":{"name":"PhantomJS","version":"2.0.0"},"pages":[{"startedDateTime":"2015-07-14T19:01:40.795Z","id":"http://myapp.domain.com:8081/people/1","title":"John Smith - MyAppName","pageTimings":{"onLoad":2900}}],"entries":[{"startedDateTime":"2015-07-14T19:03:22.559Z","time":148,"request":{"method":"GET","url":"http://myapp.domain.com:8081/people/1","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Accept","value":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"},{"name":"Cache-Control","value":"max-age=0"},{"name":"User-Agent","value":"Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1"}],"queryString":[],"headersSize":-1,"bodySize":-1},"response":{"status":200,"statusText":"OK","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Date","value":"Tue, 14 Jul 2015 19:03:22 GMT"},{"name":"Server","value":"WSGIServer/0.1 Python/2.7.6"},{"name":"Vary","value":"Cookie"},{"name":"X-Frame-Options","value":"SAMEORIGIN"},{"name":"Content-Type","value":"text/html; charset=utf-8"},{"name":"Set-Cookie","value":"csrftoken=wFzWPTm9aVkGLtPuOCcc1tIs6ve5KosW; expires=Tue, 12-Jul-2016 19:03:22 GMT; Max-Age=31449600; Path=/"}],"redirectURL":"","headersSize":-1,"bodySize":5776,"content":{"size":5776,"mimeType":"text/html; charset=utf-8"}},"cache":{},"timings":{"blocked":0,"dns":-1,"connect":-1,"send":0,"wait":140,"receive":8,"ssl":-1},"pageref":"http://myapp.domain.com:8081/people/1"},{"startedDateTime":"2015-07-14T19:03:22.705Z","time":7,"request":{"method":"GET","url":"http://myapp.domain.com:8081/static/js/lib/bower_components/requirejs/require.js","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Accept","value":"*/*"},{"name":"Referer","value":"http://myapp.domain.com:8081/people/1"},{"name":"User-Agent","value":"Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1"}],"queryString":[],"headersSize":-1,"bodySize":-1},"response":{"status":null,"statusText":"Error downloading http://myapp.domain.com:8081/static/js/lib/bower_components/requirejs/require.js - server replied: NOT FOUND","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Date","value":"Tue, 14 Jul 2015 19:03:22 GMT"},{"name":"Server","value":"WSGIServer/0.1 Python/2.7.6"},{"name":"X-Frame-Options","value":"SAMEORIGIN"},{"name":"Content-Type","value":"text/html"}],"redirectURL":"","headersSize":-1,"bodySize":125,"content":{"size":125,"mimeType":"text/html"}},"cache":{},"timings":{"blocked":0,"dns":-1,"connect":-1,"send":0,"wait":6,"receive":1,"ssl":-1},"pageref":"http://myapp.domain.com:8081/people/1"},{"startedDateTime":"2015-07-14T19:03:22.706Z","time":15,"request":{"method":"GET","url":"http://myapp.domain.com:8081/static/js/lib/bower_components/jquery-ui/themes/ui-lightness/jquery.ui.theme.css","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Accept","value":"text/css,*/*;q=0.1"},{"name":"Referer","value":"http://myapp.domain.com:8081/people/1"},{"name":"User-Agent","value":"Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1"}],"queryString":[],"headersSize":-1,"bodySize":-1},"response":{"status":null,"statusText":"Error downloading http://myapp.domain.com:8081/static/js/lib/bower_components/jquery-ui/themes/ui-lightness/jquery.ui.theme.css - server replied: NOT FOUND","httpVersion":"HTTP/1.1","cookies":[],"headers":[{"name":"Date","value":"Tue, 14 Jul 2015 19:03:22 GMT"},{"name":"Server","value":"WSGIServer/0.1 Python/2.7.6"},{"name":"X-Frame-Options","value":"SAMEORIGIN"},{"name":"Content-Type","value":"text/html"}],"redirectURL":"","headersSize":-1,"bodySize":154,"content":{"size":154,"mimeType":"text/html"}},"cache":{},"timings":{"blocked":0,"dns":-1,"connect":-1,"send":0,"wait":15,"receive":0,"ssl":-1},"pageref":"http://myapp.domain.com:8081/people/1"}]}}',
    'level': 'INFO'
}]

So with that I found JS assets weren’t being compiled in my CI environment and was able to go ahead and fix it :)

Hopefully that’s useful to someone out there.

Why is Programming Fun?

An extract from Fred Brooks’ (Frederick P. Brooks, Jr.) book, The Mythical Man-Month.

Why is programming fun? What delights may its practioner expect as his reward?

First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys building things, especially things of his own design. I think this delight must be an image of God’s delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.

Second is the pleasure of making things that are useful to other people. Deep within, we want others to use our work and to find it helpful. In this respect the programming system is not essentially different from the child’s first clay pencil holder “for Daddy’s office.”

Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.

Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.

Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures. (…)
Yet the program construct, unlike the poet’s words, is real in the sense that it moves and works, producing visible outputs separately from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities we have in common with all men.