Category Archives: Development

How to use ipdb with docker-compose

Sometimes there may be in issue you need to debug which only occurs within a Docker container. However by default ipdb.set_trace() won’t work. Here’s how to get it working.

Enable interactive mode by adding stdin_open and tty the following to your docker-compose.yml. For example:

version: "3"
services:
  app_tests:
    build: .
    stdin_open: true
    tty: true
    command: ./run_my_tests.sh

Now when you run your tests (docker-compose run app_tests) the terminal will stop at your break point.

Attach your local to the container. In another terminal window run docker attach <container_id> which will bring up the ipdb interactive session in your terminal.

How to use localstack with Gitlab CI

If you’re using a standard style .gitlab-ci.yml format such as the below, it won’t work.

image: ubuntu

services:
  - localstack/localstack:latest

variables:
  SERVICES: s3
  DEFAULT_REGION: eu-west-1
  AWS_ACCESS_KEY_ID: localkey
  AWS_SECRET_ACCESS_KEY: localsecret
  HOSTNAME_EXTERNAL: localstack
  HOSTNAME: localstack
  S3_PORT_EXTERNAL: 4572
  LOCALSTACK_HOSTNAME: localstack

test:python36:
  script:
    - pip install awscli
    - aws s3api list-buckets --endpoint-url=http://localstack:4572
Could not connect to the endpoint URL: "http://localstack:4572/"
ERROR: Job failed: exit code 1

However if you use build stages instead as below, it will work.

stages:
  - test

test-application:
  stage: test
  image: ubuntu
  variables:
    SERVICES: s3:4572
    HOSTNAME_EXTERNAL: localstack 
    DEFAULT_REGION: eu-west-1
    AWS_ACCESS_KEY_ID: localkey
    AWS_SECRET_ACCESS_KEY: localsecret
  services:
    - name: localstack/localstack
      alias: localstack
  script:
    - pip install awscli
    - aws s3api list-buckets --endpoint-url=http://localstack:4572
{
    "Buckets": [],
    "Owner": {
        "DisplayName": "webfile",
        "ID": "bcaf1ffd86f41161ca5fb16fd081034f"
    }
}
Job succeeded

How to migrate from multi-version Python Travis-CI builds to Gitlab CI

With Travis-CI you can setup a CI build to run against multiple Python versions fairly easily.

.travis.yml

sudo: false
language: python
python:
    - 2.7
    - 3.6
env:
  - TOXENV=py-normal
install: pip install tox
script: tox

tox.ini

[tox]
envlist = py{27,36}-normal

[testenv]
commands =
    pytest

deps =
    -rtest_requirements.txt

You can achieve something similar with Gitlab CI through the following .gitlab-ci.yml configuration. Your tox.ini can remain the same.

before_script:
  # Install pyenv
  - apt-get update
  - apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev
  - git clone https://github.com/pyenv/pyenv.git ~/.pyenv
  - export PYENV_ROOT="$HOME/.pyenv"
  - export PATH="$PYENV_ROOT/bin:$PATH"
  - eval "$(pyenv init -)"
  # Install tox
  - pip install tox

test:python27:
  script:
  - pyenv install 2.7.14
  - pyenv shell 2.7.14
  - tox -e py27-normal

test:python36:
  script:
  - pyenv install 3.6.4
  - pyenv shell 3.6.4
  - tox -e py36-normal

The only downside with this is the extra time it takes to install pyenv and the interpreter of choice. A small price to pay to free your project from Github ;)

Read a file in chunks in Python

This article is just to demonstrate how to read a file in chunks rather than all at once.

This is useful for a number of cases, such as chunked uploading or encryption purposes, or perhaps where the file you want to interact with is larger than your machine memory capacity.

# chunked file reading
from __future__ import division
import os

def get_chunks(file_size):
    chunk_start = 0
    chunk_size = 0x20000  # 131072 bytes, default max ssl buffer size
    while chunk_start + chunk_size < file_size:
        yield(chunk_start, chunk_size)
        chunk_start += chunk_size

    final_chunk_size = file_size - chunk_start
    yield(chunk_start, final_chunk_size)

def read_file_chunked(file_path):
    with open(file_path) as file_:
        file_size = os.path.getsize(file_path)

        print('File size: {}'.format(file_size))

        progress = 0

        for chunk_start, chunk_size in get_chunks(file_size):

            file_chunk = file_.read(chunk_size)

            # do something with the chunk, encrypt it, write to another file...

            progress += len(file_chunk)
            print('{0} of {1} bytes read ({2}%)'.format(
                progress, file_size, int(progress / file_size * 100))
            )

if __name__ == '__main__':
    read_file_chunked('some-file.gif')

Also available as a Gist (https://gist.github.com/richardasaurus/21d4b970a202d2fffa9c)

The above will output:

File size: 698837
131072 of 698837 bytes read (18%)
262144 of 698837 bytes read (37%)
393216 of 698837 bytes read (56%)
524288 of 698837 bytes read (75%)
655360 of 698837 bytes read (93%)
698837 of 698837 bytes read (100%)

Hopefully handy to someone. This of course isn't the only way, you could also use `file.seek` in the standard library to target chunks.

Why is Programming Fun?

An extract from Fred Brooks’ (Frederick P. Brooks, Jr.) book, The Mythical Man-Month.

Why is programming fun? What delights may its practioner expect as his reward?

First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys building things, especially things of his own design. I think this delight must be an image of God’s delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.

Second is the pleasure of making things that are useful to other people. Deep within, we want others to use our work and to find it helpful. In this respect the programming system is not essentially different from the child’s first clay pencil holder “for Daddy’s office.”

Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.

Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.

Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures. (…)
Yet the program construct, unlike the poet’s words, is real in the sense that it moves and works, producing visible outputs separately from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities we have in common with all men.

Concurrent Jenkins builds of a Django application

If you try to run multiple Jenkins builds of a single Django project on Jenkins out of the box you might be met with a message similar to:

Got an error creating the test database: database "test_projectdb" already exists

Got an error recreating the test database: database "test_projectdb" is being accessed by other users
DETAIL:  There is 1 other session using the database.

To fix this you need to edit the ‘DATABASES’ Dictionary within your Django project settings module, adding another key ‘TEST_NAME’.

TEST_NAME is the name of the test database Django will create when running your tests with manage.py.

We can make this name unique by adding the following function to our Django setting module:

def get_test_db_name():
    md5 = hashlib.md5()
    md5.update(os.environ.get('BUILD_TAG', b'no-tag'))
    return md5.hexdigest()

(This will take the unique BUILD_TAG environment variable set by Jenkins and md5 it)

And then calling it within the DATABASES Dictionary:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'projectdb',
        'USER': 'django',
        'PASSWORD': 'django',
        'HOST': '127.0.0.1',
        'PORT': '',
        'TEST_NAME': get_test_db_name(),
    }
}

That’s it, Jenkins should now work fine with concurrent builds of your application.

 

Separation of logic in Django Projects

Currently I work mostly on a large Django code-base 4+ years old in which business logic is tangled throughout the three MVC components.

Given such a large framework, developers often forget how to write well-organized Python business logic code they’re definitely capable of given the absence of the framework.

As someone who has worked on projects with these symptoms, as well as other older code bases before my Django days, the situation isn’t as bad as some very old PHP projects I’ve seen. Anyway, here are some tips which can be applied to Django applications to aid in organization overall.

Keep only database code within the models module

You’ll often see lengthy ORM queries randomly plastered throughout an application. Keeping these within the model class makes everything more maintainable.

  • Place ORM queries within your models module
  • Wrap up the queries you need using these features
  • Keep business logic out of here

Create modules for business logic

As you would with a regular Python program. Create your own modules outside of Django component structure.
Make your logic functions responsible only for logic, as in not caring about the presentation or data layer (use dependency injection).

Views should be light

Your views should only be used to glue things together, linking requests to forms, forms to your business logic and outputs to templates.
The view then becomes a simple description of how a feature is coupled.

Override forms for validation

  • Override the Django form methods to add any complex custom validation.

Keeping all of your validation code inside the forms module means errors can always to tied back to the individual inputs.


A result of the above rules is increased testability, easier adaptability and of course it’s in-keeping with the separation of concerns design principal.

Fix for: character of encoding “UTF8” has no equivalent in “LATIN1”, Ubuntu & Vagrant

This is mostly a post for if I happen over this problem again in the future.

Related to: “DETAIL: The chosen LC_CTYPE setting requires encoding LATIN1.”

The solution I found is a bit of a hack. Really you want to find why postgres has created its databases in LATIN1 encoding before installing postgres.

This script however will recreate them correctly so you can get on with some work. Run it before creating your application database(s).

#!/usr/bin/env bash
# This script changes postgres from LATIN1 to UTF8
pg_dumpall > /tmp/postgres.sql
pg_dropcluster --stop 9.1 main
pg_createcluster --locale en_US.UTF-8 --start 9.1 main
psql -f /tmp/postgres.sql

Navigation active state in Django

Here’s a clean way to display a navigation menu item’s active state in Django.

Wherever the app you’re doing this for is located you’ll have an urls.py. Ensure you have the name set within each url group.

from django.conf.urls import patterns, url
from apps.pages import views

urlpatterns = patterns('',
    url(r'^pages/$', views.pages.index, name='pages.index'),
    url(r'^pages/about$', views.pages.about, name='pages.about'),
)

Next create a directory called templatetags within your app folder.
Add to it a blank __init__.py and nav_active.py, giving it the below content.

from django.core.urlresolvers import resolve
from django.template import Library

register = Library()

@register.simple_tag
def nav_active(request, url):
    """
    In template: {% nav_active request "url_name_here" %}
    """
    url_name = resolve(request.path).url_name
    if url_name == url:
        return "active"
    return ""

# nav_active() will check the web request url_name and compare it 
# to the named url group within urls.py, 
# setting the active class if they match.

Now to finish up, in your template .html file you need to load in the template tag and add it to each navigation item.

{% load nav_active %}