Natural order sorting strings with numbers

The following python code makes natural sorting sequences of lexical and numerical values a little easier. It supports any iterable containing strings which have embedded numbers. In short it would give you this:

foo1 < foo2 < foo10

instead of this:

foo1 < foo10 < foo2

As an example, if you have this sequence:

>>> seq = ['foo', 'foo1', 'foo2', 'foo10', 'foobar10', '20', '100', '1', '3', 'bar1']

a regular sort would produce this:

>>> sorted(seq)
['1', '100', '20', '3', 'bar1', 'foo', 'foo1', 'foo10', 'foo2', 'foobar10']

whereas a natural sort would produce this:

>>> natural_sort(seq)
['1', '3', '20', '100', 'bar1', 'foo', 'foo1', 'foo2', 'foo10', 'foobar10']

Here is the code:

import re

def natsort_key(item):
    chunks = re.split('(\d+(?:\.\d+)?)', item)
    for ii in range(len(chunks)):
        if chunks[ii] and chunks[ii][0] in '0123456789':
            if '.' in chunks[ii]: numtype = float
            else: numtype = int
            chunks[ii] = (0, numtype(chunks[ii]))
        else:
            chunks[ii] = (1, chunks[ii])
    return (chunks, item)

def natural_sort(seq):
    sortlist = [item for item in seq]
    sortlist.sort(key=natsort_key)
    return sortlist
Gregory Armer avatar
About Gregory Armer
Sometimes I just want to give it all up and become a handsome billionaire.
comments powered by Disqus