Friday, August 22, 2014

Installing Python 2.7.8 on CentOS 6.5

CentOS 6.5 ships with Python 2.6.6 and you can't yum update past that. I really need Python 2.7. Since, 2.7.8 is the latest version (excluding Python 3) as of today, I'll go with that. I might be drunk, but I'm not so drunk that I can live without the argparse module, the unittest enhancements, OrderedDict and Counter classes.
I did all these commands from the root account, without that your mileage may vary.

Update CentOS and install the development tools

yum -y update
yum groupinstall -y 'development tools'
I'll want the following packages enabled for Python, and so should you.
yum install -y zlib-devel bzip2-devel openssl-devel xz-libs wget

Install Python 2.7.8 from Source

Download and extract the source:
wget http://www.python.org/ftp/python/2.7.8/Python-2.7.8.tar.xz

xz -d Python-2.7.8.tar.xz

tar -xvf Python-2.7.8.tar

Compile the Source and Install

# Enter the directory:
cd Python-2.7.8

# Run the configure:
./configure --prefix=/usr/local

# compile and install it:
make
make altinstall

# Checking Python version:
python2.7 -V
# Python 2.7.8

Install pip and virtualenv

No Pythonista worth two dead parrots would be caught dead putting together anything of complexity without these tools. So here's how to install them.
# Install pip
curl --silent --show-error --retry 5 https://bootstrap.pypa.io/get-pip.py | python2.7

# Install virtualenv
pip2.7 install virtualenv
And you're done.

Sunday, August 17, 2014

Project Euler Problem 2 and Fibonacci numbers

We can't talk about Fibonaacci numbers in a programming blog without at least touching on the idea of recursion. Of course, it's not a terribly efficient way to calculate Fibonacci numbers, it still warrants attention. That said, what is a Fibonacci number? Well, more accurately, it's the Fibonacci sequence and it starts with [0, 1]. So, we'll notate that by saying F0 = 0 and F1 = 1. Each subsequent number in the series is the sum of the previous two numbers.
F2 = F1 + F0
1 = 0 + 1
And more generally:
Fn = Fn - 1 + Fn - 2
The first 10 numbers in the series looks like this:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
With that overly simple explanation, on to question #2!
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.
Notice that the Project Euler question starts with 1, 2. That's fine, it isn't going to affect us here. The first thing we'll want to do is calculate Fibonacci numbers, I'm almost certain of that. Since the sequence is defined as a self referential sum of the previous 2 numbers, we could create a function like this:
def fib(n):
    """Recursive Fibonacci function.  
    Return the nth Fibonacci number.
    """
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return(fib(n - 1) + fib(n - 2))
Now, you can call that function with any number and get the nth Fibonacci number. Sadly, this is terribly slow, and it's not really what we need to solve the problem. I just put it here to show you an example of recursion. If you don't know what recursion is, then you don't know what recursion is. So, hopefully, now you know. Maybe we really want a generator function, that will just continuously yield subsequent fibonacci numbers. Something like this:
def fib_gen():
    """Fibonacci generator function.  
    Just keep yielding Fibonacci numbers.
    """
    a, b = 0, 1
    while 1:
        yield a
        a, b = b, a + b
Or maybe we can just brute force the whole thing? This is how I actually solved it without really thinking too hard at all:
second_previous = 0
previous = 1
fib_num = 0
total = 0

while fib_num < 4000000:
    fib_num = previous + second_previous
    if fib_num < 4000000:
        if fib_num % 2 == 0:
            total = total + fib_num
        second_previous = previous
        previous = fib_num
print(total)
It prints out the total that problem 2 is asking for. That solution runs very fast for me. There is one other interesting property, and that is that every third Fibonacci number is even.

Saturday, August 16, 2014

Project Euler problem 1

Apparently, some real jerks hacked the Project Euler website, so it wasn't operating at full functionality for quite some time.  However, it's back now and you can once again spend some time going through their over 400 mathematical and programming challenges posted there.

As an introduction, I'll start off with problem 1, how the site works and solve the problem for you, because even drunk we can do that.

Now that the sire is back you'll want to create a username and password.  They have removed email addresses from the site, so it's important that you remember the password, as there is no more password recovery option.

Once you've done that, you've got to start solving the problems.  The problems are usually stated very simply, and then they will kick your ass.  The first one isn't much of a challenge, but it'll give you an idea to the approach I use.  However, even the first question offers some intriguing ways to be solved, using math, more so than raw computing power.

So the first problem is titled

Multiples of 3 and 5


And it states:
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.
Just using the most obvious algorithm, we can solve the problem like so:

def sum3and5(val):
    """Sum the multiples of 3 or 5 below val."""
    total = 0
    for x in range(val):
        if x % 3 == 0 or x % 5 == 0:
            total += x
    return total

assert(sum3and5(10) == 23)
print(sum3and5(1000))
You can see I'm simply looping from 1 to 999. If the value is a multiple of 3 or 5, I add it to the total. I also test my function is working properly by running it against 10 and verifying that it totals to 23. In order to make the code a little sexier you could write this as a generator function, and sum the output. That means you can reduce the whole function to a single line:
sum(x for x in range(1000) if x % 3 == 0 or x % 5 == 0)
However, that doesn't speed up the execution much. In order to make this really efficient, we need to do some math. The function we really want is the sum of all the numbers divisible by another number in a range. We want to know the sum of all the numbers divisible by 3 up to 1000. That sum looks like this: 3 + 6 + 9 + 12 ... 999. But, we can see that it's the same as 3 * (1 + 2 + 3 + 4 + 5 ... 333). We also know from math that the sum 1 + 2 + 3 + 4 + 5 ... p = p * (p + 1) / 2 Putting those two facts together, I created a new function that looks like this:
def sum_divisible_by(x, y):
    """Sum of all the numbers up to y, divisible by x."""
    z = y // x
    return x * (z * (z + 1)) // 2
But that alone will only tell us the sum for a single divisor. We'll need to do it for 3 and 5. However, we'll be summing some numbers twice. To get rid of those we'll need to run the same function with 15 and subtract that sum. Putting that all together, we end up with this:
print(sum_divisible_by(3, 999)
      + sum_divisible_by(5, 999) 
      - sum_divisible_by(15, 999))
Sure enough, that gives the same answer, and it does it much more efficiently. In order to demonstrate that, I wrote this code, which runs each of the functions 10,000 times and times the duration.
t = time.time()
for x in range(10000):
    sum3and5(1000)
u = time.time()
print(u - t)

t = time.time()
for x in range(10000):
    sum3and5gen(1000)
u = time.time()
print(u - t)

t = time.time()
for x in range(10000):
    sum_divisible_by(3, 999)\
    + sum_divisible_by(5, 999)\
    - sum_divisible_by(15, 999)
u = time.time()
print(u - t)
This was the result:
1.87544608116
1.84872412682
0.0155010223389
Basically, the final function was more than 100 times faster.