Monday, September 24, 2012

Adventures in Python

"6502" (cc) by Blakespot 
I’m an old-school coder.
Cut my teeth on BASIC and FORTRAN.
That was over two dozen languages ago.
Programmed primarily in Assembler for about 15 years.
I’ve dabbled in the dark art of self-modifying Assembler.
I fell in love with the off-putting perl.
Lua and I will always remember that time in Azeroth.
I've seen things you people wouldn't believe. Attack ships on fire off the shoulder of Orion. I watched c-beams glitter...

But this story isn’t about the past.


It’s about the future, and Google’s dominion.

Google loves python.
Google runs the Internet.
Best learn to love the python.
There are reasons besides appEgine to learn Python: The Raspberry Pi is designed to teach Python and will be a very Python-centric environment. I’ve noticed a trend with programmers I respect slowly moving to Python from PHP.
I think PHP still performs a valuable role, and that good code can be created in PHP, but I’ll happily acknowledge there has to be a better way.

My dabblings in Python have been limited to modifying the occasional command-line script and attempting to code some XBMC addons.

And I didn’t like it.

I’m assuming this was because there were some basic things about the language that I didn’t understand, and that the framework for XBMC is it’s own gnarly beast.

I even have a real project in mind to test my hopefully growing skills.

So let’s learn some Python!
My favorite computer language book is Advanced Perl Programming http://shop.oreilly.com/product/9780596004569.do

Most of the Python educational materials are aimed at beginning programmers. While reviewing the basics is a good exercise, I’m more concerned about variable scoping, abstraction, and data structures than “hello world.”

Given the title, I thought “Learn Python the Hard Way” would provide a lot of CS context, but it’s aimed an even more naive audience than all the learn-to-program-with-python courses and books. The context has more to do with how to read and type code than syntax and data structures.
I actually did read the entire thing on the web. It clearly explained some basic nomenclature and how projects are laid out.

Dive Into Python” runs at the right speed, but hasn’t been updated since 2004. So I’ll pass.

LearnPython is current, covers the basics and has advanced tutorials. I don’t think I’ll start there, but it looks like a great reference.


The most promising appears to be the course Google uses to get programmers up-to-speed on Python.
http://code.google.com/edu/languages/google-python-class/

Not pretty. Not simple. Just right!



Google's Python Class

Section 1: Python Setup

I’m doing this on a Mac with BBedit.
Nothing really to do.
Everything worked as expected.
Google recommends spaces instead of tabs for indenting.
I’ve run into this on a number of open-source projects recently.
Again, being old-school, I remember the days when bytes were scarce. It also seems like “hard-coding” to use spaces instead of tabs.
But since it appears this has become a standard, I’ll embrace the space.

Section 2: Introduction

Interpreted - duh, it’s a scripting language.
Loosely Typed - like.
Run-time evaluation - potentially better for security.
Case Sensitive - like.
Newline for statement delimiters - I have no preference.
Indentation matters? I’m intrigued and alarmed. Actually this explains why my earlier attempts at modifying Python code failed. It never occurred to me that changing from spaces to tabs would break code.

Now we get into an actual program file. The traditional “hello world.”
We see an *import*, the main(), and the boilerplate that calls main().
Everything is perfectly sensible except the way main() is called. Seems odd to do it by using a special variable, but I can see the flexibility down the road.
Variables don’t have a delimiter. This makes sense with a loosely typed language, but I love perl’s way of being  able to cast a variable by just using the appropriate delimiter.

Plus sign for string concatenation. Now this I don’t like this for loosely typed languages since it’s ambiguous with addition.

Each file defines a module with its own namespace. This I like a lot.

That was a great introduction! Let’s learn some specifics.

Section 3: Strings

Clear description of delimters and escape sequences. I think this might be a typo:
String literals inside triple quotes, """" or ''', can multiple lines of text.
Perhaps “can contain?”
And here we encounter the most consistent failing of Google. How do I provide feedback on this typo?

Back to the lesson...
Here we learn that there is no automatic conversion of string and numeric values. So that makes the + concatenator a lot more acceptable.

The section on printf-style string expansion mentions using parentheses to span multiple lines. I thought we needed \ to cross multiple lines? I fear this is gonna get confusing...

Section 4: Lists

Bracket syntax is common form for both declaration and indexing. Assignment (=) doesn’t duplicate lists, just creates a symbol table pointer. While I like this, I’ve seen many inexperienced programmers spend a lot of time tracking down problems related to having two variables point to the same object.
I wonder what the Python equivalent of .duplicate() is?
The for, in, and range constructs all make sense, but I must admit, I’m still a bit bothered by the lack of a indicator at the end of a loop. This indentation-as-structure thing just feels odd.
Linear arrays (Lists) are all well and good, but how about lists of lists? Associative arrays? We see some of the object nature in the List methods, but I feel this section ended too soon. I look forward to more information on complex structures.

Section 5: Sorting

This section is a bit disorganized. The bits about sorting are clear enough. The sorted() construct seems very practical and easier to understand than the sort in perl, for instance.
sort() is a lot like Lingo’s sort in that it just modifies the array. Although it is unclear if it keeps it sorted like Lingo does.
But then we introduce “tuples” without explaining why they should be used. From a compiler design point-of-view they are intriguing since we could make some optimizations. But I suspect they exist more for conformance checking of objects. But it seems this part should be in the Lists section.

List Comprehensions also seems like it’s in the wrong section. A handy construct for array munging. I can see why web devs would like this - perfect for processing form input.

Section 6: Dicts and Files

Ah... Here we get our associative arrays/objects. Apparently they are called “dicts” in Python.
We also learn of a reason to use tuples: since they are immutable, they can be reliably used as keys in hash arrays.
We also revisit the % expansion in strings. This time with an associative array basically doing token expansion.
File operations are pretty familiar to anyone who’s done file operations in pretty much any language.
There’s an odd little piece of basic programming advice at the end of this section called Exercise Incremental Development. Seems out of place and slightly condescending here.

Section 7: Regular Expressions

Apparently, I’m “the regex guy” on most of the teams I work with. I learned sed back when I was first introduced to unix-style operating systems and have used regular expressions pretty much daily since then. Let’s learn about the regex in Python!
search() and findall() are pretty clear, but I’d just as soon return a string or array instead of a “match object.” Perhaps we’ll learn of some reason for this added complexity later.
Non-greedy syntax is the same as perl. Whew! I’m thrilled that “Python includes pcre support.” Makes my life a whole lot more consistent.
I’ve skipped the exercises so far, since most of this has been straightforward information, but I’m going to play with the Baby Name Excercise.
The Baby Name Excercise drops us immediately in the deep end. We are given a folder of HTML files with baby name popularity information in tables.
We get to read files, parse HTML using regular expressions, build structures to hold the data, format the data and return it from a function.
It look a lot of flipping back to the lessons, but it was a good exercise.
I got the syntax for join() wrong a couple time though...
text = '\n'.join(names)
just seems odd. I realize it’s a string method iterating over a parameter, but traditionally we see this as a list or object method.
After completing the exercise, I looked at provided solution. Their solution did not handle cases where the same name was in both the boys and girls columns. I learned how to reference elements in a more complex data structure while constructing my solution.

Section 8: Utilities

Here we get the usual interfaces for shell and file operations. Try/Except construct is introduced. This construct seems very similar to the try/catch in most languages.

The exercises were great an applied to things I want to with Python right away.

After the introduction

I was able to put my newfound Python skills to work right away fixing a gPodder scraper.
I also used Python for a little log analysis where I would normally use perl.

So far so good!

The next section on the Google class is a bunch of video. I've never found video to be a particularly good medium for learning programming, but I'll give it a shot the next article in this series.








No comments:

Post a Comment

Please leave your comment here.