Python Page #2: Tips and Tricks

There's strictly no warranty for the correctness of this text. You use any of the information provided here at your own risk.

Introduction
File-operations and Filename globbing
Extracting HTML-Tags
Date and Time
Starting other processes from inside a Python-script
Conversions Between Decimal, Hexadecimal, Binary and Octal Numbers
Installing modules
Generating Bytecode of any script
Encrypting and Decrypting Text Using a Password
Scripting LibreOffice.org
Plotting to the screen with Pygame
Messageboxes with tkinter
Parse configuration-files using module "ConfigParser"
PDF and Python
Sending Emails
Receiving Emails from a POP3-Server
Setting up a small html-Server
Creating Windows ".exe"-Files From Python Scripts
Encodings

1. Introduction

This page describes some things that can be done with Python.
In 2025, the page has been updated to work with Python 3. During the process, a few chapters were taken out.

2. File-operations and Filename globbing

This script demonstrates some often-used file-operations:

#!/usr/bin/python
# coding: utf-8

import os
import sys

import glob

# Get current working directory:
mypath = os.getcwd()

myfile = os.path.join(mypath, "myfile.txt")

# Test, if file exists:
if os.path.exists(myfile):
    print("File already exists.")
    sys.exit(1)

fh = open(myfile, "w")

fh.write("This is a line of text.\n")
fh.write("This is another line.\n")

fh.close()

fh = open(myfile, "r")

a = fh.readlines()

fh.close()

for i in a:
    print(i.rstrip("\n"))

# os.remove(myfile)

print("The directory contains:")
b = os.listdir(mypath)
for i in b:
    print(i)

# Filename globbing:
print()
print("The directory contains the following html-files:")
b = glob.glob("*html")
if b:
    for i in b:
        print(i)
else:
    print("None.")

For copying and moving files and directories and for deleting directories recursively, it is recommended to use the module "shutil".

3. Extracting HTML-Tags

If you want to extract html-tags from a html-file, there's a problem with greed. You should then use a regular expressions with its last character as a negated character-class:

#!/usr/bin/python
# coding: utf-8

import re

html = """<html>
<body>
<h2>Caption 1</h2>
<p>Some text.
<h2>Caption 2</h2>
<p>Some more text.
<br>Going on.
</body>
</html>"""

html = html.split("\n")

patobj = re.compile("<h2>([^>]*)<\/h2>") # Notice the regexp here.
for line in html:
    line = line.rstrip("\n")
    matchobj = patobj.search(line)
    if matchobj:
        print(matchobj.groups()[0])

The expression "<h2>([^>]*)<\/h2>" means:

'Find "<h2>", then any amount of characters, that aren't ">" (and mark them for extraction), then "</h2>".

The Tutorial says, you can also use "?" for marking an expression as being "not greedy" but the way above should be even more reliable.

The best practice to extract text from a HTML-page in Python, would be using the module "Beautiful Soup".

4. Date and Time

This script demonstrates some often-used operations concerning date and time (using the a little low-level time-module):

#!/usr/bin/python
# coding: utf-8

import time

# Print current time in central European format including seconds:
print(time.strftime("%H:%M:%S"))

# On Linux, you can read more about format-strings like "%H:%M:%S"
# by executing the shell-command "info date".

# Print today's date in central European format; the numbers of days and months
# below 10 are zero-padded (like in 09.08.2007):
print(time.strftime("%d.%m.%Y"))

# Print seconds since epoch (on Linux, epoch is 1.1.1970):
print(time.time())

# Generate a time-tuple from seconds since epoch and print(it:)
tmtp = time.localtime(time.time())
print(tmtp)

# Print today's date in central European format using the time-tuple generated above;
# the numbers of days and months below 10 are blank-padded (like in 9.8.2007):
print(str(tmtp[2]) + "." + str(tmtp[1]) + "." + str(tmtp[0]))

# Convert the time-tuple back to seconds since epoch:
print(time.mktime(tmtp))

If you want to calculate the time or the number of days between two or more dates, the datetime-module is even more useful:

#!/usr/bin/python
# coding: utf-8

import datetime

# Generate a datetime.date-object for today:
d1 = datetime.date.today()
print(d1.strftime("%d.%m.%Y"))

# Generate a datetime.date-object for "30.01.2010":
d2 = datetime.date(2010, 1, 30)

# Now we can do some calculations with a datetime.timedelta-object:
td = d1 - d2
a = "Today is "
if d1 > d2:
    a += str(td.days) + " days after "
else:
    a += str(td.days * -1) + " days before "
a += d2.isoformat() + "."
print(a)

try:
    feb29 = datetime.date(d1.year, 2, 29)
    print("This year is a leap year.")
except ValueError:
    print("This year is not a leap year.")

5. Starting other processes from inside a Python-script

You can start other processes like that:

#!/usr/bin/python
# coding: utf-8

import os

os.system("ls")

If you want to grab the process' output, you can do:

#!/usr/bin/python
# coding: utf-8

import os

ph = os.popen("ls")
a = ph.readlines()
ph.close()

print(a)

If you want to have more control over the started process, for example have it running for some time and pass commands to it while it's running, you should take a look at the Python-module

subprocess

On Microsoft Windows the command

os.startfile("somefile.html")

is very useful: It starts a file with its connected application (like clicking on its icon in "Explorer").
So in the example above, the system's default-browser is started showing "somefile.html".
That way, you don't have to worry about the application's correct start-command. Very nice.

6. Conversions Between Decimal, Hexadecimal, Binary and Octal Numbers

The following script shows, how these number-conversions can be done:

#!/usr/bin/python
# coding: utf-8

hexnum = "FF";
decnum = 255;
binnum = "11111111";
octnum = "377";

# Hexadecimal to decimal:
print(int(hexnum, 16))

# Hexadecimal numbers start with "0x":
print(0xFF)

# Decimal to hexadecimal:
print(hex(decnum))
print("%X" % decnum)

# Binary to decimal:
print(int(binnum, 2))

# Decimal to binary:
print("{0:08b}".format(decnum))

# Octal to decimal:
print(int(octnum, 8))

# Octal numbers start with "0".
# They are used for example in functions like os.chmod().
print(0o377)

# Decimal to octal:
print(oct(decnum))
print("%o" % decnum)

Notice, that the functions hex() and oct() return a string.

It seems, in earlier Python version (Python 2.4 and such) there wasn't built-in support for decimal-to-binary-conversion. A function "posdec2bin()" could have been used instead:

def posdec2bin(decnum):

    hexbin = {"0":"0000", "1":"0001", "2":"0010", "3":"0011",
              "4":"0100", "5":"0101", "6":"0110", "7":"0111",
              "8":"1000", "9":"1001", "A":"1010", "B":"1011",
              "C":"1100", "D":"1101", "E":"1110", "F":"1111"}

    hexnum = hex(decnum)[2:]
    hexnum = hexnum.upper()

    binlist = []

    for i in hexnum:
        binlist.append(hexbin[i])

    binstring = "".join(binlist)
    binstring = binstring.lstrip("0")
    return binstring

7. Installing modules

Modules often come with a "setup.py"-file. Usually, it should be run as root with

python setup.py install

Help on more commands "setup.py" knows, can be found doing

python setup.py --help-commands

8. Generating Bytecode of any script

When you execute a Python-script, bytecode is generated. Sometimes it is saved as a .pyc-files. These files can be executed by Python just like .py-files. They start a little faster too, but don't run faster. They don't contain plain-text, so they can't be viewed in an editor directly, but they can be retransformed to .py-files by using something like this.
However, if you want to generate a .pyc-file for a certain .py-file, you can use the script "compileall.py", that comes with your Python-distribution.

9. Encrypting and Decrypting Text Using a Password

In this chapter used to be some Python-code that encrypted some text with the "pycrypto"-module and some other old crypto-module. As that module is deprecated by now, this code doesn't work any more. The encryption/decryption can be done on Linux with shell-command "gpg" though (provided, it is installed on the system).
Sometimes it makes sense to write the data processing in Python, and finally call shell-commands with "os.system()":

#!/usr/bin/python
# coding: utf-8

import os, sys

def askAndDelete(fname):
    c = input("Do you want to delete file '" + fname + "'? (y/n) ")
    if c == "y":
        os.remove(fname)
    else:
        print("Didn't remove file '" + fname + "'. Bye.")
        sys.exit()

a = ("unencrypted.txt", "encrypted.txt", "decrypted.txt")
s = "For your eyes only!"

for i in a:
    fname = os.path.join(os.getcwd(), i)
    if os.path.exists(fname):
        askAndDelete(fname)

print("Writing string '" + s + "' to file 'unencrypted.txt'.\n")
fh = open("unencrypted.txt", "w")
fh.write(s + "\n")
fh.close()

print("Encrypting to newly created file 'encrypted.txt' using password \"Melina\".\n")
e = """echo 'Melina' | gpg --batch --textmode --passphrase-fd 0 -o encrypted.txt -c unencrypted.txt"""
os.system(e)

print("Decrypting to newly created file 'decrypted.txt' using password \"Melina\".\n")
e = """echo 'Melina' | gpg --batch --textmode --passphrase-fd 0 -d encrypted.txt > decrypted.txt"""
os.system(e)

10. Scripting LibreOffice.org

You can use Python as a scripting language for LibreOffice. To be able to do that, you have to do two things first:

1. There's a Python-module called "uno.py" that needs to be imported in the Python-script. "uno.py" comes with LibreOffice. It should be in its "program"-directory. You have to make sure, that it can be found by Python. This can be done by putting a file "uno.pth" into the python-path, for example to "/usr/lib/python/site-packages". The file "uno.pth" needs to contain just the path to the LibreOffice-"program"-directory, for example

/usr/lib/libreoffice/program

2. To accept python-scripting, LibreOffice needs to be activated in a special server-mode. This is done by starting it with

oowriter --accept="socket,host=localhost,port=2002;urp;"

If it's started this way, it accepts scripting from a script like this:

#!/usr/bin/python

import uno
from com.sun.star.text.ControlCharacter import PARAGRAPH_BREAK

# Initializing:
local = uno.getComponentContext()
resolver = local.ServiceManager.createInstanceWithContext ("com.sun.star.bridge.UnoUrlResolver", local)
context = resolver.resolve ("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
desktop = context.ServiceManager.createInstanceWithContext ("com.sun.star.frame.Desktop", context)

# Creating a new document:
document = desktop.loadComponentFromURL( "private:factory/swriter", "_blank", 0, () )

# This would write into the current document instead, if one was already loaded:
# document = desktop.getCurrentComponent()

cursor = document.Text.createTextCursor()

document.Text.insertString(cursor, "Hellow", 0)

# Doing a "Backspace" deleting the "w" from "Hellow":
cursor.goLeft(1,1)
cursor.setString("")

# Insert a "Return":
document.Text.insertControlCharacter(cursor, PARAGRAPH_BREAK,0)

document.Text.insertString(cursor, "Hello from Python.", 0)

"""
Other things (this part is currently not executed):

Loading a new document would be:
document = desktop.loadComponentFromURL("file:///home/user/letter.odt", "_blank", 0, () )
cursor = document.Text.createTextCursor()

Going to an already existing Bookmark named "bmark" would be:
Bookmark = document.Bookmarks.getByName(bmark)
cursor = document.Text.createTextCursorByRange(Bookmark.Anchor)

Inserting into an already existing TextFrame called "name" would be:
frames=document.getTextFrames()
frame=frames.getByName(name)
cursor = frame.Text.createTextCursor()
frame.Text.insertString(cursor, "Hello.", 0)
"""

11. Plotting to the screen with Pygame

The Pygame-modules provide Python-access to "SDL", a multimedia-library similar to "DirectX" on Windows.
Pygame provides a window in which you can draw graphical objects, move them around and remove them again. It also provides modules for processing keyboard- and mouse-events and for playing several sounds and additionally music at the same time.
So you can use Pygame to program 2D-games. As this is a subject of its own, I have another webpage about it.
Some years ago, I had to do

export SDL_AUDIODRIVER=alsa

to get sound working correctly with my soundcard before starting a Pygame-application. Today, this may not be necessary any more.

Pygame's capabilities can also be used for other purposes than game-programming. Here is a script, that calculates some mathematical object and plots it to the screen.
As an exception, in this script the picture is plotted only once. But in games, it would be necessary to clear and redraw the screen each "frame", that is each time the program is going through the main-loop.
Press "q" to quit it:

#!/usr/bin/python
# coding: utf-8

import pygame

import os
import math

RESX = 800
RESY = 600

def drawGauss(surface):
    b = 0.3
    ULIM = 100.
    UDIV = 1.8

    for u in range(0, int(ULIM), 1):
        # Some mathematical, Gaussian calculations:
        a = float(u) / (ULIM / UDIV)
        if u > ULIM / 2:
            a = UDIV - (float(u) / (ULIM / UDIV))
        m = u / 16.

        for x in range(50, RESX - 50):
            c = x / (RESX / 16.) - 5.
            y = a * math.exp(-b * (c - m) * (c - m))

            # down:
            y = RESY / 10 + (y * RESY / 1.5 + u * 3)

            # up:
            # y = RESY * 9 / 10 - (y * RESY / 1.5 + u * 3)

            pygame.draw.line(surface, (200,200,200), (x, y), (x, y), 1)

def processEvents():
    pygame.event.pump()
    pressed = pygame.key.get_pressed()
    if pressed[pygame.K_q]:
        return "quit"
    return 0

# Main:

os.environ['SDL_VIDEO_WINDOW_POS'] = "245, 40"
screen = pygame.display.set_mode((RESX, RESY))
pygame.display.set_caption('Gauss') 
pygame.init()
drawGauss(screen)
clock = pygame.time.Clock()
running = True
# Main loop:
while running:
    clock.tick(60)

    # We're not clearing all sprites and rebuilding the screen this time:
    # screen.fill(BLACK)

    if processEvents() == "quit":
        running = False
        pygame.quit()

    if running:
        pygame.display.flip()

12. Messageboxes with tkinter

tkinter is one of the GUI-toolkits available for Python. With it, you can create window-applications.
Actually tkinter is an interface to Tk, the GUI-toolkit of the language "Tcl", but usually you won't notice that.
tkinter is quite small in relation to other GUI-toolkits. Its windows don't look too beautiful, but they are usable.

The following script "msgbox.py" provides a simple messagebox in tkinter:

#!/usr/bin/python
# coding: utf-8

# msgbox.py

import os
import tkinter as tk

class MyMsgBox:

    def __init__(self, title = "My Title", msg = "Some text"): 
        if os.name == "nt":
            self.appfont = "{Arial} 10 {normal}"
        elif os.name == "posix":
            self.appfont = "{suse sans} 15 {normal}"
        else:
            self.appfont = "{suse sans} 15 {normal}"
        self.mw = tk.Tk()
        self.mw.title(title)
        # self.mw.iconname(iconname)
        self.mw.geometry("+350+300")
        self.mw.option_add("*font", self.appfont)
        self.label = tk.Label(self.mw, text = msg)
        self.label.pack(side = tk.TOP, anchor = tk.W, padx = 50, pady = 10)

        self.btn_exit = tk.Button(self.mw, text = "Ok", command = self.mw.destroy)
        self.btn_exit.focus()
        self.btn_exit.bind('<Return>', self.mwDestroy)
        self.btn_exit.pack(side = tk.RIGHT, anchor = tk.E, padx = 20, pady = 15)
        self.mw.mainloop()

    def mwDestroy(self, a):
        self.mw.destroy()

if __name__ == "__main__":
   app = MyMsgBox("Hello", "Some text.")

You can run "msgbox.py" directly, but in general it is meant to be used as a module that is called from another script:

#!/usr/bin/python
# coding: utf-8

from msgbox import MyMsgBox

msgb = MyMsgBox("Window", "Hello from the controlling script.")

If you have already an opened Toplevel-tkinter-window, you can even do this instead:

#!/usr/bin/python
# coding: utf-8

import tkinter.messagebox

tkinter.messagebox.showinfo(title = "Hello",
                            message = "Some text")

13. Parse configuration-files using module "ConfigParser"

With Python you can parse configuration-files using the module "ConfigParser", which is part of the Python-distribution.
The configuration-files have to be in a standard-format: They must have sections, followed by options and values in a format like this:

[section]
option=value
option=value

Take for example the configuration-file for the KDE-editor "kate" called "katerc". If "kate" is installed, it can be found here:

/home/user/.kde/share/config/katerc

If your "katerc" looks like this:

[$Version]
update_info=kate-2.4.upd:kate2.4

[Filelist]
Edit Shade=255,102,153
Shading Enabled=true
Sort Type=0
View Shade=51,204,255

[General]
Days Meta Infos=30
Modified Notification=false

you can use the following Python-script in its directory to parse it:

#!/usr/bin/python
# coding: utf-8

import configparser

c = configparser.ConfigParser()

# Read in the configuration in file "katerc":
c.read("katerc")

secs = c.sections()

print()
print("These are all sections, options and values:")
print()

for i in secs:
    print(i)
    opts = c.options(i)
    for u in opts:
        value = c.get(i, u)
        print("\t" + u + "\t\t\t" + value)
print()

# Setting all values to "MyValue":
for i in secs:
    opts = c.options(i)
    for u in opts:
        c.set(i, u, "MyValue")

# Uncommenting the following lines would write the configuration
# stored in object "c" to a file "out.conf":
#
# fh = open("out.conf", "w")
# c.write(fh)
# fh.close()
# print("New configuration written to file 'out.conf'." )

Please see "pydoc3 configparser" for further information.

14. PDF and Python

For creating new PDFs using Python, I suggest taking a look at the open-source-version of the "ReportLab PDF Toolkit" module-library.

For postprocessing already existing PDFs, on Linux I'd use the console-tool "pdftk". For Python, there's also "pyPdf".

15. Sending Emails

Python comes with everything that is needed to establish an email-client without the help of other programs. Modules included are "smtplib" and "email".

Sending emails is very easy with the script simplemail. On its page is an example, how to use it. It's for Python 2.x, so the module may need a bit of adaption.

16. Receiving Emails from a POP3-Server

Email's data-format was originally meant for transmitting just plain text. So all data, even of attachments like .jpg- or .mp3-files, is encoded and put together into a single text-message. Such a message has headers beginning with "To:" or "Date:", followed by the email's text and so on.
So after receiving an email-message from a server, you have to parse it into the parts you're interested in. You can do this with the module "email.message".

The following script accesses a POP3-email-server "pop3.exampleserver.com", fetches all emails for user "user@exampleserver.com" there using password "examplepassword" and stores all email-parts including attachments in a handy Python-list. The script just prints this list, but of course you can do much more interesting things with it like for example extract only the messages' texts or save the attachments.
As you can see, element four of the list is the attachment stored as another list with just two elements, the attachment-filename and the attachment-data. The later can be saved using "fp = open(filename, "wb")" and proceeding as described above in "File-operations":

#!/usr/bin/python
# coding: utf-8

import sys
import poplib
import email
from email import message 

HOST = "pop3.exampleserver.com"
PORT = "995"
USER = "user@exampleserver.com"
PASSWORD = "examplepassword"

def getAllEmails(host, port, user, passw): 

    try:
        M = poplib.POP3_SSL(host, port)
    except:
        print("Error: Email-server not found.")
        sys.exit(1)

    try:
        M.user(user) 
        M.pass_(passw)
    except:
        print("Error loging in: Maybe invalid username or password.")
        sys.exit(2)
    
    emails = [] 
         
    maillist = M.list()[1]

    if len(maillist) > 0: 

        for mailnr in range(len(maillist)):

            adress = "" 
            subject = "" 
            content = "" 
            attachment = [] 

            textbyteslist = M.retr(mailnr + 1)[1]
            temp = []
            for i in textbyteslist:
                temp.append(i.decode("utf-8")) 

            m = email.message_from_string("\n".join(temp))

            types = m.items() 
                 
            for a in types:

                if a[0] == "From": 
                    adress = a[1] 

                if a[0] == "Subject":
                    subject = a[1]
                                 
            for part in m.walk(): 

                if part.get_content_maintype() == 'multipart': 
                    continue 
 
                if part.get_content_maintype() == "text":
                    content = part.get_payload(decode = True)
                         
                if  part.get_filename() is not None: 
                    attachment.append([part.get_filename(), part.get_payload(decode = True)]) 
                 
            emails.append([mailnr + 1, adress, subject, content, attachment]) 

    M.quit() 

    return emails

a = getAllEmails(HOST, PORT, USER, PASSWORD)
print(a)

It is also possible to delete emails directly on the distant server using "poplib.dele()".

17. Setting up a small html-Server

It's amazing, that it's possible to set up a small html-server with just a few lines of Python-code:

#!/usr/bin/python
# coding: utf-8

# httpd.py

import sys

from http.server import HTTPServer
from http.server import CGIHTTPRequestHandler

serveradresse = ("", 8080)
server = HTTPServer(serveradresse, CGIHTTPRequestHandler)

try:
    print
    print("Serving:")
    print("Please open a browser and point it to \"http://localhost:8080/test.html\"")
    print("Press \"CTRL+c\" to exit.")
    server.serve_forever()

except KeyboardInterrupt:
    print("Bye.")
    sys.exit()

It can manage .html-files found in and below its directory.
For example, you can put a file "test.html" there and access it with an internet-browser pointing to "http://localhost:8080/test.html".
If you just go to "http://localhost:8080/", the server tries to show you a page "index.html" in its directory.
You can also use the server to test your CGI-scripts.

I wouldn't take this small server to the internet, as there may be security risks. For internet purposes, I suggest using a larger server like "Apache".

18. Creating Windows ".exe"-Files From Python Scripts

Especially on Windows, many users seem to be reluctant to install the Python interpreter. They are used to computer programs coming as Windows executables, that is as ".exe"-files. Not as scripts with the file extensions ".py" or ".pyw", that look unfamiliar to them.
However, it is possible, to pack the parts of the Python interpreter and all the modules needed by a script to run, into a single ".exe"-file. That way, the script can also be handed out to users, that couldn't cope with installing the Python interpreter or even additional modules.

A program to create such a Windows executable is PyInstaller. To turn a GUI-script into a single ".exe"-file, it would be run like this:

pyinstaller --onefile --windowed yourscript.pyw

More options of PyInstaller can be found here.

Another way to create such an ".exe"-file would be py2exe. As it's an extension to the "distutils"-package, it has to be used a bit differently.
First, a file called "setup.py" has to be created, which holds all the options and points to your Python script. "setup.py" may look like this:

# setup.py
from distutils.core import setup
import py2exe, sys, os

sys.argv.append('py2exe')

setup(
    options = {'py2exe': {'bundle_files': 1, 'compressed': True}},
    windows = [{'script': "yourscript.pyw"}],
    zipfile = None,
)

Then, this command is run to create the ".exe"-file:

python setup.py py2exe

By default (with a simpler "setup.py"-file) py2exe creates a large directory with a much smaller ".exe"-file and all the library-files needed to run it.

More options of py2exe's "setup.py"-file can be found here.

19. Encodings

The encoding-line at the beginning of a script tells the script, which encoding the script uses:

# coding: utf-8

for UTF-8 or

# coding: iso-8859-1

for ISO-8859-1 (= Latin 1).

Sometimes, there can be problems with encodings, when you try to ouput text, either to the console or to GUIs (like tkinter). I suggest, experimenting with "str.encode()" then. At least, this works on my system:

#!/usr/bin/python
# coding: utf-8

a = "Zwölf Boxkämpfer jagen Eva quer über den großen Sylter Deich."
print(a)

On Linux, the result depends on the system locales, printed by the "locale"-command, like for example "$LANG". "$XTERM_LOCALE" may be another variable to look for.

Back
Author: hlubenow2 {at-symbol} gmx.net

Python Page #2: Tips and Tricks

Contents: