Last week, I complained about how bulky VBA (Visual Basic for Applications) can be. My current project is to create a spreadsheet to keep track of upcoming NCOERs. This is an easy task in anything but the most used and abused application the admin personnel in the Army use, Microsoft Excel. Now, the data is easy enough to pull once you get an SSL connection with this website (to access you need to use an AKO Username/Password.) The Interactive Web Response System will allow you to pull data on any past, due or current NCOERS (Evaluations). This is great as I can track our evaluations as they move up the chain and will see when they are late. Unfortunately, the web designers have only sorted the information by last name.
So, my basic program is complete. I can make a connection (fortunately, MSXML6.0 makes an easy post request to an SSL website… something that if using Python, is not an easy task.) It would appear that it is utilizing Internet Explorer’s library for the connection. Once the connection is made and the data is loaded, then I parse. Here is where VBA gets bulky. Fortunately, I can use a reference to VBScript’s RegEx object, but it is far from being as complete as say, any other regex engine in any other language. On top of that, parsing text through splits and such is an amazing pain. Here’s an example.
test = 'A simple "test of the languages" will show how bad VBA is!'
a = re.findall(re.compile(r'".*"',re.I),test)
print "Result = %s" % a
## Result = "test of the languages"
Private Sub CommandButton1_Click()
' must include Microsoft VBScript Regular Expressions 5.5 in references
Dim re As New RegExp
Dim testing As Variant
Dim test As String
' to escape quotes "" can be used or chr(34).
' unbelievably, chr(34) is easier to read than ""
test = "A simple " & Chr(34) & "test of the languages" & Chr(34) & " will show how bad VBA is!"
re.Pattern = Chr(34) & ".*" & Chr(34)
re.IgnoreCase = True
Set testing = re.Execute(test)
Range("A1").Value = "Result = " & testing(0)
' Result = "test of the languages" in "A1"
See the difference? I know it’s pretty close but keep in mind, this is a single line of text. Multiple lines, large text files, etc. can be horrendous. What in the world is left, mid and right anyhow? I know 90% of this is that I refuse to touch this language if I can avoid it but still, this is ridiculous!!
More for the ongoing series of producing an XFDL viewer in Linux. In the previous tutorial, we decompressed an XFDL file, although I have had trouble recompressing the file. It turns out that I need to do some experimentation and find the exact compression method used in gzip to be able to make the form readable. That will be for the next update though. I thought I would give a short preview of what’s next on this.
An XFDL file is an XML (xform) by IBM meant to run through their interpreter. IBM has some great documentation on this format. PureEdge works much like a browser does to decompress the file by Mime-type and to then parse and read the file, including embedded binaries (for pictures, files, etc) and embedded coding (custom functions). My interpreter will have a long ways to go so I’ll be happy to just be able to place my values in the correct fields. I’m re-reading XML parsing within Python to make this an easy function, so be patient on that part. But for those eager to see what I’m talking about, I’ve pasted a small section of XML from a decompressed XFDL.
<ae>Times New Roman</ae>
<acclabel>d ay form 46 44-r, december 19 82.
ay p d. p e version 1.00.
edition of 1 august 19 77 is obsolete.
army reserve reenlistment data.
for use of this form, see ay r 1 40-1 11, the proponent agency is r c p ay c.
item 1. enter name using last name comma first name comma middle initial format.</acclabel>
As you can see, there is a <value> tag for these nodes. For my next post, I’ll write some python code to break this xml to an object that can print the label and insert a value into the xml. There is a lot of work to interpret the embedded items, code and other tags, but this will be a start!
Earlier, I wrote about using PureEdge Viewer, which is Windows software from IBM on Linux through Wine. This got me thinking, do we need to use Windows software. A quick look at the file and through Google and it’s easy to see that an *.xfdl file is a gzipped, base64 encoded xml file. So this is part one of what I hope to be a tutorial into designing software on Linux using python to open, read and write xfdl files in the same way as PureEdge Viewer. It gets annoying, to say the least, to need to open Windows in VirtualBox, or an actual install to read and edit *.xfdl files. The barriers I see at this preliminary point is to convert the xml to a readable image and then to make that editable where possible.
So, in this first part, we will do the very basic converting an *.xfdl file to an *.xml file. The code should be self explanatory but if there are questions, post them through comments and I’ll do my best at getting an answer to you.
""" IMPORTS """
from base64 import *
import gzip, os, sys
""" DEATH!!! """
# Standard way to die...
if msg == None:
msg = "Unknown error."
print " [*] ERROR - %s" % msg
""" CHECK FOR FILE """
# No file name, then we have nothing to do!
if len(sys.argv) < 2:
die("Did not specifiy a file name.")
""" GET FILE """
# In a more advanced version, this will check the magic value of the file as well to
# ensure it is an *.xfdl file.
filename = sys.argv
print "Using %s" % filename
""" OPEN FILE AND SPLIT """
# Nothing tough, grab the magic number (1st line) and then store the rest as a variable.
f = open(filename,'r').read()
magic = f.splitlines()
print "magic: %s" % magic
data = f.split(magic)
print "Got Data."
""" BASE64 DECODE """
# First we decode the base64.
f = open('temp.gz','wb')
print "Base64 Decoded."
""" GUNZIP DATA """
# Yes, I know this writing to a file and then deleting it is ugly but I have not found
# a way to gunzip from a data stream.
f = gzip.open('temp.gz','rb')
gunzip_data = f.read()
print "Gunzipped Data."
""" SAVE XML FILE """
# As this gets more advanced, it should be able to stay as a data stream for editing.
filename = filename.split(".") + ".xml"
f = open(filename,'wb')
print "Saved to temporary file '%s'" % filename
Nothing too involved here, simply open the file, strip out the first line and the decode the rest from base64 and gunzip that data to get the xml inside. In the next tutorial, we’ll look at the structure of that xml, once I actually understand it or find decent documentation on it!
As a military member, I am issued a CAC card, which is a smart card carrying a PKI certificate for logging onto web interfaces and signing documents. As Linux user, I’ve been frustrated by the Windows specifics for utilizing CAC cards and have been happy to find alternatives through libcoolkey and pcsc_lite. These work well with Firefox and I can log onto web interfaces just fine, but have not had a viable, Linux solution to signing documents. In the military, we use an IBM product, PureEdge to utilize our forms which are *.xfdl documents.
The solution I have come to, although not purely Linux, allows me to sign documents without rebooting to another OS. For testing purposes, I have had Windows XP running in VirtualBox so I decided to utilize that. The problem came in that I was using virtualbox-ose and needed to not use the open source version for access to the USB devices. The process then is relatively and well documented with simple Google searches:
- Set vboxuser in the user’s group.
- Configure your USB devices (in this case the CAC card reader) to be detected by the guest OS.
- Run the guest OS and install ActivClient 6.0 and Silinas ApproveIt to be able to sign *.xfdl documents.
See, simple… hmm, that sounds too close to Windows 7 add but I promise it is not! I may be a PC but Windows 7 was NOT my idea!!