Get data from XML and CSV files to SQL using Python.
I started to help manage a university-wide grant. Although the grant has a long history, nothing has been done to keep a record of the applicants’ information. It is my intent to build a simple database that will allow us to track such information. The first step is to collect basic information about faculty: last name, first name, department, phone number, email address, and the semester that they applied for the grant. I know, I know, the grant information should be in a different table. Anyway, the data I can retrieve from the last two semesters are in different formats: one is in XML format and the other is in CSV format. My goal is to collect faculty’s information into a single table in SQLite3, which is part of the Python standard library and does not require you to install other specific modules to work with SQLite. My code runs in Python 3.5.
I first worked with the XML file that is named ‘Spring16Contact.xml’. Below is the screen capture of the code:
I then worked with the CSV file. Since the table has already been created, I only need to get data from the CSV file and then add them to the existing table. Note: I deleted the header row in the CSV file so the header information will not be inserted in the table. Below is the screen capture of the code:
The end result in SQLite is shown in the DB Browser for SQLite. I masked faculty’s personal information, but you can see the records from the XML and CSV files have been put into one table.
Combine multiple screen captures into one PDF file using the Automator under Mac OS X.
In my last post, I introduced how to use the command lines in the Terminal window in Mac OS to combine multiple screen captures into one PDF file. If you are not familiar with the command lines, it might be a little bit daunting. Today, I will introduce how to achieve the same goal using the Automator tool in Mac OS, which does not require you to have any programming background at all.
Automator comes with Mac OS 10.4 or later. You can access it from the Applications. Automator allows you to create workflows and services that are repetitive in nature and kind of tedious for you to do manually. What we wish to accomplish here is exactly of that nature.
Once in Automator, you can choose the Workflow icon from the template to start a new workflow.
What we want to do here involves three actions in Automator. First, select the folder where your screen capture images to be combined are located. Second, get the folder contents. Third, choose to combine the images to one PDF file. You can find relevant actions from the Files & Folders library and PDFs library respectively. Drag relevant actions to the right panel in the order that they will be executed and specify options for the actions. Save your workflow so that you can use it later. Click on the run button at the top right corner. It will ask you to select the folder. Once the folder is selected, the Automator will run. The log in the bottom right panel will show the processes and time used. Check the PDF file to make sure it is what you want. You are done. It is that simple.
Combine multiple screen captures into one PDF file under Mac OS X.
You may have encountered the following situation: You have done multiple screen captures under Mac OS and you want to combine them into one PDF file without too much manual work. Here is what I did using the Mac OS Terminal and hope it will help you if you are in a similar situation. This solution has its advantage when you have a lot of files to process. It is much easier than the dragging and dropping method done in Preview.
There is/are one or two steps involved depending on if your default screen capture is in an image format or PDF format. If your default setting is in PDF format, you can skip Step 1.
Step 1: Convert image files to PDF files.
If like me, your default screen capture is in an image format, such as .png, you need to change each .png file to PDF file first. You can tweak the code a little bit if the images are in a different format, for instance, .jpg.
- Make sure you have sips (scriptable image processing system), which comes with Mac OS 10.4 or later, on your computer. Type sips –h , you should see something similar to the image below if you have sips.
- Put all the image files in a folder. Using the command line to go to that folder. Now you will need to do a batch processing use sips (note: before “out” there are two short “-“s, the display may show this as a long dash):
for i in *.png; do sips -s format pdf $i –out $i.pdf; done
- You can double check your folder to make sure everything is done right. Your file names may look like image1.png.pdf, which is OK and you do not need to change the file names.
Step 2: Merge PDF files into one PDF file
- From Tiger onwards, Mac OS comes with a Python script that allows you to combine PDF files. Here is what you type or copy into the Terminal window. The combined file is named merged.pdf. Following that, you provide either a directory/folder for processing or list the individual PDF file directory and name. For me, all the PDF files waiting to be combined are in a folder named pdfs and the merged.pdf will be under the same directory as the pdfs
“/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py” -o merged.pdf pdfs/*.pdf
That’s it. You are done. Check the file named merged.pdf to make sure everything is alright.