Converting Internet Archive Books to LUNA Books

(7.5.3.4+) If you would like to add Internet Archive books into LUNA you can use these instructions as a way to make this happen. This process was entirely done on a Linux platform. This process was designed to process a large amount of books not just a couple items. However the bash script is helpful for even just a few books.

Identify the books you want and generate download links

You will need a list of the Internet archive URLs for the books you want to load.

If you go to archive.org and open a book you want it to look like the URL below.

https://archive.org/details/harpersweeklyv7bonn

The actual Identifier for the book in the internet archive is harpersweeklyv7bonn

To download the book for upload into LUNA, you will need the link jp2 files for the book in zip format and the pdf file.

https://archive.org/download/<Indentifier>/<Indentifier>_jp2.zip

https://archive.org/download/harpersweeklyv7bonn/harpersweeklyv7bonn_jp2.zip

https://archive.org/download/<Indentifier>/<Indentifier>.pdf

https://archive.org/download/harpersweeklyv7bonn/harpersweeklyv7bonn.pdf

Here is a link to a spreadsheet in google docs that makes the URLs from an identifier

https://docs.google.com/spreadsheets/d/1kjOP1z7T7Bto4a4rQBgLwg5G6KP-3Tu8UsF6wBTKnLA/edit?usp=sharing

Basically using the Internet Archive Identifier and applying a formula to create the download links

=CONCATENATE("https://archive.org/download/",A1,"/",A1,"_jp2.zip")

Download the _jp2.zip and pdf files

You can use jdownloader2 to download the books in mass.

https://jdownloader.org/jdownloader2

Set the directory of where you want to store these downloads

Copy the URL's that you want to download and paste them into a tab called LinkGrabber

Start download and choose the speed you want to download

 

Convert downloaded zip files and pdf files into LUNA book zip format

On the Linux machine Install “rename”

>sudo apt-get install rename

We made a bash script that does the conversion of the IA Books to LUNA Book Upload zip files.

Place this bash script makeLUNABookFromIA.sh file into the directory where all the zip and pdf files are stored.

The script provided is to be used at your own risk and is not a supported product of Luna Imaging Inc. That said, it should be safe to use. It just does some unzipping, renaming, moving of files and zipping.

Book zip files can be very large. There is an option to delete or keep the source Internet Archive book zip and pdf files. So make sure you review this option before running the script.

Run the script

Once it’s finished you will have two directories: readyforUpload and processed.

The files in readyforUpload are ready to load into LUNA as book objects.