Uploading Books
What files do I need to prepare for a book upload?
The minimum you need is a books.properties file and an image for each page of the book.
If you want it to be a text searchable book you will need a book.properties file, the images for each page, a .pdf file, and the .xml file (xml is not required for 7.1.6+, it's automated in the processing).
How do I upload books, what are the steps?
- Gather the files needed for the book.
- In LUNA Choose Uploader from the Tools menu.
- Choose the collection you want to add the book to.
- Create a new batch.
- Click the "Books" tab at the top.
- Select the files and add them to the batch.
- Upload the files.
- Process the book by clicking the green "Process Book" button.
Processing the book will take some time and slow LUNA down. Large books may take quite some time to process. Consider timing the processing of a book when there are less users on the system.
I don't need the book text to be searchable in LUNA:
The collection you intend to upload the book to needs to be created in LUNA.
You need to get the following book items together and ready for upload.
Create folder/directory containing:
- an images sub directory filled with the images for each page.
- a book.properties file.
I want the book to be text searchable in LUNA:
The collection you intend to upload the book to needs to be created in LUNA.
You need to get the following book items together and ready for upload.
Create folder/directory containing:
- an images sub directory filled with the images for each page.
- a book.properties file
- a text-searchable PDF
- an XML file converted from the PDF
Preparing the BookReader objects for upload into LUNA
Images Sub Directory
Prepare all the images that will make up your book. Each image represents a page. JPEG and TIFF files are supported and must be numbered sequentially in the following manner, starting with 0001. We understand that you may have all ready named your files with unique identifiers; we propose that you simply copy your files into a new folder to apply the sequential numeric value as described in this section so that a BookReader object can be created. The file names must be 4 digits long and have leading 0's:
0001.tif
0002.tif
0003.tif
Place image files in a sub-directory (or folder) called Images within the top level directory created in step 1. For example:
Create a text file called book.properties
Create a text file called "book.properties" and place it in the top level directory. When you save the book.properties file there should be no extension appended to it.
In this book.properties file include three properties:
BookIdentifier = Cats
ThumbnailImage = 0005
TotalPages = 6
"BookIdentifier" will be the name of the book. XML and PDF MUST use this name specifically. ("Cats" is name of the book in the example image above)
"ThumbnailImage" indicates which image file will be used as the thumbnail to represent the book in LUNA. Do NOT include the file extension here. Indicate the image with the four digit number. For example: 0005
"TotalPages" should match the number of images/pages you have in the Images folder.
More than one book can have the same BookIdentifier as long as it is not processed in the same batch.
Convert your text-searchable PDF into an XML file
CONVERTING YOUR TEXT-SEARCHABLE PDF INTO AN XML FILE IS NO LONGER REQUIRED IN v7.1.6+ (this happens automatically now).
This will make you're book searchable. If you do not need the contents of the book to be searchable, skip this step.
You can determine if your PDF is searchable by opening it and trying to search the contents of the book. If you you are able to find results, then you are good to go.
If you want your book to be text searchable you will need to use an Optical Character Recognition (OCR) program to create the PDF from the images you have.
- http://blogs.adobe.com/acrobat/acrobat_ocr_make_your_scanned/
- http://finereader.abbyy.com/
- For small books (10 or less pages) https://support.google.com/drive/answer/176692?hl=en
The name of the .pdf will use the name of the book you entered in the book.properties file. If the name of the book you enter in Book.properties is Great_Expectations then the .pdf would be "Great_Expectations.pdf".
Optical Character Recognition software is not perfect. It will probably get a majority of the text correct but there will likely be some errors. If you are concerned about accuracy you will need to proof read the pdf's that your OCR creates.
Once you have a single PDF you are ready to make the special XML file that is used to search the book's contents. To convert your text-searchable PDF into this special XML file (LUNA uses the XML to enable text searching in the Viewer):
Use the New Online Conversion Utility
or
Download the utility called PDFtoXML and install it. (This utility is only available in a Windows format)
- This utility makes use of the open source utility pdftohtml but is not the same thing.
After you've installed the utility:
- Copy and paste the PDF into the PDF folder. (located in the install directory of PDFtoXML)
- Click on "Convert_PDF_to_XML.bat".
- Open the XML folder, and grab the newly created XML file.
Do not edit the name of the newly created XML file. It will automatically add "_text" to the file name. Each time you perform this conversion be sure that both the PDF and the XML folders are empty.
Uploading BookReader objects to LUNA
Once you've prepared all your BookReader objects, you can publish them to LUNA using the Uploader. Be sure to select the Books tab.
Can I turn my PDFs into books?
Yes! For best results, we recommend that you create the PDF as a PDF/A-2b. Then, you can use the Books tab in the Uploader to upload the .pdf file. Once it is uploaded you can click the Green "Process" button at the top and it will turn the .pdf file you uploaded into a book.
To make books that are created from .pdf files text searchable you'll need 3 things for each zip file. The .pdf file, the XML file converted from the PDF file and a book.properties file.
- My.pdf
- My_text.xml
- book.properties
CAUTION: If you upload .pdf files using the Media Items tab, it will not be a book object
Some pdf files will not look right when directly processed into book reader objects. In these cases we recommend that you convert the PDF as a PDF/A-2b and upload it again.
Can I upload record data with my books?
You are able to upload a record for the book. It will be in the schema that you choose for the collection. If you need a blank .csv with the field names all ready inserted to work with you can click Download blank data schema .csv file in the lower left corner of the Uploader page. You can then enter the information you want for each field and then upload it in the same batch as you uploaded the book in.
My book(s) did not process successfully, how can I tell what’s wrong?
Double check the following:
- Were the images in the sequence and format: 0001.jpg, 0002.jpg, 0003.jpg, etc?
- Did the book.properties file get included in the upload?
- Did the book.properties file include lines for "BookIdentifier", "ThumbnailImage", and "TotalPages" ?
- Was the .pdf name the same name you choose for "BookIdentifier" (in book.properties) followed by ".pdf" ?
- Was the .xml name the same name you choose for "BookIdentifier" (in book.properties) followed by "_text.xml" ?
Books are taking a lot of time and other users need access to the uploader?
You can install the stand alone uploader we have available so that all the processing is done on your computer before it is pushed up to the server to go live. If you only have a few books to upload this will not be necessary. If you have a huge number of books to upload and uploading 1-3 books a day is not going to be fast enough, let support know and we will help you setup an uploader on your work machine so it can use your system to process and then upload to LUNA.
The LUNA Uploader will be able to upload and process your books. processing of all the images that make up the book can take some time and will put some load on LUNA while the procesing is being done. This load can slow down the normal use of LUNA. If you are only going to upload a few books and not that often it should not be a problem. You could also upload the books at the end of the day when there are few, if any, users on LUNA.
If you are going to be uploading many books or big books or many big books you may want to think about installing the standalone uploader on your machine. This will allow you to access the LUNA interface like you are used to and upload your books but all the processing will be done on your computer and the final processed book uploaded to your online LUNA instance. This will take all of the load off LUNA and allow user to continue browsing and working on projects without any stress on the system.
Can I upload more than one book at a time? Make a .zip file for each book and upload them.
Yes. You will need to get the following together in a folder:
- an images sub directory (filled with your images: 0001.jpg, 0002.jpg, 0003.jpg, etc...)
- a book.properties file
- a text-searchable PDF (optional)
- an XML file converted from the PDF (optional)
Highlight all of these items and create a .zip file that is the same name as the book you are uploading. Upload the .zip files in the Uploader under the Books tab.