VAULT DWELLERS SERVED

Friday, January 29, 2016

Reporting Solution Found For Vault-OS


The HTML-to-PDF solution produces the most brilliant looking reports imaginable. Custom styles, formatting and layout are all faithful to the original. Sample output style above as example. It can be output with plain black and white or high contrast applied in the stylesheet.

I tried sending some of my inventory grids from the old VOS I had saved as HTML files except paginated lists to the PDF converter and they came out in the browser window correctly paginated for printing. I had always theorised this was possible and it is a fantastic way to generate a myriad of reports for the inventory manager, the shopping list, duty roster, radio contacts, inspection lists ... all as PDFs that can be viewed in the browser or sent to a printer. My 10 year old Compaq T5710 thin client that I purchased for $2 on EBay displays all of VOS correctly, including JQuery controls, Raphael vectors and has a built-in PDF viewer that comes installed. If the VOS server is running somewhere it is just plug-n-play and this box becomes a distributed terminal to control or monitor anything else in the shelter. My whole system is based on a few dollars for the lot if you can't find half of it thrown out somewhere.

There's a drawback and it is that you need a 20 megabyte open source PDF converter and accompanying 22 MB .DLL in the "bin" directory for all this to come off correctly. The Lua script that generates the reports as HTML calls the converter when it is done. It is easy to disable this feature and instead pop up the HTML in a new browser window which the user can then arrange to send to a freeware PDF printer or even send it directly to the printer if he thinks it will turn out looking okay.

The thing that bugs me about this approach is that it breaks my all-in-one .EXE superserver architecture by requiring a platform specific utility in the "bin" directory as well. It also adds a physical cache that needs automated management. (To delete old files on a regular basis and keep it from growing too big) This isn't too hard to implement, however. Most web servers do this. This open source converter comes with executable binaries compiled for almost all major OSes as well as all source code. It is totally portable.

Of course, I could just add a configuration flag that turns this process off and launches the HTML report in a browser window and leave it to the user to output the document however their local setup permits.

The reason I am telling Vault-Co readers all these details is that I am hoping somebody out there knows an even better way. For example ...

1. A much smaller HTML-to-PDF converter that either comes as a library or with source under a megabyte. Or a portable renderer that is an executable that takes up less space than 40+ MB.

2. Another way altogether to approach the problem ... like a report formatting tool that would plug into my former solution, LibHaru, with commands to draw tables, grids and layouts and then dump it into a PDF.

3. Another alternative like a way to build rich text files from a Lua script with tables or a Lua plug-in with some C code to render to some other document type that would be smaller and more efficient than what I am using, mostly important to compile it right in with what I have and return to my all-in-one total solution server that can do everything.

I may be making a fool out of myself but I just wonder if I throw a net out there somebody might have a better paradigm. Forty megabytes in the 'bin' directory is a little heavyweight compared with all the other code I have now.

9 comments:

Sam said...

Maybe I'll just annoy you but maybe not. You're not the first person with this problem and some say that the problem is that HTML just doesn't conform well to pdf conversion.

http://www.perlmonks.org/?node_id=744177

Maybe you could make a simpler format than HTML. If so here's an option that says the output can be pdf.

https://en.wikipedia.org/wiki/MakeDoc

http://www.rebol.org/documentation.r?script=makedoc2.r

There's a game library Allegro that has something called makedoc also. Allegro is bundled with makedoc. I wonder if it's the same type makedoc? Not sure. It's very strange but outputs to a lot of file types including pdf.

http://liballeg.org/stabledocs/en/makedoc.html


https://www.allegro.cc/manual/4/tools/makedoc/

Of course none of these are exactly what you want. I saw a php library that converted HTML to pdf. It was 50MB. Ouch!


Hmm...maybe this but no tables.

http://www.codeproject.com/Articles/5872/Pdfizer-a-dumb-HTML-to-PDF-converter-in-C

Enough.

Texas Arcane said...

@Sam

You're not annoying me at all. I am checking on every single link you posted. You've been a huge help.

I think this will be solved by conditional installation. If a person has room for it they will be able to install the PDF generator otherwise they can just pop up a browser window and then handle print formatting on the machine environment.

Texas Arcane said...

@Sam

In most cases every single feature I have added (to the version I have been developing at my day job) has been well justified by the size-vs-benefit ratio.

My employer needs PDF report output and this has worked very well but should probably be an option during installation of Vault-OS, otherwise it will simply generate formatted reports to HTML windows.

Sam said...

I found a MakeDocPro.

http://www.rebol.org/documentation.r?script=make-doc-pro.r

This is the ticket. It does tables. It's output is normally html and combined with the pdf-maker program.

http://www.rebol.org/documentation.r?script=pdf-maker.r

http://www.rebol.org/view-script.r?script=pdf-tables.r

He's also has pdf-maker 2. Very nice and more recent.

http://www.colellachiara.com/soft/PDFM2/

If you look at this pdf-maker document, down at the graphics section, you can see that you could readily display knobs for entering or rendering values. Since Rebol has meta-programming the values could be updated into the script. Not as easy as the other libraries you already have because you might have to work out how to draw the knobs or output meters but I'll bet a lot smaller and easier to code. More upfront thinking but probably less trouble to use once you have them done. No complicated linking to C code.

http://www.colellachiara.com/soft/Misc/pdf-maker-doc.pdf

This doc may be only for pdf-maker 1. Maybe ver 2 has a little more. I think version 2 has tables.


After looking at the docs further it apparent that this is not all plug and play. It's going to be hard to find all this "do everything stuff" with out having large libraries. I think it's possible. The pdf-maker 2 is 166K but it's going to take some work to get dials, output graphs and some stuff like that. On the other hand you're going to have to input all the values to all the other libraries somehow anyways.

He has more stuff if you go up to the preceding dir from the 2nd edition of pdf-maker. A pdf-maker-doc.r here. Maybe better docs is a program format?

http://www.colellachiara.com/soft/Misc/

Anyways enough of this.

Eric Green said...

In the past I've used htmldoc to good effect:

https://www.msweet.org/downloads.php?L+Z1

Htmldoc includes a makefile and project files so you can compile it with gcc or VC++, but the tarballed/zipped source comes in at around ~4 MB so it's likely heavier than what you want. It has a GUI for generating its documents, but you can also call it from the command line, or from PHP, Perl, etc. scripts assuming it's accessible to the running process or web server.

Texas Arcane said...

@Eric currently looking at it, thanks. Sam had also listed that one. The source code be trimmed down drastically by excluding a lot of the font code and support routines including the GUI. If it would come in around a meg it would be excellent, LibHaru was 800K plus font resources compiled in.

Texas Arcane said...

@Eric

Been looking at the source. Definitely could be compiled in with 50% of the source it has now, probably less than 250K added to the program.

Biggest problem is no support for styles. These make everything look pretty nice with little effort. If it doesn't support styles it won't support my supercool trick to print out barcode 39 with no images, just using CSS. The barcode is critical because the whole reason for the reports originally is to print out scannable barcodes in grids from inventory on adhesive paper so instant tracking is possible with CueCat or otherwise compatible barcode scanner. Before I got this job I was working quite hard on a floating HTML window that shows your CueCat scans in real time in the window and while you are scanning it matches them to a huge master UPC database in the background so you can fill in most of the fields in the inventory form automatically.

Eric Green said...

@ Tex

Based on your previous postings here I assume you mean the solution from http://www.codeproject.com/Articles/146336/Creating-a-Code-Barcode-using-HTML-CSS-and-Java

Other than HTML2PDF and htmldoc, the most recommended HTML->PDF converters that I find seem to be:

dompdf - http://dompdf.github.io/ - fairly sophisticated, fast, lightweight, supports CSS 2.1, but dependent on PHP.

wkhtmltopdf - http://wkhtmltopdf.org/ - handles CSS, uses the Qt/WebKit rendering engine on the command line, offers C binding, source is on Github. But the binaries are pretty big in their own right, and building it yourself means including the Qt/WebKit support libraries.

In the past I've used LibreOffice, which is a solid document converter that can run headless - https://ask.libreoffice.org/en/question/2641/convert-to-command-line-parameter/ - problem is a "minimal" install of LO and all its support libraries is 250+ MB.

All in all, I'd probably just leave PDF printing out of Vault-OS. I can't think of any time I've converted a document to PDF except to ensure consistent formatting for someone outside the current organization who may not have Word 2013, etc. That likely won't be an issue when everyone outside the "current organization" just melted in a nuclear blast.

If you're determined to use htmldoc, the first non-CSS barcode solution I thought of was to draw a table with columns of varying widths and then set their background color to black as appropriate. But the ConnectCode guys thought of that too, and they mention that modern browsers may disable printing of background colors -- hard to permanently override this on a stateless machine. Maybe you could set both the CSS border-left property for each bar to black (will be ignored by htmldoc) and the HTML bgcolor property to black as well (will be ignored by the background print setting). You could probably position one barcode on top of the other or else just display two copies of it. Either way, if the user prints to PDF or hard copy they should only see one barcode.

Looking over the htmldoc mailing list, the lack of stylesheets support has been a persistent complaint, but I can understand Mike Sweet's thinking -- no single web browser supports any CSS spec 100% either, and features are always being added and dropped without notice, so it's a moving target and ultimately you're stuck favoring the IE/Gecko/WebKit implementation over the others. No matter what you do, somebody will be getting PDF documents that don't match what appears in the browser window. This seems like it could really be an issue for browsers like Arachne or Mosaic that can run on ancient embedded systems.

Texas Arcane said...

@Eric Green

I will leave it out of the basic minimal installation but the installer could offer to add the wkhtmltopdf which is automatically integrated if present. The output looks terrific but it is just as easy to install a free PDF print driver that can take an HTML document as input. I will eventually figure out how to configure to send it to a network printer as well. Can't be too hard if it is just another URL.

Without the PDF kit the full server is still coming in at 1.8 megs with all mandatory resources.

www.000webhost.com