Generate Clean, Testable PDF Reports in Ruby/Rails with Prawn

<p>I generally hate PDF&rsquo;s. The file format is complex and designed to mimic physical paper documents, which really has little to do with the web. But unfortunately, PDF&rsquo;s are still very common and often expected, particularly when working on businesses applications. I have a legacy ruby-on-rails application with a number of PDF reports and I recently took the time to refactor them in a clean and testable manner. Here&rsquo;s how I went about that process:</p> <p></p> <h3 id="the-report-requirements">The Report Requirements</h3> <p><img src='/assets/images/example_pdf_report.png' style='float:right; margin: 10px; margin-right:0;' /></p> <p>For my project, most of my PDF reports consisted of primarily large tables of data along with a few other random pieces of data and information. Here is a screenshot of one of the simpler ones:</p> <h3 id="prawn">Prawn</h3> <p>The reports I&rsquo;m working on have been done with the <a href="http://prawn.majesticseacreature.com">Prawn library</a> from the beginning. This is the only direct PDF generation ruby library that I&rsquo;m aware of. The early days of Prawn were a bit shaky and it didn&rsquo;t support a number of features you&rsquo;d expect, but the more recent versions are quite robust. And their <a href="http://prawn.majesticseacreature.com/manual.pdf">self-documenting manual</a> is generated using the library itself and is quite useful.</p> <h3 id="why-not-use-an-html-to-pdf-library">Why not use an HTML-to-PDF Library?</h3> <p>There are several libraries available that let you write your PDF&rsquo;s in HTML and CSS, including <a href="https://github.com/pdfkit/pdfkit">PDFKit</a> and <a href="https://github.com/mileszs/wicked_pdf">wicked_pdf</a>. Both of these use <a href="https://code.google.com/p/wkhtmltopdf/">wkhtmltopdf</a> behind the scenes. While this may suit your needs, I found that creating reports in this manner made it more difficult to manage the layout of the document, particularly when there was more than a single page.</p> <h3 id="keep-it-dry">Keep it Dry</h3> <p>The first thing I noticed was how much was shared between the reports. Each had either identical or very similar headers and footers, along with a number of often-repeated design paradigms that could be easily encapsulated into helper methods and constants. So I created a parent class that all my PDF reports would inherit from. It looked something like this:</p> <div class="highlight"><pre class="highlight ruby"><code> <span class="k">class</span> <span class="nc">PdfReport</span> <span class="o">&lt;</span> <span class="no">Prawn</span><span class="o">::</span><span class="no">Document</span> <span class="c1"># Often-Used Constants</span> <span class="no">TABLE_ROW_COLORS</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"FFFFFF"</span><span class="p">,</span><span class="s2">"DDDDDD"</span><span class="p">]</span> <span class="no">TABLE_FONT_SIZE</span> <span class="o">=</span> <span class="mi">9</span> <span class="no">TABLE_BORDER_STYLE</span> <span class="o">=</span> <span class="ss">:grid</span> <span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">default_prawn_options</span><span class="o">=</span><span class="p">{})</span> <span class="k">super</span><span class="p">(</span><span class="n">default_prawn_options</span><span class="p">)</span> <span class="n">font_size</span> <span class="mi">10</span> <span class="k">end</span> <span class="k">def</span> <span class="nf">header</span><span class="p">(</span><span class="n">title</span><span class="o">=</span><span class="kp">nil</span><span class="p">)</span> <span class="n">image</span> <span class="s2">"</span><span class="si">#{</span><span class="no">Rails</span><span class="p">.</span><span class="nf">root</span><span class="si">}</span><span class="s2">/public/logo.png"</span><span class="p">,</span> <span class="ss">height: </span><span class="mi">30</span> <span class="n">text</span> <span class="s2">"My Organization"</span><span class="p">,</span> <span class="ss">size: </span><span class="mi">18</span><span class="p">,</span> <span class="ss">style: :bold</span><span class="p">,</span> <span class="ss">align: :center</span> <span class="k">if</span> <span class="n">title</span> <span class="n">text</span> <span class="n">title</span><span class="p">,</span> <span class="ss">size: </span><span class="mi">14</span><span class="p">,</span> <span class="ss">style: :bold_italic</span><span class="p">,</span> <span class="ss">align: :center</span> <span class="k">end</span> <span class="k">end</span> <span class="k">def</span> <span class="nf">footer</span> <span class="c1"># ...</span> <span class="k">end</span> <span class="c1"># ... More helpers</span> <span class="k">end</span> </code></pre></div> <h3 id="pdf-report-classes">PDF Report Classes</h3> <p>Then I built my actual reports, each of which is its own class that inherits from the above <code>PdfReport</code>. I broke up each section of the pdf into its own private method in order to make the code easy to follow.</p> <div class="highlight"><pre class="highlight ruby"><code> <span class="k">class</span> <span class="nc">EventSummaryReportPdf</span> <span class="o">&lt;</span> <span class="no">PdfReport</span> <span class="no">TABLE_WIDTHS</span> <span class="o">=</span> <span class="p">[</span><span class="mi">20</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">30</span><span class="p">,</span> <span class="mi">60</span><span class="p">]</span> <span class="no">TABLE_HEADERS</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"ID"</span><span class="p">,</span> <span class="s2">"Name"</span><span class="p">,</span> <span class="s2">"Date"</span><span class="p">,</span> <span class="s2">"User"</span><span class="p">]</span> <span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">events</span><span class="o">=</span><span class="p">[])</span> <span class="k">super</span><span class="p">()</span> <span class="vi">@events</span> <span class="o">=</span> <span class="n">events</span> <span class="n">header</span> <span class="s1">'Event Summary Report'</span> <span class="n">display_event_table</span> <span class="n">footer</span> <span class="k">end</span> <span class="kp">private</span> <span class="k">def</span> <span class="nf">display_event_table</span> <span class="k">if</span> <span class="n">table_data</span><span class="p">.</span><span class="nf">empty?</span> <span class="n">text</span> <span class="s2">"No Events Found"</span> <span class="k">else</span> <span class="n">table</span> <span class="n">table_data</span><span class="p">,</span> <span class="ss">headers: </span><span class="no">TABLE_HEADERS</span><span class="p">,</span> <span class="ss">column_widths: </span><span class="no">TABLE_WIDTHS</span><span class="p">,</span> <span class="ss">row_colors: </span><span class="no">TABLE_ROW_COLORS</span><span class="p">,</span> <span class="ss">font_size: </span><span class="no">TABLE_FONT_SIZE</span> <span class="k">end</span> <span class="k">end</span> <span class="k">def</span> <span class="nf">table_data</span> <span class="vi">@table_data</span> <span class="o">||=</span> <span class="vi">@events</span><span class="p">.</span><span class="nf">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">e</span><span class="o">|</span> <span class="p">[</span><span class="n">e</span><span class="p">.</span><span class="nf">id</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="nf">name</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="nf">created_at</span><span class="p">.</span><span class="nf">strftime</span><span class="p">(</span><span class="s2">"%m/%d/%y"</span><span class="p">),</span> <span class="n">e</span><span class="p">.</span><span class="nf">created_by</span><span class="p">.</span><span class="nf">try</span><span class="p">(</span><span class="ss">:full_name</span><span class="p">)]</span> <span class="p">}</span> <span class="k">end</span> <span class="k">end</span> </code></pre></div> <p>This PDF report could then be generated by the following:</p> <div class="highlight"><pre class="highlight ruby"><code><span class="c1"># events = [...]</span> <span class="n">pdf</span> <span class="o">=</span> <span class="no">EventSummaryReportPdf</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">events</span><span class="p">)</span> <span class="n">pdf</span><span class="p">.</span><span class="nf">render_file</span> <span class="s2">"/tmp/my_report.pdf"</span> </code></pre></div> <h3 id="integration-with-rails">Integration with Rails</h3> <p>As Ryan Bates suggested in his <a href="http://railscasts.com/episodes/153-pdfs-with-prawn-revised">excellent screencast</a>, I created a separate <code>app/pdfs</code> folder where I placed all my reports. Then in my controller, I would have the following action method:</p> <div class="highlight"><pre class="highlight ruby"><code><span class="k">def</span> <span class="nf">summary_report</span> <span class="n">events</span> <span class="o">=</span> <span class="no">Event</span><span class="p">.</span><span class="nf">all</span> <span class="n">pdf</span> <span class="o">=</span> <span class="no">EventSummaryReportPdf</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">events</span><span class="p">)</span> <span class="n">respond_to</span> <span class="k">do</span> <span class="o">|</span><span class="nb">format</span><span class="o">|</span> <span class="nb">format</span><span class="p">.</span><span class="nf">pdf</span> <span class="p">{</span> <span class="n">send_data</span> <span class="n">pdf</span><span class="p">.</span><span class="nf">render</span><span class="p">,</span> <span class="ss">filename: </span><span class="s1">'summary_report.pdf'</span><span class="p">,</span> <span class="ss">type: </span><span class="s1">'application/pdf'</span><span class="p">,</span> <span class="ss">disposition: </span><span class="s1">'inline'</span> <span class="p">}</span> <span class="k">end</span> <span class="k">end</span> </code></pre></div> <h3 id="testing">Testing</h3> <p>What&rsquo;s great about this solution is how well it lends itself to testing. By using the <a href="https://github.com/yob/pdf-reader">pdf-reader gem</a>, we can convert the renderred PDF into a string and assert that the proper content is included. So a couple example tests (using Rspec) might look something like this:</p> <div class="highlight"><pre class="highlight ruby"><code><span class="n">describe</span> <span class="no">EventSummaryReportPdf</span> <span class="k">do</span> <span class="n">context</span> <span class="s1">'Given an array containing a single event'</span> <span class="k">do</span> <span class="n">let</span><span class="p">(</span><span class="ss">:events</span><span class="p">)</span> <span class="p">{</span> <span class="p">[{</span><span class="ss">id: </span><span class="mi">10</span><span class="p">,</span> <span class="ss">name: </span><span class="s2">"Company Meeting"</span><span class="p">,</span> <span class="ss">created_at: </span><span class="mi">1</span><span class="p">.</span><span class="nf">day</span><span class="p">.</span><span class="nf">ago</span><span class="p">,</span> <span class="ss">created_by: </span><span class="p">{</span><span class="ss">full_name: </span><span class="s1">'John Doe'</span><span class="p">}]</span> <span class="p">}</span> <span class="n">context</span> <span class="s1">'The rendered pdf content'</span> <span class="k">do</span> <span class="n">let</span><span class="p">(</span><span class="ss">:pdf</span><span class="p">)</span> <span class="p">{</span> <span class="no">EventSummaryReportPdf</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">events</span><span class="p">)</span> <span class="p">}</span> <span class="n">let</span><span class="p">(</span><span class="ss">:pdf_content</span><span class="p">)</span> <span class="p">{</span> <span class="no">PDF</span><span class="o">::</span><span class="no">Reader</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">pdf</span><span class="p">.</span><span class="nf">render</span><span class="p">)).</span><span class="nf">page</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nf">to_s</span> <span class="p">}</span> <span class="n">it</span> <span class="s1">'contains the name of the event'</span> <span class="k">do</span> <span class="n">expect</span><span class="p">(</span><span class="n">pdf_content</span><span class="p">).</span><span class="nf">to</span> <span class="kp">include</span><span class="p">(</span><span class="s1">'Company Meeting'</span><span class="p">)</span> <span class="k">end</span> <span class="n">it</span> <span class="s1">'contains the full name of the user'</span> <span class="k">do</span> <span class="n">expect</span><span class="p">(</span><span class="n">pdf_content</span><span class="p">).</span><span class="nf">to</span> <span class="kp">include</span><span class="p">(</span><span class="s1">'John Doe'</span><span class="p">)</span> <span class="k">end</span> <span class="k">end</span> <span class="k">end</span> <span class="k">end</span> </code></pre></div> <h3 id="conclusion">Conclusion</h3> <p>This makes the process of creating PDF documents <em>a little</em> less painful. For those of you who are HTML and CSS wizards, be prepared to get really frustrated by how hard it is to lay out your documents in code. But just be mindful of anything that can be encapsulated into a helper method for usage later.</p> <p>This covers one type of PDF report generation that is often more appropriate for generating variable-length reports. In my next blog post, I&rsquo;m going to go over the process of pre-filling PDF form documents with data from your application.</p>