[Officeshots] Pixelwise comparison of pdf files

robert_weir at us.ibm.com robert_weir at us.ibm.com
Thu Jun 16 17:40:50 CEST 2011


officeshots-bounces at opendocsociety.org wrote on 06/16/2011 01:08:19 AM:

> 
> Hi,
> 
> I am new on this list, so let me introduce myself first: my name is
> Milos Sramek. I found the officeshots page when I was looking for tools
> to compare quality of ODF renderings created by different applications.
> I have been talking to Michiel  for some time now, and I even have set
> up my own factory (the OOO3.4 a LO3.4 ones, others may follow).
> 
> Officeshots can create pdfs by many tools so that they can be inspected
> visually. However, only significant differences are visible.  Thus I
> wrote a script, which takes pdfs and compares them - both some simple
> statistics is computed and difference images are created.
> 


Very interesting. I can see this also being very useful as a regression 
test took for an editor.  So we could compare version N+1 beta of an 
editor with the layout of version N.  Ideally their would be no 
differences.  If there were differences this would mean that user's 
documents would be rendered differently in the new version, which is 
generally a bad thing


One concern with interpreting the data is that a single layout error can 
cause all of the document after that error to be incorrect.  So it is hard 
to distinguish between a single error and many independent errors. 

This is a similar problem to what a compiler writer faces.  Once you get 
the first syntax error, a simple parser will report a large cascade of 
additional errors that are not really that useful to report.  A smarter 
compiler will try to restart the parsing at the next logical block, and it 
the source code can be interpreted consistently at that point forward, 
return no further error messages.  I wonder whether something similar 
could be done in the document comparisons, e.g., resynch to the next 
paragraph.


-Rob



More information about the Officeshots mailing list