[WebODF] SelectionMover and standardising lingo

Wed Oct 9 02:16:17 CEST 2013

On 09/10/2013, at 9:10 AM, Friedrich W. H. Kossebau wrote:

> E.g. to me (as non-native speaker obviously) steps are a more relative 
> measure, giving a distance, while position is a more absolute measure, giving 
> a, well, position or location. Then also the meaning of "Point", "Position" 
> and "Step" is slightly arbitrary and will have to be learned and cannot 
> directly derived from the terms itself. So I will forget that again and again. 
> The only one?

I agree. I'm not attached to the chosen terms at all… but now that I've put some forward
we can debate in an informed manner :)

> 
> Perhaps because my brain is structured by my German mother tongue, but I would 
> prefer prefixes to mark the meaning of the positions. Let's see how that can 
> be derived from the definitions of the terms you gave:
> 
>> Points
>> ======
>> Raw DOM positions. This is any valid node & offset pair.
> 
>> Positions
>> =========
>> A set of positions as returned by the position iterator. This is a subset of
>> points as these can be filtered at the node level via a NodeFilter.
> 
>> Steps
>> =====
>> A filtered subset of positions as returned by the position iterator. This is
>> controlled using a PositionFilter.
> 
> So "Positions" depend on the type of the node filter (+ built-in 
> EmptyTextNodeFilter). Now given the variety of possible filters the semantics 
> of "Positions" is quite broad, and I actually see no real advantage in having 
> an explicit general distinctive name for a set of non-raw-DOM positions. 
> Especially as the positions are not direct subsets, but actually in different 
> address spaces, as the "offset" property depends on the "seen" nodes, so 
> node:x, offset:y can mean something different with a node-filtered view on the 
> DOM.

Even though the variety of filters is infinite in theory, in practice, there is only a
single node filter (pair) in use in WebODF, being CursorFilter + EmptyTextNodeFilter.

> I would propose to call real DOM positions always "DomPositions" and equally 
> prefix the node-filtered positions by whatever xmltree-view the nodefilter is 
> used for, e.g. with "ODF" or "WebODF" (webodf-extended ODF) to "OdfPosition" 
> or "WebOdfPosition". That way it is more clear against what nodefilter view on 
> the raw DOM a certain position is valid.

+1 on this. Good proposal.

> 
> Similar for "Steps" vs. "Position". While the position-filtered set of 
> positions is in the same address space like the position-unfiltered set, the 
> exact meaning depends on the type of the positionfilter, so that should be 
> indicated by a proper prefix. They are still positions in a certain (virtual) 
> node tree, and by their object type themselves not restricted to other 
> positions.
> Actually in the current code steps are always used as distance between two 
> (filtered) positions, with the single-numeric distance estimated from the 
> mapping of the position to the number in the iterating sequence, so actually 
> another address system, but with 1:1 mapping (you know that, just noting for 
> myself for the record).

Right. I did a bad job explaining this. The key reason for Points vs. Positions
(or DOM Positions vs OdfPositions) is that OdfPosition is actually an integer
number. The integer number that means "starting at root, call nextPosition X times".

Any time the code uses passes around direct container & offset pairs, I consider this
speaking in DOM Positions.

The key point of confusion is that there are two integer-based coord systems that
mean drastically different things:

OdfPosition (Positions in the original email)
=================================
PositionIterator + CursorFilter + EmptyTextNodeFilter
Integer number indicates how many times to call .nextPosition from root

CursorStep (? Steps in the original email)
===================================
(PositionIterator + CursorFilter + EmptyTextNodeFilter) + TextPositionFilter
Integer number indicates how many times acceptPosition(iterator) will equal  FILTER_ACCEPT
before the PositionIterator arrives at the desired position.

> Now SelectionMover operates on a certain node tree, defined by the passed root 
> node, which then is node-filtered by the CursorFilter+EmptyTextNodeFilter. And 
> thus has a certain address space, in which the used positions (by node & 
> offset) are valid.
> Which makes me wonder if there are not incompatibilities, as the users of 
> SelectionMover only pass in a position filter and get back distances which are 
> only valid against that position filter and the address space of the 
> SelectionMover node tree. Seems any results from the methods of SelectionMover 
> are only used again with SelectionMover and thus the same address space, so 
> possibly no problem exists. Okay.

The key problem with the current design is that SelectionMover mixes OdfPosition
and CursorStep input & output. It returns or takes a number of "steps", but depending
on which function depends on which coordinate system the integer is in :-/.

The only place that currently uses an additional PositionFilter is the annotation walking, but
the mechanism for translation in and out of this is somewhat painful to use (one of the 
recommendations highlights this).

> One thing I am missing: what is the purpose of the CursorFilter? Why does it 
> filter only webodf-namespaced elements? And not more, like skipping HTML 
> elements, as odf.OdfNodeFilter does?
> (hm, btw, could there already be alien elements matching HTML tags be loaded 
> from the odt file? not expected, but might need handling WRT to our extension 
> of )

The next level of filtering you're referring to is taken care of in TextPositionFilter.
This is actually one of the first places I will start to play with for some easy perf
improvements.

> 
>> My recommendations (I'm assuming you want to hear these :) )
>> 
>> 1. Standardise SelectionMover on Points and Steps only.
>> Any attempts to move a cursor to a Position that is not also a Step is
>> likely a bug. The only place where we deliberately move in non-Step
>> increments is when trying to put the cursor back into a filtered Step. This
>> same effect could be achieved by calling moveForwardStep(0, filter).
>> 
>> 2. Remove or rewrite convert(Forward|Backward)StepsBetweenFilters.
>> The current implementation will not work for arbitrary filter pairs. These
>> should likely be re-written to return the number of filter1-steps when
>> travelling filter2-steps in distance. I would also suggest this takes the
>> same approach as countStepsToPosition and round the cursor back to the
>> Point of the last filter2-step that is smaller-than or equal-to the current
>> cursor Point.
> 
> Still undecided on these recommendations (to be blamed on my lack of 
> experience in this field and current state of my mind, thank you, cold).

You have a compromised immune system and you're willing to read emails
like this? I salute you, you brave person :)

I fear you passed that code on to me during our code review on IRC the
other night :-(. I need better virus scanners in my chat client...

> 
> As if things are not confusing enough we also use the term "position" with 
> regard to the cursor-walkable textpositions in the linear addressing system. 
> Not only in the op specs, but also in the API of OdtDocument etc. Seems we 
> will get out of words #)

My fear also :-/

> 
> I need some more time for this.
> 
> Cheers
> Friedrich
> 
> PS: I also think in the var names we should always explicitly talk about 
> nodeFilter or positionFilter, because that makes a difference as I only now 
> fully realized :)

I agree here too.