Ticket #134 (closed defect: fixed)

Opened 10 months ago

Last modified 5 months ago

GDAL transform translates the points incorrectly when there is a scale or translation on the points

Reported by: TnyLtSprGy Owned by: TnyLtSprGy
Priority: minor Milestone: 1.2.1
Component: General Version: 1.2
Keywords: Spatial Reference Cc:
LAS Format Version: Not Applicable

Description (last modified by hobu) (diff)

In Writer::FillPointRecord?() we are re-scaling the point to write it to the file before we are projecting it.

Attachments

commit.zip Download (11.1 KB) - added by TnyLtSprGy 10 months ago.
Changes for Ticket #134. I also changed the location of the Project() function to LASPoint.
ticket-134-proposals.patch Download (7.0 KB) - added by mloskot 7 months ago.
Regular patch created from non-patch commit.zip. In order to help reviewing proposed fixes, please, try to 1) use svn diff command to create patches instead of packing a bunch of changed files in a bundle 2) use spaces not tabs. Thanks!

Change History

  Changed 10 months ago by TnyLtSprGy

  • status changed from new to assigned

Changed 10 months ago by TnyLtSprGy

Changes for Ticket #134. I also changed the location of the Project() function to LASPoint.

  Changed 10 months ago by TnyLtSprGy

  • status changed from assigned to closed
  • resolution set to fixed

  Changed 10 months ago by hobu

  • status changed from closed to reopened
  • resolution fixed deleted

The proper way to file bugs is to file a patch. You can create a patch by issuing 'svn diff > mypatch.patch' against your subversion source tree of libLAS and then attaching that to a ticket. Also, a bug is not "fixed" until one of the source committers has committed a patch to fix the issue to the subversion repository.

On to your code. The Project method is on the Reader:: and Writer:: so it is not public to LASPoint, which is user-focused. Do you have a good/strong use case for having this public on the LASPoint?

Another issue your FillPointRecord? ends up copying the point. For 100s of millions of points, this is prohibitive. I wanted to ensure that the point could be used and transformed in place.

I will try to fix the LASWriter scaling issue which is indeed a bug. Thanks for the report.

  Changed 10 months ago by hobu

Fix attempted in r1261. Please confirm that it works for you.

follow-ups: ↓ 7 ↓ 10   Changed 10 months ago by TnyLtSprGy

Isn't 'record' empty at that point? It first needs to be filled with the information from 'point'.

I was thinking about what you said about LASPoint. I still think it makes more sense to have the project member in LASPoint. This way we could just call the project method on the point and be done with it. I think there are lots of times when the project method might be used by a user. I had to project a point the coordinates of the bounding box to WGS84 the other day to get the UTM zone of the file. Thinking about what I did the other day, It might be best to have a static member Project that takes three coordinates and re-writes them reprojected to a different SRS. I think we could avoid creating a transform object each time if we had another static SetTransform? method that would take an input and output SRS, and then a Project method that took only the three coordinates as input and used the last SRS that was set. We could also have another method that took the three coordinates and an input and an output LASSpatialReference. In using LASSpatialReference the user wouldn't even have to know anything about GDAL.

Anyway, record is supposed to represent a point as it is in the actual file (scaled and transformed) not as the point actually is (as in the point record). It doesn't make sense the way it is being used there.

  Changed 10 months ago by TnyLtSprGy

As to your question about usage, anyone who's going to want to be spatial reference aware is probably going to want to project points at some time. Sorry if I'm writing about this on the wrong thread which is supposed to be about the bug in the project code.

in reply to: ↑ 5   Changed 10 months ago by hobu

duh. I'll make another attempt.

You should go into your user preferences and add your email address so that you are emailed whenever a ticket with your uid is modified. That way you can follow happenings in tickets when they happen and not have to remember to go back and look at them.

Replying to TnyLtSprGy:

I think there are lots of times when the project method might be used by a user. I had to project a point the coordinates of the bounding box to WGS84 the other day to get the UTM zone of the file.

I would rather folks just use GDAL/OSR than wrapping it up ourselves. You are going to have much more flexibility and usability to use the OSR projection methods in GDAL if you want to do this. The ability to set an output SRS on the reader/writer was mostly just an experiment to see if it was natural and useful :)

I think we could avoid creating a transform object each time if we had another static SetTransform? method that would take an input and output SRS ...

We really don't want to be passing GDAL objects in/out of libLAS. I've only made one exception to this, and that is the ability of LASSpatialReference to return the GTIF*. Because GDAL/libgeotiff are optional usage in libLAS, it complicates things quite a bit to start adding lots of fancy.

In using LASSpatialReference the user wouldn't even have to know anything about GDAL.

LASSpatialReference isn't meant to be a replacement for GDAL's OSR or CS-Map's description dictionaries. It is meant to be a container for the LAS VLR records you need to make GeoTIFF keys for a LAS file.

Anyway, record is supposed to represent a point as it is in the actual file (scaled and transformed) not as the point actually is (as in the point record). It doesn't make sense the way it is being used there.

Yes, I messed up :)

  Changed 9 months ago by hobu

  • milestone set to 1.2.1

Changed 7 months ago by mloskot

Regular patch created from non-patch commit.zip. In order to help reviewing proposed fixes, please, try to 1) use svn diff command to create patches instead of packing a bunch of changed files in a bundle 2) use spaces not tabs. Thanks!

  Changed 7 months ago by mloskot

From the first look, this is not a trivial issue. Please, give me a few hours to think about it and I'll be back with my comments today yet.

Generally, LASPoint::Project is not a good idea to me, but let me to explain it later today.

in reply to: ↑ 5 ; follow-up: ↓ 11   Changed 7 months ago by mloskot

Replying to TnyLtSprGy:

I was thinking about what you said about LASPoint. I still think it makes more sense to have the project member in LASPoint.

I'm not convinced to the proposed:

void LASPoint::Project([OGRCoordinateTransformationH const& transform);

A point and transformation of a point are orthogonal concepts, so they should not be tightly coupled. These are like a data and an algorithm. IMHO, it's inappropriate to encapsulate the latter in definition of the former.

Also, passing transformation object for every point is a kind of redundant, an ineffective option.

Before jumping into implementation, I'd suggest to discuss a couple of issues:

1. There are two situations: a LAS file with defined SRS and a file with not defined SRS

2. If a file has no SRS assigned, then header and points have know "knowledge" of in space of what SRS they are defined.

3. Let's assume that LASPoint object stores information about SRS, for instance we can achieve it easily by redesigning LASPoint class to link to its source (target) LASHeader:

struct LASPoint
{
LASHeader const& m_header;
LASPoint(LASHeader const&);
}

We still need to deal with situation in which actual SRS stored in LASHeader object is unknown.

4. In case 3. SRS is an optional and external information that may be provided by user.

5. Now, in case that LASPoint object (through linked LASHeader) has no SRS assigned, then general concept of LASPoint::Project is useless, IMHO.

6. Important question is: do we need in-place transformations or we accept that transformation produces a copy of original object of type of LASPoint. The specified concept of

void LASPoint::Project(OGRCoordinateTransformationH const& transform);

makes sense only if transformation is made in-place.

7. IMHO, transformation should be applied on point but not by point. I imagine this patterns:

Transformer t(utm, wgs84);
Point p1;
Point p2;
t.forward(p1); // in place
p2 = t.forward(); // by copy

Transformer t(utm, wgs84);
list<Point> putm;
p.insert(p1)
p.insert(p2)
t.forward(p); // in-place bulk transformation

or by copy using common idiom of input + output + operation (Transformer as a functor):

Transformer t(utm, wgs84);
list<Point> pwgs84;
std::transform(p.begin(), p.end(), t, pwgs84.begin());

8. Separation of data and algorithms scales better and is more flexible to user than built-in LASPoint::Project method.

9. Furthermore, Transformer could be applicable on all types of objects for which transformation does make sense: * single point (in-place and by copy) * collection of points (in-place and by copy) * LAS files (by copy only) * collections of LAS files (by copy only)

There is an important assumption I usually make for LAS implementation: LAS file is a collection of points usable in the same way as input *or* output stream (or iterator): reading/writing are always single pass operations, so in-place transformations of files do not make much sense. Disks are cheap, so copying does not hurt :-)

Thinking about what I did the other day, It might be best to have a static member Project that takes three coordinates and re-writes them reprojected to a different SRS.

How would that be different and more usable in comparison to  OGRCoordinateTransformation class from GDAL/OGR ? I can't see any difference in usability, but I see that static method would pollute LASPoint interface with unrelated operation.

I think we could avoid creating a transform object each time if we had another static SetTransform? > method that would take an input and output SRS, and then a Project method that took only the three coordinates as input and used the last SRS that was set.

I clearly lack of enough imagination :-) Could you give a sample (pseudo)code that combines the static method with instance-only operation (Project) ?

We could also have another method that took the three coordinates and an input and an output LASSpatialReference.

Such method would just duplicate responsibility of OGRCoordinateTransformation

In using LASSpatialReference the user wouldn't even have to know anything about GDAL.

OK, good argument. Let's provide LASCoordinateTransformation, a simple(ifying) adaptor on top of OGRCoordinateTransformation class.

Anyway, record is supposed to represent a point as it is in the actual file (scaled and transformed) not as the point actually is (as in the point record). It doesn't make sense the way it is being used there.

I don't understand. Point record stores 3 numbers - let's call them raw coordinates. Next, user (libLAS) applies scaling to those 3 numbers and these are real coordinates in space of defined (or assumed) SRS. Three is no transformation required to get real geospatial coordinates. Or I'm missing something.

in reply to: ↑ 10 ; follow-up: ↓ 14   Changed 7 months ago by hobu

Replying to mloskot:

Replying to TnyLtSprGy:

I was thinking about what you said about LASPoint. I still think it makes more sense to have the project member in LASPoint.

I'm not convinced to the proposed: {{{ void LASPoint::Project([OGRCoordinateTransformationH const& transform); }}} A point and transformation of a point are orthogonal concepts, so they should not be tightly coupled. These are like a data and an algorithm. IMHO, it's inappropriate to encapsulate the latter in definition of the former.

I am -1 on this as well. The biggest problem is what do we do when OGRCoordinateTransformationH is not defined because we don't have GDAL? Our behavior changes. You could argue that having no GDAL and putting coordinate system definitions on the readers/writers is the same behavioral disconnect and you would be right, however.

Also, passing transformation object for every point is a kind of redundant, an ineffective option. Before jumping into implementation, I'd suggest to discuss a couple of issues: 1. There are two situations: a LAS file with defined SRS and a file with not defined SRS 2. If a file has no SRS assigned, then header and points have know "knowledge" of in space of what SRS they are defined. 3. Let's assume that LASPoint object stores information about SRS, for instance we can achieve it easily by redesigning LASPoint class to link to its source (target) LASHeader: {{{ struct LASPoint { LASHeader const& m_header; LASPoint(LASHeader const&); } }}}

I should take a little time to explain what my concept was for the transformation stuff in the first place. My idea was for a simple way to provide for on-the-fly transformation when the input SRS was known and when an output SRS was set on the reader/writer. I wanted something like a boost-style filtering iostream. It hasn't quite achieved that yet, but that is the concept I was going for. It may be that coordinate system transformation is a bad match. The entire exercise was mostly an experiment regardless. This concept has a few problems however:

* The points are not transitive due to the nature of the offset/scale storage of int32_t. This means that a Reader/Writer must have the header around to be able to scale/descale, but it more importantly means that the numbers will not exactly round trip through the projection process. If a user is not careful, like say for instance they set a scale on data from UTM they were projecting into DD of something like 0.01, it turns into a giant mess.

* Your situation #2. What do you do when you don't have an SRS defined for the header. 80% of the files I have found in the wild do not have SRS's defined. My concept was an error would be returned if no SRS was defined for the Header that the Reader/Writer is using. I was providing no way at all for the user to *set* an SRS either. If it wasn't defined as part of the file (for a Reader), or defined as part of the Header (for the Writer), the user is SOL.

5. Now, in case that LASPoint object (through linked LASHeader) has no SRS assigned, then general concept of LASPoint::Project is useless, IMHO. 6. Important question is: do we need in-place transformations or we accept that transformation produces a copy of original object of type of LASPoint. The specified concept of {{{ void LASPoint::Project(OGRCoordinateTransformationH const& transform); }}} makes sense only if transformation is made in-place.

We don't want to mix this with my concept of boost-style filtering/projecting readers/writers due to the fact that writers are LASPoint const& and we shouldn't be changing data we're going to write. What if you wanted to write out *two* different files of different SRSs? You wouldn't want the writer changing your input data...

7. IMHO, transformation should be applied on point but not by point. I imagine this patterns: {{{ Transformer t(utm, wgs84); Point p1; Point p2; t.forward(p1); // in place p2 = t.forward(); // by copy Transformer t(utm, wgs84); list<Point> putm; p.insert(p1) p.insert(p2) t.forward(p); // in-place bulk transformation }}} or by copy using common idiom of input + output + operation (Transformer as a functor): {{{ Transformer t(utm, wgs84); list<Point> pwgs84; std::transform(p.begin(), p.end(), t, pwgs84.begin()); }}}

These are concepts GDAL should adopt, frankly. Our success at getting much traction in GDAL for something like this sucks though. We should make a new coordinate systems library that sits atop cs-map, proj, and OSR that behaves this way :) I'll find the funding... oh, you're busy though hehe.

In using LASSpatialReference the user wouldn't even have to know anything about GDAL.

I'll reiterate that LASSpatialReference is just a container for the GeoTIFF keys stored as VLR methods. That it has the ability to output WKT (when GDAL is available) or Proj.4 (when libproj is available) is just sugar. It doesn't need those libraries if a user wanted to work with raw VLR GeoTIFF key data.

OK, good argument. Let's provide LASCoordinateTransformation, a simple(ifying) adaptor on top of OGRCoordinateTransformation class.

Anyway, record is supposed to represent a point as it is in the actual file (scaled and transformed) not as the point actually is (as in the point record). It doesn't make sense the way it is being used there.

I don't understand. Point record stores 3 numbers - let's call them raw coordinates. Next, user (libLAS) applies scaling to those 3 numbers and these are real coordinates in space of defined (or assumed) SRS. Three is no transformation required to get real geospatial coordinates. Or I'm missing something.

I think this discussion is PointRecord? vs. LASPoint and the idea that one might be scaled/descaled (LASPoint) and the other is always descaled (PointRecord?). Projecting PointRecords? is bad because of the precision loss.

Short term, for the 1.2.1 release, I am going to fix this problem by having the Writer internally copy the point, descale it, reproject it, scale it, and write it. I really wanted to avoid this extra copy, but there really isn't another choice. Longer term, I am completely open to changing this stuff entirely. I think the concept as I have implemented is too rigid and does not fit very well with other established usage patterns (like std::transform, etc), and I think it should.

  Changed 7 months ago by hobu

  • description modified (diff)

r1331 attempts to fix this for 1.2.1. r1332 brings it forward to trunk for us too work on there.

  Changed 7 months ago by hobu

  • description modified (diff)

A messy test for the Python side was added in r1333/r1334 that demonstrates the usage.

in reply to: ↑ 11   Changed 7 months ago by mloskot

Replying to hobu:

Replying to mloskot: I am -1 on this as well. The biggest problem is what do we do when OGRCoordinateTransformationH is not defined because we don't have GDAL? Our behavior changes. You could argue that having no GDAL and putting coordinate system definitions on the readers/writers is the same behavioral disconnect and you would be right, however.

Agreed, this would be fragile.

I should take a little time to explain what my concept was for the transformation stuff in the first place. My idea was for a simple way to provide for on-the-fly transformation when the input SRS was known and when an output SRS was set on the reader/writer.

Sounds like a good idea.

Looks like we may want to introduce some solution involving traits and policy-based design (links below) of LASReader/LASWriter, so users can control its internal behaviour in compile-time, by composing a final type of reader/writer according to their needs.

C++ Templates: The Complete Guide]

I wanted something like a boost-style filtering iostream. It hasn't quite achieved that yet, but that is the concept I was going for.

Have you posted this idea from high-level? What syntax are you looking for?

It may be that coordinate system transformation is a bad match. The entire exercise was mostly an experiment regardless. This concept has a few problems however: * The points are not transitive due to the nature of the offset/scale storage of int32_t. This means that a Reader/Writer must have the header around to be able to scale/descale,

The header and point records are mutually inclusive. Points (so, reader/writer) without header are meaningless.

but it more importantly means that the numbers will not exactly round trip through the projection process. If a user is not careful, like say for instance they set a scale on data from UTM they were projecting into DD of something like 0.01, it turns into a giant mess.

Right. This looks like a domain error that is under user's control.

* Your situation #2. What do you do when you don't have an SRS defined for the header. 80% of the files I have found in the wild do not have SRS's defined. My concept was an error would be returned if no SRS was defined for the Header that the Reader/Writer is using.

Probably, best approach.

I was providing no way at all for the user to *set* an SRS either. If it wasn't defined as part of the file (for a Reader), or defined as part of the Header (for the Writer), the user is SOL.

Again, user must know their data in order to process it.

We don't want to mix this with my concept of boost-style filtering/projecting readers/writers due to the fact that writers are LASPoint const& and we shouldn't be changing data we're going to write.

Yes, good point.

However, I may still not understand well what boost-style filtering means. Let's discuss it on the list?

What if you wanted to write out *two* different files of different SRSs? You wouldn't want the writer changing your input data...

Yes, you are right.

Let's focus on processing single files only and provide users with toolset that enables them with composing their own processing chains.

7. IMHO, transformation should be applied on point but not by point. I imagine this patterns: (...) or by copy using common idiom of input + output + operation (Transformer as a functor): (...)

These are concepts GDAL should adopt, frankly.

yup, that's only I can say :-)

Our success at getting much traction in GDAL for something like this sucks though. We should make a new coordinate systems library that sits atop cs-map, proj, and OSR that behaves this way :) I'll find the funding... oh, you're busy though hehe.

I am busy, but perhaps  Martin would be interested too. Otherwise, I might be available as a volunteer.

I don't understand. Point record stores 3 numbers - let's call them raw coordinates. Next, user (libLAS) applies scaling to those 3 numbers and these are real coordinates in space of defined (or assumed) SRS. Three is no transformation required to get real geospatial coordinates. Or I'm missing something.

I think this discussion is PointRecord? vs. LASPoint and the idea that one might be scaled/descaled (LASPoint) and the other is always descaled (PointRecord?). Projecting PointRecords? is bad because of the precision loss.

Who asks about PointRecord? :-)

Common users are supposed to have no idea of its existence - it is an implementation detail. It can even disappear one day. Full stop.

Short term, for the 1.2.1 release, I am going to fix this problem by having the Writer internally copy the point, descale it, reproject it, scale it, and write it. I really wanted to avoid this extra copy, but there really isn't another choice.

OK for me.

Longer term, I am completely open to changing this stuff entirely. I think the concept as I have implemented is too rigid does not fit very well with other established usage patterns (like std::transform, etc), and I think it should.

OK. Let's continue the discussion about future options on the list.

  Changed 5 months ago by hobu

  • status changed from reopened to closed
  • resolution set to fixed

This was fixed by copying the points in both the 1.2 and main branches. Let's revisit this when we have boost in the mix.

Note: See TracTickets for help on using tickets.