.. _annual_report_2010:

******************************************************************************
2010 Annual Report
******************************************************************************

:Author: Howard Butler
:Contact: hobu.inc@gmail.com
:Date: 12/21/2010

The following document is a synopsis of major development activities that have 
taken place in the libLAS project (or related projects) in the 2010 calendar 
year.  

Vertical Datum Reprojection and Transformation
..............................................................................

Frank Warmerdam implemented vertical datum reprojection and transformation in the 
entire open source GIS stack in the past year (`proj.4`_, `GDAL`_ and `libgeotiff`_).  This work 
makes it possible to make vertical datum transformations via command-line utilities 
like :ref:`las2las <las2las>` in addition to providing the tools for software 
developers to implement the features in their own software.

.. _`libgeotiff`: http://www.remotesensing.org/geotiff/geotiff.html
.. _`proj.4`: http://proj.maptools.org/
.. _`GDAL`: http://www.gdal.org

libLAS Processing Kernel
..............................................................................

libLAS gained something I call the "libLAS Processing Kernel" in the past year. 
It's really just a set of common functions that all application-level software 
can reuse to implement filtering and transformation on LAS data.  The collation 
of all of this functionality in a common place has meant the reuse of the 
same operations in many of the libLAS utilities include :ref:`las2las <las2las>`, 
:ref:`lasinfo <lasinfo>`, and :ref:`las2txt <las2txt>`.  New features implemented 
in the kernel include:

* Color fetching from GDAL raster data sources
* Data reprojection and vertical datum transformation
* Numerous filtering operations
* Header modification

These utilities are available to all software developers who wish to reuse them 
in their own tools.


Re-engineering of :ref:`las2las <las2las>` :ref:`las2txt <las2txt>` and :ref:`lasinfo <lasinfo>`
...................................................................................................

:ref:`las2las <las2las>` :ref:`las2txt <las2txt>` and :ref:`lasinfo <lasinfo>`
were re-imagined in light of development of the `libLAS Processing Kernel`_ to
take advantage of new functionality and regularize command-line argument
handling and parsing. The previous versions of the utilities have been
preserved under the `las2las-old`, `las2txt` and `lasinfo-old`
monikers in case people have significant processing workflows developed with them.
It would be advantageous to upgrade to the new versions in many cases -- both for
significantly improved functionality and for a speed improvement that is
sometimes double that of the -old versions.

Some new features the utilities gained as part of this effort include:

* Setting color information from GDAL rasters
* Splitting files based on a point count or a file size in mb
* Chaining many filter operations together into a single call
* Modifying header information, including setting coordinate system info
* Summarizing data more fully and more flexibly (XML, per-point)

Chipper
..............................................................................

Andrew Bell developed a specialized point partitioning process called
`lasblock` to bucketize point data.The process that aims to
optimize the fill capacity, shape, and speed of processing. More specifically,
it attempts to keep the blocks as full as possible and as square as possible
to augment querying characteristics for `Oracle Point Cloud`_. This
pre-processing is needed as precursor step in the processing chain that ends
with actually loading the data into Oracle via `las2oci`. `lasblock` can
also be used as a LAS tiling process, although it is not so memory efficient.


Indexing
..............................................................................

Gary Huber developed an octree-based spatial index for libLAS to speed up random,
bounding-box-based queries to LAS files. It is released as part of libLAS 1.6,
but its full implementation within the library is not yet complete. The index
can store its data within VLRs (requires a file rewrite) in addition to
in a file alongside the .las file.


CMake
..............................................................................

libLAS was migrated to using `CMake`_ for its configuration system. CMake
allows easy generation of MSVC project files, XCode project files, and make
files under a common configuration. This effort eliminated three parallel
build system configurations (MSVC projects, GNU autoconf, MSVC makefiles) and
provided more flexibility for packaging, testing, and build types. In my
opinion, its use has been a boon to the project.

OSGeo4W
..............................................................................

For the first time ever, we have released fully-capable Windows libLAS
packages in the form of an OSGeo4W release. These releases contain the full
range of libLAS functionality including coordinate system support, Oracle
support, vertical datum transformation, and chipping. Head to
http://trac.osgeo.org/osgeo4w to obtain your copy and start testing libLAS
immediately.

New Website
..............................................................................

We rewrote the libLAS website and transformed it from a bunch of wiki pages 
in Trac to a Sphinx-backed HTML website. We have added tons more documentation, 
provided it in formats such as PDF, and organized things significantly.  

Generic LAS Schema Support
..............................................................................

Though it is specifically allowed by the standard but not widely implemented,
it is possible to store extra data attached to each point after the requisite
PointFormat data are stored. There is neither a regularized way to describe
these data nor a way to capture metadata about this. To this end, I have
proposed an XML schema document that could be stored in a VLR as well as
schema-aware reader and writer implementations that can utilize that VLR to
work with the data. See <https://lidarbb.cr.usgs.gov/index.php?showtopic=9075>
for more details on the initial proposal of schema support.

libLAS now implements a class called liblas::Schema that is driven 
by the Point Data Format ID of the header in addition to any extra dimensions 
you wish to store with the point.  This work is used for both the `Oracle Point Cloud`_ 
effort and upcoming LASzip compression integration.


Refactoring of liblas::Point class
..............................................................................

We significantly refactored the liblas::Point class the 1.6.0 release.  The 
first thing that was done was to make it "thinner" in the sense that it doesn't 
store a union of all point-format-derived dimensions on it, and instead 
stores a reference to a schema that informs the class about which dimensions 
exist.  Additionally, data are interpreted on-the-fly from the raw bytes 
which compose the point, eliminating the fidelity issues.

libLAS 1.2.1 and below utilized a liblas::Point that was kind of fat. It
carried around interpreted data members for all of the dimensions on the point
-- x, y, z, intensity, etc -- and if you asked for one of these, it just
returned it to you directly. The interpretation of those data happened as the
data were read, and again as the data were written (back into raw bytes).

libLAS 1.6+ has changed liblas::Point in a number of important ways.
liblas::Point now only carries along the raw bytes for the point, and if you
ask for one of the dimensions, it interprets it on-the-fly. For example, a
GetX() call now requires going into the liblas::Point byte array, pulling the
first four bytes off of it, asking the point's header for scaling information,
and rescaling the integer data into double data. If you only call GetX() one
time, things are roughly equivalent to what we were doing before --
interpreting and caching interpreted data directly on the liblas::Point -- but
every one of your subsequent calls to GetX() have this interpretation
performance hit. You need to cache your calls to interpreted data if you are
reusing things.  Alternatively, you can control when your data have scaling 
applied by using GetRawX(), which was not possible before libLAS 1.6.

The rationale for moving to this approach was three-fold. First, the LAS
committee continually adds new dimensions onto the specification, and I wanted
an extendable way to add them to libLAS without causing a full re-engineering
of the liblas::Point class every time they do. Second, liblas::Point now has a
schema attached to it (based on the list of dimensions that a LAS file's point
format defines plus any custom dimensions you wish to add to the point
record). The schema allows you to extend the point format and add your own
dimensions and it provides generic descriptive information about what exists
in the file. You can see the description of these schemas in the new
:ref:`lasinfo <lasinfo>` output from libLAS. Lastly, previous versions of
libLAS did not allow you to work with raw data, and did not allow the user to
transform the data (coordinate data, especially) with perfect fidelity. The
new approach explicitly supports this out-of-the-box. Here's something that is
now possible with the new (C++) API that was not previously:

::

    liblas::Point const& p = reader.GetPoint();
    std::vector<uint8_t> data = p.GetData();
    ... // do something with the raw data like stuff it into a database.

Refactoring of internal Reader and Writer code
..............................................................................

The previous (< libLAS 1.2) C++ reader and writer code of libLAS was a bit
inflexible, and contained significant duplication for each file format
version. Giant updates would be required to the code as the ASPRS LAS standard
committee added new specification versions with new required point formats.
Additionally, the old code's design was a bit rigid for adding things like
generic schema support.

Both the liblas::Reader and liblas::Writer have been significantly `refactored`_

The reduction in duplication means going to only one place to make changes to 
the code.  In addition to not repeating ourselves, it provides us more 
flexibility to add new features and extensibility to allow the reader and 
writers to be overridden by user code.  

Generic interfaces
..............................................................................

A number of generic interfaces have been added to libLAS to support dynamic
polymorphism. See <liblas/liblas.hpp> for the C++ interfaces. By implementing
these interfaces, you can add your own reader/writer implementations as well
as provide custom filtering and transformation capability.

Faster binary i/o
..............................................................................

Mateusz Loskot developed a more savvy implementation for its binary i/o which
provides some significant performance improvements. 

Caching reader
..............................................................................

A reader implementation that provides data caching will be provided at 
libLAS 1.6. If your data reading involves reading the data in multiple passes 
through the file, you can utilize the cached reader to cache the points 
(up to the size of the entire file) for faster repeated and random access.  

Seek support
..............................................................................

It is now possible to seek to a specific point in the file and start reading 
points.  This can significantly speed up the "random sampling" access strategy 
where one starts reading a run of points at a specific location in the file 
and then moves to a different location.

Classification class
..............................................................................

A class is now provided to abstract the LAS classification value and help 
interpret the bit fields that are present for synthetic, key point, and withheld 
types.  

.. _refactored: http://en.wikipedia.org/wiki/Code_refactoring
.. _`Oracle Point Cloud`: http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28400/sdo_pc_pkg_ref.htm