Context Navigation

Changes between Version 16 and Version 17 of Processing/lagdevelopersfaq

Timestamp:: Jul 9, 2012, 3:07:36 PM (13 years ago)
Author:: jaho
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

Processing/lagdevelopersfaq

-                      v16
+                      v17
 === 2.2 What are LAS files? [=#q2.2] ===
 LAS is a binary, public file format designed to hold LiDAR points data. One alternative to LAS are ASCII files(.csv or .txt), which are however less efficient in terms of both processing time and file size. \\ \\
 LAS files consist of several parts: Public Header Block, Variable Length Records, Point Records and in format 1.3 and later Waveform Records. \\ \\
+LAS is a binary, public file format designed to hold LiDAR points data. One alternative to LAS are ASCII files (.csv or .txt), which are however less efficient in terms of both processing time and file size. \\ \\
+LAS files consist of several parts: Public Header Block containing meta-data about the records (like min and max x, y, z values, number of points etc.), Variable Length Records with additional meta-data (like projection information), Point Records and in format 1.3 and later Waveform Records. \\ \\
 One thing to note about LAS is that point coordinates are stored as scaled integers and are then unscaled upon loading to double values with the use of scale factors and offsets stored in the header. \\ \\
 A detailed LAS format specification can be found at:
 …
 In addition to discrete point records LAS 1.3 files also hold full waveform data which is quite huge in volume. The point records have also been modified to include a number of new attributes related to waveform data. \\ \\
 New point format causes a problem to the quadtree library because point objects are much bigger in size thus affecting the performance and memory consumption. We don't currently have tools to process full waveform data, however we do sometimes include it in deliveries so we need some way of handling it.
 …
 There is no official documentation for laslib API, however it is a fairly well written C++ which provides an easy to use interface. The best way to learn laslib is to study the source code of programs included in lastools. It is quite powerful and even provides its own tools for spatial indexing (totally undocumented though). \\ \\
 The main classes of our interest are {{{LASreader}}} and {{{LASwriter}}}. A simple program that reads points from a file, filters out points with classification 7 (noise), prints points' coordinates and then saves them to another file may look like this (note there's no error checking or exception handling for simplicity, but you should always include it the real code):
 {{{
 …
 === 3.4 How does caching work? [=#q3.4] ===
 Upon creation of the quadtree the user specifies a maximum number of points to be held in memory. Whenever a new bucket is created inside {{{PointBucket::cache()}}} the {{{CacheMinder::updateCache(int requestSize, PointBucket* pointBucket, bool force)}}} method is called with {{{requestSize}}} parameter equal to the number of points that will be held in that bucket. This number is then added to {{{CacheMinder::cacheUsed}}} variable and compared to {{{CacheMinder::totalCache}}} equal to maximum number of points that can be held in memory. If the sum of the to is smaller then maxim cache size the {{{cachUsed}}} variable is updated and a pointer to the bucket is added to the queue ({{{std::deque}}}) of cached buckets. If total amount of cache requested is greater then allowed maximum the buckets are taken from the front of the queue and uncached until there's enough space for the new bucket. (A note here: the {{{force}}} variable in {{{updateCache()}}} method is actually obsolete as it's always set to true). \\ \\
+Upon creation of the quadtree the user specifies a maximum number of points to be held in memory. Whenever a new bucket is created inside {{{PointBucket::cache()}}} the {{{CacheMinder::updateCache(int requestSize, PointBucket* pointBucket, bool force)}}} method is called with {{{requestSize}}} parameter equal to the number of points that will be held in that bucket. This number is then added to the value of {{{CacheMinder::cacheUsed}}} variable and compared to {{{CacheMinder::totalCache}}}, which is equal to maximum number of points that can be held in memory. If the sum of the to is smaller then maxim cache size the {{{cachUsed}}} variable is updated and a pointer to the bucket is added to the queue ({{{std::deque}}}) of cached buckets. If total amount of cache requested is greater then allowed maximum the buckets are taken from the front of the queue and uncached until there's enough space for the new bucket. (A note here: the {{{force}}} variable in {{{updateCache()}}} method is actually obsolete as it's always set to true). \\ \\
 When a bucket needs to be uncached, because there is no more room for new buckets in memory, {{{PointBucket::uncache()}}} method is called which serializes the bucket by compressing it and writes it to a file. Two lso compression is used for this together with two static variables to reduce the memory overhead. The serialization code looks like this:
 …
 Worker.h
   An abstract worker class.
 ui classes
+UI classes
   These classes represent top-level interface elements (windows and dialogs) and are responsible for connecting the UI to signal handlers.
 …
 === 4.6 How is threading done? [=#q4.6] ===
 It uses [http://developer.gnome.org/glib/2.30/glib-Threads.html Gtkmm threading system] and implements worker classes which do the work in separate threads, without blocking the GUI. All worker classes inherit from {{{Worker.h}}} which is pretty simple. It provides a {{{start()}}} method with launches a new thread and a virtual {{{run()}}} method with code to be executed. It also has a {{{Glib::Dispatcher}}} object for inter-thread communication (works like signals).
+It uses [http://developer.gnome.org/glib/2.30/glib-Threads.html Gtkmm threading system] and implements worker classes which do the work in separate threads, without blocking the GUI. All worker classes inherit from {{{Worker.h}}} which is pretty simple. It provides a {{{start()}}} method which launches a new thread and a virtual {{{run()}}} method with code to be executed. It also has a {{{Glib::Dispatcher}}} object for inter-thread communication (works like signals).
 {{{
 …
   worker->sig_done().connect(sigc::mem_fun(*this, &GUIClass::on_worker_work_finished));
   // Display a progress bar or a busy cursor
+  // Signal that the work is being don to the user by displaying a progress bar or a busy cursor
   }}}
 }}}
 Once the worker has finished it emits {{{sig_done()}}} signal in its {{{run()}}} method notifying the caller class and {{{on_worker_work_finished()}}} method gets called. Most worker classes implement more signals to notify about different events (for example load progress to move the progress bar). The idea stays the same though. \\ \\
 Additionally in GUI classes {{{Glib::Mutex}}} objects are used at the beginning of each critical section to avoid race conditions. Using brackets around the section makes sure that the {{{Mutex::Lock}}} object get automatically destroyed.
+Additionally in GUI classes {{{Glib::Mutex}}} objects are used at the beginning of each critical section to avoid race conditions. Using brackets around the section makes sure that the {{{Mutex::Lock}}} object gets automatically destroyed.
 {{{
 …
 === 4.7 How does rendering work? [=#q4.7] ===
+It's difficult to tell. It's quite a mess and it definitely needs some refactoring for maintainability. It does work however so I haven't changed it, focusing on other things. \\ \\
+I'm not quite sure. It's a mess and it definitely needs some refactoring. It does work however so I haven't changed it, focusing on other things. \\ \\
 Once the files have been loaded, a pointer to the quadtree is passed to {{{LagDisplay}}}, {{{TwoDeeOverview}}} and {{{Profile}}}. First {{{LagDisplay::prepare_image()}}} method is called which sets up OpenGl and gets all the buckets from the quadtree with {{{advSubset()}}} method. Based on the ranges of z and intensity values it then sets up arrays for coloring and brightness of the points in {{{LagDisplay::coloursandshades()}}}. Then {{{resetview()}}} method is called which sets up orthographic projection and conversion of world coordinates to screen coordinates. \\ \\
 After that a {{{drawviewable()}}} method is called in {{{TwoDeeOverview}}} and/or {{{Profile}}} which checks what part of the data is currently being viewed on the screen and gets the relevant buckets from the quadtree. It also uses a good number of boolean variables which are changed all over the place to make things harder to understand.If you were wondering how NOT to do threading in C++, here's a great example. Eventually we create a new thread and call {{{TwoDeeOverview::mainimage()}}} method passing it the buckets we've got. \\ \\
 …
 Luckily there's not so many any more, however the major one is the memory consumption. This is quite a difficult problem, but it should be looked at before lag can be released to the public. \\ \\
 Another one is handling of the LAS 1.3 files, which works, but can be improved (eg. saving to the same file). This may not be possible though as long as laslib just ignores  full waveform data. \\ \\
+Also there is an issue with rendering/loading thread where forcing ON_EXPOSE event while loading a file at the same time may sometimes cause a segfault. For example moving some window on top of the TwoDeeOverview window while file is being loaded may cause this. This is most likely cause by both loading and rendering thread trying to access the quadtree at the same time and I believe sorting out the rendering thread should solve this problem.
+Also there is an issue with rendering/loading thread where forcing ON_EXPOSE event while loading a file at the same time may sometimes cause a segfault. For example moving some window on top of the TwoDeeOverview window while file is being loaded may cause this. This is most likely caused by both loading and rendering thread trying to access the quadtree at the same time and I believe sorting out the rendering thread should solve this problem.
 === 5.3 What is lag memory issue? [=#q5.3] ===
 …
 === 5.4 Which parts of LAG's code need improvement? [=#q5.4] ===
+All that I haven't written. \\ \\
+On a more serious note the threading is not quite finished. While {{{LoadWorker}}} and {{{SaveWorker}}} have been fully implemented, the {{{ProfileWorker}}} and {{{ClassifyWorker}}} need some improvement (so the work is actually done inside these classes). To add to that a progress indication for classifying points and loading a profile could be added. \\ \\
+The threading is not quite finished. While {{{LoadWorker}}} and {{{SaveWorker}}} have been fully implemented, the {{{ProfileWorker}}} and {{{ClassifyWorker}}} need some improvement (so the work is actually done inside these classes). To add to that a progress indication for classifying points and loading a profile could be added. \\ \\
 {{{PointBucket}}} and {{{CacheMinder}}} may use some work, possibly implementing a custom memory allocator for better memory management. \\ \\
 There's still plenty of refactoring to be done (possibly a good thing to start with to understand the code better). Particularly the rendering, which works, but is kind of scattered between different classes and methods and its threading which is just bad. Ideally there should be a separate Renderer/RenderingWorker class with all the code in one place. \\ \\
 …
   * A 3D overview. Not so useful for processing, but pretty cool for visualisation and possibly attractive for other users outside ARSF (once LAG is released to the public).
   * Additional Loading/Saving features. For example an option to use shape files to filter out points, ability to import/export different file formats, support for more projections etc. Also improved support for waveform data.
   * Plugins or scripting support. Probably scripting would be easier to implement, for example using boost::python. The idea is to provide an interface for lag/quadtree to a scripting language like python so even users that don't know C++ could add their modifications. For example classify_las could then be made a script or tool inside lag.
+  * Plugins or scripting support. Probably scripting would be easier to implement, for example using [http://www.boost.org/doc/libs/1_50_0/libs/python/doc/ Boost::Python]. The idea is to provide an interface for lag/quadtree to a scripting language like python so even users that don't know C++ could add their modifications. For example classify_las could then be made a script or tool inside lag.
   * Further GUI improvements. Eg. progress bars for loading a profile and classifying points, tool tips and other user-friendly stuff.
   * Waveform visualisation.
 …
 valgrind --tool=callgrind lag
 }}}
 The lag will start as usual however it will run extremely slow. Run some operations (possibly such that you can repeat with no variations, like loading a single file) and close the program. The callgrind will have generated a call-graph file called callgrind.out.xxxxx which you can open in [http://kcachegrind.sourceforge.net/html/Home.html kcachegrind] for visualisation.
+The lag will start as usual however it will run extremely slow. Run some operations (possibly such that you can repeat with no variations, like loading a single file) and close the program. The callgrind will have generated a call-graph file called callgrind.out.xxxxx which you can open in [http://kcachegrind.sourceforge.net/html/Home.html kcachegrind] for visualisation. \\
 [[Image(kcachegrind.png,align=center)]]
 {{{