DesignGuide < TModeling

Design Goals

Simple: We want simple, straight forward, easy to understand code. Avoid complex solutions to simple problems. Use encapsulation/modularization to minimize the amount of complexity that anyone has to juggle in their mind to understand/use/maintain the code.

Readable: We want well written self-documenting code with good summary and intent comments. Make it is easy for new students to learn and start using the code base as quickly as possible.

Correct: Our code should work correctly. It should work as intended in the original design. Use defensive programming (pre-conditions, post-conditions, loop invariants, asserts, parameter checking, validation routines, etc.) to make sure code is actually correct.

Reliable: Our code should not crash and be relatively bug-free. We should use a variety of techniques to help find and fix potential bugs in our code. Unit testing, debugging sessions, stress tests, code reviews, etc. can all help decrease our error rate.

Robust: Similar but different from 'Reliable'. Our code should gracefully handle all invalid inputs, bad data files, etc. We should write our code to validate inputs and either replace with valid defaults or provide useful error messages back to the user identifying the actual problem, so it can be quickly corrected.

Re-usable: Design code in a modular general way so we can reuse functions/modules in other systems/tools/apps.

Portable: Design code so that it can be easily moved back and forth between multiple environments (Windows, Unix, Mac, etc.). This means avoiding or wrapping dependencies on platform specific functionality.

Lean: Design the system so that it has no extra left-over parts. This extra code still needs to be tested, debugged, and maintained as we modify our code base. Useful leftover code should be moved into an archive. Not so useful code should be deleted (we can always recover it from subversion if we really need it).

Performance: We want fast code. Make good choices in algorithms and data structures up front. O(1) < O( log n ) < O(n) < O(n log n) < O(n^2), etc. Look for unnecessary problems with memory allocation, copying data structures, redundant calculations, etc. Correct use of inline functions can speed-up code at no cost in readability. If necessary, rewrite code in choke-points to be as fast as possible. Use simple timers to test overall performance. Use profilers to identify problem areas and fix them. This goal is often in conflict with simplicity and readability. Use good comments to explain code rewritten to be tricky for performance reasons.

Handle Large Data-sets: Our code should be written to handle very large datasets (files). Lots more data than can fit into memory at once. Use streaming, buffering, binning, finalization techniques to accomplish this goal.

Scalable: We should write our code to take advantage of additional computational resources to speed-up the overall solution where possible (IE use pipelining, parallel programming [multiple threads in one application vs. farming out task across multiple computers], GP-GPU programming, etc.).

-- ShawnDB - 15 Jun 2008

Revision: r1.1 - 26 Jun 2008 - 16:31 - Main.guest

TModeling > Software > CodeGuide > DesignGuide