About Mastrave: scientific knowledge freedom for robust computational science

What I cannot create,
I do not understand.


Richard P. Feynman, 1988

Mastrave is a library written in order to be as compatible as possible with both GNU Octave and Matlab computing environments.  The Mastrave project attempts to allow a more effective, quick interoperability between them by using a reasonably well documented wrap around their main incompatibilities and by promoting a reasonably general idiom based on their common, stable syntagms.  It aims to build a solid layer of coherently designed utilities on top of the already valuable abstraction provided by these computing environments — to minimize the roughness of the learning curve to pass through to think and implement non-trivial scientific contributions.  This is done by also promoting the systematic adoption of lightweight semantic constraints to enable concise and reliable implementations of models according to the paradigm of semantic array programming.  There are a couple of underlying ideas: library design is language design and vice versa (Bell labs); language notation is definitely a "tool of thought" (Iverson), in the sense that there is a feedback between programming/mathematical notation and the ability to think new scientific insights.  And perhaps ethic ones.

Knowledge and culture freedom

Mastrave is free software, which is software respecting your freedom.  It has been designed having in mind (and empirically experienced) some of the difficulties that arise in attempting to carry out collaborative scientific research by sharing software, data, metadata and most importantly by communicating the underlying patterns of ideas together with practicable ways for critically assessing and collectively enhancing all of them.

Since errors radiate from the human nature, we should try to ease their discovery and improvement even from a cultural perspective.  Is competitiveness the panacea?  If not limited, competitiveness applied to scientific knowledge — the publish or perish mechanism[1][2] — may induce the fear of "external" a posteriori falsification and therefore favorite ambiguity and obfuscation (as short-sighted patches to protect from external competitive aggressions) in publishing the relevant information needed to verify scientific claims.  This strategy perhaps may be locally effective to survive in a hostile cultural context, however it can easily degrade the maintainability, durability and ultimately compromise the long-term relevance of scientific insights.

The example of free software community-based success should suggest the global sub-optimality of renouncing to a cooperative effort in mitigating human fallibility.  Its goal to spread freedom and cooperation has a pragmatic foundation which can be applied to scientific knowledge by valuing instead of fearing the design of error-aware, non overconfident research while helping to reduce ambiguity and obfuscation.  Public, complete availability, eased understandability and improvability of technologies used to accomplish scientific research is a first methodological step.

As many other free scientific software packages, Mastrave is offered to the community of researchers to promote the development of a free society more concerned about cooperation rather than competitiveness, heading toward knowledge and culture freedom. This also implies the possibility for motivated individuals to freely access and contribute even to the cutting-edge academic culture.  This possibility relies on the development of tools and methodologies helping to overcome economic, organizational and institutional barriers (i.e. knowledge oligopolies).  This is a long-term goal to which the free software paradigm can and has been able to actively cooperate.

Adding on top:
portability, scalability

Mastrave was originally conceived and written by Daniele de Rigo (in about 2005) to perform highly vectorized computation with the constraints to use sufficiently portable, scalable architecture and data/metadata manipulation abstraction: requirements common to many otherwise heterogeneous domains related to natural resources modelling — which is integrating data sets impressively more and more extensive and detailed and models of increasing complexity.  This is a trend that is showing just the early signs of its impact on the mode of doing science and therefore requires support designed to last.

Portability required to deal with the essential intersection of the GNU Octave and Matlab languages, without forsake efficiency and conciseness provided by the vectorized approach that makes sensible to use such computing environments.  The Bash shell has been transparently integrated within the Mastrave core of GNU Octave and Matlab compatible functions, to take advantage of some relevant stable and largely portable features of that shell which have been oriented toward the array-programming paradigm.

Despite powerful GNU Octave language and library extensions and Matlab Toolboxes reach great "unilateral" capabilities, until now did not exist a systematic attempt to improve both of them with general purpose, portable and freely available features to make complex modelling manageable and sustainable.  The peculiar approach promotes strong modularization and the systematic adoption of lightweight semantic constraints to enable concise and reliable implementations of (sub) models according to the paradigm of semantic array programming.  Array semantic constraints may be contextualized in analogy with behavioural subtyping[3] as lightweight array-oriented behavioural contracts.

Since many other languages which do not properly support array-programming still could be enriched to manifest at least an array-programming flavor, Mastrave is open to experiment with their peculiarities to create a common basis for reasoning and encouraging mutual paradigm contamination (such research forms a set of experimental extensions to the core of Mastrave).

On the other hand, implementing some interesting features seems to need a choice among divergent objectives like preserving vectorial approach, accomplishing time efficiency or avoiding memory exhaustion for large data sets.  The Mastrave project tried to mitigate this dilemma even through an extensive use of sparse matrices and irregular grid indexing.  Most of the resulting algorithms are designed to inherently manifest scalable parallelism and scalable locality.

Freedom to review
software-based scientific claims

In scientific environments, free, easy exchange and diffusion of information are the foundations of knowledge building.  Independent validations, the possibility to independently review and repeat an experiment or a process having full access to structural information and the freedom of exploring different variants are notoriously the core of the scientific method.

Scientific oriented software should guarantee the same free, easy exchange and diffusion of information about itself.  If the main result of a scientific publication is a new algorithm or a new software package claimed to be useful, it is in the scientific community's interest that the author make available a detailed description and a source code understandable, peer-reviewable and improvable by other scientists.

The possibility of a deep verification is especially needed for those results that infer theoretical conclusions by using non trivial numerical tasks.  A theoretical assumption could be justified a posteriori in the more persuading manner.  However it may ignominiously collapse when discovering a subtle bug in the code used to infer it.

Cooperative scientific patterns

Mastrave encourages conciseness and focusing on the problem instead of on the dusty corners of the array programming language and on portability.  Its general purpose abstraction and its pervasive requirement check try to move the code development from individual debug of both the actual problem and the other expensive unspecific tasks to a more specific coding, leaving to the community a collaborative debug and evolution of those general, common tasks.
That is the meaning of a free library.

On the other hand, a new algorithm created for a prosaic application may be brilliantly adapted opening unexpected opportunities in some other field, maybe becoming a general purpose utility.
But it needs to be accessible to other people, widely and freely.  Such things as closed, black box armor-plated software or software patents might sound extravagant if applied to science.  What would have happened if the Fast Fourier Transform had been occulted under a software patent?

The source code for Mastrave is freely redistributable under the terms of the GNU General Public License (GPL) as published by the Free Software Foundation.  As an handy reference for understanding some key aspects of the GNU General Public License, it could be said the GNU GPL guarantees that anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it.

By distributing the complete source code for Mastrave under the terms of the GNU GPL, it is possible to guarantee that you and all other users have the freedom to redistribute and change both Mastrave and the growing collection of public scientific works that use Mastrave as a library.  Moreover, you have the freedom to analyze in details how each computation is performed and how to improve it.

Everyone is encouraged to share this software with others under the terms of the GNU General Public License.
Mastrave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.

You are also invited to help to improve Mastrave by contributing new ideas for it and by reporting problems you may have.  Aside gold plating, science remains "the pleasure of finding the thing out, the kick in the discovery, the observation that other people use it".

Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 Daniele de Rigo

This page is licensed under a Creative Commons Attribution-NoDerivs 3.0 Italy License.


Valid XHTML 1.0 Transitional