Did you know ... Search Documentation:
Pack logtalk -- logtalk-3.100.1/docs/handbook/_sources/libraries/random_projection.rst.txt

.. _library_random_projection:

random_projection

Random projection reducer for continuous datasets. The library implements the dimension_reducer_protocol defined in the dimension_reduction_protocols library and learns a seeded dense Rademacher projection matrix using the portable fast_random pseudo-random generator after centering the training data, optionally standardizing continuous attributes, and sampling entries in {-$1/sqrt(k)$, +$1/sqrt(k)$} where $k$ is the requested reduced dimensionality.

API documentation

Open the `../../apis/library_index.html#random_projection <../../apis/library_index.html#random_projection>`__ link in a web browser.

Loading

To load this library, load the loader.lgt file:

::

| ?- logtalk_load(random_projection(loader)).

Testing

To test this library predicates, load the tester.lgt file:

::

| ?- logtalk_load(random_projection(tester)).

Features

  • Continuous Datasets: Accepts datasets containing only continuous attributes. Missing or nonnumeric values are rejected.
  • Centering and Optional Scaling: Centers all attributes and optionally standardizes them before projection.
  • Portable Seeded Sampling: Uses fast_random(xoshiro128pp) so learned projection matrices are portable and reproducible.
  • Projection API: Transforms a new instance into a list of component_N-Value pairs.
  • Model Export: Learned reducers can be exported as predicate clauses or written to a file.

Options

The learn/3 predicate accepts the following options:

  • n_components/1: Number of random projection components to sample. Requests that exceed the number of features raise domain_error(component_count, Requested-Maximum). The default is 2.
  • feature_scaling/1: Whether to standardize continuous attributes before projection. Options: true (default) or false.
  • random_seed/1: Positive integer used to seed the portable pseudo-random generator before sampling the projection matrix. The default is 1357911.

Usage

The following examples use the sample datasets shipped with the dimension_reduction_protocols library:

::

| ?- logtalk_load(dimension_reduction_protocols('test_datasets/correlated_plane')), logtalk_load(dimension_reduction_protocols('test_datasets/high_dimensional_measurements')).

Learning a reducer ~~~~~~~~~~~~~~~~~~

::

| ?- random_projection::learn(correlated_plane, DimensionReducer).

| ?- random_projection::learn(correlated_plane, DimensionReducer, [n_components(1), feature_scaling(false), random_seed(17)]).

Transforming new instances ~~~~~~~~~~~~~~~~~~~~~~~~~~

::

| ?- random_projection::learn(high_dimensional_measurements, DimensionReducer, [random_seed(11)]), random_projection::transform(DimensionReducer, [f1-0.9, f2-1.1, f3-1.0, f4-2.0, f5-2.2, f6-2.1], ReducedInstance).

| ?- random_projection::learn(correlated_plane, DimensionReducer, [n_components(1), random_seed(19)]), random_projection::transform(DimensionReducer, [x-1.0, y-2.0, z-3.0], ReducedInstance).

Exporting and reusing the reducer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

::

| ?- random_projection::learn(correlated_plane, DimensionReducer, [n_components(1), random_seed(29)]), random_projection::export_to_file(correlated_plane, DimensionReducer, reducer, 'random_projection_reducer.pl').

| ?- logtalk_load('random_projection_reducer.pl'), reducer(Reducer), random_projection::transform(Reducer, [x-1.0, y-2.0, z-3.0], ReducedInstance).

Dimension reducer representation

The learned dimension reducer is represented by a compound term with the functor chosen by the implementation and arity 3. For example:

::

random_projection_reducer(Encoders, Components, Diagnostics)

Where:

  • Encoders: List of continuous attribute encoders storing attribute name, mean, and scale.
  • Components: List of sampled projection vectors in component order.
  • Diagnostics: Learned reducer metadata including the effective training options and reproducibility details.

    When exported using export_to_clauses/4 or export_to_file/4, this reducer term is serialized directly as the single argument of the generated predicate clause so that the exported model can be loaded and reused as-is.

References

  1. Johnson, W. B. and Lindenstrauss, J. (1984) - "Extensions of Lipschitz mappings into a Hilbert space".
  2. Achlioptas, D. (2003) - "Database-friendly random projections: Johnson-Lindenstrauss with binary coins".