| Did you know ... | Search Documentation: |
| Pack logtalk -- logtalk-3.100.1/docs/handbook/_sources/libraries/dimension_reduction_protocols.rst.txt |
.. _library_dimension_reduction_protocols:
dimension_reduction_protocols
This library provides protocols and reusable support predicates used in
the implementation of machine learning dimension reduction algorithms.
Unsupervised datasets are represented as objects implementing the
dimension_reduction_dataset_protocol protocol. Labeled datasets for
supervised reducers such as Linear Discriminant Analysis are represented
as objects implementing the
supervised_dimension_reduction_dataset_protocol protocol.
Target-valued datasets for reducers such as partial least squares
projections are represented as objects implementing the
target_supervised_dimension_reduction_dataset_protocol protocol.
Dimension reducers are represented as objects importing the
dimension_reducer_common category.
This library also provides reusable test datasets and a small smoke-test suite.
Concrete dimension reduction algorithms are intentionally out of scope
for this package. The goal is to provide a portable foundation for
libraries such as pca_projection, truncated_svd_projection,
random_projection, nmf_projection, lda_projection, and
pls_projection, including shared reducer helpers for common tasks
such as feature encoding, projection, reducer validation, diagnostics,
export, and pretty-printing.
Exported reducers are serialized protocol-wide as single-argument
predicates such as reducer(Reducer), allowing the saved reducer term
to be loaded and passed directly to transform/3.
All reducers importing the shared category are also expected to record a diagnostics list containing at least model/1 and options/1 terms. The shared protocol surface exposes this metadata using the diagnostics/2, diagnostic/2, and dimension_reducer_options/2 predicates and validates serialized reducers using check_dimension_reducer/1.
Open the `../../apis/library_index.html#dimension_reduction_protocols <../../apis/library_index.html#dimension_reduction_protocols>`__ link in a web browser.
To load all entities in this library, load the loader.lgt file:
::
| ?- logtalk_load(dimension_reduction_protocols(loader)).
The dimension_reducer_common category supports the
feature_scaling/1 option used by importing reducers to control
continuous feature normalization before projection:
feature_scaling(true) standardizes each continuous attribute using
its training-set mean and standard deviation.feature_scaling(false) only centers each continuous attribute
using its training-set mean.
The current pca_projection, random_projection,
lda_projection, and pls_projection libraries all define
feature_scaling(true) as their default.
Reducers can also specialize preprocessing behavior. For example, the
nmf_projection library only supports center(false) because the
learned feature representation must remain non-negative.
To run the library smoke tests, load the tester.lgt file:
::
| ?- logtalk_load(dimension_reduction_protocols(tester)).
Several sample datasets are included in the test_datasets directory:
correlated_plane.lgt â A compact continuous dataset with 8
examples and 3 continuous attributes (x, y, z) where the
features are strongly correlated. It is intended for PCA and
projection smoke tests.high_dimensional_measurements.lgt â A small continuous dataset
with 10 examples and 6 continuous attributes (f1 through f6).
It is intended for testing projected dimensionality and reducer output
shape.low_rank_rectangular.lgt â A compact continuous dataset with 4
examples and 3 continuous attributes whose third feature is the sum of
the first two. It is intended for testing matrix-rank truncation and
singular-value-based reducers such as truncated_svd_projection.labeled_measurements.lgt â A compact labeled continuous dataset
with 9 examples, 4 continuous attributes (length, width,
height, weight), and 3 class labels (alpha, beta,
gamma). It is intended for testing supervised reducers such as
lda_projection.parts_based_measurements.lgt â A compact non-negative continuous
dataset with 5 examples and 4 continuous attributes (f1 through
f4) built from two latent additive parts. It is intended for
testing parts-based reducers such as nmf_projection.target_latent_measurements.lgt â A compact target-valued
continuous dataset with 6 examples, 4 continuous attributes (f1
through f4), and a numeric target attribute (score) influenced
by two latent directions. It is intended for testing target-supervised
reducers such as pls_projection.target_measurements object in the library smoke tests, which
exercises the
target_supervised_dimension_reduction_dataset_protocol contract.